Science.gov

Sample records for massively parallel ligation

  1. Validation of an Oligonucleotide Ligation Assay for Quantification of Human Immunodeficiency Virus Type 1 Drug-Resistant Mutants by Use of Massively Parallel Sequencing

    PubMed Central

    Beck, Ingrid A.; Deng, Wenjie; Payant, Rachel; Hall, Robert; Bumgarner, Roger E.; Mullins, James I.

    2014-01-01

    Global HIV treatment programs need sensitive and affordable tests to monitor HIV drug resistance. We compared mutant detection by the oligonucleotide ligation assay (OLA), an economical and simple test, to massively parallel sequencing. Nonnucleoside reverse transcriptase inhibitor (K103N, V106M, Y181C, and G190A) and lamivudine (M184V) resistance mutations were quantified in blood-derived plasma RNA and cell DNA specimens by OLA and 454 pyrosequencing. A median of 1,000 HIV DNA or RNA templates (range, 163 to 1,874 templates) from blood specimens collected in Mozambique (n = 60) and Kenya (n = 51) were analyzed at 4 codons in each sample (n = 441 codons assessed). Mutations were detected at 75 (17%) codons by OLA sensitive to 2.0%, at 71 codons (16%; P = 0.78) by pyrosequencing using a cutoff value of ≥2.0%, and at 125 codons (28%; P < 0.0001) by pyrosequencing sensitive to 0.1%. Discrepancies between the assays included 15 codons with mutant concentrations of ∼2%, one at 8.8% by pyrosequencing and not detected by OLA, and one at 69% by OLA and not detected by pyrosequencing. The latter two cases were associated with genetic polymorphisms in the regions critical for ligation of the OLA probes and pyrosequencing primers, respectively. Overall, mutant concentrations quantified by the two methods correlated well across the codons tested (R2 > 0.8). Repeat pyrosequencing of 13 specimens showed reproducible detection of 5/24 mutations at <2% and 6/6 at ≥2%. In conclusion, the OLA and pyrosequencing performed similarly in the quantification of nonnucleoside reverse transcriptase inhibitor and lamivudine mutations present at >2% of the viral population in clinical specimens. While pyrosequencing was more sensitive, detection of mutants below 2% was not reproducible. PMID:24740080

  2. Massively parallel mathematical sieves

    SciTech Connect

    Montry, G.R.

    1989-01-01

    The Sieve of Eratosthenes is a well-known algorithm for finding all prime numbers in a given subset of integers. A parallel version of the Sieve is described that produces computational speedups over 800 on a hypercube with 1,024 processing elements for problems of fixed size. Computational speedups as high as 980 are achieved when the problem size per processor is fixed. The method of parallelization generalizes to other sieves and will be efficient on any ensemble architecture. We investigate two highly parallel sieves using scattered decomposition and compare their performance on a hypercube multiprocessor. A comparison of different parallelization techniques for the sieve illustrates the trade-offs necessary in the design and implementation of massively parallel algorithms for large ensemble computers.

  3. Massively Parallel Genetics.

    PubMed

    Shendure, Jay; Fields, Stanley

    2016-06-01

    Human genetics has historically depended on the identification of individuals whose natural genetic variation underlies an observable trait or disease risk. Here we argue that new technologies now augment this historical approach by allowing the use of massively parallel assays in model systems to measure the functional effects of genetic variation in many human genes. These studies will help establish the disease risk of both observed and potential genetic variants and to overcome the problem of "variants of uncertain significance."

  4. Massively parallel processor computer

    NASA Technical Reports Server (NTRS)

    Fung, L. W. (Inventor)

    1983-01-01

    An apparatus for processing multidimensional data with strong spatial characteristics, such as raw image data, characterized by a large number of parallel data streams in an ordered array is described. It comprises a large number (e.g., 16,384 in a 128 x 128 array) of parallel processing elements operating simultaneously and independently on single bit slices of a corresponding array of incoming data streams under control of a single set of instructions. Each of the processing elements comprises a bidirectional data bus in communication with a register for storing single bit slices together with a random access memory unit and associated circuitry, including a binary counter/shift register device, for performing logical and arithmetical computations on the bit slices, and an I/O unit for interfacing the bidirectional data bus with the data stream source. The massively parallel processor architecture enables very high speed processing of large amounts of ordered parallel data, including spatial translation by shifting or sliding of bits vertically or horizontally to neighboring processing elements.

  5. Parallel rendering techniques for massively parallel visualization

    SciTech Connect

    Hansen, C.; Krogh, M.; Painter, J.

    1995-07-01

    As the resolution of simulation models increases, scientific visualization algorithms which take advantage of the large memory. and parallelism of Massively Parallel Processors (MPPs) are becoming increasingly important. For large applications rendering on the MPP tends to be preferable to rendering on a graphics workstation due to the MPP`s abundant resources: memory, disk, and numerous processors. The challenge becomes developing algorithms that can exploit these resources while minimizing overhead, typically communication costs. This paper will describe recent efforts in parallel rendering for polygonal primitives as well as parallel volumetric techniques. This paper presents rendering algorithms, developed for massively parallel processors (MPPs), for polygonal, spheres, and volumetric data. The polygon algorithm uses a data parallel approach whereas the sphere and volume render use a MIMD approach. Implementations for these algorithms are presented for the Thinking Ma.chines Corporation CM-5 MPP.

  6. Non-invasive prenatal diagnosis of fetal aneuploidies using massively parallel sequencing-by-ligation and evidence that cell-free fetal DNA in the maternal plasma originates from cytotrophoblastic cells.

    PubMed

    Faas, Brigitte H W; de Ligt, Joep; Janssen, Irene; Eggink, Alex J; Wijnberger, Lia D E; van Vugt, John M G; Vissers, Lisenka; Geurts van Kessel, Ad

    2012-06-01

    Blood plasma of pregnant women contains circulating cell-free fetal DNA (ccffDNA), originating from the placenta. The use of this DNA for non-invasive detection of fetal aneuploidies using massively parallel sequencing (MPS)-by-synthesis has been proven previously. Sequence performance may, however, depend on the MPS platform and therefore we have explored the possibility for multiplex MPS-by-ligation, using the Applied Biosystems SOLiD(™) 4 system. DNA isolated from plasma samples from 52 pregnant women, carrying normal or aneuploid fetuses, was sequenced in multiplex runs of 4, 8 or 16 samples simultaneously. The sequence reads were mapped to the human reference genome and quantified according to their genomic location. In case of a fetal aneuploidy, the number of reads of the aberrant chromosome is expected to be higher or lower than in normal reference samples. To statistically determine this, Z-scores per chromosome were calculated as described previously, with thresholds for aneuploidies set at > +3.0 and < -3.0 for chromosomal over- or underrepresentation, respectively. All samples from fetal aneuploidies yielded Z-scores outside the thresholds for the aberrant chromosomes, with no false negative or positive results. Full-blown fetal aneuploidies can thus be reliably detected in maternal plasma using a multiplex MPS-by-ligation approach. Furthermore, the results obtained with a sample from a pregnancy with 45,X in the cytotrophoblastic cell layer and 46,XX in the mesenchymal core cells show that ccffDNA originates from the cytotrophoblastic cell layer. Discrepancies between the genetic constitution of this cell layer and the fetus itself are well known, and therefore, care should be taken when translating results to the fetus itself.

  7. Fast, Massively Parallel Data Processors

    NASA Technical Reports Server (NTRS)

    Heaton, Robert A.; Blevins, Donald W.; Davis, ED

    1994-01-01

    Proposed fast, massively parallel data processor contains 8x16 array of processing elements with efficient interconnection scheme and options for flexible local control. Processing elements communicate with each other on "X" interconnection grid with external memory via high-capacity input/output bus. This approach to conditional operation nearly doubles speed of various arithmetic operations.

  8. Massively Parallel MRI Detector Arrays

    PubMed Central

    Keil, Boris; Wald, Lawrence L

    2013-01-01

    Originally proposed as a method to increase sensitivity by extending the locally high-sensitivity of small surface coil elements to larger areas, the term parallel imaging now includes the use of array coils to perform image encoding. This methodology has impacted clinical imaging to the point where many examinations are performed with an array comprising multiple smaller surface coil elements as the detector of the MR signal. This article reviews the theoretical and experimental basis for the trend towards higher channel counts relying on insights gained from modeling and experimental studies as well as the theoretical analysis of the so-called “ultimate” SNR and g-factor. We also review the methods for optimally combining array data and changes in RF methodology needed to construct massively parallel MRI detector arrays and show some examples of state-of-the-art for highly accelerated imaging with the resulting highly parallel arrays. PMID:23453758

  9. Merlin - Massively parallel heterogeneous computing

    NASA Technical Reports Server (NTRS)

    Wittie, Larry; Maples, Creve

    1989-01-01

    Hardware and software for Merlin, a new kind of massively parallel computing system, are described. Eight computers are linked as a 300-MIPS prototype to develop system software for a larger Merlin network with 16 to 64 nodes, totaling 600 to 3000 MIPS. These working prototypes help refine a mapped reflective memory technique that offers a new, very general way of linking many types of computer to form supercomputers. Processors share data selectively and rapidly on a word-by-word basis. Fast firmware virtual circuits are reconfigured to match topological needs of individual application programs. Merlin's low-latency memory-sharing interfaces solve many problems in the design of high-performance computing systems. The Merlin prototypes are intended to run parallel programs for scientific applications and to determine hardware and software needs for a future Teraflops Merlin network.

  10. Multigrid on massively parallel architectures

    SciTech Connect

    Falgout, R D; Jones, J E

    1999-09-17

    The scalable implementation of multigrid methods for machines with several thousands of processors is investigated. Parallel performance models are presented for three different structured-grid multigrid algorithms, and a description is given of how these models can be used to guide implementation. Potential pitfalls are illustrated when moving from moderate-sized parallelism to large-scale parallelism, and results are given from existing multigrid codes to support the discussion. Finally, the use of mixed programming models is investigated for multigrid codes on clusters of SMPs.

  11. Massively Parallel Computing: A Sandia Perspective

    SciTech Connect

    Dosanjh, Sudip S.; Greenberg, David S.; Hendrickson, Bruce; Heroux, Michael A.; Plimpton, Steve J.; Tomkins, James L.; Womble, David E.

    1999-05-06

    The computing power available to scientists and engineers has increased dramatically in the past decade, due in part to progress in making massively parallel computing practical and available. The expectation for these machines has been great. The reality is that progress has been slower than expected. Nevertheless, massively parallel computing is beginning to realize its potential for enabling significant break-throughs in science and engineering. This paper provides a perspective on the state of the field, colored by the authors' experiences using large scale parallel machines at Sandia National Laboratories. We address trends in hardware, system software and algorithms, and we also offer our view of the forces shaping the parallel computing industry.

  12. Template based parallel checkpointing in a massively parallel computer system

    DOEpatents

    Archer, Charles Jens; Inglett, Todd Alan

    2009-01-13

    A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.

  13. Massively Parallel Algorithms for Solution of Schrodinger Equation

    NASA Technical Reports Server (NTRS)

    Fijany, Amir; Barhen, Jacob; Toomerian, Nikzad

    1994-01-01

    In this paper massively parallel algorithms for solution of Schrodinger equation are developed. Our results clearly indicate that the Crank-Nicolson method, in addition to its excellent numerical properties, is also highly suitable for massively parallel computation.

  14. Aerodynamic simulation on massively parallel systems

    NASA Technical Reports Server (NTRS)

    Haeuser, Jochem; Simon, Horst D.

    1992-01-01

    This paper briefly addresses the computational requirements for the analysis of complete configurations of aircraft and spacecraft currently under design to be used for advanced transportation in commercial applications as well as in space flight. The discussion clearly shows that massively parallel systems are the only alternative which is both cost effective and on the other hand can provide the necessary TeraFlops, needed to satisfy the narrow design margins of modern vehicles. It is assumed that the solution of the governing physical equations, i.e., the Navier-Stokes equations which may be complemented by chemistry and turbulence models, is done on multiblock grids. This technique is situated between the fully structured approach of classical boundary fitted grids and the fully unstructured tetrahedra grids. A fully structured grid best represents the flow physics, while the unstructured grid gives best geometrical flexibility. The multiblock grid employed is structured within a block, but completely unstructured on the block level. While a completely unstructured grid is not straightforward to parallelize, the above mentioned multiblock grid is inherently parallel, in particular for multiple instruction multiple datastream (MIMD) machines. In this paper guidelines are provided for setting up or modifying an existing sequential code so that a direct parallelization on a massively parallel system is possible. Results are presented for three parallel systems, namely the Intel hypercube, the Ncube hypercube, and the FPS 500 system. Some preliminary results for an 8K CM2 machine will also be mentioned. The code run is the two dimensional grid generation module of Grid, which is a general two dimensional and three dimensional grid generation code for complex geometries. A system of nonlinear Poisson equations is solved. This code is also a good testcase for complex fluid dynamics codes, since the same datastructures are used. All systems provided good speedups, but

  15. Massively-Parallel Dislocation Dynamics Simulations

    SciTech Connect

    Cai, W; Bulatov, V V; Pierce, T G; Hiratani, M; Rhee, M; Bartelt, M; Tang, M

    2003-06-18

    Prediction of the plastic strength of single crystals based on the collective dynamics of dislocations has been a challenge for computational materials science for a number of years. The difficulty lies in the inability of the existing dislocation dynamics (DD) codes to handle a sufficiently large number of dislocation lines, in order to be statistically representative and to reproduce experimentally observed microstructures. A new massively-parallel DD code is developed that is capable of modeling million-dislocation systems by employing thousands of processors. We discuss the general aspects of this code that make such large scale simulations possible, as well as a few initial simulation results.

  16. Plasma simulation using the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Lin, C. S.; Thring, A. L.; Koga, J.; Janetzke, R. W.

    1987-01-01

    Two dimensional electrostatic simulation codes using the particle-in-cell model are developed on the Massively Parallel Processor (MPP). The conventional plasma simulation procedure that computes electric fields at particle positions by means of a gridded system is found inefficient on the MPP. The MPP simulation code is thus based on the gridless system in which particles are assigned to processing elements and electric fields are computed directly via Discrete Fourier Transform. Currently, the gridless model on the MPP in two dimensions is about nine times slower that the gridded system on the CRAY X-MP without considering I/O time. However, the gridless system on the MPP can be improved by incorporating a faster I/O between the staging memory and Array Unit and a more efficient procedure for taking floating point sums over processing elements. The initial results suggest that the parallel processors have the potential for performing large scale plasma simulations.

  17. Seismic imaging on massively parallel computers

    SciTech Connect

    Ober, C.C.; Oldfield, R.A.; Womble, D.E.; Mosher, C.C.

    1997-07-01

    A key to reducing the risks and costs associated with oil and gas exploration is the fast, accurate imaging of complex geologies, such as salt domes in the Gulf of Mexico and overthrust regions in US onshore regions. Pre-stack depth migration generally yields the most accurate images, and one approach to this is to solve the scalar-wave equation using finite differences. Current industry computational capabilities are insufficient for the application of finite-difference, 3-D, prestack, depth-migration algorithms. High performance computers and state-of-the-art algorithms and software are required to meet this need. As part of an ongoing ACTI project funded by the US Department of Energy, the authors have developed a finite-difference, 3-D prestack, depth-migration code for massively parallel computer systems. The goal of this work is to demonstrate that massively parallel computers (thousands of processors) can be used efficiently for seismic imaging, and that sufficient computing power exists (or soon will exist) to make finite-difference, prestack, depth migration practical for oil and gas exploration.

  18. MASSIVE HYBRID PARALLELISM FOR FULLY IMPLICIT MULTIPHYSICS

    SciTech Connect

    Cody J. Permann; David Andrs; John W. Peterson; Derek R. Gaston

    2013-05-01

    As hardware advances continue to modify the supercomputing landscape, traditional scientific software development practices will become more outdated, ineffective, and inefficient. The process of rewriting/retooling existing software for new architectures is a Sisyphean task, and results in substantial hours of development time, effort, and money. Software libraries which provide an abstraction of the resources provided by such architectures are therefore essential if the computational engineering and science communities are to continue to flourish in this modern computing environment. The Multiphysics Object Oriented Simulation Environment (MOOSE) framework enables complex multiphysics analysis tools to be built rapidly by scientists, engineers, and domain specialists, while also allowing them to both take advantage of current HPC architectures, and efficiently prepare for future supercomputer designs. MOOSE employs a hybrid shared-memory and distributed-memory parallel model and provides a complete and consistent interface for creating multiphysics analysis tools. In this paper, a brief discussion of the mathematical algorithms underlying the framework and the internal object-oriented hybrid parallel design are given. Representative massively parallel results from several applications areas are presented, and a brief discussion of future areas of research for the framework are provided.

  19. Massive hybrid parallelism for fully implicit multiphysics

    SciTech Connect

    Gaston, D. R.; Permann, C. J.; Andrs, D.; Peterson, J. W.

    2013-07-01

    As hardware advances continue to modify the supercomputing landscape, traditional scientific software development practices will become more outdated, ineffective, and inefficient. The process of rewriting/retooling existing software for new architectures is a Sisyphean task, and results in substantial hours of development time, effort, and money. Software libraries which provide an abstraction of the resources provided by such architectures are therefore essential if the computational engineering and science communities are to continue to flourish in this modern computing environment. The Multiphysics Object Oriented Simulation Environment (MOOSE) framework enables complex multiphysics analysis tools to be built rapidly by scientists, engineers, and domain specialists, while also allowing them to both take advantage of current HPC architectures, and efficiently prepare for future supercomputer designs. MOOSE employs a hybrid shared-memory and distributed-memory parallel model and provides a complete and consistent interface for creating multiphysics analysis tools. In this paper, a brief discussion of the mathematical algorithms underlying the framework and the internal object-oriented hybrid parallel design are given. Representative massively parallel results from several applications areas are presented, and a brief discussion of future areas of research for the framework are provided. (authors)

  20. Parallel Impurity Spreading During Massive Gas Injection

    NASA Astrophysics Data System (ADS)

    Izzo, V. A.

    2016-10-01

    Extended-MHD simulations of disruption mitigation in DIII-D demonstrate that both pre-existing islands (locked-modes) and plasma rotation can significantly influence toroidal spreading of impurities following massive gas injection (MGI). Given the importance of successful disruption mitigation in ITER and the large disparity in device parameters, empirical demonstrations of disruption mitigation strategies in present tokamaks are insufficient to inspire unreserved confidence for ITER. Here, MHD simulations elucidate how impurities injected as a localized jet spread toroidally and poloidally. Simulations with large pre-existing islands at the q = 2 surface reveal that the magnetic topology strongly influences the rate of impurity spreading parallel to the field lines. Parallel spreading is largely driven by rapid parallel heat conduction, and is much faster at low order rational surfaces, where a short parallel connection length leads to faster thermal equilibration. Consequently, the presence of large islands, which alter the connection length, can slow impurity transport; but the simulations also show that the appearance of a 4/2 harmonic of the 2/1 mode, which breaks up the large islands, can increase the rate of spreading. This effect is seen both for simulations with spontaneously growing and directly imposed 4/2 modes. Given the prevalence of locked-modes as a cause of disruptions, understanding the effect of large islands is of particular importance. Simulations with and without islands also show that rotation can alter impurity spreading, even reversing the predominant direction of spreading, which is toward the high-field-side in the absence of rotation. Given expected differences in rotation for ITER vs. DIII-D, rotation effects are another important consideration when extrapolating experimental results. Work supported by US DOE under DE-FG02-95ER54309.

  1. Computational chaos in massively parallel neural networks

    NASA Technical Reports Server (NTRS)

    Barhen, Jacob; Gulati, Sandeep

    1989-01-01

    A fundamental issue which directly impacts the scalability of current theoretical neural network models to massively parallel embodiments, in both software as well as hardware, is the inherent and unavoidable concurrent asynchronicity of emerging fine-grained computational ensembles and the possible emergence of chaotic manifestations. Previous analyses attributed dynamical instability to the topology of the interconnection matrix, to parasitic components or to propagation delays. However, researchers have observed the existence of emergent computational chaos in a concurrently asynchronous framework, independent of the network topology. Researcher present a methodology enabling the effective asynchronous operation of large-scale neural networks. Necessary and sufficient conditions guaranteeing concurrent asynchronous convergence are established in terms of contracting operators. Lyapunov exponents are computed formally to characterize the underlying nonlinear dynamics. Simulation results are presented to illustrate network convergence to the correct results, even in the presence of large delays.

  2. A nanofluidic system for massively parallel PCR

    NASA Astrophysics Data System (ADS)

    Brenan, Colin; Morrison, Tom; Roberts, Douglas; Hurley, James

    2008-02-01

    Massively parallel nanofluidic systems are lab-on-a-chip devices where solution phase biochemical and biological analyses are implemented in high density arrays of nanoliter holes micro-machined in a thin platen. Polymer coatings make the interior surfaces of the holes hydrophilic and the exterior surface of the platen hydrophobic for precise and accurate self-metered loading of liquids into each hole without cross-contamination. We have created a "nanoplate" based on this concept, equivalent in performance to standard microtiter plates, having 3072 thirty-three nanoliter holes in a stainless steel platen the dimensions of a microscope slide. We report on the performance of this device for PCR-based single nucleotide polymorphism (SNP) genotyping or quantitative measurement of gene expression by real-time PCR in applications ranging from plant and animal diagnostics, agricultural genetics and human disease research.

  3. Broadband monitoring simulation with massively parallel processors

    NASA Astrophysics Data System (ADS)

    Trubetskov, Mikhail; Amotchkina, Tatiana; Tikhonravov, Alexander

    2011-09-01

    Modern efficient optimization techniques, namely needle optimization and gradual evolution, enable one to design optical coatings of any type. Even more, these techniques allow obtaining multiple solutions with close spectral characteristics. It is important, therefore, to develop software tools that can allow one to choose a practically optimal solution from a wide variety of possible theoretical designs. A practically optimal solution provides the highest production yield when optical coating is manufactured. Computational manufacturing is a low-cost tool for choosing a practically optimal solution. The theory of probability predicts that reliable production yield estimations require many hundreds or even thousands of computational manufacturing experiments. As a result reliable estimation of the production yield may require too much computational time. The most time-consuming operation is calculation of the discrepancy function used by a broadband monitoring algorithm. This function is formed by a sum of terms over wavelength grid. These terms can be computed simultaneously in different threads of computations which opens great opportunities for parallelization of computations. Multi-core and multi-processor systems can provide accelerations up to several times. Additional potential for further acceleration of computations is connected with using Graphics Processing Units (GPU). A modern GPU consists of hundreds of massively parallel processors and is capable to perform floating-point operations efficiently.

  4. Parallel molecular dynamics: Communication requirements for massively parallel machines

    NASA Astrophysics Data System (ADS)

    Taylor, Valerie E.; Stevens, Rick L.; Arnold, Kathryn E.

    1995-05-01

    Molecular mechanics and dynamics are becoming widely used to perform simulations of molecular systems from large-scale computations of materials to the design and modeling of drug compounds. In this paper we address two major issues: a good decomposition method that can take advantage of future massively parallel processing systems for modest-sized problems in the range of 50,000 atoms and the communication requirements needed to achieve 30 to 40% efficiency on MPPs. We analyzed a scalable benchmark molecular dynamics program executing on the Intel Touchstone Deleta parallelized with an interaction decomposition method. Using a validated analytical performance model of the code, we determined that for an MPP with a four-dimensional mesh topology and 400 MHz processors the communication startup time must be at most 30 clock cycles and the network bandwidth must be at least 2.3 GB/s. This configuration results in 30 to 40% efficiency of the MPP for a problem with 50,000 atoms executing on 50,000 processors.

  5. Multiplexed microsatellite recovery using massively parallel sequencing.

    PubMed

    Jennings, T N; Knaus, B J; Mullins, T D; Haig, S M; Cronn, R C

    2011-11-01

    Conservation and management of natural populations requires accurate and inexpensive genotyping methods. Traditional microsatellite, or simple sequence repeat (SSR), marker analysis remains a popular genotyping method because of the comparatively low cost of marker development, ease of analysis and high power of genotype discrimination. With the availability of massively parallel sequencing (MPS), it is now possible to sequence microsatellite-enriched genomic libraries in multiplex pools. To test this approach, we prepared seven microsatellite-enriched, barcoded genomic libraries from diverse taxa (two conifer trees, five birds) and sequenced these on one lane of the Illumina Genome Analyzer using paired-end 80-bp reads. In this experiment, we screened 6.1 million sequences and identified 356,958 unique microreads that contained di- or trinucleotide microsatellites. Examination of four species shows that our conversion rate from raw sequences to polymorphic markers compares favourably to Sanger- and 454-based methods. The advantage of multiplexed MPS is that the staggering capacity of modern microread sequencing is spread across many libraries; this reduces sample preparation and sequencing costs to less than $400 (USD) per species. This price is sufficiently low that microsatellite libraries could be prepared and sequenced for all 1373 organisms listed as 'threatened' and 'endangered' in the United States for under $0.5 M (USD).

  6. Multiplexed microsatellite recovery using massively parallel sequencing

    USGS Publications Warehouse

    Jennings, T.N.; Knaus, B.J.; Mullins, T.D.; Haig, S.M.; Cronn, R.C.

    2011-01-01

    Conservation and management of natural populations requires accurate and inexpensive genotyping methods. Traditional microsatellite, or simple sequence repeat (SSR), marker analysis remains a popular genotyping method because of the comparatively low cost of marker development, ease of analysis and high power of genotype discrimination. With the availability of massively parallel sequencing (MPS), it is now possible to sequence microsatellite-enriched genomic libraries in multiplex pools. To test this approach, we prepared seven microsatellite-enriched, barcoded genomic libraries from diverse taxa (two conifer trees, five birds) and sequenced these on one lane of the Illumina Genome Analyzer using paired-end 80-bp reads. In this experiment, we screened 6.1 million sequences and identified 356958 unique microreads that contained di- or trinucleotide microsatellites. Examination of four species shows that our conversion rate from raw sequences to polymorphic markers compares favourably to Sanger- and 454-based methods. The advantage of multiplexed MPS is that the staggering capacity of modern microread sequencing is spread across many libraries; this reduces sample preparation and sequencing costs to less than $400 (USD) per species. This price is sufficiently low that microsatellite libraries could be prepared and sequenced for all 1373 organisms listed as 'threatened' and 'endangered' in the United States for under $0.5M (USD).

  7. The EMCC / DARPA Massively Parallel Electromagnetic Scattering Project

    NASA Technical Reports Server (NTRS)

    Woo, Alex C.; Hill, Kueichien C.

    1996-01-01

    The Electromagnetic Code Consortium (EMCC) was sponsored by the Advanced Research Program Agency (ARPA) to demonstrate the effectiveness of massively parallel computing in large scale radar signature predictions. The EMCC/ARPA project consisted of three parts.

  8. Experimental free-space optical network for massively parallel computers

    NASA Astrophysics Data System (ADS)

    Araki, S.; Kajita, M.; Kasahara, K.; Kubota, K.; Kurihara, K.; Redmond, I.; Schenfeld, E.; Suzaki, T.

    1996-03-01

    A free-space optical interconnection scheme is described for massively parallel processors based on the interconnection-cached network architecture. The optical network operates in a circuit-switching mode. Combined with a packet-switching operation among the circuit-switched optical channels, a high-bandwidth, low-latency network for massively parallel processing results. The design and assembly of a 64-channel experimental prototype is discussed, and operational results are presented.

  9. Scan line graphics generation on the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Dorband, John E.

    1988-01-01

    Described here is how researchers implemented a scan line graphics generation algorithm on the Massively Parallel Processor (MPP). Pixels are computed in parallel and their results are applied to the Z buffer in large groups. To perform pixel value calculations, facilitate load balancing across the processors and apply the results to the Z buffer efficiently in parallel requires special virtual routing (sort computation) techniques developed by the author especially for use on single-instruction multiple-data (SIMD) architectures.

  10. Advances in Domain Mapping of Massively Parallel Scientific Computations

    SciTech Connect

    Leland, Robert W.; Hendrickson, Bruce A.

    2015-10-01

    One of the most important concerns in parallel computing is the proper distribution of workload across processors. For most scientific applications on massively parallel machines, the best approach to this distribution is to employ data parallelism; that is, to break the datastructures supporting a computation into pieces and then to assign those pieces to different processors. Collectively, these partitioning and assignment tasks comprise the domain mapping problem.

  11. IMPAIR: massively parallel deconvolution on the GPU

    NASA Astrophysics Data System (ADS)

    Sherry, Michael; Shearer, Andy

    2013-02-01

    The IMPAIR software is a high throughput image deconvolution tool for processing large out-of-core datasets of images, varying from large images with spatially varying PSFs to large numbers of images with spatially invariant PSFs. IMPAIR implements a parallel version of the tried and tested Richardson-Lucy deconvolution algorithm regularised via a custom wavelet thresholding library. It exploits the inherently parallel nature of the convolution operation to achieve quality results on consumer grade hardware: through the NVIDIA Tesla GPU implementation, the multi-core OpenMP implementation, and the cluster computing MPI implementation of the software. IMPAIR aims to address the problem of parallel processing in both top-down and bottom-up approaches: by managing the input data at the image level, and by managing the execution at the instruction level. These combined techniques will lead to a scalable solution with minimal resource consumption and maximal load balancing. IMPAIR is being developed as both a stand-alone tool for image processing, and as a library which can be embedded into non-parallel code to transparently provide parallel high throughput deconvolution.

  12. EFFICIENT SCHEDULING OF PARALLEL JOBS ON MASSIVELY PARALLEL SYSTEMS

    SciTech Connect

    F. PETRINI; W. FENG

    1999-09-01

    We present buffered coscheduling, a new methodology to multitask parallel jobs in a message-passing environment and to develop parallel programs that can pave the way to the efficient implementation of a distributed operating system. Buffered coscheduling is based on three innovative techniques: communication buffering, strobing, and non-blocking communication. By leveraging these techniques, we can perform effective optimizations based on the global status of the parallel machine rather than on the limited knowledge available locally to each processor. The advantages of buffered coscheduling include higher resource utilization, reduced communication overhead, efficient implementation of low-control strategies and fault-tolerant protocols, accurate performance modeling, and a simplified yet still expressive parallel programming model. Preliminary experimental results show that buffered coscheduling is very effective in increasing the overall performance in the presence of load imbalance and communication-intensive workloads.

  13. Massively parallel sequencing: the next big thing in genetic medicine.

    PubMed

    Tucker, Tracy; Marra, Marco; Friedman, Jan M

    2009-08-01

    Massively parallel sequencing has reduced the cost and increased the throughput of genomic sequencing by more than three orders of magnitude, and it seems likely that costs will fall and throughput improve even more in the next few years. Clinical use of massively parallel sequencing will provide a way to identify the cause of many diseases of unknown etiology through simultaneous screening of thousands of loci for pathogenic mutations and by sequencing biological specimens for the genomic signatures of novel infectious agents. In addition to providing these entirely new diagnostic capabilities, massively parallel sequencing may also replace arrays and Sanger sequencing in clinical applications where they are currently being used. Routine clinical use of massively parallel sequencing will require higher accuracy, better ways to select genomic subsets of interest, and improvements in the functionality, speed, and ease of use of data analysis software. In addition, substantial enhancements in laboratory computer infrastructure, data storage, and data transfer capacity will be needed to handle the extremely large data sets produced. Clinicians and laboratory personnel will require training to use the sequence data effectively, and appropriate methods will need to be developed to deal with the incidental discovery of pathogenic mutations and variants of uncertain clinical significance. Massively parallel sequencing has the potential to transform the practice of medical genetics and related fields, but the vast amount of personal genomic data produced will increase the responsibility of geneticists to ensure that the information obtained is used in a medically and socially responsible manner.

  14. Contact-impact simulations on massively parallel SIMD supercomputers

    SciTech Connect

    Plaskacz, E.J. ); Belytscko, T.; Chiang, H.Y. )

    1992-01-01

    The implementation of explicit finite element methods with contact-impact on massively parallel SIMD computers is described. The basic parallel finite element algorithm employs an exchange process which minimizes interprocessor communication at the expense of redundant computations and storage. The contact-impact algorithm is based on the pinball method in which compatibility is enforced by preventing interpenetration on spheres embedded in elements adjacent to surfaces. The enhancements to the pinball algorithm include a parallel assembled surface normal algorithm and a parallel detection of interpenetrating pairs. Some timings with and without contact-impact are given.

  15. Staging memory for massively parallel processor

    NASA Technical Reports Server (NTRS)

    Batcher, Kenneth E. (Inventor)

    1988-01-01

    The invention herein relates to a computer organization capable of rapidly processing extremely large volumes of data. A staging memory is provided having a main stager portion consisting of a large number of memory banks which are accessed in parallel to receive, store, and transfer data words simultaneous with each other. Substager portions interconnect with the main stager portion to match input and output data formats with the data format of the main stager portion. An address generator is coded for accessing the data banks for receiving or transferring the appropriate words. Input and output permutation networks arrange the lineal order of data into and out of the memory banks.

  16. Solving unstructured grid problems on massively parallel computers

    NASA Technical Reports Server (NTRS)

    Hammond, Steven W.; Schreiber, Robert

    1990-01-01

    A highly parallel graph mapping technique that enables one to efficiently solve unstructured grid problems on massively parallel computers is presented. Many implicit and explicit methods for solving discretized partial differential equations require each point in the discretization to exchange data with its neighboring points every time step or iteration. The cost of this communication can negate the high performance promised by massively parallel computing. To eliminate this bottleneck, the graph of the irregular problem is mapped into the graph representing the interconnection topology of the computer such that the sum of the distances that the messages travel is minimized. It is shown that using the heuristic mapping algorithm significantly reduces the communication time compared to a naive assignment of processes to processors.

  17. Laparoscopic splenectomy for massive splenomegaly: technical aspects of initial ligation of splenic artery and extraction without hand-assisted technique.

    PubMed

    Trelles, Nelson; Gagner, Michel; Pomp, Alfons; Parikh, Manish

    2008-06-01

    A 37-year-old man was referred for massive splenomegaly. In November 2005, he was diagnosed with non-Hodgkin's B-cell lymphoma in the setting of splenomegaly and thrombocytopenia. His laboratory results showed a coagulopathy owing to lupus anticoagulant. A computed tomography scan showed a 36 x 26 x 11 cm spleen and a prominent and sinuous splenic artery. The authors performed a laparoscopic splenectomy with an initial ligation of the splenic artery. The patient tolerated the procedure well and was discharged home on the fourth postoperative day in stable condition. Discussed in this paper is the safety and feasibility of the minimally invasive approach in massive splenomegaly.

  18. The language parallel Pascal and other aspects of the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Reeves, A. P.; Bruner, J. D.

    1982-01-01

    A high level language for the Massively Parallel Processor (MPP) was designed. This language, called Parallel Pascal, is described in detail. A description of the language design, a description of the intermediate language, Parallel P-Code, and details for the MPP implementation are included. Formal descriptions of Parallel Pascal and Parallel P-Code are given. A compiler was developed which converts programs in Parallel Pascal into the intermediate Parallel P-Code language. The code generator to complete the compiler for the MPP is being developed independently. A Parallel Pascal to Pascal translator was also developed. The architecture design for a VLSI version of the MPP was completed with a description of fault tolerant interconnection networks. The memory arrangement aspects of the MPP are discussed and a survey of other high level languages is given.

  19. Development of massively parallel quantum chemistry program SMASH

    SciTech Connect

    Ishimura, Kazuya

    2015-12-31

    A massively parallel program for quantum chemistry calculations SMASH was released under the Apache License 2.0 in September 2014. The SMASH program is written in the Fortran90/95 language with MPI and OpenMP standards for parallelization. Frequently used routines, such as one- and two-electron integral calculations, are modularized to make program developments simple. The speed-up of the B3LYP energy calculation for (C{sub 150}H{sub 30}){sub 2} with the cc-pVDZ basis set (4500 basis functions) was 50,499 on 98,304 cores of the K computer.

  20. Solving mazes with memristors: A massively parallel approach

    NASA Astrophysics Data System (ADS)

    Pershin, Yuriy V.; di Ventra, Massimiliano

    2011-10-01

    Solving mazes is not just a fun pastime: They are prototype models in several areas of science and technology. However, when maze complexity increases, their solution becomes cumbersome and very time consuming. Here, we show that a network of memristors—resistors with memory—can solve such a nontrivial problem quite easily. In particular, maze solving by the network of memristors occurs in a massively parallel fashion since all memristors in the network participate simultaneously in the calculation. The result of the calculation is then recorded into the memristors’ states and can be used and/or recovered at a later time. Furthermore, the network of memristors finds all possible solutions in multiple-solution mazes and sorts out the solution paths according to their length. Our results demonstrate not only the application of memristive networks to the field of massively parallel computing, but also an algorithm to solve mazes, which could find applications in different fields.

  1. Massively parallel sequencing, a new method for detecting adventitious agents.

    PubMed

    Onions, David; Kolman, John

    2010-05-01

    There has been an upsurge of interest in developing new veterinary and human vaccines and, in turn, this has involved the development of new mammalian and insect cell substrates. Excluding adventitious agents from these cells can be problematic, particularly for cells derived from species with limited virological investigation. Massively parallel sequencing is a powerful new method for the identification of viruses and other adventitious agents, without prior knowledge of the nature of the agent. We have developed methods using random priming to detect viruses in the supernatants from cell substrates or in virus seed stocks. Using these methods we have recently discovered a new parvovirus in bovine serum. When applied to sequencing the transcriptome, massively parallel sequencing can reveal latent or silent infections. Enormous amounts of data are developed in this process usually between 100 and 400 Mbp. Consequently, sophisticated bioinformatic algorithms are required to analyse and verify virus targets.

  2. Nearest Neighbor Search Applications for the Terasys Massively Parallel Workstation.

    DTIC Science & Technology

    1996-08-01

    ANALYSES Nearest Neighbor Search Applications for the Terasys Massively Parallel Workstation Eric W. Johnson 00 ^«fciasn» ^ «RBOüSQ...thank Harold E. Conn for help with Tera- sys programming and for running the Terasys tests described in this report. The author would like to thank...USING THE TERASYS TO FIND NEAREST NEIGHBORS 9 4.1 Placing Sample Records in the Terasys 9 4.2 Placing Training Records in the Terasys 10 4.3

  3. Massively parallel Wang-Landau sampling on multiple GPUs

    NASA Astrophysics Data System (ADS)

    Yin, Junqi; Landau, D. P.

    2012-08-01

    Wang-Landau sampling is implemented on the Graphics Processing Unit (GPU) with the Compute Unified Device Architecture (CUDA). Performances on three different GPU cards, including the new generation Fermi architecture card, are compared with that on a Central Processing Unit (CPU). The parameters for massively parallel Wang-Landau sampling are tuned in order to achieve fast convergence. For simulations of the water cluster systems, we obtain an average of over 50 times speedup for a given workload.

  4. A massively parallel corpus: the Bible in 100 languages.

    PubMed

    Christodouloupoulos, Christos; Steedman, Mark

    We describe the creation of a massively parallel corpus based on 100 translations of the Bible. We discuss some of the difficulties in acquiring and processing the raw material as well as the potential of the Bible as a corpus for natural language processing. Finally we present a statistical analysis of the corpora collected and a detailed comparison between the English translation and other English corpora.

  5. Massively parallel Wang Landau sampling on multiple GPUs

    SciTech Connect

    Yin, Junqi; Landau, D. P.

    2012-01-01

    Wang Landau sampling is implemented on the Graphics Processing Unit (GPU) with the Compute Unified Device Architecture (CUDA). Performances on three different GPU cards, including the new generation Fermi architecture card, are compared with that on a Central Processing Unit (CPU). The parameters for massively parallel Wang Landau sampling are tuned in order to achieve fast convergence. For simulations of the water cluster systems, we obtain an average of over 50 times speedup for a given workload.

  6. Development of a Massively Parallel NOGAPS Forecast Model

    DTIC Science & Technology

    2016-06-07

    to exploit massively parallel processor (MPP), distributed memory computer architectures. Future increases in computer power from MPP’s will allow...substantial increases in model resolution, more realistic physical processes, and more sophisticated data assimilation methods, all of which will...immediate objective of the project is to redesign the model’s numerical algorithms and data structures to allow efficient execution on MPP

  7. TSE computers - A means for massively parallel computations

    NASA Technical Reports Server (NTRS)

    Strong, J. P., III

    1976-01-01

    A description is presented of hardware concepts for building a massively parallel processing system for two-dimensional data. The processing system is to use logic arrays of 128 x 128 elements which perform over 16 thousand operations simultaneously. Attention is given to image data, logic arrays, basic image logic functions, a prototype negator, an interleaver device, image logic circuits, and an image memory circuit.

  8. 3D seismic imaging on massively parallel computers

    SciTech Connect

    Womble, D.E.; Ober, C.C.; Oldfield, R.

    1997-02-01

    The ability to image complex geologies such as salt domes in the Gulf of Mexico and thrusts in mountainous regions is a key to reducing the risk and cost associated with oil and gas exploration. Imaging these structures, however, is computationally expensive. Datasets can be terabytes in size, and the processing time required for the multiple iterations needed to produce a velocity model can take months, even with the massively parallel computers available today. Some algorithms, such as 3D, finite-difference, prestack, depth migration remain beyond the capacity of production seismic processing. Massively parallel processors (MPPs) and algorithms research are the tools that will enable this project to provide new seismic processing capabilities to the oil and gas industry. The goals of this work are to (1) develop finite-difference algorithms for 3D, prestack, depth migration; (2) develop efficient computational approaches for seismic imaging and for processing terabyte datasets on massively parallel computers; and (3) develop a modular, portable, seismic imaging code.

  9. Requirements for supercomputing in energy research: The transition to massively parallel computing

    SciTech Connect

    Not Available

    1993-02-01

    This report discusses: The emergence of a practical path to TeraFlop computing and beyond; requirements of energy research programs at DOE; implementation: supercomputer production computing environment on massively parallel computers; and implementation: user transition to massively parallel computing.

  10. The 2nd Symposium on the Frontiers of Massively Parallel Computations

    NASA Technical Reports Server (NTRS)

    Mills, Ronnie (Editor)

    1988-01-01

    Programming languages, computer graphics, neural networks, massively parallel computers, SIMD architecture, algorithms, digital terrain models, sort computation, simulation of charged particle transport on the massively parallel processor and image processing are among the topics discussed.

  11. Massively parallel processing of remotely sensed hyperspectral images

    NASA Astrophysics Data System (ADS)

    Plaza, Javier; Plaza, Antonio; Valencia, David; Paz, Abel

    2009-08-01

    In this paper, we develop several parallel techniques for hyperspectral image processing that have been specifically designed to be run on massively parallel systems. The techniques developed cover the three relevant areas of hyperspectral image processing: 1) spectral mixture analysis, a popular approach to characterize mixed pixels in hyperspectral data addressed in this work via efficient implementation of a morphological algorithm for automatic identification of pure spectral signatures or endmembers from the input data; 2) supervised classification of hyperspectral data using multi-layer perceptron neural networks with back-propagation learning; and 3) automatic target detection in the hyperspectral data using orthogonal subspace projection concepts. The scalability of the proposed parallel techniques is investigated using Barcelona Supercomputing Center's MareNostrum facility, one of the most powerful supercomputers in Europe.

  12. Bit-parallel arithmetic in a massively-parallel associative processor

    NASA Technical Reports Server (NTRS)

    Scherson, Isaac D.; Kramer, David A.; Alleyne, Brian D.

    1992-01-01

    A simple but powerful new architecture based on a classical associative processor model is presented. Algorithms for performing the four basic arithmetic operations both for integer and floating point operands are described. For m-bit operands, the proposed architecture makes it possible to execute complex operations in O(m) cycles as opposed to O(m exp 2) for bit-serial machines. A word-parallel, bit-parallel, massively-parallel computing system can be constructed using this architecture with VLSI technology. The operation of this system is demonstrated for the fast Fourier transform and matrix multiplication.

  13. Routing performance analysis and optimization within a massively parallel computer

    DOEpatents

    Archer, Charles Jens; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen

    2013-04-16

    An apparatus, program product and method optimize the operation of a massively parallel computer system by, in part, receiving actual performance data concerning an application executed by the plurality of interconnected nodes, and analyzing the actual performance data to identify an actual performance pattern. A desired performance pattern may be determined for the application, and an algorithm may be selected from among a plurality of algorithms stored within a memory, the algorithm being configured to achieve the desired performance pattern based on the actual performance data.

  14. Ordered fast Fourier transforms on a massively parallel hypercube multiprocessor

    NASA Technical Reports Server (NTRS)

    Tong, Charles; Swarztrauber, Paul N.

    1991-01-01

    The present evaluation of alternative, massively parallel hypercube processor-applicable designs for ordered radix-2 decimation-in-frequency FFT algorithms gives attention to the reduction of computation time-dominating communication. A combination of the order and computational phases of the FFT is accordingly employed, in conjunction with sequence-to-processor maps which reduce communication. Two orderings, 'standard' and 'cyclic', in which the order of the transform is the same as that of the input sequence, can be implemented with ease on the Connection Machine (where orderings are determined by geometries and priorities. A parallel method for trigonometric coefficient computation is presented which does not employ trigonometric functions or interprocessor communication.

  15. A biconjugate gradient type algorithm on massively parallel architectures

    NASA Technical Reports Server (NTRS)

    Freund, Roland W.; Hochbruck, Marlis

    1991-01-01

    The biconjugate gradient (BCG) method is the natural generalization of the classical conjugate gradient algorithm for Hermitian positive definite matrices to general non-Hermitian linear systems. Unfortunately, the original BCG algorithm is susceptible to possible breakdowns and numerical instabilities. Recently, Freund and Nachtigal have proposed a novel BCG type approach, the quasi-minimal residual method (QMR), which overcomes the problems of BCG. Here, an implementation is presented of QMR based on an s-step version of the nonsymmetric look-ahead Lanczos algorithm. The main feature of the s-step Lanczos algorithm is that, in general, all inner products, except for one, can be computed in parallel at the end of each block; this is unlike the other standard Lanczos process where inner products are generated sequentially. The resulting implementation of QMR is particularly attractive on massively parallel SIMD architectures, such as the Connection Machine.

  16. A massively parallel fractional step solver for incompressible flows

    SciTech Connect

    Houzeaux, G. Vazquez, M. Aubry, R. Cela, J.M.

    2009-09-20

    This paper presents a parallel implementation of fractional solvers for the incompressible Navier-Stokes equations using an algebraic approach. Under this framework, predictor-corrector and incremental projection schemes are seen as sub-classes of the same class, making apparent its differences and similarities. An additional advantage of this approach is to set a common basis for a parallelization strategy, which can be extended to other split techniques or to compressible flows. The predictor-corrector scheme consists in solving the momentum equation and a modified 'continuity' equation (namely a simple iteration for the pressure Schur complement) consecutively in order to converge to the monolithic solution, thus avoiding fractional errors. On the other hand, the incremental projection scheme solves only one iteration of the predictor-corrector per time step and adds a correction equation to fulfill the mass conservation. As shown in the paper, these two schemes are very well suited for massively parallel implementation. In fact, when compared with monolithic schemes, simpler solvers and preconditioners can be used to solve the non-symmetric momentum equations (GMRES, Bi-CGSTAB) and to solve the symmetric continuity equation (CG, Deflated CG). This gives good speedup properties of the algorithm. The implementation of the mesh partitioning technique is presented, as well as the parallel performances and speedups for thousands of processors.

  17. Ordered fast fourier transforms on a massively parallel hypercube multiprocessor

    NASA Technical Reports Server (NTRS)

    Tong, Charles; Swarztrauber, Paul N.

    1989-01-01

    Design alternatives for ordered Fast Fourier Transformation (FFT) algorithms were examined on massively parallel hypercube multiprocessors such as the Connection Machine. Particular emphasis is placed on reducing communication which is known to dominate the overall computing time. To this end, the order and computational phases of the FFT were combined, and the sequence to processor maps that reduce communication were used. The class of ordered transforms is expanded to include any FFT in which the order of the transform is the same as that of the input sequence. Two such orderings are examined, namely, standard-order and A-order which can be implemented with equal ease on the Connection Machine where orderings are determined by geometries and priorities. If the sequence has N = 2 exp r elements and the hypercube has P = 2 exp d processors, then a standard-order FFT can be implemented with d + r/2 + 1 parallel transmissions. An A-order sequence can be transformed with 2d - r/2 parallel transmissions which is r - d + 1 fewer than the standard order. A parallel method for computing the trigonometric coefficients is presented that does not use trigonometric functions or interprocessor communication. A performance of 0.9 GFLOPS was obtained for an A-order transform on the Connection Machine.

  18. Learning Quantitative Sequence-Function Relationships from Massively Parallel Experiments

    NASA Astrophysics Data System (ADS)

    Atwal, Gurinder S.; Kinney, Justin B.

    2016-03-01

    A fundamental aspect of biological information processing is the ubiquity of sequence-function relationships—functions that map the sequence of DNA, RNA, or protein to a biochemically relevant activity. Most sequence-function relationships in biology are quantitative, but only recently have experimental techniques for effectively measuring these relationships been developed. The advent of such "massively parallel" experiments presents an exciting opportunity for the concepts and methods of statistical physics to inform the study of biological systems. After reviewing these recent experimental advances, we focus on the problem of how to infer parametric models of sequence-function relationships from the data produced by these experiments. Specifically, we retrace and extend recent theoretical work showing that inference based on mutual information, not the standard likelihood-based approach, is often necessary for accurately learning the parameters of these models. Closely connected with this result is the emergence of "diffeomorphic modes"—directions in parameter space that are far less constrained by data than likelihood-based inference would suggest. Analogous to Goldstone modes in physics, diffeomorphic modes arise from an arbitrarily broken symmetry of the inference problem. An analytically tractable model of a massively parallel experiment is then described, providing an explicit demonstration of these fundamental aspects of statistical inference. This paper concludes with an outlook on the theoretical and computational challenges currently facing studies of quantitative sequence-function relationships.

  19. Massively parallel density functional calculations for thousands of atoms: KKRnano

    NASA Astrophysics Data System (ADS)

    Thiess, A.; Zeller, R.; Bolten, M.; Dederichs, P. H.; Blügel, S.

    2012-06-01

    Applications of existing precise electronic-structure methods based on density functional theory are typically limited to the treatment of about 1000 inequivalent atoms, which leaves unresolved many open questions in material science, e.g., on complex defects, interfaces, dislocations, and nanostructures. KKRnano is a new massively parallel linear scaling all-electron density functional algorithm in the framework of the Korringa-Kohn-Rostoker (KKR) Green's-function method. We conceptualized, developed, and optimized KKRnano for large-scale applications of many thousands of atoms without compromising on the precision of a full-potential all-electron method, i.e., it is a method without any shape approximation of the charge density or potential. A key element of the new method is the iterative solution of the sparse linear Dyson equation, which we parallelized atom by atom, across energy points in the complex plane and for each spin degree of freedom using the message passing interface standard, followed by a lower-level OpenMP parallelization. This hybrid four-level parallelization allows for an efficient use of up to 100000 processors on the latest generation of supercomputers. The iterative solution of the Dyson equation is significantly accelerated, employing preconditioning techniques making use of coarse-graining principles expressed in a block-circulant preconditioner. In this paper, we will describe the important elements of this new algorithm, focusing on the parallelization and preconditioning and showing scaling results for NiPd alloys up to 8192 atoms and 65536 processors. At the end, we present an order-N algorithm for large-scale simulations of metallic systems, making use of the nearsighted principle of the KKR Green's-function approach by introducing a truncation of the electron scattering to a local cluster of atoms, the size of which is determined by the requested accuracy. By exploiting this algorithm, we show linear scaling calculations of more

  20. Automation of Molecular-Based Analyses: A Primer on Massively Parallel Sequencing

    PubMed Central

    Nguyen, Lan; Burnett, Leslie

    2014-01-01

    Recent advances in genetics have been enabled by new genetic sequencing techniques called massively parallel sequencing (MPS) or next-generation sequencing. Through the ability to sequence in parallel hundreds of thousands to millions of DNA fragments, the cost and time required for sequencing has dramatically decreased. There are a number of different MPS platforms currently available and being used in Australia. Although they differ in the underlying technology involved, their overall processes are very similar: DNA fragmentation, adaptor ligation, immobilisation, amplification, sequencing reaction and data analysis. MPS is being used in research, translational and increasingly now also in clinical settings. Common applications include sequencing of whole genomes, whole exomes or targeted genes for disease-causing gene discovery, genetic diagnosis and targeted cancer therapy. Even though the revolution that is occurring with MPS is exciting due to its increasing use, improving and emerging technologies and new applications, significant challenges still exist. Particularly challenging issues are the bioinformatics required for data analysis, interpretation of results and the ethical dilemma of ‘incidental findings’. PMID:25336762

  1. Representing and computing regular languages on massively parallel networks

    SciTech Connect

    Miller, M.I.; O'Sullivan, J.A. ); Boysam, B. ); Smith, K.R. )

    1991-01-01

    This paper proposes a general method for incorporating rule-based constraints corresponding to regular languages into stochastic inference problems, thereby allowing for a unified representation of stochastic and syntactic pattern constraints. The authors' approach first established the formal connection of rules to Chomsky grammars, and generalizes the original work of Shannon on the encoding of rule-based channel sequences to Markov chains of maximum entropy. This maximum entropy probabilistic view leads to Gibb's representations with potentials which have their number of minima growing at precisely the exponential rate that the language of deterministically constrained sequences grow. These representations are coupled to stochastic diffusion algorithms, which sample the language-constrained sequences by visiting the energy minima according to the underlying Gibbs' probability law. The coupling to stochastic search methods yields the all-important practical result that fully parallel stochastic cellular automata may be derived to generate samples from the rule-based constraint sets. The production rules and neighborhood state structure of the language of sequences directly determines the necessary connection structures of the required parallel computing surface. Representations of this type have been mapped to the DAP-510 massively-parallel processor consisting of 1024 mesh-connected bit-serial processing elements for performing automated segmentation of electron-micrograph images.

  2. Direct stereo radargrammetric processing using massively parallel processing

    NASA Astrophysics Data System (ADS)

    Balz, Timo; Zhang, Lu; Liao, Mingsheng

    2013-05-01

    Synthetic Aperture Radar (SAR) offers many ways to reconstruct digital surface models (DSMs). The two most commonly used methods are SAR interferometry (InSAR) and stereo radargrammetry. Stereo radargrammetry is a very stable and reliable process and is far less affected by temporal decorrelation compared with InSAR. It is therefore often used for DSM generation in heavily vegetated areas. However, stereo radargrammetry often produces rather noisy DSMs, sometimes containing large outliers. In this manuscript, we present a new approach for stereo radargrammetric processing, where the homologous points between the images are found by geocoding large amount of points. This offers a very flexible approach, allowing the simultaneous processing of multiple images and of cross-heading image pairs. Our approach relies on a good initial geocoding accuracy of the data and on very fast processing using a massively parallel implementation. The approach is demonstrated using TerraSAR-X images from Mount Song, China, and from Trento, Italy.

  3. Transmissive Nanohole Arrays for Massively-Parallel Optical Biosensing

    PubMed Central

    2015-01-01

    A high-throughput optical biosensing technique is proposed and demonstrated. This hybrid technique combines optical transmission of nanoholes with colorimetric silver staining. The size and spacing of the nanoholes are chosen so that individual nanoholes can be independently resolved in massive parallel using an ordinary transmission optical microscope, and, in place of determining a spectral shift, the brightness of each nanohole is recorded to greatly simplify the readout. Each nanohole then acts as an independent sensor, and the blocking of nanohole optical transmission by enzymatic silver staining defines the specific detection of a biological agent. Nearly 10000 nanoholes can be simultaneously monitored under the field of view of a typical microscope. As an initial proof of concept, biotinylated lysozyme (biotin-HEL) was used as a model analyte, giving a detection limit as low as 0.1 ng/mL. PMID:25530982

  4. A Computational Fluid Dynamics Algorithm on a Massively Parallel Computer

    NASA Technical Reports Server (NTRS)

    Jespersen, Dennis C.; Levit, Creon

    1989-01-01

    The discipline of computational fluid dynamics is demanding ever-increasing computational power to deal with complex fluid flow problems. We investigate the performance of a finite-difference computational fluid dynamics algorithm on a massively parallel computer, the Connection Machine. Of special interest is an implicit time-stepping algorithm; to obtain maximum performance from the Connection Machine, it is necessary to use a nonstandard algorithm to solve the linear systems that arise in the implicit algorithm. We find that the Connection Machine ran achieve very high computation rates on both explicit and implicit algorithms. The performance of the Connection Machine puts it in the same class as today's most powerful conventional supercomputers.

  5. Integration of IR focal plane arrays with massively parallel processor

    NASA Astrophysics Data System (ADS)

    Esfandiari, P.; Koskey, P.; Vaccaro, K.; Buchwald, W.; Clark, F.; Krejca, B.; Rekeczky, C.; Zarandy, A.

    2008-04-01

    The intent of this investigation is to replace the low fill factor visible sensor of a Cellular Neural Network (CNN) processor with an InGaAs Focal Plane Array (FPA) using both bump bonding and epitaxial layer transfer techniques for use in the Ballistic Missile Defense System (BMDS) interceptor seekers. The goal is to fabricate a massively parallel digital processor with a local as well as a global interconnect architecture. Currently, this unique CNN processor is capable of processing a target scene in excess of 10,000 frames per second with its visible sensor. What makes the CNN processor so unique is that each processing element includes memory, local data storage, local and global communication devices and a visible sensor supported by a programmable analog or digital computer program.

  6. Development of a massively parallel parachute performance prediction code

    SciTech Connect

    Peterson, C.W.; Strickland, J.H.; Wolfe, W.P.; Sundberg, W.D.; McBride, D.D.

    1997-04-01

    The Department of Energy has given Sandia full responsibility for the complete life cycle (cradle to grave) of all nuclear weapon parachutes. Sandia National Laboratories is initiating development of a complete numerical simulation of parachute performance, beginning with parachute deployment and continuing through inflation and steady state descent. The purpose of the parachute performance code is to predict the performance of stockpile weapon parachutes as these parachutes continue to age well beyond their intended service life. A new massively parallel computer will provide unprecedented speed and memory for solving this complex problem, and new software will be written to treat the coupled fluid, structure and trajectory calculations as part of a single code. Verification and validation experiments have been proposed to provide the necessary confidence in the computations.

  7. A massively parallel solution strategy for efficient thermal radiation simulation

    NASA Astrophysics Data System (ADS)

    Nguyen, P. D.; Moureau, V.; Vervisch, L.; Perret, N.

    2012-06-01

    A novel and efficient methodology to solve the Radiative Transfer Equations (RTE) in thermal radiation is discussed. The BiCGStab(2) iterative solution method, as designed for the non-symmetric linear equation systems, is used to solve the discretized RTE. The numerical upwind and central schemes are blended to provide a stable numerical scheme (MUCS) for interpolation of the cell facial radiation intensities in finite volume formulation. The combination of the BiCGStab(2) and MUCS methods proved to be very efficient when coupling with the DOM approach to solve the RTE. A cost-effective tabulation technique for the gaseous radiative property model SNB-FSCK using 7-point Gauss-Labatto quadrature scheme is also introduced. The whole methodology is implemented into a massively parallel unstructured CFD code where the radiative and fluid flow solutions share the same domain decomposition, which is the bottleneck in current radiative solvers. The dual mesh decomposition at the cell groups level and processors level is adopted to optimize the CFD code for massively parallel computing. The whole method is applied to simulate the radiation heat-transfer in a 3D rectangular enclosure containing non-isothermal CO2 and H2O mixtures. Two test cases are studied for homogeneous and inhomogeneous distributions of CO2 and H2O in the enclosure. The result is reported for the heat flux and radiation energy source and the comparison is also made between the present methodology BiCGStab(2)/MUCS/tabulated SNB-FSCK, the benchmark method SNB-CK (implemented at 25cm-1 narrow-band) and some other methods available in the literature. The present method (BiCGStab(2)/MUCS/tabulated SNB-FSCK) yields more accurate predictions particularly for the radiation source term. When comparing with the benchmark solution, the relative error of the radiation source term is remarkably reduced to less than 4% and the CPU time is drastically diminished.

  8. Massively Parallel Simulations of Diffusion in Dense Polymeric Structures

    SciTech Connect

    Faulon, Jean-Loup, Wilcox, R.T. , Hobbs, J.D. , Ford, D.M.

    1997-11-01

    An original computational technique to generate close-to-equilibrium dense polymeric structures is proposed. Diffusion of small gases are studied on the equilibrated structures using massively parallel molecular dynamics simulations running on the Intel Teraflops (9216 Pentium Pro processors) and Intel Paragon(1840 processors). Compared to the current state-of-the-art equilibration methods this new technique appears to be faster by some orders of magnitude.The main advantage of the technique is that one can circumvent the bottlenecks in configuration space that inhibit relaxation in molecular dynamics simulations. The technique is based on the fact that tetravalent atoms (such as carbon and silicon) fit in the center of a regular tetrahedron and that regular tetrahedrons can be used to mesh the three-dimensional space. Thus, the problem of polymer equilibration described by continuous equations in molecular dynamics is reduced to a discrete problem where solutions are approximated by simple algorithms. Practical modeling applications include the constructing of butyl rubber and ethylene-propylene-dimer-monomer (EPDM) models for oxygen and water diffusion calculations. Butyl and EPDM are used in O-ring systems and serve as sealing joints in many manufactured objects. Diffusion coefficients of small gases have been measured experimentally on both polymeric systems, and in general the diffusion coefficients in EPDM are an order of magnitude larger than in butyl. In order to better understand the diffusion phenomena, 10, 000 atoms models were generated and equilibrated for butyl and EPDM. The models were submitted to a massively parallel molecular dynamics simulation to monitor the trajectories of the diffusing species.

  9. Particle simulation of plasmas on the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Gledhill, I. M. A.; Storey, L. R. O.

    1987-01-01

    Particle simulations, in which collective phenomena in plasmas are studied by following the self consistent motions of many discrete particles, involve several highly repetitive sets of calculations that are readily adaptable to SIMD parallel processing. A fully electromagnetic, relativistic plasma simulation for the massively parallel processor is described. The particle motions are followed in 2 1/2 dimensions on a 128 x 128 grid, with periodic boundary conditions. The two dimensional simulation space is mapped directly onto the processor network; a Fast Fourier Transform is used to solve the field equations. Particle data are stored according to an Eulerian scheme, i.e., the information associated with each particle is moved from one local memory to another as the particle moves across the spatial grid. The method is applied to the study of the nonlinear development of the whistler instability in a magnetospheric plasma model, with an anisotropic electron temperature. The wave distribution function is included as a new diagnostic to allow simulation results to be compared with satellite observations.

  10. Massively Parallel Single-Molecule Manipulation Using Centrifugal Force

    NASA Astrophysics Data System (ADS)

    Wong, Wesley; Halvorsen, Ken

    2011-03-01

    Precise manipulation of single molecules has led to remarkable insights in physics, chemistry, biology, and medicine. However, two issues that have impeded the widespread adoption of these techniques are equipment cost and the laborious nature of making measurements one molecule at a time. To meet these challenges, we have developed an approach that enables massively parallel single- molecule force measurements using centrifugal force. This approach is realized in the centrifuge force microscope, an instrument in which objects in an orbiting sample are subjected to a calibration-free, macroscopically uniform force- field while their micro-to-nanoscopic motions are observed. We demonstrate high- throughput single-molecule force spectroscopy with this technique by performing thousands of rupture experiments in parallel, characterizing force-dependent unbinding kinetics of an antibody-antigen pair in minutes rather than days. Currently, we are taking steps to integrate high-resolution detection, fluorescence, temperature control and a greater dynamic range in force. With significant benefits in efficiency, cost, simplicity, and versatility, single-molecule centrifugation has the potential to expand single-molecule experimentation to a wider range of researchers and experimental systems.

  11. Massively Parallel Atomic Force Microscope with Digital Holographic Readout

    NASA Astrophysics Data System (ADS)

    Sache, L.; Kawakatsu, H.; Emery, Y.; Bleuler, H.

    2007-03-01

    Massively Parallel Scanning Probe Microscopy is an obvious path for data storage (E Grochowski, R F Hoyt, Future Trends in Hard disc Drives, IEEE Trans. Magn. 1996, 32, 1850- 1854; J L Griffin, S W Schlosser, G R Ganger and D F Nagle, Modeling and Performance of MEMS-Based Storage Devices, Proc. ACM SIGMETRICS, 2000). Current experimental systems still lay far behind Hard Disc Drive (HDD) or Digital Video Disk (DVD), be it in access speed, data throughput, storage density or cost per bit. This paper presents an entirely new approach with the promise to break several of these barriers. The key idea is readout of a Scanning Probes Microscope (SPM) array by Digital Holographic Microscopy (DHM). This technology directly gives phase information at each pixel of a CCD array. This means that no contact line to each individual SPM probes is needed. The data is directly available in parallel form. Moreover, the optical setup needs in principle no expensive components, optical (or, to a large extent, mechanical) imperfections being compensated in the signal processing, i.e. in electronics. This gives the system the potential for a low cost device with fast Terabit readout capability.

  12. Efficiently modeling neural networks on massively parallel computers

    NASA Technical Reports Server (NTRS)

    Farber, Robert M.

    1993-01-01

    Neural networks are a very useful tool for analyzing and modeling complex real world systems. Applying neural network simulations to real world problems generally involves large amounts of data and massive amounts of computation. To efficiently handle the computational requirements of large problems, we have implemented at Los Alamos a highly efficient neural network compiler for serial computers, vector computers, vector parallel computers, and fine grain SIMD computers such as the CM-2 connection machine. This paper describes the mapping used by the compiler to implement feed-forward backpropagation neural networks for a SIMD (Single Instruction Multiple Data) architecture parallel computer. Thinking Machines Corporation has benchmarked our code at 1.3 billion interconnects per second (approximately 3 gigaflops) on a 64,000 processor CM-2 connection machine (Singer 1990). This mapping is applicable to other SIMD computers and can be implemented on MIMD computers such as the CM-5 connection machine. Our mapping has virtually no communications overhead with the exception of the communications required for a global summation across the processors (which has a sub-linear runtime growth on the order of O(log(number of processors)). We can efficiently model very large neural networks which have many neurons and interconnects and our mapping can extend to arbitrarily large networks (within memory limitations) by merging the memory space of separate processors with fast adjacent processor interprocessor communications. This paper will consider the simulation of only feed forward neural network although this method is extendable to recurrent networks.

  13. CHOLLA: A New Massively Parallel Hydrodynamics Code for Astrophysical Simulation

    NASA Astrophysics Data System (ADS)

    Schneider, Evan E.; Robertson, Brant E.

    2015-04-01

    We present Computational Hydrodynamics On ParaLLel Architectures (Cholla ), a new three-dimensional hydrodynamics code that harnesses the power of graphics processing units (GPUs) to accelerate astrophysical simulations. Cholla models the Euler equations on a static mesh using state-of-the-art techniques, including the unsplit Corner Transport Upwind algorithm, a variety of exact and approximate Riemann solvers, and multiple spatial reconstruction techniques including the piecewise parabolic method (PPM). Using GPUs, Cholla evolves the fluid properties of thousands of cells simultaneously and can update over 10 million cells per GPU-second while using an exact Riemann solver and PPM reconstruction. Owing to the massively parallel architecture of GPUs and the design of the Cholla code, astrophysical simulations with physically interesting grid resolutions (≳2563) can easily be computed on a single device. We use the Message Passing Interface library to extend calculations onto multiple devices and demonstrate nearly ideal scaling beyond 64 GPUs. A suite of test problems highlights the physical accuracy of our modeling and provides a useful comparison to other codes. We then use Cholla to simulate the interaction of a shock wave with a gas cloud in the interstellar medium, showing that the evolution of the cloud is highly dependent on its density structure. We reconcile the computed mixing time of a turbulent cloud with a realistic density distribution destroyed by a strong shock with the existing analytic theory for spherical cloud destruction by describing the system in terms of its median gas density.

  14. CHOLLA: A NEW MASSIVELY PARALLEL HYDRODYNAMICS CODE FOR ASTROPHYSICAL SIMULATION

    SciTech Connect

    Schneider, Evan E.; Robertson, Brant E.

    2015-04-15

    We present Computational Hydrodynamics On ParaLLel Architectures (Cholla ), a new three-dimensional hydrodynamics code that harnesses the power of graphics processing units (GPUs) to accelerate astrophysical simulations. Cholla models the Euler equations on a static mesh using state-of-the-art techniques, including the unsplit Corner Transport Upwind algorithm, a variety of exact and approximate Riemann solvers, and multiple spatial reconstruction techniques including the piecewise parabolic method (PPM). Using GPUs, Cholla evolves the fluid properties of thousands of cells simultaneously and can update over 10 million cells per GPU-second while using an exact Riemann solver and PPM reconstruction. Owing to the massively parallel architecture of GPUs and the design of the Cholla code, astrophysical simulations with physically interesting grid resolutions (≳256{sup 3}) can easily be computed on a single device. We use the Message Passing Interface library to extend calculations onto multiple devices and demonstrate nearly ideal scaling beyond 64 GPUs. A suite of test problems highlights the physical accuracy of our modeling and provides a useful comparison to other codes. We then use Cholla to simulate the interaction of a shock wave with a gas cloud in the interstellar medium, showing that the evolution of the cloud is highly dependent on its density structure. We reconcile the computed mixing time of a turbulent cloud with a realistic density distribution destroyed by a strong shock with the existing analytic theory for spherical cloud destruction by describing the system in terms of its median gas density.

  15. Massively Parallel Interrogation of Aptamer Sequence, Structure and Function

    SciTech Connect

    Fischer, N O; Tok, J B; Tarasow, T M

    2008-02-08

    Optimization of high affinity reagents is a significant bottleneck in medicine and the life sciences. The ability to synthetically create thousands of permutations of a lead high-affinity reagent and survey the properties of individual permutations in parallel could potentially relieve this bottleneck. Aptamers are single stranded oligonucleotides affinity reagents isolated by in vitro selection processes and as a class have been shown to bind a wide variety of target molecules. Methodology/Principal Findings. High density DNA microarray technology was used to synthesize, in situ, arrays of approximately 3,900 aptamer sequence permutations in triplicate. These sequences were interrogated on-chip for their ability to bind the fluorescently-labeled cognate target, immunoglobulin E, resulting in the parallel execution of thousands of experiments. Fluorescence intensity at each array feature was well resolved and shown to be a function of the sequence present. The data demonstrated high intra- and interchip correlation between the same features as well as among the sequence triplicates within a single array. Consistent with aptamer mediated IgE binding, fluorescence intensity correlated strongly with specific aptamer sequences and the concentration of IgE applied to the array. The massively parallel sequence-function analyses provided by this approach confirmed the importance of a consensus sequence found in all 21 of the original IgE aptamer sequences and support a common stem:loop structure as being the secondary structure underlying IgE binding. The microarray application, data and results presented illustrate an efficient, high information content approach to optimizing aptamer function. It also provides a foundation from which to better understand and manipulate this important class of high affinity biomolecules.

  16. Scalable, massively parallel approaches to upstream drainage area computation

    NASA Astrophysics Data System (ADS)

    Richardson, A.; Hill, C. N.; Perron, T.

    2011-12-01

    Accumulated drainage area maps of large regions are required for several applications. Among these are assessments of regional patterns of flow and sediment routing, high-resolution landscape evolution models in which drainage basin geometry evolves with time, and surveys of the characteristics of river basins that drain to continental margins. The computation of accumulated drainage areas is accomplished by inferring the vector field of drainage flow directions from a two-dimensional digital elevation map, and then computing the area that drains to each tile. From this map of elevations we can compute the integrated, upstream area that drains to each tile of the map. Generally this last step is done with a recursive algorithm, that accumulates upstream areas sequentially. The inherently serial nature of this restricts the number of tiles that can be included, thereby limiting the resolution of continental-size domains. This is because of the requirements of both memory, which will rise proportionally to the number of tiles, N, and computing time, which is O(N2). The fundamental sequential property of this approach prohibits effective use of large scale parallelism. An alternate method of calculating accumulated drainage area from drainage direction data can be arrived at by reformulating the problem as the solution of a system of simultaneous linear equations. The equations define the relation that the total upslope area of a particular tile is the sum of all the upslope areas for tiles immediately adjacent to that tile that drain to it, and the tile's own area. Solving these equations amounts to finding the solution of a sparse, nine-diagonal matrix operating on a vector for a right-hand-side that is simply the individual tile areas and where the diagonals of the matrix are determined by the landscape geometry. We show how an iterative method, Bi-CGSTAB, can be used to solve this problem in a scalable, massively parallel manner. However, this introduces

  17. Analysis of composite ablators using massively parallel computation

    NASA Technical Reports Server (NTRS)

    Shia, David

    1995-01-01

    In this work, the feasibility of using massively parallel computation to study the response of ablative materials is investigated. Explicit and implicit finite difference methods are used on a massively parallel computer, the Thinking Machines CM-5. The governing equations are a set of nonlinear partial differential equations. The governing equations are developed for three sample problems: (1) transpiration cooling, (2) ablative composite plate, and (3) restrained thermal growth testing. The transpiration cooling problem is solved using a solution scheme based solely on the explicit finite difference method. The results are compared with available analytical steady-state through-thickness temperature and pressure distributions and good agreement between the numerical and analytical solutions is found. It is also found that a solution scheme based on the explicit finite difference method has the following advantages: incorporates complex physics easily, results in a simple algorithm, and is easily parallelizable. However, a solution scheme of this kind needs very small time steps to maintain stability. A solution scheme based on the implicit finite difference method has the advantage that it does not require very small times steps to maintain stability. However, this kind of solution scheme has the disadvantages that complex physics cannot be easily incorporated into the algorithm and that the solution scheme is difficult to parallelize. A hybrid solution scheme is then developed to combine the strengths of the explicit and implicit finite difference methods and minimize their weaknesses. This is achieved by identifying the critical time scale associated with the governing equations and applying the appropriate finite difference method according to this critical time scale. The hybrid solution scheme is then applied to the ablative composite plate and restrained thermal growth problems. The gas storage term is included in the explicit pressure calculation of both

  18. Cloud identification using genetic algorithms and massively parallel computation

    NASA Technical Reports Server (NTRS)

    Buckles, Bill P.; Petry, Frederick E.

    1996-01-01

    As a Guest Computational Investigator under the NASA administered component of the High Performance Computing and Communication Program, we implemented a massively parallel genetic algorithm on the MasPar SIMD computer. Experiments were conducted using Earth Science data in the domains of meteorology and oceanography. Results obtained in these domains are competitive with, and in most cases better than, similar problems solved using other methods. In the meteorological domain, we chose to identify clouds using AVHRR spectral data. Four cloud speciations were used although most researchers settle for three. Results were remarkedly consistent across all tests (91% accuracy). Refinements of this method may lead to more timely and complete information for Global Circulation Models (GCMS) that are prevalent in weather forecasting and global environment studies. In the oceanographic domain, we chose to identify ocean currents from a spectrometer having similar characteristics to AVHRR. Here the results were mixed (60% to 80% accuracy). Given that one is willing to run the experiment several times (say 10), then it is acceptable to claim the higher accuracy rating. This problem has never been successfully automated. Therefore, these results are encouraging even though less impressive than the cloud experiment. Successful conclusion of an automated ocean current detection system would impact coastal fishing, naval tactics, and the study of micro-climates. Finally we contributed to the basic knowledge of GA (genetic algorithm) behavior in parallel environments. We developed better knowledge of the use of subpopulations in the context of shared breeding pools and the migration of individuals. Rigorous experiments were conducted based on quantifiable performance criteria. While much of the work confirmed current wisdom, for the first time we were able to submit conclusive evidence. The software developed under this grant was placed in the public domain. An extensive user

  19. Massively Parallel Processing for Fast and Accurate Stamping Simulations

    NASA Astrophysics Data System (ADS)

    Gress, Jeffrey J.; Xu, Siguang; Joshi, Ramesh; Wang, Chuan-tao; Paul, Sabu

    2005-08-01

    The competitive automotive market drives automotive manufacturers to speed up the vehicle development cycles and reduce the lead-time. Fast tooling development is one of the key areas to support fast and short vehicle development programs (VDP). In the past ten years, the stamping simulation has become the most effective validation tool in predicting and resolving all potential formability and quality problems before the dies are physically made. The stamping simulation and formability analysis has become an critical business segment in GM math-based die engineering process. As the simulation becomes as one of the major production tools in engineering factory, the simulation speed and accuracy are the two of the most important measures for stamping simulation technology. The speed and time-in-system of forming analysis becomes an even more critical to support the fast VDP and tooling readiness. Since 1997, General Motors Die Center has been working jointly with our software vendor to develop and implement a parallel version of simulation software for mass production analysis applications. By 2001, this technology was matured in the form of distributed memory processing (DMP) of draw die simulations in a networked distributed memory computing environment. In 2004, this technology was refined to massively parallel processing (MPP) and extended to line die forming analysis (draw, trim, flange, and associated spring-back) running on a dedicated computing environment. The evolution of this technology and the insight gained through the implementation of DM0P/MPP technology as well as performance benchmarks are discussed in this publication.

  20. Fmoc-based peptide thioester synthesis with self-purifying effect: heading to native chemical ligation in parallel formats.

    PubMed

    Thomas, Franziska

    2013-03-01

    The chemical synthesis of proteins has facilitated functional studies of proteins due to the site-specific incorporation of post-translational modifications, labels, and non-proteinogenic amino acids. Moreover, native chemical ligation provides facile access to proteins by chemical means. However, the application of the native chemical ligation reaction in the synthesis of parallel formats such as protein arrays has been complicated because of the often cumbersome and time-consuming synthesis of the required peptide thioesters. An Fmoc-based peptide thioester synthesis with self-purification on the sulfonamide 'safety-catch' linker widens this bottleneck because HPLC purification can be avoided. The method is based on an on-resin cyclization-thiolysis reaction sequence. A macrocyclization via the N-terminus of the full-length peptide followed by a thiolytic C-terminal ring opening allows selective detachment of the truncation products and the full-length peptide. A brief overview of the chemical aspects of this method is provided including the optimization steps and the automation process. Furthermore, the application of the cyclization-thiolysis approach combined with the native chemical ligation reaction in the parallel synthesis of a library of 16 SH3-domain variants of SHO1 in yeast is described, demonstrating the value of this new technique for the chemical synthesis of protein arrays.

  1. Massively parallel processor networks with optical express channels

    DOEpatents

    Deri, Robert J.; Brooks, III, Eugene D.; Haigh, Ronald E.; DeGroot, Anthony J.

    1999-01-01

    An optical method for separating and routing local and express channel data comprises interconnecting the nodes in a network with fiber optic cables. A single fiber optic cable carries both express channel traffic and local channel traffic, e.g., in a massively parallel processor (MPP) network. Express channel traffic is placed on, or filtered from, the fiber optic cable at a light frequency or a color different from that of the local channel traffic. The express channel traffic is thus placed on a light carrier that skips over the local intermediate nodes one-by-one by reflecting off of selective mirrors placed at each local node. The local-channel-traffic light carriers pass through the selective mirrors and are not reflected. A single fiber optic cable can thus be threaded throughout a three-dimensional matrix of nodes with the x,y,z directions of propagation encoded by the color of the respective light carriers for both local and express channel traffic. Thus frequency division multiple access is used to hierarchically separate the local and express channels to eliminate the bucket brigade latencies that would otherwise result if the express traffic had to hop between every local node to reach its ultimate destination.

  2. Massively parallel processor networks with optical express channels

    DOEpatents

    Deri, R.J.; Brooks, E.D. III; Haigh, R.E.; DeGroot, A.J.

    1999-08-24

    An optical method for separating and routing local and express channel data comprises interconnecting the nodes in a network with fiber optic cables. A single fiber optic cable carries both express channel traffic and local channel traffic, e.g., in a massively parallel processor (MPP) network. Express channel traffic is placed on, or filtered from, the fiber optic cable at a light frequency or a color different from that of the local channel traffic. The express channel traffic is thus placed on a light carrier that skips over the local intermediate nodes one-by-one by reflecting off of selective mirrors placed at each local node. The local-channel-traffic light carriers pass through the selective mirrors and are not reflected. A single fiber optic cable can thus be threaded throughout a three-dimensional matrix of nodes with the x,y,z directions of propagation encoded by the color of the respective light carriers for both local and express channel traffic. Thus frequency division multiple access is used to hierarchically separate the local and express channels to eliminate the bucket brigade latencies that would otherwise result if the express traffic had to hop between every local node to reach its ultimate destination. 3 figs.

  3. Massively parallel interferometry: towards the all-integrated lambdameter

    NASA Astrophysics Data System (ADS)

    Fodor, Jozsua; Garcia-Marquez, Jorge; Surrel, Yves

    2004-08-01

    In this paper we present recent work about the application of digital phase detection for accurate wavelength measurement using two beam interferometry (lambdametry). The advantage of two beam interferometry is the sinusoidal fringe signal for which precise phase detection algorithms exist. Modern algorithms can cope with different sources of errors, and correct them. We recall the principle of the Michelson-type lambdameter using temporal interference and we introduce the Young-type lambdameter using spatial interference. The Young-type lambdameter is based on the acquisition of the interference pattern from two point sources (e.g. two ends of monomode optical fibers) projected onto a CCD camera. The measurement of an unknown wavelength can be achieved by comparison with a reference wavelength. Accurate interference phase maps can be calculated using spatial phase-shifting. In this way, each small group of contiguous pixels acts as a single interferometer, and the whole set of pixels corresponds to a massively parallel interferometric measurement system (up to many hundreds of thousands units). The major advantage of our method is its structural simplicity and the possibility of full optical integration. The final goal is to achieve a relative uncertainty of the order of some 10-8 with a measurement duration of the order of some minutes. Preliminary results are presented.

  4. Comparing current cluster, massively parallel, and accelerated systems

    SciTech Connect

    Barker, Kevin J; Davis, Kei; Hoisie, Adolfy; Kerbyson, Darren J; Pakin, Scott; Lang, Mike; Sancho Pitarch, Jose C

    2010-01-01

    Currently there is large architectural diversity in high perfonnance computing systems. They include 'commodity' cluster systems that optimize per-node performance for small jobs, massively parallel processors (MPPs) that optimize aggregate perfonnance for large jobs, and accelerated systems that optimize both per-node and aggregate performance but only for applications custom-designed to take advantage of such systems. Because of these dissimilarities, meaningful comparisons of achievable performance are not straightforward. In this work we utilize a methodology that combines both empirical analysis and performance modeling to compare clusters (represented by a 4,352-core IB cluster), MPPs (represented by a 147,456-core BG/P), and accelerated systems (represented by the 129,600-core Roadrunner) across a workload of four applications. Strengths of our approach include the ability to compare architectures - as opposed to specific implementations of an architecture - attribute each application's performance bottlenecks to characteristics unique to each system, and to explore performance scenarios in advance of their availability for measurement. Our analysis illustrates that application performance is essentially unrelated to relative peak performance but that application performance can be both predicted and explained using modeling.

  5. Three-dimensional radiative transfer on a massively parallel computer

    NASA Technical Reports Server (NTRS)

    Vath, H. M.

    1994-01-01

    We perform 3D radiative transfer calculations in non-local thermodynamic equilibrium (NLTE) in the simple two-level atom approximation on the Mas-Par MP-1, which contains 8192 processors and is a single instruction multiple data (SIMD) machine, an example of the new generation of massively parallel computers. On such a machine, all processors execute the same command at a given time, but on different data. To make radiative transfer calculations efficient, we must re-consider the numerical methods and storage of data. To solve the transfer equation, we adopt the short characteristic method and examine different acceleration methods to obtain the source function. We use the ALI method and test local and non-local operators. Furthermore, we compare the Ng and the orthomin methods of acceleration. We also investigate the use of multi-grid methods to get fast solutions for the NLTE case. In order to test these numerical methods, we apply them to two problems with and without periodic boundary conditions.

  6. Massively parallel support for a case-based planning system

    NASA Technical Reports Server (NTRS)

    Kettler, Brian P.; Hendler, James A.; Anderson, William A.

    1993-01-01

    Case-based planning (CBP), a kind of case-based reasoning, is a technique in which previously generated plans (cases) are stored in memory and can be reused to solve similar planning problems in the future. CBP can save considerable time over generative planning, in which a new plan is produced from scratch. CBP thus offers a potential (heuristic) mechanism for handling intractable problems. One drawback of CBP systems has been the need for a highly structured memory to reduce retrieval times. This approach requires significant domain engineering and complex memory indexing schemes to make these planners efficient. In contrast, our CBP system, CaPER, uses a massively parallel frame-based AI language (PARKA) and can do extremely fast retrieval of complex cases from a large, unindexed memory. The ability to do fast, frequent retrievals has many advantages: indexing is unnecessary; very large case bases can be used; memory can be probed in numerous alternate ways; and queries can be made at several levels, allowing more specific retrieval of stored plans that better fit the target problem with less adaptation. In this paper we describe CaPER's case retrieval techniques and some experimental results showing its good performance, even on large case bases.

  7. Programming a massively parallel, computation universal system: Static behavior

    NASA Astrophysics Data System (ADS)

    Lapedes, Alan; Farber, Robert

    1986-08-01

    Massively parallel systems are presently the focus of intense interest for a variety of reasons. A key problem is how to control, or ``program'' these systems. In previous work by the authors, the ``optimum finding'' properties of Hopfield neural nets were applied to the nets themselves to create a ``neural compiler.'' This was done in such a way that the problem of programming the attractors of one neural net (called the Slave net) was expressed as an optimization problem that was in turn solved by a second neural net (the Master net). The procedure is effective and efficient. In this series of papers we extend that approach to programming nets that contain interneurons (sometimes called ``hidden neurons''), and thus we deal with nets capable of universal computation. Our work is closely related to recent work of Rummelhart et al. (also Parker, and LeChun), which may be viewed as a special case of this formalism and therefore of ``computing with attractors.'' In later papers in this series, we present the theory for programming time dependent behavior, and consider practical implementations. One may expect numerous applications in view of the computation universality of these networks.

  8. Massive Parallel Sequencing Provides New Perspectives on Bacterial Brain Abscesses

    PubMed Central

    Wilhelmsen, Marianne Thulin; Skrede, Steinar; Meisal, Roger; Jakovljev, Aleksandra; Gaustad, Peter; Hermansen, Nils Olav; Vik-Mo, Einar; Solheim, Ole; Ambur, Ole Herman; Sæbø, Øystein; Høstmælingen, Christina Teisner; Helland, Christian

    2014-01-01

    Rapid development within the field of massive parallel sequencing (MPS) is about to bring this technology within reach for diagnostic microbiology laboratories. We wanted to explore its potential for improving diagnosis and understanding of polymicrobial infections, using bacterial brain abscesses as an example. We conducted a prospective nationwide study on bacterial brain abscesses. Fifty-two surgical samples were included over a 2-year period. The samples were categorized as either spontaneous intracerebral, spontaneous subdural, or postoperative. Bacterial 16S rRNA genes were amplified directly from the specimens and sequenced using Ion Torrent technology, with an average of 500,000 reads per sample. The results were compared to those from culture- and Sanger sequencing-based diagnostics. Compared to culture, MPS allowed for triple the number of bacterial identifications. Aggregatibacter aphrophilus, Fusobacterium nucleatum, and Streptococcus intermedius or combinations of them were found in all spontaneous polymicrobial abscesses. F. nucleatum was systematically detected in samples with anaerobic flora. The increased detection rate for Actinomyces spp. and facultative Gram-negative rods further revealed several species associations. We suggest that A. aphrophilus, F. nucleatum, and S. intermedius are key pathogens for the establishment of spontaneous polymicrobial brain abscesses. In addition, F. nucleatum seems to be important for the development of anaerobic flora. MPS can accurately describe polymicrobial specimens when a sufficient number of reads is used to compensate for unequal species concentrations and principles are defined to discard contaminant bacterial DNA in the subsequent data analysis. This will contribute to our understanding of how different types of polymicrobial infections develop. PMID:24671797

  9. Massively parallel computational fluid dynamics calculations for aerodynamics and aerothermodynamics applications

    SciTech Connect

    Payne, J.L.; Hassan, B.

    1998-09-01

    Massively parallel computers have enabled the analyst to solve complicated flow fields (turbulent, chemically reacting) that were previously intractable. Calculations are presented using a massively parallel CFD code called SACCARA (Sandia Advanced Code for Compressible Aerothermodynamics Research and Analysis) currently under development at Sandia National Laboratories as part of the Department of Energy (DOE) Accelerated Strategic Computing Initiative (ASCI). Computations were made on a generic reentry vehicle in a hypersonic flowfield utilizing three different distributed parallel computers to assess the parallel efficiency of the code with increasing numbers of processors. The parallel efficiencies for the SACCARA code will be presented for cases using 1, 150, 100 and 500 processors. Computations were also made on a subsonic/transonic vehicle using both 236 and 521 processors on a grid containing approximately 14.7 million grid points. Ongoing and future plans to implement a parallel overset grid capability and couple SACCARA with other mechanics codes in a massively parallel environment are discussed.

  10. Massively parallel implementation of a high order domain decomposition equatorial ocean model

    SciTech Connect

    Ma, H.; McCaffrey, J.W.; Piacsek, S.

    1999-06-01

    The present work is about the algorithms and parallel constructs of a spectral element equatorial ocean model. It shows that high order domain decomposition ocean models can be efficiently implemented on massively parallel architectures, such as the Connection Machine Model CM5. The optimized computational efficiency of the parallel spectral element ocean model comes not only from the exponential convergence of the numerical solution, but also from the work-intensive, medium-grained, geometry-based data parallelism. The data parallelism is created to efficiently implement the spectral element ocean model on the distributed-memory massively parallel computer, which minimizes communication among processing nodes. Computational complexity analysis is given for the parallel algorithm of the spectral element ocean model, and the model's parallel performance on the CM5 is evaluated. Lastly, results from a simulation of wind-driven circulation in low-latitude Atlantic Ocean are described.

  11. MASSIVELY PARALLEL IMPLEMENTATION OF A HIGH ORDER DOMAIN DECOMPOSITION EQUATORIAL OCEAN MODEL

    SciTech Connect

    MA,H.; MCCAFFREY,J.W.; PIACSEK,S.

    1998-07-15

    The present work is about the algorithms and parallel constructs of a spectral element equatorial ocean model. It shows that high order domain decomposition ocean models can be efficiently implemented on massively parallel architectures, such as the Connection Machine Model CM5. The optimized computational efficiency of the parallel spectral element ocean model comes not only from the exponential convergence of the numerical solution, but also from the work-intensive, medium-grained, geometry-based data parallelism. The data parallelism is created to efficiently implement the spectral element ocean model on the distributed-memory massively parallel computer, which minimizes communication among processing nodes. Computational complexity analysis is given for the parallel algorithm of the spectral element ocean model, and the model's parallel performance on the CM5 is evaluated. Lastly, results from a simulation of wind-driven circulation in low-latitude Atlantic ocean are described.

  12. Massively parallel computing at Sandia and its application to national defense

    SciTech Connect

    Dosanjh, S.S.

    1991-01-01

    Two years ago, researchers at Sandia National Laboratories showed that a massively parallel computer with 1024 processors could solve scientific problems more than 1000 times faster than a single processor. Since then, interest in massively parallel processing has increased dramatically. This review paper discusses some of the applications of this emerging technology to important problems at Sandia. Particular attention is given here to the impact of massively parallel systems on applications related to national defense. New concepts in heterogenous programming and load balancing for MIMD computers are drastically increasing synthetic aperture radar (SAR) and SDI modeling capabilities. Also, researchers are showing that the current generation of massively parallel MIMD and SIMD computers are highly competitive with a CRAY on hydrodynamic and structural mechanics codes that are optimized for vector processors. 9 refs., 5 figs., 1 tab.

  13. SWAMP+: multiple subsequence alignment using associative massive parallelism

    SciTech Connect

    Steinfadt, Shannon Irene; Baker, Johnnie W

    2010-10-18

    A new parallel algorithm SWAMP+ incorporates the Smith-Waterman sequence alignment on an associative parallel model known as ASC. It is a highly sensitive parallel approach that expands traditional pairwise sequence alignment. This is the first parallel algorithm to provide multiple non-overlapping, non-intersecting subsequence alignments with the accuracy of Smith-Waterman. The efficient algorithm provides multiple alignments similar to BLAST while creating a better workflow for the end users. The parallel portions of the code run in O(m+n) time using m processors. When m = n, the algorithmic analysis becomes O(n) with a coefficient of two, yielding a linear speedup. Implementation of the algorithm on the SIMD ClearSpeed CSX620 confirms this theoretical linear speedup with real timings.

  14. Design of a massively parallel computer using bit serial processing elements

    NASA Technical Reports Server (NTRS)

    Aburdene, Maurice F.; Khouri, Kamal S.; Piatt, Jason E.; Zheng, Jianqing

    1995-01-01

    A 1-bit serial processor designed for a parallel computer architecture is described. This processor is used to develop a massively parallel computational engine, with a single instruction-multiple data (SIMD) architecture. The computer is simulated and tested to verify its operation and to measure its performance for further development.

  15. Solution of large, sparse systems of linear equations in massively parallel applications

    SciTech Connect

    Jones, M.T.; Plassmann, P.E.

    1992-01-01

    We present a general-purpose parallel iterative solver for large, sparse systems of linear equations. This solver is used in two applications, a piezoelectric crystal vibration problem and a superconductor model, that could be solved only on the largest available massively parallel machine. Results obtained on the Intel DELTA show computational rates of up to 3.25 gigaflops for these applications.

  16. QCD on the Massively Parallel Computer AP1000

    NASA Astrophysics Data System (ADS)

    Akemi, K.; Fujisaki, M.; Okuda, M.; Tago, Y.; Hashimoto, T.; Hioki, S.; Miyamura, O.; Takaishi, T.; Nakamura, A.; de Forcrand, Ph.; Hege, C.; Stamatescu, I. O.

    We present the QCD-TARO program of calculations which uses the parallel computer AP1000 of Fujitsu. We discuss the results on scaling, correlation times and hadronic spectrum, some aspects of the implementation and the future prospects.

  17. A Massively Parallel Adaptive Fast Multipole Method on Heterogeneous Architectures

    SciTech Connect

    Lashuk, Ilya; Chandramowlishwaran, Aparna; Langston, Harper; Nguyen, Tuan-Anh; Sampath, Rahul S; Shringarpure, Aashay; Vuduc, Richard; Ying, Lexing; Zorin, Denis; Biros, George

    2012-01-01

    We describe a parallel fast multipole method (FMM) for highly nonuniform distributions of particles. We employ both distributed memory parallelism (via MPI) and shared memory parallelism (via OpenMP and GPU acceleration) to rapidly evaluate two-body nonoscillatory potentials in three dimensions on heterogeneous high performance computing architectures. We have performed scalability tests with up to 30 billion particles on 196,608 cores on the AMD/CRAY-based Jaguar system at ORNL. On a GPU-enabled system (NSF's Keeneland at Georgia Tech/ORNL), we observed 30x speedup over a single core CPU and 7x speedup over a multicore CPU implementation. By combining GPUs with MPI, we achieve less than 10 ns/particle and six digits of accuracy for a run with 48 million nonuniformly distributed particles on 192 GPUs.

  18. Targeted single molecule mutation detection with massively parallel sequencing

    PubMed Central

    Gregory, Mark T.; Bertout, Jessica A.; Ericson, Nolan G.; Taylor, Sean D.; Mukherjee, Rithun; Robins, Harlan S.; Drescher, Charles W.; Bielas, Jason H.

    2016-01-01

    Next-generation sequencing (NGS) technologies have transformed genomic research and have the potential to revolutionize clinical medicine. However, the background error rates of sequencing instruments and limitations in targeted read coverage have precluded the detection of rare DNA sequence variants by NGS. Here we describe a method, termed CypherSeq, which combines double-stranded barcoding error correction and rolling circle amplification (RCA)-based target enrichment to vastly improve NGS-based rare variant detection. The CypherSeq methodology involves the ligation of sample DNA into circular vectors, which contain double-stranded barcodes for computational error correction and adapters for library preparation and sequencing. CypherSeq is capable of detecting rare mutations genome-wide as well as those within specific target genes via RCA-based enrichment. We demonstrate that CypherSeq is capable of correcting errors incurred during library preparation and sequencing to reproducibly detect mutations down to a frequency of 2.4 × 10−7 per base pair, and report the frequency and spectra of spontaneous and ethyl methanesulfonate-induced mutations across the Saccharomyces cerevisiae genome. PMID:26384417

  19. High density packaging and interconnect of massively parallel image processors

    NASA Technical Reports Server (NTRS)

    Carson, John C.; Indin, Ronald J.

    1991-01-01

    This paper presents conceptual designs for high density packaging of parallel processing systems. The systems fall into two categories: global memory systems where many processors are packaged into a stack, and distributed memory systems where a single processor and many memory chips are packaged into a stack. Thermal behavior and performance are discussed.

  20. Generic, hierarchical framework for massively parallel Wang-Landau sampling.

    PubMed

    Vogel, Thomas; Li, Ying Wai; Wüst, Thomas; Landau, David P

    2013-05-24

    We introduce a parallel Wang-Landau method based on the replica-exchange framework for Monte Carlo simulations. To demonstrate its advantages and general applicability for simulations of complex systems, we apply it to different spin models including spin glasses, the Ising model, and the Potts model, lattice protein adsorption, and the self-assembly process in amphiphilic solutions. Without loss of accuracy, the method gives significant speed-up and potentially scales up to petaflop machines.

  1. Generic, Hierarchical Framework for Massively Parallel Wang-Landau Sampling

    NASA Astrophysics Data System (ADS)

    Vogel, Thomas; Li, Ying Wai; Wüst, Thomas; Landau, David P.

    2013-05-01

    We introduce a parallel Wang-Landau method based on the replica-exchange framework for Monte Carlo simulations. To demonstrate its advantages and general applicability for simulations of complex systems, we apply it to different spin models including spin glasses, the Ising model, and the Potts model, lattice protein adsorption, and the self-assembly process in amphiphilic solutions. Without loss of accuracy, the method gives significant speed-up and potentially scales up to petaflop machines.

  2. A generic, hierarchical framework for massively parallel Wang Landau sampling

    SciTech Connect

    Vogel, Thomas; Li, Ying Wai; Wuest, Thomas; Landau, David P

    2013-01-01

    We introduce a parallel Wang Landau method based on the replica-exchange framework for Monte Carlo simulations. To demonstrate its advantages and general applicability for simulations of com- plex systems, we apply it to the self-assembly process in amphiphilic solutions and to lattice protein adsorption. Without loss of accuracy, the method gives significant speed-up on small architectures like multi-core processors, and should be beneficial for petaflop machines.

  3. Performance Evaluation Methodologies and Tools for Massively Parallel Programs

    NASA Technical Reports Server (NTRS)

    Yan, Jerry C.; Sarukkai, Sekhar; Tucker, Deanne (Technical Monitor)

    1994-01-01

    The need for computing power has forced a migration from serial computation on a single processor to parallel processing on multiprocessors. However, without effective means to monitor (and analyze) program execution, tuning the performance of parallel programs becomes exponentially difficult as program complexity and machine size increase. The recent introduction of performance tuning tools from various supercomputer vendors (Intel's ParAide, TMC's PRISM, CSI'S Apprentice, and Convex's CXtrace) seems to indicate the maturity of performance tool technologies and vendors'/customers' recognition of their importance. However, a few important questions remain: What kind of performance bottlenecks can these tools detect (or correct)? How time consuming is the performance tuning process? What are some important technical issues that remain to be tackled in this area? This workshop reviews the fundamental concepts involved in analyzing and improving the performance of parallel and heterogeneous message-passing programs. Several alternative strategies will be contrasted, and for each we will describe how currently available tuning tools (e.g., AIMS, ParAide, PRISM, Apprentice, CXtrace, ATExpert, Pablo, IPS-2)) can be used to facilitate the process. We will characterize the effectiveness of the tools and methodologies based on actual user experiences at NASA Ames Research Center. Finally, we will discuss their limitations and outline recent approaches taken by vendors and the research community to address them.

  4. Fast parallel Markov clustering in bioinformatics using massively parallel computing on GPU with CUDA and ELLPACK-R sparse format.

    PubMed

    Bustamam, Alhadi; Burrage, Kevin; Hamilton, Nicholas A

    2012-01-01

    Markov clustering (MCL) is becoming a key algorithm within bioinformatics for determining clusters in networks. However,with increasing vast amount of data on biological networks, performance and scalability issues are becoming a critical limiting factor in applications. Meanwhile, GPU computing, which uses CUDA tool for implementing a massively parallel computing environment in the GPU card, is becoming a very powerful, efficient, and low-cost option to achieve substantial performance gains over CPU approaches. The use of on-chip memory on the GPU is efficiently lowering the latency time, thus, circumventing a major issue in other parallel computing environments, such as MPI. We introduce a very fast Markov clustering algorithm using CUDA (CUDA-MCL) to perform parallel sparse matrix-matrix computations and parallel sparse Markov matrix normalizations, which are at the heart of MCL. We utilized ELLPACK-R sparse format to allow the effective and fine-grain massively parallel processing to cope with the sparse nature of interaction networks data sets in bioinformatics applications. As the results show, CUDA-MCL is significantly faster than the original MCL running on CPU. Thus, large-scale parallel computation on off-the-shelf desktop-machines, that were previously only possible on supercomputing architectures, can significantly change the way bioinformaticians and biologists deal with their data.

  5. Performance effects of irregular communications patterns on massively parallel multiprocessors

    NASA Technical Reports Server (NTRS)

    Saltz, Joel; Petiton, Serge; Berryman, Harry; Rifkin, Adam

    1991-01-01

    A detailed study of the performance effects of irregular communications patterns on the CM-2 was conducted. The communications capabilities of the CM-2 were characterized under a variety of controlled conditions. In the process of carrying out the performance evaluation, extensive use was made of a parameterized synthetic mesh. In addition, timings with unstructured meshes generated for aerodynamic codes and a set of sparse matrices with banded patterns on non-zeroes were performed. This benchmarking suite stresses the communications capabilities of the CM-2 in a range of different ways. Benchmark results demonstrate that it is possible to make effective use of much of the massive concurrency available in the communications network.

  6. A sweep algorithm for massively parallel simulation of circuit-switched networks

    NASA Technical Reports Server (NTRS)

    Gaujal, Bruno; Greenberg, Albert G.; Nicol, David M.

    1992-01-01

    A new massively parallel algorithm is presented for simulating large asymmetric circuit-switched networks, controlled by a randomized-routing policy that includes trunk-reservation. A single instruction multiple data (SIMD) implementation is described, and corresponding experiments on a 16384 processor MasPar parallel computer are reported. A multiple instruction multiple data (MIMD) implementation is also described, and corresponding experiments on an Intel IPSC/860 parallel computer, using 16 processors, are reported. By exploiting parallelism, our algorithm increases the possible execution rate of such complex simulations by as much as an order of magnitude.

  7. Massively parallel finite element computation of three dimensional flow problems

    NASA Astrophysics Data System (ADS)

    Tezduyar, T.; Aliabadi, S.; Behr, M.; Johnson, A.; Mittal, S.

    1992-12-01

    The parallel finite element computation of three-dimensional compressible, and incompressible flows, with emphasis on the space-time formulations, mesh moving schemes and implementations on the Connection Machines CM-200 and CM-5 are presented. For computation of unsteady compressible and incompressible flows involving moving boundaries and interfaces, the Deformable-Spatial-Domain/Stabilized-Space-Time (DSD/SST) formulation that previously developed are employed. In this approach, the stabilized finite element formulations of the governing equations are written over the space-time domain of the problem; therefore, the deformation of the spatial domain with respect to time is taken into account automatically. This approach gives the capability to solve a large class of problems involving free surfaces, moving interfaces, and fluid-structure and fluid-particle interactions. By using special mesh moving schemes, the frequency of remeshing is minimized to reduce the projection errors involved in remeshing and also to increase the parallelization ease of the computations. The implicit equation systems arising from the finite element discretizations are solved iteratively by using the GMRES update technique with the diagonal and nodal-block-diagonal preconditioners. These formulations have all been implemented on the CM-200 and CM-5, and have been applied to several large-scale problems. The three-dimensional problems in this report were all computed on the CM-200 and CM-5.

  8. Performance of the Wavelet Decomposition on Massively Parallel Architectures

    NASA Technical Reports Server (NTRS)

    El-Ghazawi, Tarek A.; LeMoigne, Jacqueline; Zukor, Dorothy (Technical Monitor)

    2001-01-01

    Traditionally, Fourier Transforms have been utilized for performing signal analysis and representation. But although it is straightforward to reconstruct a signal from its Fourier transform, no local description of the signal is included in its Fourier representation. To alleviate this problem, Windowed Fourier transforms and then wavelet transforms have been introduced, and it has been proven that wavelets give a better localization than traditional Fourier transforms, as well as a better division of the time- or space-frequency plane than Windowed Fourier transforms. Because of these properties and after the development of several fast algorithms for computing the wavelet representation of any signal, in particular the Multi-Resolution Analysis (MRA) developed by Mallat, wavelet transforms have increasingly been applied to signal analysis problems, especially real-life problems, in which speed is critical. In this paper we present and compare efficient wavelet decomposition algorithms on different parallel architectures. We report and analyze experimental measurements, using NASA remotely sensed images. Results show that our algorithms achieve significant performance gains on current high performance parallel systems, and meet scientific applications and multimedia requirements. The extensive performance measurements collected over a number of high-performance computer systems have revealed important architectural characteristics of these systems, in relation to the processing demands of the wavelet decomposition of digital images.

  9. Signal processing applications of massively parallel charge domain computing devices

    NASA Technical Reports Server (NTRS)

    Fijany, Amir (Inventor); Barhen, Jacob (Inventor); Toomarian, Nikzad (Inventor)

    1999-01-01

    The present invention is embodied in a charge coupled device (CCD)/charge injection device (CID) architecture capable of performing a Fourier transform by simultaneous matrix vector multiplication (MVM) operations in respective plural CCD/CID arrays in parallel in O(1) steps. For example, in one embodiment, a first CCD/CID array stores charge packets representing a first matrix operator based upon permutations of a Hartley transform and computes the Fourier transform of an incoming vector. A second CCD/CID array stores charge packets representing a second matrix operator based upon different permutations of a Hartley transform and computes the Fourier transform of an incoming vector. The incoming vector is applied to the inputs of the two CCD/CID arrays simultaneously, and the real and imaginary parts of the Fourier transform are produced simultaneously in the time required to perform a single MVM operation in a CCD/CID array.

  10. Repartitioning Strategies for Massively Parallel Simulation of Reacting Flow

    NASA Astrophysics Data System (ADS)

    Pisciuneri, Patrick; Zheng, Angen; Givi, Peyman; Labrinidis, Alexandros; Chrysanthis, Panos

    2015-11-01

    The majority of parallel CFD simulators partition the domain into equal regions and assign the calculations for a particular region to a unique processor. This type of domain decomposition is vital to the efficiency of the solver. However, as the simulation develops, the workload among the partitions often become uneven (e.g. by adaptive mesh refinement, or chemically reacting regions) and a new partition should be considered. The process of repartitioning adjusts the current partition to evenly distribute the load again. We compare two repartitioning tools: Zoltan, an architecture-agnostic graph repartitioner developed at the Sandia National Laboratories; and Paragon, an architecture-aware graph repartitioner developed at the University of Pittsburgh. The comparative assessment is conducted via simulation of the Taylor-Green vortex flow with chemical reaction.

  11. A massively parallel computational approach to coupled thermoelastic/porous gas flow problems

    NASA Technical Reports Server (NTRS)

    Shia, David; Mcmanus, Hugh L.

    1995-01-01

    A new computational scheme for coupled thermoelastic/porous gas flow problems is presented. Heat transfer, gas flow, and dynamic thermoelastic governing equations are expressed in fully explicit form, and solved on a massively parallel computer. The transpiration cooling problem is used as an example problem. The numerical solutions have been verified by comparison to available analytical solutions. Transient temperature, pressure, and stress distributions have been obtained. Small spatial oscillations in pressure and stress have been observed, which would be impractical to predict with previously available schemes. Comparisons between serial and massively parallel versions of the scheme have also been made. The results indicate that for small scale problems the serial and parallel versions use practically the same amount of CPU time. However, as the problem size increases the parallel version becomes more efficient than the serial version.

  12. Massively parallel computation of RCS with finite elements

    NASA Technical Reports Server (NTRS)

    Parker, Jay

    1993-01-01

    One of the promising combinations of finite element approaches for scattering problems uses Whitney edge elements, spherical vector wave-absorbing boundary conditions, and bi-conjugate gradient solution for the frequency-domain near field. Each of these approaches may be criticized. Low-order elements require high mesh density, but also result in fast, reliable iterative convergence. Spherical wave-absorbing boundary conditions require additional space to be meshed beyond the most minimal near-space region, but result in fully sparse, symmetric matrices which keep storage and solution times low. Iterative solution is somewhat unpredictable and unfriendly to multiple right-hand sides, yet we find it to be uniformly fast on large problems to date, given the other two approaches. Implementation of these approaches on a distributed memory, message passing machine yields huge dividends, as full scalability to the largest machines appears assured and iterative solution times are well-behaved for large problems. We present times and solutions for computed RCS for a conducting cube and composite permeability/conducting sphere on the Intel ipsc860 with up to 16 processors solving over 200,000 unknowns. We estimate problems of approximately 10 million unknowns, encompassing 1000 cubic wavelengths, may be attempted on a currently available 512 processor machine, but would be exceedingly tedious to prepare. The most severe bottlenecks are due to the slow rate of mesh generation on non-parallel machines and the large transfer time from such a machine to the parallel processor. One solution, in progress, is to create and then distribute a coarse mesh among the processors, followed by systematic refinement within each processor. Elimination of redundant node definitions at the mesh-partition surfaces, snap-to-surface post processing of the resulting mesh for good modelling of curved surfaces, and load-balancing redistribution of new elements after the refinement are auxiliary

  13. MASSIVELY PARALLEL LATENT SEMANTIC ANALYSES USING A GRAPHICS PROCESSING UNIT

    SciTech Connect

    Cavanagh, J.; Cui, S.

    2009-01-01

    Latent Semantic Analysis (LSA) aims to reduce the dimensions of large term-document datasets using Singular Value Decomposition. However, with the ever-expanding size of datasets, current implementations are not fast enough to quickly and easily compute the results on a standard PC. A graphics processing unit (GPU) can solve some highly parallel problems much faster than a traditional sequential processor or central processing unit (CPU). Thus, a deployable system using a GPU to speed up large-scale LSA processes would be a much more effective choice (in terms of cost/performance ratio) than using a PC cluster. Due to the GPU’s application-specifi c architecture, harnessing the GPU’s computational prowess for LSA is a great challenge. We presented a parallel LSA implementation on the GPU, using NVIDIA® Compute Unifi ed Device Architecture and Compute Unifi ed Basic Linear Algebra Subprograms software. The performance of this implementation is compared to traditional LSA implementation on a CPU using an optimized Basic Linear Algebra Subprograms library. After implementation, we discovered that the GPU version of the algorithm was twice as fast for large matrices (1 000x1 000 and above) that had dimensions not divisible by 16. For large matrices that did have dimensions divisible by 16, the GPU algorithm ran fi ve to six times faster than the CPU version. The large variation is due to architectural benefi ts of the GPU for matrices divisible by 16. It should be noted that the overall speeds for the CPU version did not vary from relative normal when the matrix dimensions were divisible by 16. Further research is needed in order to produce a fully implementable version of LSA. With that in mind, the research we presented shows that the GPU is a viable option for increasing the speed of LSA, in terms of cost/performance ratio.

  14. Massively Parallel Latent Semantic Analyzes using a Graphics Processing Unit

    SciTech Connect

    Cavanagh, Joseph M; Cui, Xiaohui

    2009-01-01

    Latent Semantic Indexing (LSA) aims to reduce the dimensions of large Term-Document datasets using Singular Value Decomposition. However, with the ever expanding size of data sets, current implementations are not fast enough to quickly and easily compute the results on a standard PC. The Graphics Processing Unit (GPU) can solve some highly parallel problems much faster than the traditional sequential processor (CPU). Thus, a deployable system using a GPU to speedup large-scale LSA processes would be a much more effective choice (in terms of cost/performance ratio) than using a computer cluster. Due to the GPU s application-specific architecture, harnessing the GPU s computational prowess for LSA is a great challenge. We present a parallel LSA implementation on the GPU, using NVIDIA Compute Unified Device Architecture and Compute Unified Basic Linear Algebra Subprograms. The performance of this implementation is compared to traditional LSA implementation on CPU using an optimized Basic Linear Algebra Subprograms library. After implementation, we discovered that the GPU version of the algorithm was twice as fast for large matrices (1000x1000 and above) that had dimensions not divisible by 16. For large matrices that did have dimensions divisible by 16, the GPU algorithm ran five to six times faster than the CPU version. The large variation is due to architectural benefits the GPU has for matrices divisible by 16. It should be noted that the overall speeds for the CPU version did not vary from relative normal when the matrix dimensions were divisible by 16. Further research is needed in order to produce a fully implementable version of LSA. With that in mind, the research we presented shows that the GPU is a viable option for increasing the speed of LSA, in terms of cost/performance ratio.

  15. Massively parallel algorithms for trace-driven cache simulations

    NASA Technical Reports Server (NTRS)

    Nicol, David M.; Greenberg, Albert G.; Lubachevsky, Boris D.

    1991-01-01

    Trace driven cache simulation is central to computer design. A trace is a very long sequence of reference lines from main memory. At the t(exp th) instant, reference x sub t is hashed into a set of cache locations, the contents of which are then compared with x sub t. If at the t sup th instant x sub t is not present in the cache, then it is said to be a miss, and is loaded into the cache set, possibly forcing the replacement of some other memory line, and making x sub t present for the (t+1) sup st instant. The problem of parallel simulation of a subtrace of N references directed to a C line cache set is considered, with the aim of determining which references are misses and related statistics. A simulation method is presented for the Least Recently Used (LRU) policy, which regradless of the set size C runs in time O(log N) using N processors on the exclusive read, exclusive write (EREW) parallel model. A simpler LRU simulation algorithm is given that runs in O(C log N) time using N/log N processors. Timings are presented of the second algorithm's implementation on the MasPar MP-1, a machine with 16384 processors. A broad class of reference based line replacement policies are considered, which includes LRU as well as the Least Frequently Used and Random replacement policies. A simulation method is presented for any such policy that on any trace of length N directed to a C line set runs in the O(C log N) time with high probability using N processors on the EREW model. The algorithms are simple, have very little space overhead, and are well suited for SIMD implementation.

  16. Massively Parallel Solution of Poisson Equation on Coarse Grain MIMD Architectures

    NASA Technical Reports Server (NTRS)

    Fijany, A.; Weinberger, D.; Roosta, R.; Gulati, S.

    1998-01-01

    In this paper a new algorithm, designated as Fast Invariant Imbedding algorithm, for solution of Poisson equation on vector and massively parallel MIMD architectures is presented. This algorithm achieves the same optimal computational efficiency as other Fast Poisson solvers while offering a much better structure for vector and parallel implementation. Our implementation on the Intel Delta and Paragon shows that a speedup of over two orders of magnitude can be achieved even for moderate size problems.

  17. Massively parallel computation of three-dimensional scramjet combustor

    NASA Astrophysics Data System (ADS)

    Zheng, Z. H.; Le, J. L.

    Recent progress of computational study of scramjet combustor has been described in Refs 1-3. However, detailed flow properties, especially the lateral properties and the sidewall effects are not considered. In this paper, a parallel simulation of an experimental dual-mode scramjet combustor configuration is presented, considering the jet-to-jet symmetry and the full-duct modeling. Turbulence is modeled with the k-ɛ two-equation turbulence model and a 7-specie, 8-equation kinetics model is used to model hydrogen/air combustion. The conservation form of the Navier-Stokes equations with finite-rate chemistry reactions is solved using a diagonal implicit finite-volume method. For the two cases, the three-dimension flow-fields with equivalence ratio Φ=0.0 and 0.35 have been respectively simulated on the COW and MPP. Wall pressure comparisons between CFD and experiments (CARDC and NAL) show fair agreement for the jet-to-jet case. For the full-duct modeling, more detailed flow properties are obtained. The fuelpenetrating heights of the injectors are different because of the effects of the sidewall boundary layer and the shock wave in the combustor. According to numerical results, if adjusting the locations of the injectors, the combustion efficiency could be improved.

  18. The minimal amount of starting DNA for Agilent’s hybrid capture-based targeted massively parallel sequencing

    PubMed Central

    Chung, Jongsuk; Son, Dae-Soon; Jeon, Hyo-Jeong; Kim, Kyoung-Mee; Park, Gahee; Ryu, Gyu Ha; Park, Woong-Yang; Park, Donghyun

    2016-01-01

    Targeted capture massively parallel sequencing is increasingly being used in clinical settings, and as costs continue to decline, use of this technology may become routine in health care. However, a limited amount of tissue has often been a challenge in meeting quality requirements. To offer a practical guideline for the minimum amount of input DNA for targeted sequencing, we optimized and evaluated the performance of targeted sequencing depending on the input DNA amount. First, using various amounts of input DNA, we compared commercially available library construction kits and selected Agilent’s SureSelect-XT and KAPA Biosystems’ Hyper Prep kits as the kits most compatible with targeted deep sequencing using Agilent’s SureSelect custom capture. Then, we optimized the adapter ligation conditions of the Hyper Prep kit to improve library construction efficiency and adapted multiplexed hybrid selection to reduce the cost of sequencing. In this study, we systematically evaluated the performance of the optimized protocol depending on the amount of input DNA, ranging from 6.25 to 200 ng, suggesting the minimal input DNA amounts based on coverage depths required for specific applications. PMID:27220682

  19. Large-eddy simulation of the Rayleigh-Taylor instability on a massively parallel computer

    SciTech Connect

    Amala, P.A.K.

    1995-03-01

    A computational model for the solution of the three-dimensional Navier-Stokes equations is developed. This model includes a turbulence model: a modified Smagorinsky eddy-viscosity with a stochastic backscatter extension. The resultant equations are solved using finite difference techniques: the second-order explicit Lax-Wendroff schemes. This computational model is implemented on a massively parallel computer. Programming models on massively parallel computers are next studied. It is desired to determine the best programming model for the developed computational model. To this end, three different codes are tested on a current massively parallel computer: the CM-5 at Los Alamos. Each code uses a different programming model: one is a data parallel code; the other two are message passing codes. Timing studies are done to determine which method is the fastest. The data parallel approach turns out to be the fastest method on the CM-5 by at least an order of magnitude. The resultant code is then used to study a current problem of interest to the computational fluid dynamics community. This is the Rayleigh-Taylor instability. The Lax-Wendroff methods handle shocks and sharp interfaces poorly. To this end, the Rayleigh-Taylor linear analysis is modified to include a smoothed interface. The linear growth rate problem is then investigated. Finally, the problem of the randomly perturbed interface is examined. Stochastic backscatter breaks the symmetry of the stationary unstable interface and generates a mixing layer growing at the experimentally observed rate. 115 refs., 51 figs., 19 tabs.

  20. Parallel optimization of pixel purity index algorithm for massive hyperspectral images in cloud computing environment

    NASA Astrophysics Data System (ADS)

    Chen, Yufeng; Wu, Zebin; Sun, Le; Wei, Zhihui; Li, Yonglong

    2016-04-01

    With the gradual increase in the spatial and spectral resolution of hyperspectral images, the size of image data becomes larger and larger, and the complexity of processing algorithms is growing, which poses a big challenge to efficient massive hyperspectral image processing. Cloud computing technologies distribute computing tasks to a large number of computing resources for handling large data sets without the limitation of memory and computing resource of a single machine. This paper proposes a parallel pixel purity index (PPI) algorithm for unmixing massive hyperspectral images based on a MapReduce programming model for the first time in the literature. According to the characteristics of hyperspectral images, we describe the design principle of the algorithm, illustrate the main cloud unmixing processes of PPI, and analyze the time complexity of serial and parallel algorithms. Experimental results demonstrate that the parallel implementation of the PPI algorithm on the cloud can effectively process big hyperspectral data and accelerate the algorithm.

  1. Implementation of Shifted Periodic Boundary Conditions in the Large-Scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) Software

    DTIC Science & Technology

    2015-08-01

    Atomic /Molecular Massively Parallel Simulator (LAMMPS) Software by N Scott Weingarten and James P Larentzos Approved for...0687 ● AUG 2015 US Army Research Laboratory Implementation of Shifted Periodic Boundary Conditions in the Large-Scale Atomic /Molecular...Shifted Periodic Boundary Conditions in the Large-Scale Atomic /Molecular Massively Parallel Simulator (LAMMPS) Software 5a. CONTRACT NUMBER 5b

  2. CODE BLUE: Three dimensional massively-parallel simulation of multi-scale configurations

    NASA Astrophysics Data System (ADS)

    Juric, Damir; Kahouadji, Lyes; Chergui, Jalel; Shin, Seungwon; Craster, Richard; Matar, Omar

    2016-11-01

    We present recent progress on BLUE, a solver for massively parallel simulations of fully three-dimensional multiphase flows which runs on a variety of computer architectures from laptops to supercomputers and on 131072 threads or more (limited only by the availability to us of more threads). The code is wholly written in Fortran 2003 and uses a domain decomposition strategy for parallelization with MPI. The fluid interface solver is based on a parallel implementation of a hybrid Front Tracking/Level Set method designed to handle highly deforming interfaces with complex topology changes. We developed parallel GMRES and multigrid iterative solvers suited to the linear systems arising from the implicit solution for the fluid velocities and pressure in the presence of strong density and viscosity discontinuities across fluid phases. Particular attention is drawn to the details and performance of the parallel Multigrid solver. EPSRC UK Programme Grant MEMPHIS (EP/K003976/1).

  3. A highly scalable massively parallel fast marching method for the Eikonal equation

    NASA Astrophysics Data System (ADS)

    Yang, Jianming; Stern, Frederick

    2017-03-01

    The fast marching method is a widely used numerical method for solving the Eikonal equation arising from a variety of scientific and engineering fields. It is long deemed inherently sequential and an efficient parallel algorithm applicable to large-scale practical applications is not available in the literature. In this study, we present a highly scalable massively parallel implementation of the fast marching method using a domain decomposition approach. Central to this algorithm is a novel restarted narrow band approach that coordinates the frequency of communications and the amount of computations extra to a sequential run for achieving an unprecedented parallel performance. Within each restart, the narrow band fast marching method is executed; simple synchronous local exchanges and global reductions are adopted for communicating updated data in the overlapping regions between neighboring subdomains and getting the latest front status, respectively. The independence of front characteristics is exploited through special data structures and augmented status tags to extract the masked parallelism within the fast marching method. The efficiency, flexibility, and applicability of the parallel algorithm are demonstrated through several examples. These problems are extensively tested on six grids with up to 1 billion points using different numbers of processes ranging from 1 to 65536. Remarkable parallel speedups are achieved using tens of thousands of processes. Detailed pseudo-codes for both the sequential and parallel algorithms are provided to illustrate the simplicity of the parallel implementation and its similarity to the sequential narrow band fast marching algorithm.

  4. Tubal ligation

    MedlinePlus

    ... tying; Tying the tubes; Hysteroscopic tubal occlusion procedure; Contraception - tubal ligation; Family planning - tubal ligation ... Tubal ligation is considered a permanent form of birth control. It is NOT recommended as a short-term ...

  5. Massively parallel implementation of 3D-RISM calculation with volumetric 3D-FFT.

    PubMed

    Maruyama, Yutaka; Yoshida, Norio; Tadano, Hiroto; Takahashi, Daisuke; Sato, Mitsuhisa; Hirata, Fumio

    2014-07-05

    A new three-dimensional reference interaction site model (3D-RISM) program for massively parallel machines combined with the volumetric 3D fast Fourier transform (3D-FFT) was developed, and tested on the RIKEN K supercomputer. The ordinary parallel 3D-RISM program has a limitation on the number of parallelizations because of the limitations of the slab-type 3D-FFT. The volumetric 3D-FFT relieves this limitation drastically. We tested the 3D-RISM calculation on the large and fine calculation cell (2048(3) grid points) on 16,384 nodes, each having eight CPU cores. The new 3D-RISM program achieved excellent scalability to the parallelization, running on the RIKEN K supercomputer. As a benchmark application, we employed the program, combined with molecular dynamics simulation, to analyze the oligomerization process of chymotrypsin Inhibitor 2 mutant. The results demonstrate that the massive parallel 3D-RISM program is effective to analyze the hydration properties of the large biomolecular systems.

  6. Salinas - An implicit finite element structural dynamics code developed for massively parallel platforms

    SciTech Connect

    BHARDWAJ, MANLJ K.; REESE,GARTH M.; DRIESSEN,BRIAN; ALVIN,KENNETH F.; DAY,DAVID M.

    2000-04-06

    As computational needs for structural finite element analysis increase, a robust implicit structural dynamics code is needed which can handle millions of degrees of freedom in the model and produce results with quick turn around time. A parallel code is needed to avoid limitations of serial platforms. Salinas is an implicit structural dynamics code specifically designed for massively parallel platforms. It computes the structural response of very large complex structures and provides solutions faster than any existing serial machine. This paper gives a current status of Salinas and uses demonstration problems to show Salinas' performance.

  7. Developing a Massively Parallel Forward Projection Radiography Model for Large-Scale Industrial Applications

    SciTech Connect

    Bauerle, Matthew

    2014-08-01

    This project utilizes Graphics Processing Units (GPUs) to compute radiograph simulations for arbitrary objects. The generation of radiographs, also known as the forward projection imaging model, is computationally intensive and not widely utilized. The goal of this research is to develop a massively parallel algorithm that can compute forward projections for objects with a trillion voxels (3D pixels). To achieve this end, the data are divided into blocks that can each t into GPU memory. The forward projected image is also divided into segments to allow for future parallelization and to avoid needless computations.

  8. Analysis of gallium arsenide deposition in a horizontal chemical vapor deposition reactor using massively parallel computations

    SciTech Connect

    Salinger, A.G.; Shadid, J.N.; Hutchinson, S.A.

    1998-01-01

    A numerical analysis of the deposition of gallium from trimethylgallium (TMG) and arsine in a horizontal CVD reactor with tilted susceptor and a three inch diameter rotating substrate is performed. The three-dimensional model includes complete coupling between fluid mechanics, heat transfer, and species transport, and is solved using an unstructured finite element discretization on a massively parallel computer. The effects of three operating parameters (the disk rotation rate, inlet TMG fraction, and inlet velocity) and two design parameters (the tilt angle of the reactor base and the reactor width) on the growth rate and uniformity are presented. The nonlinear dependence of the growth rate uniformity on the key operating parameters is discussed in detail. Efficient and robust algorithms for massively parallel reacting flow simulations, as incorporated into our analysis code MPSalsa, make detailed analysis of this complicated system feasible.

  9. Chemical network problems solved on NASA/Goddard's massively parallel processor computer

    NASA Technical Reports Server (NTRS)

    Cho, Seog Y.; Carmichael, Gregory R.

    1987-01-01

    The single instruction stream, multiple data stream Massively Parallel Processor (MPP) unit consists of 16,384 bit serial arithmetic processors configured as a 128 x 128 array whose speed can exceed that of current supercomputers (Cyber 205). The applicability of the MPP for solving reaction network problems is presented and discussed, including the mapping of the calculation to the architecture, and CPU timing comparisons.

  10. Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes

    PubMed Central

    2009-01-01

    Background Molecular evolutionary studies share the common goal of elucidating historical relationships, and the common challenge of adequately sampling taxa and characters. Particularly at low taxonomic levels, recent divergence, rapid radiations, and conservative genome evolution yield limited sequence variation, and dense taxon sampling is often desirable. Recent advances in massively parallel sequencing make it possible to rapidly obtain large amounts of sequence data, and multiplexing makes extensive sampling of megabase sequences feasible. Is it possible to efficiently apply massively parallel sequencing to increase phylogenetic resolution at low taxonomic levels? Results We reconstruct the infrageneric phylogeny of Pinus from 37 nearly-complete chloroplast genomes (average 109 kilobases each of an approximately 120 kilobase genome) generated using multiplexed massively parallel sequencing. 30/33 ingroup nodes resolved with ≥ 95% bootstrap support; this is a substantial improvement relative to prior studies, and shows massively parallel sequencing-based strategies can produce sufficient high quality sequence to reach support levels originally proposed for the phylogenetic bootstrap. Resampling simulations show that at least the entire plastome is necessary to fully resolve Pinus, particularly in rapidly radiating clades. Meta-analysis of 99 published infrageneric phylogenies shows that whole plastome analysis should provide similar gains across a range of plant genera. A disproportionate amount of phylogenetic information resides in two loci (ycf1, ycf2), highlighting their unusual evolutionary properties. Conclusion Plastome sequencing is now an efficient option for increasing phylogenetic resolution at lower taxonomic levels in plant phylogenetic and population genetic analyses. With continuing improvements in sequencing capacity, the strategies herein should revolutionize efforts requiring dense taxon and character sampling, such as phylogeographic

  11. Implementation of (omega)-k synthetic aperture radar imaging algorithm on a massively parallel supercomputer

    NASA Astrophysics Data System (ADS)

    Yerkes, Christopher R.; Webster, Eric D.

    1994-06-01

    Advanced algorithms for synthetic aperture radar (SAR) imaging have in the past required computing capabilities only available from high performance special purpose hardware. Such architectures have tended to have short life cycles with respect to development expense. Current generation Massively Parallel Processors (MPP) are offering high performance capabilities necessary for such applications with both a scalable architecture and a longer projected life cycle. In this paper we explore issues associated with implementation of a SAR imaging algorithm on a mesh configured MPP architecture.

  12. Using CLIPS in the domain of knowledge-based massively parallel programming

    NASA Technical Reports Server (NTRS)

    Dvorak, Jiri J.

    1994-01-01

    The Program Development Environment (PDE) is a tool for massively parallel programming of distributed-memory architectures. Adopting a knowledge-based approach, the PDE eliminates the complexity introduced by parallel hardware with distributed memory and offers complete transparency in respect of parallelism exploitation. The knowledge-based part of the PDE is realized in CLIPS. Its principal task is to find an efficient parallel realization of the application specified by the user in a comfortable, abstract, domain-oriented formalism. A large collection of fine-grain parallel algorithmic skeletons, represented as COOL objects in a tree hierarchy, contains the algorithmic knowledge. A hybrid knowledge base with rule modules and procedural parts, encoding expertise about application domain, parallel programming, software engineering, and parallel hardware, enables a high degree of automation in the software development process. In this paper, important aspects of the implementation of the PDE using CLIPS and COOL are shown, including the embedding of CLIPS with C++-based parts of the PDE. The appropriateness of the chosen approach and of the CLIPS language for knowledge-based software engineering are discussed.

  13. ASCI Red -- Experiences and lessons learned with a massively parallel teraFLOP supercomputer

    SciTech Connect

    Christon, M.A.; Crawford, D.A.; Hertel, E.S.; Peery, J.S.; Robinson, A.C.

    1997-06-01

    The Accelerated Strategic Computing Initiative (ASCI) program involves Sandia, Los Alamos and Lawrence Livermore National Laboratories. At Sandia National Laboratories, ASCI applications include large deformation transient dynamics, shock propagation, electromechanics, and abnormal thermal environments. In order to resolve important physical phenomena in these problems, it is estimated that meshes ranging from 10{sup 6} to 10{sup 9} grid points will be required. The ASCI program is relying on the use of massively parallel supercomputers initially capable of delivering over 1 TFLOPs to perform such demanding computations. The ASCI Red machine at Sandia National Laboratories consists of over 4,500 computational nodes with a peak computational rate of 1.8 TFLOPs, 567 GBytes of memory, and 2 TBytes of disk storage. Regardless of the peak FLOP rate, there are many issues surrounding the use of massively parallel supercomputers in a production environment. These issues include parallel I/O, mesh generation, visualization, archival storage, high-bandwidth networking and the development of parallel algorithms. In order to illustrate these issues and their solution with respect to ASCI Red, demonstration calculations of time-dependent buoyancy-dominated plumes, electromechanics, and shock propagation will be presented.

  14. Massively parallel Monte Carlo for many-particle simulations on GPUs

    SciTech Connect

    Anderson, Joshua A.; Jankowski, Eric; Grubb, Thomas L.; Engel, Michael; Glotzer, Sharon C.

    2013-12-01

    Current trends in parallel processors call for the design of efficient massively parallel algorithms for scientific computing. Parallel algorithms for Monte Carlo simulations of thermodynamic ensembles of particles have received little attention because of the inherent serial nature of the statistical sampling. In this paper, we present a massively parallel method that obeys detailed balance and implement it for a system of hard disks on the GPU. We reproduce results of serial high-precision Monte Carlo runs to verify the method. This is a good test case because the hard disk equation of state over the range where the liquid transforms into the solid is particularly sensitive to small deviations away from the balance conditions. On a Tesla K20, our GPU implementation executes over one billion trial moves per second, which is 148 times faster than on a single Intel Xeon E5540 CPU core, enables 27 times better performance per dollar, and cuts energy usage by a factor of 13. With this improved performance we are able to calculate the equation of state for systems of up to one million hard disks. These large system sizes are required in order to probe the nature of the melting transition, which has been debated for the last forty years. In this paper we present the details of our computational method, and discuss the thermodynamics of hard disks separately in a companion paper.

  15. Molecular Dynamics Simulations from SNL's Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS)

    DOE Data Explorer

    Plimpton, Steve; Thompson, Aidan; Crozier, Paul

    LAMMPS (http://lammps.sandia.gov/index.html) stands for Large-scale Atomic/Molecular Massively Parallel Simulator and is a code that can be used to model atoms or, as the LAMMPS website says, as a parallel particle simulator at the atomic, meso, or continuum scale. This Sandia-based website provides a long list of animations from large simulations. These were created using different visualization packages to read LAMMPS output, and each one provides the name of the PI and a brief description of the work done or visualization package used. See also the static images produced from simulations at http://lammps.sandia.gov/pictures.html The foundation paper for LAMMPS is: S. Plimpton, Fast Parallel Algorithms for Short-Range Molecular Dynamics, J Comp Phys, 117, 1-19 (1995), but the website also lists other papers describing contributions to LAMMPS over the years.

  16. SPECT reconstruction using a backpropagation neural network implemented on a massively parallel SIMD computer

    SciTech Connect

    Kerr, J.P.; Bartlett, E.B.

    1992-12-31

    In this paper, the feasibility of reconstructing a single photon emission computed tomography (SPECT) image via the parallel implementation of a backpropagation neural network is shown. The MasPar, MP-1 is a single instruction multiple data (SIMD) massively parallel machine. It is composed of a 128 x 128 array of 4-bit processors. The neural network is distributed on the array by dedicating a processor to each node and each interconnection of the network. An 8 x 8 SPECT image slice section is projected into eight planes. It is shown that based on the projections, the neural network can produce the original SPECT slice image exactly. Likewise, when trained on two parallel slices, separated by one slice, the neural network is able to reproduce the center, untrained image to an RMS error of 0.001928.

  17. Fast structural design and analysis via hybrid domain decomposition on massively parallel processors

    NASA Technical Reports Server (NTRS)

    Farhat, Charbel

    1993-01-01

    A hybrid domain decomposition framework for static, transient and eigen finite element analyses of structural mechanics problems is presented. Its basic ingredients include physical substructuring and /or automatic mesh partitioning, mapping algorithms, 'gluing' approximations for fast design modifications and evaluations, and fast direct and preconditioned iterative solvers for local and interface subproblems. The overall methodology is illustrated with the structural design of a solar viewing payload that is scheduled to fly in March 1993. This payload has been entirely designed and validated by a group of undergraduate students at the University of Colorado using the proposed hybrid domain decomposition approach on a massively parallel processor. Performance results are reported on the CRAY Y-MP/8 and the iPSC-860/64 Touchstone systems, which represent both extreme parallel architectures. The hybrid domain decomposition methodology is shown to outperform leading solution algorithms and to exhibit an excellent parallel scalability.

  18. LDRD final report on massively-parallel linear programming : the parPCx system.

    SciTech Connect

    Parekh, Ojas; Phillips, Cynthia Ann; Boman, Erik Gunnar

    2005-02-01

    This report summarizes the research and development performed from October 2002 to September 2004 at Sandia National Laboratories under the Laboratory-Directed Research and Development (LDRD) project ''Massively-Parallel Linear Programming''. We developed a linear programming (LP) solver designed to use a large number of processors. LP is the optimization of a linear objective function subject to linear constraints. Companies and universities have expended huge efforts over decades to produce fast, stable serial LP solvers. Previous parallel codes run on shared-memory systems and have little or no distribution of the constraint matrix. We have seen no reports of general LP solver runs on large numbers of processors. Our parallel LP code is based on an efficient serial implementation of Mehrotra's interior-point predictor-corrector algorithm (PCx). The computational core of this algorithm is the assembly and solution of a sparse linear system. We have substantially rewritten the PCx code and based it on Trilinos, the parallel linear algebra library developed at Sandia. Our interior-point method can use either direct or iterative solvers for the linear system. To achieve a good parallel data distribution of the constraint matrix, we use a (pre-release) version of a hypergraph partitioner from the Zoltan partitioning library. We describe the design and implementation of our new LP solver called parPCx and give preliminary computational results. We summarize a number of issues related to efficient parallel solution of LPs with interior-point methods including data distribution, numerical stability, and solving the core linear system using both direct and iterative methods. We describe a number of applications of LP specific to US Department of Energy mission areas and we summarize our efforts to integrate parPCx (and parallel LP solvers in general) into Sandia's massively-parallel integer programming solver PICO (Parallel Interger and Combinatorial Optimizer). We

  19. Visualizing Network Traffic to Understand the Performance of Massively Parallel Simulations.

    PubMed

    Landge, A G; Levine, J A; Bhatele, A; Isaacs, K E; Gamblin, T; Schulz, M; Langer, S H; Bremer, Peer-Timo; Pascucci, V

    2012-12-01

    The performance of massively parallel applications is often heavily impacted by the cost of communication among compute nodes. However, determining how to best use the network is a formidable task, made challenging by the ever increasing size and complexity of modern supercomputers. This paper applies visualization techniques to aid parallel application developers in understanding the network activity by enabling a detailed exploration of the flow of packets through the hardware interconnect. In order to visualize this large and complex data, we employ two linked views of the hardware network. The first is a 2D view, that represents the network structure as one of several simplified planar projections. This view is designed to allow a user to easily identify trends and patterns in the network traffic. The second is a 3D view that augments the 2D view by preserving the physical network topology and providing a context that is familiar to the application developers. Using the massively parallel multi-physics code pF3D as a case study, we demonstrate that our tool provides valuable insight that we use to explain and optimize pF3D's performance on an IBM Blue Gene/P system.

  20. On the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods

    PubMed Central

    Lee, Anthony; Yau, Christopher; Giles, Michael B.; Doucet, Arnaud; Holmes, Christopher C.

    2011-01-01

    We present a case-study on the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods. Graphics cards, containing multiple Graphics Processing Units (GPUs), are self-contained parallel computational devices that can be housed in conventional desktop and laptop computers and can be thought of as prototypes of the next generation of many-core processors. For certain classes of population-based Monte Carlo algorithms they offer massively parallel simulation, with the added advantage over conventional distributed multi-core processors that they are cheap, easily accessible, easy to maintain, easy to code, dedicated local devices with low power consumption. On a canonical set of stochastic simulation examples including population-based Markov chain Monte Carlo methods and Sequential Monte Carlo methods, we nd speedups from 35 to 500 fold over conventional single-threaded computer code. Our findings suggest that GPUs have the potential to facilitate the growth of statistical modelling into complex data rich domains through the availability of cheap and accessible many-core computation. We believe the speedup we observe should motivate wider use of parallelizable simulation methods and greater methodological attention to their design. PMID:22003276

  1. On the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods.

    PubMed

    Lee, Anthony; Yau, Christopher; Giles, Michael B; Doucet, Arnaud; Holmes, Christopher C

    2010-12-01

    We present a case-study on the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods. Graphics cards, containing multiple Graphics Processing Units (GPUs), are self-contained parallel computational devices that can be housed in conventional desktop and laptop computers and can be thought of as prototypes of the next generation of many-core processors. For certain classes of population-based Monte Carlo algorithms they offer massively parallel simulation, with the added advantage over conventional distributed multi-core processors that they are cheap, easily accessible, easy to maintain, easy to code, dedicated local devices with low power consumption. On a canonical set of stochastic simulation examples including population-based Markov chain Monte Carlo methods and Sequential Monte Carlo methods, we nd speedups from 35 to 500 fold over conventional single-threaded computer code. Our findings suggest that GPUs have the potential to facilitate the growth of statistical modelling into complex data rich domains through the availability of cheap and accessible many-core computation. We believe the speedup we observe should motivate wider use of parallelizable simulation methods and greater methodological attention to their design.

  2. Overcoming rule-based rigidity and connectionist limitations through massively-parallel case-based reasoning

    NASA Technical Reports Server (NTRS)

    Barnden, John; Srinivas, Kankanahalli

    1990-01-01

    Symbol manipulation as used in traditional Artificial Intelligence has been criticized by neural net researchers for being excessively inflexible and sequential. On the other hand, the application of neural net techniques to the types of high-level cognitive processing studied in traditional artificial intelligence presents major problems as well. A promising way out of this impasse is to build neural net models that accomplish massively parallel case-based reasoning. Case-based reasoning, which has received much attention recently, is essentially the same as analogy-based reasoning, and avoids many of the problems leveled at traditional artificial intelligence. Further problems are avoided by doing many strands of case-based reasoning in parallel, and by implementing the whole system as a neural net. In addition, such a system provides an approach to some aspects of the problems of noise, uncertainty and novelty in reasoning systems. The current neural net system (Conposit), which performs standard rule-based reasoning, is being modified into a massively parallel case-based reasoning version.

  3. Time-dependent density-functional theory in massively parallel computer architectures: the OCTOPUS project.

    PubMed

    Andrade, Xavier; Alberdi-Rodriguez, Joseba; Strubbe, David A; Oliveira, Micael J T; Nogueira, Fernando; Castro, Alberto; Muguerza, Javier; Arruabarrena, Agustin; Louie, Steven G; Aspuru-Guzik, Alán; Rubio, Angel; Marques, Miguel A L

    2012-06-13

    Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures.

  4. Time-dependent density-functional theory in massively parallel computer architectures: the octopus project

    NASA Astrophysics Data System (ADS)

    Andrade, Xavier; Alberdi-Rodriguez, Joseba; Strubbe, David A.; Oliveira, Micael J. T.; Nogueira, Fernando; Castro, Alberto; Muguerza, Javier; Arruabarrena, Agustin; Louie, Steven G.; Aspuru-Guzik, Alán; Rubio, Angel; Marques, Miguel A. L.

    2012-06-01

    Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures.

  5. Commodity cluster and hardware-based massively parallel implementations of hyperspectral imaging algorithms

    NASA Astrophysics Data System (ADS)

    Plaza, Antonio; Chang, Chein-I.; Plaza, Javier; Valencia, David

    2006-05-01

    The incorporation of hyperspectral sensors aboard airborne/satellite platforms is currently producing a nearly continual stream of multidimensional image data, and this high data volume has soon introduced new processing challenges. The price paid for the wealth spatial and spectral information available from hyperspectral sensors is the enormous amounts of data that they generate. Several applications exist, however, where having the desired information calculated quickly enough for practical use is highly desirable. High computing performance of algorithm analysis is particularly important in homeland defense and security applications, in which swift decisions often involve detection of (sub-pixel) military targets (including hostile weaponry, camouflage, concealment, and decoys) or chemical/biological agents. In order to speed-up computational performance of hyperspectral imaging algorithms, this paper develops several fast parallel data processing techniques. Techniques include four classes of algorithms: (1) unsupervised classification, (2) spectral unmixing, and (3) automatic target recognition, and (4) onboard data compression. A massively parallel Beowulf cluster (Thunderhead) at NASA's Goddard Space Flight Center in Maryland is used to measure parallel performance of the proposed algorithms. In order to explore the viability of developing onboard, real-time hyperspectral data compression algorithms, a Xilinx Virtex-II field programmable gate array (FPGA) is also used in experiments. Our quantitative and comparative assessment of parallel techniques and strategies may help image analysts in selection of parallel hyperspectral algorithms for specific applications.

  6. Massively parallel DNA sequencing facilitates diagnosis of patients with Usher syndrome type 1.

    PubMed

    Yoshimura, Hidekane; Iwasaki, Satoshi; Nishio, Shin-Ya; Kumakawa, Kozo; Tono, Tetsuya; Kobayashi, Yumiko; Sato, Hiroaki; Nagai, Kyoko; Ishikawa, Kotaro; Ikezono, Tetsuo; Naito, Yasushi; Fukushima, Kunihiro; Oshikawa, Chie; Kimitsuki, Takashi; Nakanishi, Hiroshi; Usami, Shin-Ichi

    2014-01-01

    Usher syndrome is an autosomal recessive disorder manifesting hearing loss, retinitis pigmentosa and vestibular dysfunction, and having three clinical subtypes. Usher syndrome type 1 is the most severe subtype due to its profound hearing loss, lack of vestibular responses, and retinitis pigmentosa that appears in prepuberty. Six of the corresponding genes have been identified, making early diagnosis through DNA testing possible, with many immediate and several long-term advantages for patients and their families. However, the conventional genetic techniques, such as direct sequence analysis, are both time-consuming and expensive. Targeted exon sequencing of selected genes using the massively parallel DNA sequencing technology will potentially enable us to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using this technique combined with direct sequence analysis, we screened 17 unrelated Usher syndrome type 1 patients and detected probable pathogenic variants in the 16 of them (94.1%) who carried at least one mutation. Seven patients had the MYO7A mutation (41.2%), which is the most common type in Japanese. Most of the mutations were detected by only the massively parallel DNA sequencing. We report here four patients, who had probable pathogenic mutations in two different Usher syndrome type 1 genes, and one case of MYO7A/PCDH15 digenic inheritance. This is the first report of Usher syndrome mutation analysis using massively parallel DNA sequencing and the frequency of Usher syndrome type 1 genes in Japanese. Mutation screening using this technique has the power to quickly identify mutations of many causative genes while maintaining cost-benefit performance. In addition, the simultaneous mutation analysis of large numbers of genes is useful for detecting mutations in different genes that are possibly disease modifiers or of digenic inheritance.

  7. Stochastic simulation of charged particle transport on the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Earl, James A.

    1988-01-01

    Computations of cosmic-ray transport based upon finite-difference methods are afflicted by instabilities, inaccuracies, and artifacts. To avoid these problems, researchers developed a Monte Carlo formulation which is closely related not only to the finite-difference formulation, but also to the underlying physics of transport phenomena. Implementations of this approach are currently running on the Massively Parallel Processor at Goddard Space Flight Center, whose enormous computing power overcomes the poor statistical accuracy that usually limits the use of stochastic methods. These simulations have progressed to a stage where they provide a useful and realistic picture of solar energetic particle propagation in interplanetary space.

  8. Scalable load balancing for massively parallel distributed Monte Carlo particle transport

    SciTech Connect

    O'Brien, M. J.; Brantley, P. S.; Joy, K. I.

    2013-07-01

    In order to run computer simulations efficiently on massively parallel computers with hundreds of thousands or millions of processors, care must be taken that the calculation is load balanced across the processors. Examining the workload of every processor leads to an unscalable algorithm, with run time at least as large as O(N), where N is the number of processors. We present a scalable load balancing algorithm, with run time 0(log(N)), that involves iterated processor-pair-wise balancing steps, ultimately leading to a globally balanced workload. We demonstrate scalability of the algorithm up to 2 million processors on the Sequoia supercomputer at Lawrence Livermore National Laboratory. (authors)

  9. A fully coupled method for massively parallel simulation of hydraulically driven fractures in 3-dimensions

    DOE PAGES

    Settgast, Randolph R.; Fu, Pengcheng; Walsh, Stuart D. C.; ...

    2016-09-18

    This study describes a fully coupled finite element/finite volume approach for simulating field-scale hydraulically driven fractures in three dimensions, using massively parallel computing platforms. The proposed method is capable of capturing realistic representations of local heterogeneities, layering and natural fracture networks in a reservoir. A detailed description of the numerical implementation is provided, along with numerical studies comparing the model with both analytical solutions and experimental results. The results demonstrate the effectiveness of the proposed method for modeling large-scale problems involving hydraulically driven fractures in three dimensions.

  10. Massive parallel simulation of phenomena in condensed matter at high energy density

    NASA Astrophysics Data System (ADS)

    Fortov, Vladimir

    2005-03-01

    This talk deals with computational hydrodynamics, advanced material properties and phenomena at high energy density. New results of massive parallel 3D simulation done with method of individual particles in cells have been obtained. The gas dynamic code includes advanced physical models of matter such as multi-phase equations of state, elastic-plastic, spallation, optic properties and ion-beams stopping. Investigated are the influence on hypervelocity impact processes effects of equation of state, elastic-plastic and spallation. We also report results of numerical modeling of the action of intense heavy ion beams on metallic targets in comparison with new experimental data.

  11. Estimating water flow through a hillslope using the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Devaney, Judy E.; Camillo, P. J.; Gurney, R. J.

    1988-01-01

    A new two-dimensional model of water flow in a hillslope has been implemented on the Massively Parallel Processor at the Goddard Space Flight Center. Flow in the soil both in the saturated and unsaturated zones, evaporation and overland flow are all modelled, and the rainfall rates are allowed to vary spatially. Previous models of this type had always been very limited computationally. This model takes less than a minute to model all the components of the hillslope water flow for a day. The model can now be used in sensitivity studies to specify which measurements should be taken and how accurate they should be to describe such flows for environmental studies.

  12. Degradation in forensic trace DNA samples explored by massively parallel sequencing.

    PubMed

    Hanssen, Eirik Nataas; Lyle, Robert; Egeland, Thore; Gill, Peter

    2017-03-01

    Routine forensic analysis using STRs will fail if the DNA is too degraded. The DNA degradation process in biological stain material is not well understood. In this study we sequenced old semen and blood stains by massively parallel sequencing. The sequence data coverage was used to measure degradation across the genome. The results supported the contention that degradation is uniform across the genome, showing no evidence of regions with increased or decreased resistance towards degradation. Thus the lack of genetic regions robust to degradation removes the possibility of using such regions to further optimize analysis performance for degraded DNA.

  13. Block iterative restoration of astronomical images with the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Heap, Sara R.; Lindler, Don J.

    1987-01-01

    A method is described for algebraic image restoration capable of treating astronomical images. For a typical 500 x 500 image, direct algebraic restoration would require the solution of a 250,000 x 250,000 linear system. The block iterative approach is used to reduce the problem to solving 4900 121 x 121 linear systems. The algorithm was implemented on the Goddard Massively Parallel Processor, which can solve a 121 x 121 system in approximately 0.06 seconds. Examples are shown of the results for various astronomical images.

  14. Genome-wide DNA methylation analysis using massively parallel sequencing technologies.

    PubMed

    Suzuki, Masako; Greally, John M

    2013-01-01

    "Epigenetics" refers to a heritable change in transcriptional status without alteration in the primary nucleotide sequence. Epigenetics provides an extra layer of transcriptional control and plays a crucial role in normal development, as well as in pathological conditions. DNA methylation is one of the best known and well-studied epigenetic modifications. Genome-wide DNA methylation profiling has become recognized as a biologically and clinically important epigenomic assay. In this review, we discuss the strengths and weaknesses of the protocols for genome-wide DNA methylation profiling using massively parallel sequencing (MPS) techniques. We will also describe recently discovered DNA modifications, and the protocols to detect these modifications.

  15. A Massively Parallel Solver for the Mechanical Harmonic Analysis of Accelerator Cavities

    SciTech Connect

    O. Kononenko

    2015-02-17

    ACE3P is a 3D massively parallel simulation suite that developed at SLAC National Accelerator Laboratory that can perform coupled electromagnetic, thermal and mechanical study. Effectively utilizing supercomputer resources, ACE3P has become a key simulation tool for particle accelerator R and D. A new frequency domain solver to perform mechanical harmonic response analysis of accelerator components is developed within the existing parallel framework. This solver is designed to determine the frequency response of the mechanical system to external harmonic excitations for time-efficient accurate analysis of the large-scale problems. Coupled with the ACE3P electromagnetic modules, this capability complements a set of multi-physics tools for a comprehensive study of microphonics in superconducting accelerating cavities in order to understand the RF response and feedback requirements for the operational reliability of a particle accelerator. (auth)

  16. Animated computer graphics models of space and earth sciences data generated via the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Treinish, Lloyd A.; Gough, Michael L.; Wildenhain, W. David

    1987-01-01

    The capability was developed of rapidly producing visual representations of large, complex, multi-dimensional space and earth sciences data sets via the implementation of computer graphics modeling techniques on the Massively Parallel Processor (MPP) by employing techniques recently developed for typically non-scientific applications. Such capabilities can provide a new and valuable tool for the understanding of complex scientific data, and a new application of parallel computing via the MPP. A prototype system with such capabilities was developed and integrated into the National Space Science Data Center's (NSSDC) Pilot Climate Data System (PCDS) data-independent environment for computer graphics data display to provide easy access to users. While developing these capabilities, several problems had to be solved independently of the actual use of the MPP, all of which are outlined.

  17. Implementation, capabilities, and benchmarking of Shift, a massively parallel Monte Carlo radiation transport code

    DOE PAGES

    Pandya, Tara M.; Johnson, Seth R.; Evans, Thomas M.; ...

    2015-12-21

    This paper discusses the implementation, capabilities, and validation of Shift, a massively parallel Monte Carlo radiation transport package developed and maintained at Oak Ridge National Laboratory. It has been developed to scale well from laptop to small computing clusters to advanced supercomputers. Special features of Shift include hybrid capabilities for variance reduction such as CADIS and FW-CADIS, and advanced parallel decomposition and tally methods optimized for scalability on supercomputing architectures. Shift has been validated and verified against various reactor physics benchmarks and compares well to other state-of-the-art Monte Carlo radiation transport codes such as MCNP5, CE KENO-VI, and OpenMC. Somemore » specific benchmarks used for verification and validation include the CASL VERA criticality test suite and several Westinghouse AP1000® problems. These benchmark and scaling studies show promising results.« less

  18. Implementation, capabilities, and benchmarking of Shift, a massively parallel Monte Carlo radiation transport code

    SciTech Connect

    Pandya, Tara M.; Johnson, Seth R.; Evans, Thomas M.; Davidson, Gregory G.; Hamilton, Steven P.; Godfrey, Andrew T.

    2015-12-21

    This paper discusses the implementation, capabilities, and validation of Shift, a massively parallel Monte Carlo radiation transport package developed and maintained at Oak Ridge National Laboratory. It has been developed to scale well from laptop to small computing clusters to advanced supercomputers. Special features of Shift include hybrid capabilities for variance reduction such as CADIS and FW-CADIS, and advanced parallel decomposition and tally methods optimized for scalability on supercomputing architectures. Shift has been validated and verified against various reactor physics benchmarks and compares well to other state-of-the-art Monte Carlo radiation transport codes such as MCNP5, CE KENO-VI, and OpenMC. Some specific benchmarks used for verification and validation include the CASL VERA criticality test suite and several Westinghouse AP1000® problems. These benchmark and scaling studies show promising results.

  19. Massively Parallel Computation of Soil Surface Roughness Parameters on A Fermi GPU

    NASA Astrophysics Data System (ADS)

    Li, Xiaojie; Song, Changhe

    2016-06-01

    Surface roughness is description of the surface micro topography of randomness or irregular. The standard deviation of surface height and the surface correlation length describe the statistical variation for the random component of a surface height relative to a reference surface. When the number of data points is large, calculation of surface roughness parameters is time-consuming. With the advent of Graphics Processing Unit (GPU) architectures, inherently parallel problem can be effectively solved using GPUs. In this paper we propose a GPU-based massively parallel computing method for 2D bare soil surface roughness estimation. This method was applied to the data collected by the surface roughness tester based on the laser triangulation principle during the field experiment in April 2012. The total number of data points was 52,040. It took 47 seconds on a Fermi GTX 590 GPU whereas its serial CPU version took 5422 seconds, leading to a significant 115x speedup.

  20. Implementation, capabilities, and benchmarking of Shift, a massively parallel Monte Carlo radiation transport code

    NASA Astrophysics Data System (ADS)

    Pandya, Tara M.; Johnson, Seth R.; Evans, Thomas M.; Davidson, Gregory G.; Hamilton, Steven P.; Godfrey, Andrew T.

    2016-03-01

    This work discusses the implementation, capabilities, and validation of Shift, a massively parallel Monte Carlo radiation transport package authored at Oak Ridge National Laboratory. Shift has been developed to scale well from laptops to small computing clusters to advanced supercomputers and includes features such as support for multiple geometry and physics engines, hybrid capabilities for variance reduction methods such as the Consistent Adjoint-Driven Importance Sampling methodology, advanced parallel decompositions, and tally methods optimized for scalability on supercomputing architectures. The scaling studies presented in this paper demonstrate good weak and strong scaling behavior for the implemented algorithms. Shift has also been validated and verified against various reactor physics benchmarks, including the Consortium for Advanced Simulation of Light Water Reactors' Virtual Environment for Reactor Analysis criticality test suite and several Westinghouse AP1000® problems presented in this paper. These benchmark results compare well to those from other contemporary Monte Carlo codes such as MCNP5 and KENO.

  1. "Multipoint Force Feedback" Leveling of Massively Parallel Tip Arrays in Scanning Probe Lithography.

    PubMed

    Noh, Hanaul; Jung, Goo-Eun; Kim, Sukhyun; Yun, Seong-Hun; Jo, Ahjin; Kahng, Se-Jong; Cho, Nam-Joon; Cho, Sang-Joon

    2015-09-16

    Nanoscale patterning with massively parallel 2D array tips is of significant interest in scanning probe lithography. A challenging task for tip-based large area nanolithography is maintaining parallel tip arrays at the same contact point with a sample substrate in order to pattern a uniform array. Here, polymer pen lithography is demonstrated with a novel leveling method to account for the magnitude and direction of the total applied force of tip arrays by a multipoint force sensing structure integrated into the tip holder. This high-precision approach results in a 0.001° slope of feature edge length variation over 1 cm wide tip arrays. The position sensitive leveling operates in a fully automated manner and is applicable to recently developed scanning probe lithography techniques of various kinds which can enable "desktop nanofabrication."

  2. Massively parallel simulation of flow and transport in variably saturated porous and fractured media

    SciTech Connect

    Wu, Yu-Shu; Zhang, Keni; Pruess, Karsten

    2002-01-15

    This paper describes a massively parallel simulation method and its application for modeling multiphase flow and multicomponent transport in porous and fractured reservoirs. The parallel-computing method has been implemented into the TOUGH2 code and its numerical performance is tested on a Cray T3E-900 and IBM SP. The efficiency and robustness of the parallel-computing algorithm are demonstrated by completing two simulations with more than one million gridblocks, using site-specific data obtained from a site-characterization study. The first application involves the development of a three-dimensional numerical model for flow in the unsaturated zone of Yucca Mountain, Nevada. The second application is the study of tracer/radionuclide transport through fracture-matrix rocks for the same site. The parallel-computing technique enhances modeling capabilities by achieving several-orders-of-magnitude speedup for large-scale and high resolution modeling studies. The resulting modeling results provide many new insights into flow and transport processes that could not be obtained from simulations using the single-CPU simulator.

  3. DGDFT: A massively parallel method for large scale density functional theory calculations

    SciTech Connect

    Hu, Wei Yang, Chao; Lin, Lin

    2015-09-28

    We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10{sup −4} Hartree/atom in terms of the error of energy and 6.2 × 10{sup −4} Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail.

  4. DGDFT: A massively parallel method for large scale density functional theory calculations.

    PubMed

    Hu, Wei; Lin, Lin; Yang, Chao

    2015-09-28

    We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10(-4) Hartree/atom in terms of the error of energy and 6.2 × 10(-4) Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail.

  5. Seismic waves modeling with the Fourier pseudo-spectral method on massively parallel machines.

    NASA Astrophysics Data System (ADS)

    Klin, Peter

    2015-04-01

    The Fourier pseudo-spectral method (FPSM) is an approach for the 3D numerical modeling of the wave propagation, which is based on the discretization of the spatial domain in a structured grid and relies on global spatial differential operators for the solution of the wave equation. This last peculiarity is advantageous from the accuracy point of view but poses difficulties for an efficient implementation of the method to be run on parallel computers with distributed memory architecture. The 1D spatial domain decomposition approach has been so far commonly adopted in the parallel implementations of the FPSM, but it implies an intensive data exchange among all the processors involved in the computation, which can degrade the performance because of communication latencies. Moreover, the scalability of the 1D domain decomposition is limited, since the number of processors can not exceed the number of grid points along the directions in which the domain is partitioned. This limitation inhibits an efficient exploitation of the computational environments with a very large number of processors. In order to overcome the limitations of the 1D domain decomposition we implemented a parallel version of the FPSM based on a 2D domain decomposition, which allows to achieve a higher degree of parallelism and scalability on massively parallel machines with several thousands of processing elements. The parallel programming is essentially achieved using the MPI protocol but OpenMP parts are also included in order to exploit the single processor multi - threading capabilities, when available. The developed tool is aimed at the numerical simulation of the seismic waves propagation and in particular is intended for earthquake ground motion research. We show the scalability tests performed up to 16k processing elements on the IBM Blue Gene/Q computer at CINECA (Italy), as well as the application to the simulation of the earthquake ground motion in the alluvial plain of the Po river (Italy).

  6. A massively parallel semi-Lagrangian algorithm for solving the transport equation

    SciTech Connect

    Manson, Russell; Wang, Dali

    2010-01-01

    The scalar transport equation underpins many models employed in science, engineering, technology and business. Application areas include, but are not restricted to, pollution transport, weather forecasting, video analysis and encoding (the optical flow equation), options and stock pricing (the Black-Scholes equation) and spatially explicit ecological models. Unfortunately finding numerical solutions to this equation which are fast and accurate is not trivial. Moreover, finding such numerical algorithms that can be implemented on high performance computer architectures efficiently is challenging. In this paper the authors describe a massively parallel algorithm for solving the advection portion of the transport equation. We present an approach here which is different to that used in most transport models and which we have tried and tested for various scenarios. The approach employs an intelligent domain decomposition based on the vector field of the system equations and thus automatically partitions the computational domain into algorithmically autonomous regions. The solution of a classic pure advection transport problem is shown to be conservative, monotonic and highly accurate at large time steps. Additionally we demonstrate that the algorithm is highly efficient for high performance computer architectures and thus offers a route towards massively parallel application.

  7. On distributed memory MPI-based parallelization of SPH codes in massive HPC context

    NASA Astrophysics Data System (ADS)

    Oger, G.; Le Touzé, D.; Guibert, D.; de Leffe, M.; Biddiscombe, J.; Soumagne, J.; Piccinali, J.-G.

    2016-03-01

    Most of particle methods share the problem of high computational cost and in order to satisfy the demands of solvers, currently available hardware technologies must be fully exploited. Two complementary technologies are now accessible. On the one hand, CPUs which can be structured into a multi-node framework, allowing massive data exchanges through a high speed network. In this case, each node is usually comprised of several cores available to perform multithreaded computations. On the other hand, GPUs which are derived from the graphics computing technologies, able to perform highly multi-threaded calculations with hundreds of independent threads connected together through a common shared memory. This paper is primarily dedicated to the distributed memory parallelization of particle methods, targeting several thousands of CPU cores. The experience gained clearly shows that parallelizing a particle-based code on moderate numbers of cores can easily lead to an acceptable scalability, whilst a scalable speedup on thousands of cores is much more difficult to obtain. The discussion revolves around speeding up particle methods as a whole, in a massive HPC context by making use of the MPI library. We focus on one particular particle method which is Smoothed Particle Hydrodynamics (SPH), one of the most widespread today in the literature as well as in engineering.

  8. The Fortran-P Translator: Towards Automatic Translation of Fortran 77 Programs for Massively Parallel Processors

    DOE PAGES

    O'keefe, Matthew; Parr, Terence; Edgar, B. Kevin; ...

    1995-01-01

    Massively parallel processors (MPPs) hold the promise of extremely high performance that, if realized, could be used to study problems of unprecedented size and complexity. One of the primary stumbling blocks to this promise has been the lack of tools to translate application codes to MPP form. In this article we show how applications codes written in a subset of Fortran 77, called Fortran-P, can be translated to achieve good performance on several massively parallel machines. This subset can express codes that are self-similar, where the algorithm applied to the global data domain is also applied to each subdomain. Wemore » have found many codes that match the Fortran-P programming style and have converted them using our tools. We believe a self-similar coding style will accomplish what a vectorizable style has accomplished for vector machines by allowing the construction of robust, user-friendly, automatic translation systems that increase programmer productivity and generate fast, efficient code for MPPs.« less

  9. Massively Parallel Dantzig-Wolfe Decomposition Applied to Traffic Flow Scheduling

    NASA Technical Reports Server (NTRS)

    Rios, Joseph Lucio; Ross, Kevin

    2009-01-01

    Optimal scheduling of air traffic over the entire National Airspace System is a computationally difficult task. To speed computation, Dantzig-Wolfe decomposition is applied to a known linear integer programming approach for assigning delays to flights. The optimization model is proven to have the block-angular structure necessary for Dantzig-Wolfe decomposition. The subproblems for this decomposition are solved in parallel via independent computation threads. Experimental evidence suggests that as the number of subproblems/threads increases (and their respective sizes decrease), the solution quality, convergence, and runtime improve. A demonstration of this is provided by using one flight per subproblem, which is the finest possible decomposition. This results in thousands of subproblems and associated computation threads. This massively parallel approach is compared to one with few threads and to standard (non-decomposed) approaches in terms of solution quality and runtime. Since this method generally provides a non-integral (relaxed) solution to the original optimization problem, two heuristics are developed to generate an integral solution. Dantzig-Wolfe followed by these heuristics can provide a near-optimal (sometimes optimal) solution to the original problem hundreds of times faster than standard (non-decomposed) approaches. In addition, when massive decomposition is employed, the solution is shown to be more likely integral, which obviates the need for an integerization step. These results indicate that nationwide, real-time, high fidelity, optimal traffic flow scheduling is achievable for (at least) 3 hour planning horizons.

  10. A Faster Parallel Algorithm and Efficient Multithreaded Implementations for Evaluating Betweenness Centrality on Massive Datasets

    SciTech Connect

    Madduri, Kamesh; Ediger, David; Jiang, Karl; Bader, David A.; Chavarria-Miranda, Daniel

    2009-02-15

    We present a new lock-free parallel algorithm for computing betweenness centralityof massive small-world networks. With minor changes to the data structures, ouralgorithm also achieves better spatial cache locality compared to previous approaches. Betweenness centrality is a key algorithm kernel in HPCS SSCA#2, a benchmark extensively used to evaluate the performance of emerging high-performance computing architectures for graph-theoretic computations. We design optimized implementations of betweenness centrality and the SSCA#2 benchmark for two hardware multithreaded systems: a Cray XMT system with the Threadstorm processor, and a single-socket Sun multicore server with the UltraSPARC T2 processor. For a small-world network of 134 million vertices and 1.073 billion edges, the 16-processor XMT system and the 8-core Sun Fire T5120 server achieve TEPS scores (an algorithmic performance count for the SSCA#2 benchmark) of 160 million and 90 million respectively, which corresponds to more than a 2X performance improvement over the previous parallel implementations. To better characterize the performance of these multithreaded systems, we correlate the SSCA#2 performance results with data from the memory-intensive STREAM and RandomAccess benchmarks. Finally, we demonstrate the applicability of our implementation to analyze massive real-world datasets by computing approximate betweenness centrality for a large-scale IMDb movie-actor network.

  11. Transcriptional analysis of endocrine disruption using zebrafish and massively parallel sequencing

    PubMed Central

    Baker, Michael E.; Hardiman, Gary

    2014-01-01

    Endocrine disrupting chemicals (EDCs) including plasticizers, pesticides, detergents and pharmaceuticals, affect a variety of hormone-regulated physiological pathways in humans and wildlife. Many EDCs are lipophilic molecules and bind to hydrophobic pockets in steroid receptors, such as the estrogen receptor and androgen receptor, which are important in vertebrate reproduction and development. Indeed, health effects attributed to EDCs include reproductive dysfunction (e.g., reduced fertility, reproductive tract abnormalities and skewed male/female sex ratios in fish), early puberty, various cancers and obesity. A major concern is the effects of exposure to low concentrations of endocrine disruptors in utero and post partum, which may increase the incidence of cancer and diabetes in adults. EDCs affect transcription of hundreds and even thousands of genes, which has created the need for new tools to monitor the global effects of EDCs. The emergence of massive parallel sequencing for investigating gene transcription provides a sensitive tool for monitoring the effects of EDCs on humans and other vertebrates as well as elucidating the mechanism of action of EDCs. Zebrafish conserve many developmental pathways found in humans, which makes zebrafish a valuable model system for studying EDCs especially on early organ development because their embryos are translucent. In this article we review recent advances in massive parallel sequencing approaches with a focus on zebrafish. We make the case that zebrafish exposed to EDCs at different stages of development, can provide important insights on EDC effects on human health. PMID:24850832

  12. High-quality draft assemblies of mammalian genomes from massively parallel sequence data

    PubMed Central

    Gnerre, Sante; MacCallum, Iain; Przybylski, Dariusz; Ribeiro, Filipe J.; Burton, Joshua N.; Walker, Bruce J.; Sharpe, Ted; Hall, Giles; Shea, Terrance P.; Sykes, Sean; Berlin, Aaron M.; Aird, Daniel; Costello, Maura; Daza, Riza; Williams, Louise; Nicol, Robert; Gnirke, Andreas; Nusbaum, Chad; Lander, Eric S.; Jaffe, David B.

    2011-01-01

    Massively parallel DNA sequencing technologies are revolutionizing genomics by making it possible to generate billions of relatively short (~100-base) sequence reads at very low cost. Whereas such data can be readily used for a wide range of biomedical applications, it has proven difficult to use them to generate high-quality de novo genome assemblies of large, repeat-rich vertebrate genomes. To date, the genome assemblies generated from such data have fallen far short of those obtained with the older (but much more expensive) capillary-based sequencing approach. Here, we report the development of an algorithm for genome assembly, ALLPATHS-LG, and its application to massively parallel DNA sequence data from the human and mouse genomes, generated on the Illumina platform. The resulting draft genome assemblies have good accuracy, short-range contiguity, long-range connectivity, and coverage of the genome. In particular, the base accuracy is high (≥99.95%) and the scaffold sizes (N50 size = 11.5 Mb for human and 7.2 Mb for mouse) approach those obtained with capillary-based sequencing. The combination of improved sequencing technology and improved computational methods should now make it possible to increase dramatically the de novo sequencing of large genomes. The ALLPATHS-LG program is available at http://www.broadinstitute.org/science/programs/genome-biology/crd. PMID:21187386

  13. MADmap: A Massively Parallel Maximum-Likelihood Cosmic Microwave Background Map-Maker

    SciTech Connect

    Cantalupo, Christopher; Borrill, Julian; Jaffe, Andrew; Kisner, Theodore; Stompor, Radoslaw

    2009-06-09

    MADmap is a software application used to produce maximum-likelihood images of the sky from time-ordered data which include correlated noise, such as those gathered by Cosmic Microwave Background (CMB) experiments. It works efficiently on platforms ranging from small workstations to the most massively parallel supercomputers. Map-making is a critical step in the analysis of all CMB data sets, and the maximum-likelihood approach is the most accurate and widely applicable algorithm; however, it is a computationally challenging task. This challenge will only increase with the next generation of ground-based, balloon-borne and satellite CMB polarization experiments. The faintness of the B-mode signal that these experiments seek to measure requires them to gather enormous data sets. MADmap is already being run on up to O(1011) time samples, O(108) pixels and O(104) cores, with ongoing work to scale to the next generation of data sets and supercomputers. We describe MADmap's algorithm based around a preconditioned conjugate gradient solver, fast Fourier transforms and sparse matrix operations. We highlight MADmap's ability to address problems typically encountered in the analysis of realistic CMB data sets and describe its application to simulations of the Planck and EBEX experiments. The massively parallel and distributed implementation is detailed and scaling complexities are given for the resources required. MADmap is capable of analysing the largest data sets now being collected on computing resources currently available, and we argue that, given Moore's Law, MADmap will be capable of reducing the most massive projected data sets.

  14. MPI/OpenMP Hybrid Parallel Algorithm of Resolution of Identity Second-Order Møller-Plesset Perturbation Calculation for Massively Parallel Multicore Supercomputers.

    PubMed

    Katouda, Michio; Nakajima, Takahito

    2013-12-10

    A new algorithm for massively parallel calculations of electron correlation energy of large molecules based on the resolution of identity second-order Møller-Plesset perturbation (RI-MP2) technique is developed and implemented into the quantum chemistry software NTChem. In this algorithm, a Message Passing Interface (MPI) and Open Multi-Processing (OpenMP) hybrid parallel programming model is applied to attain efficient parallel performance on massively parallel supercomputers. An in-core storage scheme of intermediate data of three-center electron repulsion integrals utilizing the distributed memory is developed to eliminate input/output (I/O) overhead. The parallel performance of the algorithm is tested on massively parallel supercomputers such as the K computer (using up to 45 992 central processing unit (CPU) cores) and a commodity Intel Xeon cluster (using up to 8192 CPU cores). The parallel RI-MP2/cc-pVTZ calculation of two-layer nanographene sheets (C150H30)2 (number of atomic orbitals is 9640) is performed using 8991 node and 71 288 CPU cores of the K computer.

  15. ASSET: Analysis of Sequences of Synchronous Events in Massively Parallel Spike Trains

    PubMed Central

    Canova, Carlos; Denker, Michael; Gerstein, George; Helias, Moritz

    2016-01-01

    With the ability to observe the activity from large numbers of neurons simultaneously using modern recording technologies, the chance to identify sub-networks involved in coordinated processing increases. Sequences of synchronous spike events (SSEs) constitute one type of such coordinated spiking that propagates activity in a temporally precise manner. The synfire chain was proposed as one potential model for such network processing. Previous work introduced a method for visualization of SSEs in massively parallel spike trains, based on an intersection matrix that contains in each entry the degree of overlap of active neurons in two corresponding time bins. Repeated SSEs are reflected in the matrix as diagonal structures of high overlap values. The method as such, however, leaves the task of identifying these diagonal structures to visual inspection rather than to a quantitative analysis. Here we present ASSET (Analysis of Sequences of Synchronous EvenTs), an improved, fully automated method which determines diagonal structures in the intersection matrix by a robust mathematical procedure. The method consists of a sequence of steps that i) assess which entries in the matrix potentially belong to a diagonal structure, ii) cluster these entries into individual diagonal structures and iii) determine the neurons composing the associated SSEs. We employ parallel point processes generated by stochastic simulations as test data to demonstrate the performance of the method under a wide range of realistic scenarios, including different types of non-stationarity of the spiking activity and different correlation structures. Finally, the ability of the method to discover SSEs is demonstrated on complex data from large network simulations with embedded synfire chains. Thus, ASSET represents an effective and efficient tool to analyze massively parallel spike data for temporal sequences of synchronous activity. PMID:27420734

  16. ASSET: Analysis of Sequences of Synchronous Events in Massively Parallel Spike Trains.

    PubMed

    Torre, Emiliano; Canova, Carlos; Denker, Michael; Gerstein, George; Helias, Moritz; Grün, Sonja

    2016-07-01

    With the ability to observe the activity from large numbers of neurons simultaneously using modern recording technologies, the chance to identify sub-networks involved in coordinated processing increases. Sequences of synchronous spike events (SSEs) constitute one type of such coordinated spiking that propagates activity in a temporally precise manner. The synfire chain was proposed as one potential model for such network processing. Previous work introduced a method for visualization of SSEs in massively parallel spike trains, based on an intersection matrix that contains in each entry the degree of overlap of active neurons in two corresponding time bins. Repeated SSEs are reflected in the matrix as diagonal structures of high overlap values. The method as such, however, leaves the task of identifying these diagonal structures to visual inspection rather than to a quantitative analysis. Here we present ASSET (Analysis of Sequences of Synchronous EvenTs), an improved, fully automated method which determines diagonal structures in the intersection matrix by a robust mathematical procedure. The method consists of a sequence of steps that i) assess which entries in the matrix potentially belong to a diagonal structure, ii) cluster these entries into individual diagonal structures and iii) determine the neurons composing the associated SSEs. We employ parallel point processes generated by stochastic simulations as test data to demonstrate the performance of the method under a wide range of realistic scenarios, including different types of non-stationarity of the spiking activity and different correlation structures. Finally, the ability of the method to discover SSEs is demonstrated on complex data from large network simulations with embedded synfire chains. Thus, ASSET represents an effective and efficient tool to analyze massively parallel spike data for temporal sequences of synchronous activity.

  17. User's Guide for TOUGH2-MP - A Massively Parallel Version of the TOUGH2 Code

    SciTech Connect

    Earth Sciences Division; Zhang, Keni; Zhang, Keni; Wu, Yu-Shu; Pruess, Karsten

    2008-05-27

    TOUGH2-MP is a massively parallel (MP) version of the TOUGH2 code, designed for computationally efficient parallel simulation of isothermal and nonisothermal flows of multicomponent, multiphase fluids in one, two, and three-dimensional porous and fractured media. In recent years, computational requirements have become increasingly intensive in large or highly nonlinear problems for applications in areas such as radioactive waste disposal, CO2 geological sequestration, environmental assessment and remediation, reservoir engineering, and groundwater hydrology. The primary objective of developing the parallel-simulation capability is to significantly improve the computational performance of the TOUGH2 family of codes. The particular goal for the parallel simulator is to achieve orders-of-magnitude improvement in computational time for models with ever-increasing complexity. TOUGH2-MP is designed to perform parallel simulation on multi-CPU computational platforms. An earlier version of TOUGH2-MP (V1.0) was based on the TOUGH2 Version 1.4 with EOS3, EOS9, and T2R3D modules, a software previously qualified for applications in the Yucca Mountain project, and was designed for execution on CRAY T3E and IBM SP supercomputers. The current version of TOUGH2-MP (V2.0) includes all fluid property modules of the standard version TOUGH2 V2.0. It provides computationally efficient capabilities using supercomputers, Linux clusters, or multi-core PCs, and also offers many user-friendly features. The parallel simulator inherits all process capabilities from V2.0 together with additional capabilities for handling fractured media from V1.4. This report provides a quick starting guide on how to set up and run the TOUGH2-MP program for users with a basic knowledge of running the (standard) version TOUGH2 code, The report also gives a brief technical description of the code, including a discussion of parallel methodology, code structure, as well as mathematical and numerical methods used

  18. Compact Graph Representations and Parallel Connectivity Algorithms for Massive Dynamic Network Analysis

    SciTech Connect

    Madduri, Kamesh; Bader, David A.

    2009-02-15

    Graph-theoretic abstractions are extensively used to analyze massive data sets. Temporal data streams from socioeconomic interactions, social networking web sites, communication traffic, and scientific computing can be intuitively modeled as graphs. We present the first study of novel high-performance combinatorial techniques for analyzing large-scale information networks, encapsulating dynamic interaction data in the order of billions of entities. We present new data structures to represent dynamic interaction networks, and discuss algorithms for processing parallel insertions and deletions of edges in small-world networks. With these new approaches, we achieve an average performance rate of 25 million structural updates per second and a parallel speedup of nearly28 on a 64-way Sun UltraSPARC T2 multicore processor, for insertions and deletions to a small-world network of 33.5 million vertices and 268 million edges. We also design parallel implementations of fundamental dynamic graph kernels related to connectivity and centrality queries. Our implementations are freely distributed as part of the open-source SNAP (Small-world Network Analysis and Partitioning) complex network analysis framework.

  19. Scalable High Performance Computing: Direct and Large-Eddy Turbulent Flow Simulations Using Massively Parallel Computers

    NASA Technical Reports Server (NTRS)

    Morgan, Philip E.

    2004-01-01

    This final report contains reports of research related to the tasks "Scalable High Performance Computing: Direct and Lark-Eddy Turbulent FLow Simulations Using Massively Parallel Computers" and "Devleop High-Performance Time-Domain Computational Electromagnetics Capability for RCS Prediction, Wave Propagation in Dispersive Media, and Dual-Use Applications. The discussion of Scalable High Performance Computing reports on three objectives: validate, access scalability, and apply two parallel flow solvers for three-dimensional Navier-Stokes flows; develop and validate a high-order parallel solver for Direct Numerical Simulations (DNS) and Large Eddy Simulation (LES) problems; and Investigate and develop a high-order Reynolds averaged Navier-Stokes turbulence model. The discussion of High-Performance Time-Domain Computational Electromagnetics reports on five objectives: enhancement of an electromagnetics code (CHARGE) to be able to effectively model antenna problems; utilize lessons learned in high-order/spectral solution of swirling 3D jets to apply to solving electromagnetics project; transition a high-order fluids code, FDL3DI, to be able to solve Maxwell's Equations using compact-differencing; develop and demonstrate improved radiation absorbing boundary conditions for high-order CEM; and extend high-order CEM solver to address variable material properties. The report also contains a review of work done by the systems engineer.

  20. Large-Scale Eigenvalue Calculations for Stability Analysis of Steady Flows on Massively Parallel Computers

    SciTech Connect

    Lehoucq, Richard B.; Salinger, Andrew G.

    1999-08-01

    We present an approach for determining the linear stability of steady states of PDEs on massively parallel computers. Linearizing the transient behavior around a steady state leads to a generalized eigenvalue problem. The eigenvalues with largest real part are calculated using Arnoldi's iteration driven by a novel implementation of the Cayley transformation to recast the problem as an ordinary eigenvalue problem. The Cayley transformation requires the solution of a linear system at each Arnoldi iteration, which must be done iteratively for the algorithm to scale with problem size. A representative model problem of 3D incompressible flow and heat transfer in a rotating disk reactor is used to analyze the effect of algorithmic parameters on the performance of the eigenvalue algorithm. Successful calculations of leading eigenvalues for matrix systems of order up to 4 million were performed, identifying the critical Grashof number for a Hopf bifurcation.

  1. Simulating massively parallel electron beam inspection for sub-20 nm defects

    NASA Astrophysics Data System (ADS)

    Bunday, Benjamin D.; Mukhtar, Maseeh; Quoi, Kathy; Thiel, Brad; Malloy, Matt

    2015-03-01

    SEMATECH has initiated a program to develop massively-parallel electron beam defect inspection (MPEBI). Here we use JMONSEL simulations to generate expected imaging responses of chosen test cases of patterns and defects with ability to vary parameters for beam energy, spot size, pixel size, and/or defect material and form factor. The patterns are representative of the design rules for an aggressively-scaled FinFET-type design. With these simulated images and resulting shot noise, a signal-to-noise framework is developed, which relates to defect detection probabilities. Additionally, with this infrastructure the effect of detection chain noise and frequency dependent system response can be made, allowing for targeting of best recipe parameters for MPEBI validation experiments, ultimately leading to insights into how such parameters will impact MPEBI tool design, including necessary doses for defect detection and estimations of scanning speeds for achieving high throughput for HVM.

  2. Guiding the design of synthetic DNA-binding molecules with massively parallel sequencing.

    PubMed

    Meier, Jordan L; Yu, Abigail S; Korf, Ian; Segal, David J; Dervan, Peter B

    2012-10-24

    Genomic applications of DNA-binding molecules require an unbiased knowledge of their high affinity sites. We report the high-throughput analysis of pyrrole-imidazole polyamide DNA-binding specificity in a 10(12)-member DNA sequence library using affinity purification coupled with massively parallel sequencing. We find that even within this broad context, the canonical pairing rules are remarkably predictive of polyamide DNA-binding specificity. However, this approach also allows identification of unanticipated high affinity DNA-binding sites in the reverse orientation for polyamides containing β/Im pairs. These insights allow the redesign of hairpin polyamides with different turn units capable of distinguishing 5'-WCGCGW-3' from 5'-WGCGCW-3'. Overall, this study displays the power of high-throughput methods to aid the optimal targeting of sequence-specific minor groove binding molecules, an essential underpinning for biological and nanotechnological applications.

  3. Computations on the massively parallel processor at the Goddard Space Flight Center

    NASA Technical Reports Server (NTRS)

    Strong, James P.

    1991-01-01

    Described are four significant algorithms implemented on the massively parallel processor (MPP) at the Goddard Space Flight Center. Two are in the area of image analysis. Of the other two, one is a mathematical simulation experiment and the other deals with the efficient transfer of data between distantly separated processors in the MPP array. The first algorithm presented is the automatic determination of elevations from stereo pairs. The second algorithm solves mathematical logistic equations capable of producing both ordered and chaotic (or random) solutions. This work can potentially lead to the simulation of artificial life processes. The third algorithm is the automatic segmentation of images into reasonable regions based on some similarity criterion, while the fourth is an implementation of a bitonic sort of data which significantly overcomes the nearest neighbor interconnection constraints on the MPP for transferring data between distant processors.

  4. Demonstration of EDA flow for massively parallel e-beam lithography

    NASA Astrophysics Data System (ADS)

    Brandt, P.; Belledent, J.; Tranquillin, C.; Figueiro, T.; Meunier, S.; Bayle, S.; Fay, A.; Milléquant, M.; Icard, B.; Wieland, M.

    2014-03-01

    Today's soaring complexity in pushing the limits of 193nm immersion lithography drives the development of other technologies. One of these alternatives is mask-less massively parallel electron beam lithography, (MP-EBL), a promising candidate in which future resolution needs can be fulfilled at competitive cost. MAPPER Lithography's MATRIX MP-EBL platform has currently entered an advanced stage of development. The first tool in this platform, the FLX 1200, will operate using more than 1,300 beams, each one writing a stripe 2.2μm wide. 0.2μm overlap from stripe to stripe is allocated for stitching. Each beam is composed of 49 individual sub-beams that can be blanked independently in order to write in a raster scan pixels onto the wafer.

  5. A scalable approach to modeling groundwater flow on massively parallel computers

    SciTech Connect

    Ashby, S.F.; Falgout, R.D.; Tompson, A.F.B.

    1995-12-01

    We describe a fully scalable approach to the simulation of groundwater flow on a hierarchy of computing platforms, ranging from workstations to massively parallel computers. Specifically, we advocate the use of scalable conceptual models in which the subsurface model is defined independently of the computational grid on which the simulation takes place. We also describe a scalable multigrid algorithm for computing the groundwater flow velocities. We axe thus able to leverage both the engineer`s time spent developing the conceptual model and the computing resources used in the numerical simulation. We have successfully employed this approach at the LLNL site, where we have run simulations ranging in size from just a few thousand spatial zones (on workstations) to more than eight million spatial zones (on the CRAY T3D)-all using the same conceptual model.

  6. Massively parallel enzyme kinetics reveals the substrate recognition landscape of the metalloprotease ADAMTS13.

    PubMed

    Kretz, Colin A; Dai, Manhong; Soylemez, Onuralp; Yee, Andrew; Desch, Karl C; Siemieniak, David; Tomberg, Kärt; Kondrashov, Fyodor A; Meng, Fan; Ginsburg, David

    2015-07-28

    Proteases play important roles in many biologic processes and are key mediators of cancer, inflammation, and thrombosis. However, comprehensive and quantitative techniques to define the substrate specificity profile of proteases are lacking. The metalloprotease ADAMTS13 regulates blood coagulation by cleaving von Willebrand factor (VWF), reducing its procoagulant activity. A mutagenized substrate phage display library based on a 73-amino acid fragment of VWF was constructed, and the ADAMTS13-dependent change in library complexity was evaluated over reaction time points, using high-throughput sequencing. Reaction rate constants (kcat/KM) were calculated for nearly every possible single amino acid substitution within this fragment. This massively parallel enzyme kinetics analysis detailed the specificity of ADAMTS13 and demonstrated the critical importance of the P1-P1' substrate residues while defining exosite binding domains. These data provided empirical evidence for the propensity for epistasis within VWF and showed strong correlation to conservation across orthologs, highlighting evolutionary selective pressures for VWF.

  7. Simultaneous digital quantification and fluorescence-based size characterization of massively parallel sequencing libraries.

    PubMed

    Laurie, Matthew T; Bertout, Jessica A; Taylor, Sean D; Burton, Joshua N; Shendure, Jay A; Bielas, Jason H

    2013-08-01

    Due to the high cost of failed runs and suboptimal data yields, quantification and determination of fragment size range are crucial steps in the library preparation process for massively parallel sequencing (or next-generation sequencing). Current library quality control methods commonly involve quantification using real-time quantitative PCR and size determination using gel or capillary electrophoresis. These methods are laborious and subject to a number of significant limitations that can make library calibration unreliable. Herein, we propose and test an alternative method for quality control of sequencing libraries using droplet digital PCR (ddPCR). By exploiting a correlation we have discovered between droplet fluorescence and amplicon size, we achieve the joint quantification and size determination of target DNA with a single ddPCR assay. We demonstrate the accuracy and precision of applying this method to the preparation of sequencing libraries.

  8. 3-D readout-electronics packaging for high-bandwidth massively paralleled imager

    DOEpatents

    Kwiatkowski, Kris; Lyke, James

    2007-12-18

    Dense, massively parallel signal processing electronics are co-packaged behind associated sensor pixels. Microchips containing a linear or bilinear arrangement of photo-sensors, together with associated complex electronics, are integrated into a simple 3-D structure (a "mirror cube"). An array of photo-sensitive cells are disposed on a stacked CMOS chip's surface at a 45.degree. angle from light reflecting mirror surfaces formed on a neighboring CMOS chip surface. Image processing electronics are held within the stacked CMOS chip layers. Electrical connections couple each of said stacked CMOS chip layers and a distribution grid, the connections for distributing power and signals to components associated with each stacked CSMO chip layer.

  9. Phase space simulation of collisionless stellar systems on the massively parallel processor

    NASA Technical Reports Server (NTRS)

    White, Richard L.

    1987-01-01

    A numerical technique for solving the collisionless Boltzmann equation describing the time evolution of a self gravitating fluid in phase space was implemented on the Massively Parallel Processor (MPP). The code performs calculations for a two dimensional phase space grid (with one space and one velocity dimension). Some results from calculations are presented. The execution speed of the code is comparable to the speed of a single processor of a Cray-XMP. Advantages and disadvantages of the MPP architecture for this type of problem are discussed. The nearest neighbor connectivity of the MPP array does not pose a significant obstacle. Future MPP-like machines should have much more local memory and easier access to staging memory and disks in order to be effective for this type of problem.

  10. Massively parallel cis-regulatory analysis in the mammalian central nervous system.

    PubMed

    Shen, Susan Q; Myers, Connie A; Hughes, Andrew E O; Byrne, Leah C; Flannery, John G; Corbo, Joseph C

    2016-02-01

    Cis-regulatory elements (CREs, e.g., promoters and enhancers) regulate gene expression, and variants within CREs can modulate disease risk. Next-generation sequencing has enabled the rapid generation of genomic data that predict the locations of CREs, but a bottleneck lies in functionally interpreting these data. To address this issue, massively parallel reporter assays (MPRAs) have emerged, in which barcoded reporter libraries are introduced into cells, and the resulting barcoded transcripts are quantified by next-generation sequencing. Thus far, MPRAs have been largely restricted to assaying short CREs in a limited repertoire of cultured cell types. Here, we present two advances that extend the biological relevance and applicability of MPRAs. First, we adapt exome capture technology to instead capture candidate CREs, thereby tiling across the targeted regions and markedly increasing the length of CREs that can be readily assayed. Second, we package the library into adeno-associated virus (AAV), thereby allowing delivery to target organs in vivo. As a proof of concept, we introduce a capture library of about 46,000 constructs, corresponding to roughly 3500 DNase I hypersensitive (DHS) sites, into the mouse retina by ex vivo plasmid electroporation and into the mouse cerebral cortex by in vivo AAV injection. We demonstrate tissue-specific cis-regulatory activity of DHSs and provide examples of high-resolution truncation mutation analysis for multiplex parsing of CREs. Our approach should enable massively parallel functional analysis of a wide range of CREs in any organ or species that can be infected by AAV, such as nonhuman primates and human stem cell-derived organoids.

  11. Architecture for next-generation massively parallel maskless lithography system (MPML2)

    NASA Astrophysics Data System (ADS)

    Su, Ming-Shing; Tsai, Kuen-Yu; Lu, Yi-Chang; Kuo, Yu-Hsuan; Pei, Ting-Hang; Yen, Jia-Yush

    2010-03-01

    Electron-beam lithography is promising for future manufacturing technology because it does not suffer from wavelength limits set by light sources. Since single electron-beam lithography systems have a common problem in throughput, a multi-electron-beam lithography (MEBL) system should be a feasible alternative using the concept of massive parallelism. In this paper, we evaluate the advantages and the disadvantages of different MEBL system architectures, and propose our novel Massively Parallel MaskLess Lithography System, MPML2. MPML2 system is targeting for cost-effective manufacturing at the 32nm node and beyond. The key structure of the proposed system is its beamlet array cells (BACs). Hundreds of BACs are uniformly arranged over the whole wafer area in the proposed system. Each BAC has a data processor and an array of beamlets, and each beamlet consists of an electron-beam source, a source controller, a set of electron lenses, a blanker, a deflector, and an electron detector. These essential parts of beamlets are integrated using MEMS technology, which increases the density of beamlets and reduces the system cost. The data processor in the BAC processes layout information coming off-chamber and dispatches them to the corresponding beamlet to control its ON/OFF status. High manufacturing cost of masks is saved in maskless lithography systems, however, immense mask data are needed to be handled and transmitted. Therefore, data compression technique is applied to reduce required transmission bandwidth. The compression algorithm is fast and efficient so that the real-time decoder can be implemented on-chip. Consequently, the proposed MPML2 can achieve 10 wafers per hour (WPH) throughput for 300mm-wafer systems.

  12. Massively parallel simulation with DOE's ASCI supercomputers : an overview of the Los Alamos Crestone project

    SciTech Connect

    Weaver, R. P.; Gittings, M. L.

    2004-01-01

    The Los Alamos Crestone Project is part of the Department of Energy's (DOE) Accelerated Strategic Computing Initiative, or ASCI Program. The main goal of this software development project is to investigate the use of continuous adaptive mesh refinement (CAMR) techniques for application to problems of interest to the Laboratory. There are many code development efforts in the Crestone Project, both unclassified and classified codes. In this overview I will discuss the unclassified SAGE and the RAGE codes. The SAGE (SAIC adaptive grid Eulerian) code is a one-, two-, and three-dimensional multimaterial Eulerian massively parallel hydrodynamics code for use in solving a variety of high-deformation flow problems. The RAGE CAMR code is built from the SAGE code by adding various radiation packages, improved setup utilities and graphics packages and is used for problems in which radiation transport of energy is important. The goal of these massively-parallel versions of the codes is to run extremely large problems in a reasonable amount of calendar time. Our target is scalable performance to {approx}10,000 processors on a 1 billion CAMR computational cell problem that requires hundreds of variables per cell, multiple physics packages (e.g. radiation and hydrodynamics), and implicit matrix solves for each cycle. A general description of the RAGE code has been published in [l],[ 2], [3] and [4]. Currently, the largest simulations we do are three-dimensional, using around 500 million computation cells and running for literally months of calendar time using {approx}2000 processors. Current ASCI platforms range from several 3-teraOPS supercomputers to one 12-teraOPS machine at Lawrence Livermore National Laboratory, the White machine, and one 20-teraOPS machine installed at Los Alamos, the Q machine. Each machine is a system comprised of many component parts that must perform in unity for the successful run of these simulations. Key features of any massively parallel system

  13. Integration Architecture of Content Addressable Memory and Massive-Parallel Memory-Embedded SIMD Matrix for Versatile Multimedia Processor

    NASA Astrophysics Data System (ADS)

    Kumaki, Takeshi; Ishizaki, Masakatsu; Koide, Tetsushi; Mattausch, Hans Jürgen; Kuroda, Yasuto; Gyohten, Takayuki; Noda, Hideyuki; Dosaka, Katsumi; Arimoto, Kazutami; Saito, Kazunori

    This paper presents an integration architecture of content addressable memory (CAM) and a massive-parallel memory-embedded SIMD matrix for constructing a versatile multimedia processor. The massive-parallel memory-embedded SIMD matrix has 2,048 2-bit processing elements, which are connected by a flexible switching network, and supports 2-bit 2,048-way bit-serial and word-parallel operations with a single command. The SIMD matrix architecture is verified to be a better way for processing the repeated arithmetic operation types in multimedia applications. The proposed architecture, reported in this paper, exploits in addition CAM technology and enables therefore fast pipelined table-lookup coding operations. Since both arithmetic and table-lookup operations execute extremely fast, the proposed novel architecture can realize consequently efficient and versatile multimedia data processing. Evaluation results of the proposed CAM-enhanced massive-parallel SIMD matrix processor for the example of the frequently used JPEG image-compression application show that the necessary clock cycle number can be reduced by 86% in comparison to a conventional mobile DSP architecture. The determined performances in Mpixel/mm2 are factors 3.3 and 4.4 better than with a CAM-less massive-parallel memory-embedded SIMD matrix processor and a conventional mobile DSP, respectively.

  14. The divide-expand-consolidate MP2 scheme goes massively parallel

    NASA Astrophysics Data System (ADS)

    Kristensen, Kasper; Kjærgaard, Thomas; Høyvik, Ida-Marie; Ettenhuber, Patrick; Jørgensen, Poul; Jansik, Branislav; Reine, Simen; Jakowski, Jacek

    2013-07-01

    For large molecular systems conventional implementations of second order Møller-Plesset (MP2) theory encounter a scaling wall, both memory- and time-wise. We describe how this scaling wall can be removed. We present a massively parallel algorithm for calculating MP2 energies and densities using the divide-expand-consolidate scheme where a calculation on a large system is divided into many small fragment calculations employing local orbital spaces. The resulting algorithm is linear-scaling with system size, exhibits near perfect parallel scalability, removes memory bottlenecks and does not involve any I/O. The algorithm employs three levels of parallelisation combined via a dynamic job distribution scheme. Results on two molecular systems containing 528 and 1056 atoms (4278 and 8556 basis functions) using 47,120 and 94,240 cores are presented. The results demonstrate the scalability of the algorithm both with respect to the number of cores and with respect to system size. The presented algorithm is thus highly suited for large super computer architectures and allows MP2 calculations on large molecular systems to be carried out within a few hours - for example, the correlated calculation on the molecular system containing 1056 atoms took 2.37 hours using 94240 cores.

  15. CRISPR–Cas9-targeted fragmentation and selective sequencing enable massively parallel microsatellite analysis

    PubMed Central

    Shin, GiWon; Grimes, Susan M.; Lee, HoJoon; Lau, Billy T.; Xia, Li C.; Ji, Hanlee P.

    2017-01-01

    Microsatellites are multi-allelic and composed of short tandem repeats (STRs) with individual motifs composed of mononucleotides, dinucleotides or higher including hexamers. Next-generation sequencing approaches and other STR assays rely on a limited number of PCR amplicons, typically in the tens. Here, we demonstrate STR-Seq, a next-generation sequencing technology that analyses over 2,000 STRs in parallel, and provides the accurate genotyping of microsatellites. STR-Seq employs in vitro CRISPR–Cas9-targeted fragmentation to produce specific DNA molecules covering the complete microsatellite sequence. Amplification-free library preparation provides single molecule sequences without unique molecular barcodes. STR-selective primers enable massively parallel, targeted sequencing of large STR sets. Overall, STR-Seq has higher throughput, improved accuracy and provides a greater number of informative haplotypes compared with other microsatellite analysis approaches. With these new features, STR-Seq can identify a 0.1% minor genome fraction in a DNA mixture composed of different, unrelated samples. PMID:28169275

  16. GPAW - massively parallel electronic structure calculations with Python-based software.

    SciTech Connect

    Enkovaara, J.; Romero, N.; Shende, S.; Mortensen, J.

    2011-01-01

    Electronic structure calculations are a widely used tool in materials science and large consumer of supercomputing resources. Traditionally, the software packages for these kind of simulations have been implemented in compiled languages, where Fortran in its different versions has been the most popular choice. While dynamic, interpreted languages, such as Python, can increase the effciency of programmer, they cannot compete directly with the raw performance of compiled languages. However, by using an interpreted language together with a compiled language, it is possible to have most of the productivity enhancing features together with a good numerical performance. We have used this approach in implementing an electronic structure simulation software GPAW using the combination of Python and C programming languages. While the chosen approach works well in standard workstations and Unix environments, massively parallel supercomputing systems can present some challenges in porting, debugging and profiling the software. In this paper we describe some details of the implementation and discuss the advantages and challenges of the combined Python/C approach. We show that despite the challenges it is possible to obtain good numerical performance and good parallel scalability with Python based software.

  17. Genetic algorithm based task reordering to improve the performance of batch scheduled massively parallel scientific applications

    DOE PAGES

    Sankaran, Ramanan; Angel, Jordan; Brown, W. Michael

    2015-04-08

    The growth in size of networked high performance computers along with novel accelerator-based node architectures has further emphasized the importance of communication efficiency in high performance computing. The world's largest high performance computers are usually operated as shared user facilities due to the costs of acquisition and operation. Applications are scheduled for execution in a shared environment and are placed on nodes that are not necessarily contiguous on the interconnect. Furthermore, the placement of tasks on the nodes allocated by the scheduler is sub-optimal, leading to performance loss and variability. Here, we investigate the impact of task placement on themore » performance of two massively parallel application codes on the Titan supercomputer, a turbulent combustion flow solver (S3D) and a molecular dynamics code (LAMMPS). Benchmark studies show a significant deviation from ideal weak scaling and variability in performance. The inter-task communication distance was determined to be one of the significant contributors to the performance degradation and variability. A genetic algorithm-based parallel optimization technique was used to optimize the task ordering. This technique provides an improved placement of the tasks on the nodes, taking into account the application's communication topology and the system interconnect topology. As a result, application benchmarks after task reordering through genetic algorithm show a significant improvement in performance and reduction in variability, therefore enabling the applications to achieve better time to solution and scalability on Titan during production.« less

  18. Genetic algorithm based task reordering to improve the performance of batch scheduled massively parallel scientific applications

    SciTech Connect

    Sankaran, Ramanan; Angel, Jordan; Brown, W. Michael

    2015-04-08

    The growth in size of networked high performance computers along with novel accelerator-based node architectures has further emphasized the importance of communication efficiency in high performance computing. The world's largest high performance computers are usually operated as shared user facilities due to the costs of acquisition and operation. Applications are scheduled for execution in a shared environment and are placed on nodes that are not necessarily contiguous on the interconnect. Furthermore, the placement of tasks on the nodes allocated by the scheduler is sub-optimal, leading to performance loss and variability. Here, we investigate the impact of task placement on the performance of two massively parallel application codes on the Titan supercomputer, a turbulent combustion flow solver (S3D) and a molecular dynamics code (LAMMPS). Benchmark studies show a significant deviation from ideal weak scaling and variability in performance. The inter-task communication distance was determined to be one of the significant contributors to the performance degradation and variability. A genetic algorithm-based parallel optimization technique was used to optimize the task ordering. This technique provides an improved placement of the tasks on the nodes, taking into account the application's communication topology and the system interconnect topology. As a result, application benchmarks after task reordering through genetic algorithm show a significant improvement in performance and reduction in variability, therefore enabling the applications to achieve better time to solution and scalability on Titan during production.

  19. Rigid body constraints realized in massively-parallel molecular dynamics on graphics processing units

    NASA Astrophysics Data System (ADS)

    Nguyen, Trung Dac; Phillips, Carolyn L.; Anderson, Joshua A.; Glotzer, Sharon C.

    2011-11-01

    Molecular dynamics (MD) methods compute the trajectory of a system of point particles in response to a potential function by numerically integrating Newton's equations of motion. Extending these basic methods with rigid body constraints enables composite particles with complex shapes such as anisotropic nanoparticles, grains, molecules, and rigid proteins to be modeled. Rigid body constraints are added to the GPU-accelerated MD package, HOOMD-blue, version 0.10.0. The software can now simulate systems of particles, rigid bodies, or mixed systems in microcanonical (NVE), canonical (NVT), and isothermal-isobaric (NPT) ensembles. It can also apply the FIRE energy minimization technique to these systems. In this paper, we detail the massively parallel scheme that implements these algorithms and discuss how our design is tuned for the maximum possible performance. Two different case studies are included to demonstrate the performance attained, patchy spheres and tethered nanorods. In typical cases, HOOMD-blue on a single GTX 480 executes 2.5-3.6 times faster than LAMMPS executing the same simulation on any number of CPU cores in parallel. Simulations with rigid bodies may now be run with larger systems and for longer time scales on a single workstation than was previously even possible on large clusters.

  20. Massively parallel electrical conductivity imaging of the subsurface: Applications to hydrocarbon exploration

    SciTech Connect

    Newman, G.A.; Commer, M.

    2009-06-01

    Three-dimensional (3D) geophysical imaging is now receiving considerable attention for electrical conductivity mapping of potential offshore oil and gas reservoirs. The imaging technology employs controlled source electromagnetic (CSEM) and magnetotelluric (MT) fields and treats geological media exhibiting transverse anisotropy. Moreover when combined with established seismic methods, direct imaging of reservoir fluids is possible. Because of the size of the 3D conductivity imaging problem, strategies are required exploiting computational parallelism and optimal meshing. The algorithm thus developed has been shown to scale to tens of thousands of processors. In one imaging experiment, 32,768 tasks/processors on the IBM Watson Research Blue Gene/L supercomputer were successfully utilized. Over a 24 hour period we were able to image a large scale field data set that previously required over four months of processing time on distributed clusters based on Intel or AMD processors utilizing 1024 tasks on an InfiniBand fabric. Electrical conductivity imaging using massively parallel computational resources produces results that cannot be obtained otherwise and are consistent with timeframes required for practical exploration problems.

  1. GPU-accelerated Tersoff potentials for massively parallel Molecular Dynamics simulations

    NASA Astrophysics Data System (ADS)

    Nguyen, Trung Dac

    2017-03-01

    The Tersoff potential is one of the empirical many-body potentials that has been widely used in simulation studies at atomic scales. Unlike pair-wise potentials, the Tersoff potential involves three-body terms, which require much more arithmetic operations and data dependency. In this contribution, we have implemented the GPU-accelerated version of several variants of the Tersoff potential for LAMMPS, an open-source massively parallel Molecular Dynamics code. Compared to the existing MPI implementation in LAMMPS, the GPU implementation exhibits a better scalability and offers a speedup of 2.2X when run on 1000 compute nodes on the Titan supercomputer. On a single node, the speedup ranges from 2.0 to 8.0 times, depending on the number of atoms per GPU and hardware configurations. The most notable features of our GPU-accelerated version include its design for MPI/accelerator heterogeneous parallelism, its compatibility with other functionalities in LAMMPS, its ability to give deterministic results and to support both NVIDIA CUDA- and OpenCL-enabled accelerators. Our implementation is now part of the GPU package in LAMMPS and accessible for public use.

  2. Parametric Study of CO2 Sequestration in Geologic Media Using the Massively Parallel Computer Code PFLOTRAN

    NASA Astrophysics Data System (ADS)

    Lu, C.; Lichtner, P. C.; Tsimpanogiannis, I. N.

    2005-12-01

    Uncontrolled release of CO2 to the atmosphere has been identified as a major contributing source to the global warming problem. Significant research efforts from the international scientific community are targeted towards stabilization/reduction of CO2 concentrations in the atmosphere while attempting to satisfy our continuously increasing needs for energy. CO2 sequestration (capture, separation, and long term storage) in various media (e.g. geologic such as depleted oil reservoirs, saline aquifers, etc.; oceanic at different depths) has been considered as a possible solution to reduce green house gas emissions. In this study we utilize the PFLOTRAN simulator to investigate geologic sequestration of CO2. PFLOTRAN is a massively parallel 3-D reservoir simulator for modeling supercritical CO2 sequestration in geologic formations based on continuum scale mass and energy conservations. The mass and energy equations are sequentially coupled to reactive transport equations describing multi-component chemical reactions within the formation including aqueous speciation, and precipitation and dissolution of minerals to describe aqueous and mineral CO2 sequestration. The effect of the injected CO2 on pH, CO2 concentration within the aqueous phase, mineral stability, and other factors can be evaluated with this model. Parallelization is carried out using the PETSc parallel library package based on MPI providing a high parallel efficiency and allowing simulations with several tens of millions of degrees of freedom to be carried out-ideal for large-scale field applications involving multi-component chemistry. In this work, our main focus is a parametrical examination on the effects of reservoir and fluid properties on the sequestration process, such as permeability and capillary pressure functions (e.g. linear, van Genuchten, etc.), diffusion coefficients in a multiphase system, the sensitivity of component solubility on pressure, temperature and mole fractions etc. Several

  3. Switching dynamics of thin film ferroelectric devices - a massively parallel phase field study

    NASA Astrophysics Data System (ADS)

    Ashraf, Md. Khalid

    In this thesis, we investigate the switching dynamics in thin film ferroelectrics. Ferroelectric materials are of inherent interest for low power and multi-functional devices. However, possible device applications of these materials have been limited due to the poorly understood electromagnetic and mechanical response at the nanoscale in arbitrary device structures. The difficulty in understanding switching dynamics mainly arises from the presence of features at multiple length scales and the nonlinearity associated with the strongly coupled states. For example, in a ferroelectric material, the domain walls are of nm size whereas the domain pattern forms at micron scale. The switching is determined by coupled chemical, electrostatic, mechanical and thermal interactions. Thus computational understanding of switching dynamics in thin film ferroelectrics and a direct comparison with experiment poses a significant numerical challenge. We have developed a phase field model that describes the physics of polarization dynamics at the microscopic scale. A number of efficient numerical methods have been applied for achieving massive parallelization of all the calculation steps. Conformally mapped elements, node wise assembly and prevention of dynamic loading minimized the communication between processors and increased the parallelization efficiency. With these improvements, we have reached the experimental scale - a significant step forward compared to the state of the art thin film ferroelectric switching dynamics models. Using this model, we elucidated the switching dynamics on multiple surfaces of the multiferroic material BFO. We also calculated the switching energy of scaled BFO islands. Finally, we studied the interaction of domain wall propagation with misfit dislocations in the thin film. We believe that the model will be useful in understanding the switching dynamics in many different experimental setups incorporating thin film ferroelectrics.

  4. Hierarchical Image Segmentation of Remotely Sensed Data using Massively Parallel GNU-LINUX Software

    NASA Technical Reports Server (NTRS)

    Tilton, James C.

    2003-01-01

    A hierarchical set of image segmentations is a set of several image segmentations of the same image at different levels of detail in which the segmentations at coarser levels of detail can be produced from simple merges of regions at finer levels of detail. In [1], Tilton, et a1 describes an approach for producing hierarchical segmentations (called HSEG) and gave a progress report on exploiting these hierarchical segmentations for image information mining. The HSEG algorithm is a hybrid of region growing and constrained spectral clustering that produces a hierarchical set of image segmentations based on detected convergence points. In the main, HSEG employs the hierarchical stepwise optimization (HSWO) approach to region growing, which was described as early as 1989 by Beaulieu and Goldberg. The HSWO approach seeks to produce segmentations that are more optimized than those produced by more classic approaches to region growing (e.g. Horowitz and T. Pavlidis, [3]). In addition, HSEG optionally interjects between HSWO region growing iterations, merges between spatially non-adjacent regions (i.e., spectrally based merging or clustering) constrained by a threshold derived from the previous HSWO region growing iteration. While the addition of constrained spectral clustering improves the utility of the segmentation results, especially for larger images, it also significantly increases HSEG s computational requirements. To counteract this, a computationally efficient recursive, divide-and-conquer, implementation of HSEG (RHSEG) was devised, which includes special code to avoid processing artifacts caused by RHSEG s recursive subdivision of the image data. The recursive nature of RHSEG makes for a straightforward parallel implementation. This paper describes the HSEG algorithm, its recursive formulation (referred to as RHSEG), and the implementation of RHSEG using massively parallel GNU-LINUX software. Results with Landsat TM data are included comparing RHSEG with classic

  5. PORTA: A Massively Parallel Code for 3D Non-LTE Polarized Radiative Transfer

    NASA Astrophysics Data System (ADS)

    Štěpán, J.

    2014-10-01

    The interpretation of the Stokes profiles of the solar (stellar) spectral line radiation requires solving a non-LTE radiative transfer problem that can be very complex, especially when the main interest lies in modeling the linear polarization signals produced by scattering processes and their modification by the Hanle effect. One of the main difficulties is due to the fact that the plasma of a stellar atmosphere can be highly inhomogeneous and dynamic, which implies the need to solve the non-equilibrium problem of generation and transfer of polarized radiation in realistic three-dimensional stellar atmospheric models. Here we present PORTA, a computer program we have developed for solving, in three-dimensional (3D) models of stellar atmospheres, the problem of the generation and transfer of spectral line polarization taking into account anisotropic radiation pumping and the Hanle and Zeeman effects in multilevel atoms. The numerical method of solution is based on a highly convergent iterative algorithm, whose convergence rate is insensitive to the grid size, and on an accurate short-characteristics formal solver of the Stokes-vector transfer equation which uses monotonic Bezier interpolation. In addition to the iterative method and the 3D formal solver, another important feature of PORTA is a novel parallelization strategy suitable for taking advantage of massively parallel computers. Linear scaling of the solution with the number of processors allows to reduce the solution time by several orders of magnitude. We present useful benchmarks and a few illustrations of applications using a 3D model of the solar chromosphere resulting from MHD simulations. Finally, we present our conclusions with a view to future research. For more details see Štěpán & Trujillo Bueno (2013).

  6. High-order accurate solution of the incompressible Navier-Stokes equations on massively parallel computers

    NASA Astrophysics Data System (ADS)

    Henniger, R.; Obrist, D.; Kleiser, L.

    2010-05-01

    The emergence of "petascale" supercomputers requires us to develop today's simulation codes for (incompressible) flows by codes which are using numerical schemes and methods that are better able to exploit the offered computational power. In that spirit, we present a massively parallel high-order Navier-Stokes solver for large incompressible flow problems in three dimensions. The governing equations are discretized with finite differences in space and a semi-implicit time integration scheme. This discretization leads to a large linear system of equations which is solved with a cascade of iterative solvers. The iterative solver for the pressure uses a highly efficient commutation-based preconditioner which is robust with respect to grid stretching. The efficiency of the implementation is further enhanced by carefully setting the (adaptive) termination criteria for the different iterative solvers. The computational work is distributed to different processing units by a geometric data decomposition in all three dimensions. This decomposition scheme ensures a low communication overhead and excellent scaling capabilities. The discretization is thoroughly validated. First, we verify the convergence orders of the spatial and temporal discretizations for a forced channel flow. Second, we analyze the iterative solution technique by investigating the absolute accuracy of the implementation with respect to the different termination criteria. Third, Orr-Sommerfeld and Squire eigenmodes for plane Poiseuille flow are simulated and compared to analytical results. Fourth, the practical applicability of the implementation is tested for transitional and turbulent channel flow. The results are compared to solutions from a pseudospectral solver. Subsequently, the performance of the commutation-based preconditioner for the pressure iteration is demonstrated. Finally, the excellent parallel scalability of the proposed method is demonstrated with a weak and a strong scaling test on up to

  7. Massively parallel computation of absolute binding free energy with well-equilibrated states

    NASA Astrophysics Data System (ADS)

    Fujitani, Hideaki; Tanida, Yoshiaki; Matsuura, Azuma

    2009-02-01

    A force field formulator for organic molecules (FF-FOM) was developed to assign bond, angle, and dihedral parameters to arbitrary organic molecules in a unified manner including proteins and nucleic acids. With the unified force field parametrization we performed massively parallel computations of absolute binding free energies for pharmaceutical target proteins and ligands. Compared with the previous calculation with the ff99 force field in the Amber simulation package (Amber99) and the ligand charges produced by the Austin Model 1 bond charge correction (AM1-BCC), the unified parametrization gave better absolute binding energies for the FK506 binding protein (FKBP) and ligand system. Our method is based on extensive work measurement between thermodynamic states to calculate the free energy difference and it is also the same as the traditional free energy perturbation. There are important requirements for accurate calculations. The first is a well-equilibrated bound structure including the conformational change of the protein induced by the binding of the ligand. The second requirement is the convergence of the work distribution with a sufficient number of trajectories and dense spacing of the coupling constant between the ligand and the rest of the system. Finally, the most important requirement is the force field parametrization.

  8. Unique archaeal assemblages in the Arctic Ocean unveiled by massively parallel tag sequencing.

    PubMed

    Galand, Pierre E; Casamayor, Emilio O; Kirchman, David L; Potvin, Marianne; Lovejoy, Connie

    2009-07-01

    The Arctic Ocean plays a critical role in controlling nutrient budgets between the Pacific and Atlantic Ocean. Archaea are key players in the nitrogen cycle and in cycling nutrients, but their community composition has been little studied in the Arctic Ocean. Here, we characterize archaeal assemblages from surface and deep Arctic water masses using massively parallel tag sequencing of the V6 region of the 16S rRNA gene. This approach gave a very high coverage of the natural communities, allowing a precise description of archaeal assemblages. This first taxonomic description of archaeal communities by tag sequencing reported so far shows that it is possible to assign an identity below phylum level to most (95%) of the archaeal V6 tags, and shows that tag sequencing is a powerful tool for resolving the diversity and distribution of specific microbes in the environment. Marine group I Crenarchaeota was overall the most abundant group in the Arctic Ocean and comprised between 27% and 63% of all tags. Group III Euryarchaeota were more abundant in deep-water masses and represented the largest archaeal group in the deep Atlantic layer of the central Arctic Ocean. Coastal surface waters, in turn, harbored more group II Euryarchaeota. Moreover, group II sequences that dominated surface waters were different from the group II sequences detected in deep waters, suggesting functional differences in closely related groups. Our results unveiled for the first time an archaeal community dominated by group III Euryarchaeota and show biogeographical traits for marine Arctic Archaea.

  9. Photo-patterned free-standing hydrogel microarrays for massively parallel protein analysis

    NASA Astrophysics Data System (ADS)

    Duncombe, Todd A.; Herr, Amy E.

    2015-03-01

    Microfluidic technologies have largely been realized within enclosed microchannels. While powerful, a principle limitation of closed-channel microfluidics is the difficulty for sample extraction and downstream processing. To address this limitation and expand the utility of microfluidic analytical separation tools, we developed an openchannel hydrogel architecture for rapid protein analysis. Designed for compatibility with slab-gel polyacrylamide gel electrophoresis (PAGE) reagents and instruments, we detail the development of free-standing polyacrylamide gel (fsPAG) microstructures supporting electrophoretic performance rivalling that of microfluidic platforms. Owing to its open architecture - the platform can be easily interfaced with automated robotic controllers and downstream processing (e.g., sample spotters, immunological probing, mass spectroscopy). The fsPAG devices are directly photopatterened atop of and covalently attached to planar polymer or glass surfaces. Due to the fast < 1 hr design-prototype-test cycle - significantly faster than mold based fabrication techniques - rapid prototyping devices with fsPAG microstructures provides researchers a powerful tool for developing custom analytical assays. Leveraging the rapid prototyping benefits - we up-scale from a unit separation to an array of 96 concurrent fsPAGE assays in 10 min run time driven by one electrode pair. The fsPAGE platform is uniquely well-suited for massively parallelized proteomics, a major unrealized goal from bioanalytical technology.

  10. Mitochondrial DNA heteroplasmy in the emerging field of massively parallel sequencing

    PubMed Central

    Just, Rebecca S.; Irwin, Jodi A.; Parson, Walther

    2015-01-01

    Long an important and useful tool in forensic genetic investigations, mitochondrial DNA (mtDNA) typing continues to mature. Research in the last few years has demonstrated both that data from the entire molecule will have practical benefits in forensic DNA casework, and that massively parallel sequencing (MPS) methods will make full mitochondrial genome (mtGenome) sequencing of forensic specimens feasible and cost-effective. A spate of recent studies has employed these new technologies to assess intraindividual mtDNA variation. However, in several instances, contamination and other sources of mixed mtDNA data have been erroneously identified as heteroplasmy. Well vetted mtGenome datasets based on both Sanger and MPS sequences have found authentic point heteroplasmy in approximately 25% of individuals when minor component detection thresholds are in the range of 10–20%, along with positional distribution patterns in the coding region that differ from patterns of point heteroplasmy in the well-studied control region. A few recent studies that examined very low-level heteroplasmy are concordant with these observations when the data are examined at a common level of resolution. In this review we provide an overview of considerations related to the use of MPS technologies to detect mtDNA heteroplasmy. In addition, we examine published reports on point heteroplasmy to characterize features of the data that will assist in the evaluation of future mtGenome data developed by any typing method. PMID:26009256

  11. Breast cancer genomics from microarrays to massively parallel sequencing: paradigms and new insights.

    PubMed

    Ng, Charlotte K Y; Schultheis, Anne M; Bidard, Francois-Clement; Weigelt, Britta; Reis-Filho, Jorge S

    2015-02-23

    Rapid advancements in massively parallel sequencing methods have enabled the analysis of breast cancer genomes at an unprecedented resolution, which have revealed the remarkable heterogeneity of the disease. As a result, we now accept that despite originating in the breast, estrogen receptor (ER)-positive and ER-negative breast cancers are completely different diseases at the molecular level. It has become apparent that there are very few highly recurrently mutated genes such as TP53, PIK3CA, and GATA3, that no two breast cancers display an identical repertoire of somatic genetic alterations at base-pair resolution and that there might not be a single highly recurrently mutated gene that defines each of the "intrinsic" subtypes of breast cancer (ie, basal-like, HER2-enriched, luminal A, and luminal B). Breast cancer heterogeneity, however, extends beyond the diversity between tumors. There is burgeoning evidence to demonstrate that at least some primary breast cancers are composed of multiple, genetically diverse clones at diagnosis and that metastatic lesions may differ in their repertoire of somatic genetic alterations when compared with their respective primary tumors. Several biological phenomena may shape the reported intratumor genetic heterogeneity observed in breast cancers, including the different mutational processes and multiple types of genomic instability. Harnessing the emerging concepts of the diversity of breast cancer genomes and the phenomenon of intratumor genetic heterogeneity will be essential for the development of optimal methods for diagnosis, disease monitoring, and the matching of patients to the drugs that would benefit them the most.

  12. Massively parallel network architectures for automatic recognition of visual speech signals. Final technical report

    SciTech Connect

    Sejnowski, T.J.; Goldstein, M.

    1990-01-01

    This research sought to produce a massively-parallel network architecture that could interpret speech signals from video recordings of human talkers. This report summarizes the project's results: (1) A corpus of video recordings from two human speakers was analyzed with image processing techniques and used as the data for this study; (2) We demonstrated that a feed forward network could be trained to categorize vowels from these talkers. The performance was comparable to that of the nearest neighbors techniques and to trained humans on the same data; (3) We developed a novel approach to sensory fusion by training a network to transform from facial images to short-time spectral amplitude envelopes. This information can be used to increase the signal-to-noise ratio and hence the performance of acoustic speech recognition systems in noisy environments; (4) We explored the use of recurrent networks to perform the same mapping for continuous speech. Results of this project demonstrate the feasibility of adding a visual speech recognition component to enhance existing speech recognition systems. Such a combined system could be used in noisy environments, such as cockpits, where improved communication is needed. This demonstration of presymbolic fusion of visual and acoustic speech signals is consistent with our current understanding of human speech perception.

  13. GRay: A MASSIVELY PARALLEL GPU-BASED CODE FOR RAY TRACING IN RELATIVISTIC SPACETIMES

    SciTech Connect

    Chan, Chi-kwan; Psaltis, Dimitrios; Özel, Feryal

    2013-11-01

    We introduce GRay, a massively parallel integrator designed to trace the trajectories of billions of photons in a curved spacetime. This graphics-processing-unit (GPU)-based integrator employs the stream processing paradigm, is implemented in CUDA C/C++, and runs on nVidia graphics cards. The peak performance of GRay using single-precision floating-point arithmetic on a single GPU exceeds 300 GFLOP (or 1 ns per photon per time step). For a realistic problem, where the peak performance cannot be reached, GRay is two orders of magnitude faster than existing central-processing-unit-based ray-tracing codes. This performance enhancement allows more effective searches of large parameter spaces when comparing theoretical predictions of images, spectra, and light curves from the vicinities of compact objects to observations. GRay can also perform on-the-fly ray tracing within general relativistic magnetohydrodynamic algorithms that simulate accretion flows around compact objects. Making use of this algorithm, we calculate the properties of the shadows of Kerr black holes and the photon rings that surround them. We also provide accurate fitting formulae of their dependencies on black hole spin and observer inclination, which can be used to interpret upcoming observations of the black holes at the center of the Milky Way, as well as M87, with the Event Horizon Telescope.

  14. Climate systems modeling on massively parallel processing computers at Lawrence Livermore National Laboratory

    SciTech Connect

    Wehner, W.F.; Mirin, A.A.; Bolstad, J.H.

    1996-09-01

    A comprehensive climate system model is under development at Lawrence Livermore National Laboratory. The basis for this model is a consistent coupling of multiple complex subsystem models, each describing a major component of the Earth`s climate. Among these are general circulation models of the atmosphere and ocean, a dynamic and thermodynamic sea ice model, and models of the chemical processes occurring in the air, sea water, and near-surface land. The computational resources necessary to carry out simulations at adequate spatial resolutions for durations of climatic time scales exceed those currently available. Distributed memory massively parallel processing (MPP) computers promise to affordably scale to the computational rates required by directing large numbers of relatively inexpensive processors onto a single problem. We have developed a suite of routines designed to exploit current generation MPP architectures via domain and functional decomposition strategies. These message passing techniques have been implemented in each of the component models and in their coupling interfaces. Production runs of the atmospheric and oceanic components performed on the National Environmental Supercomputing Center (NESC) Cray T3D are described.

  15. Targeted massively parallel sequencing provides comprehensive genetic diagnosis for patients with disorders of sex development.

    PubMed

    Arboleda, V A; Lee, H; Sánchez, F J; Délot, E C; Sandberg, D E; Grody, W W; Nelson, S F; Vilain, E

    2013-01-01

    Disorders of sex development (DSD) are rare disorders in which there is discordance between chromosomal, gonadal, and phenotypic sex. Only a minority of patients clinically diagnosed with DSD obtains a molecular diagnosis, leaving a large gap in our understanding of the prevalence, management, and outcomes in affected patients. We created a novel DSD-genetic diagnostic tool, in which sex development genes are captured using RNA probes and undergo massively parallel sequencing. In the pilot group of 14 patients, we determined sex chromosome dosage, copy number variation, and gene mutations. In the patients with a known genetic diagnosis (obtained either on a clinical or research basis), this test identified the molecular cause in 100% (7/7) of patients. In patients in whom no molecular diagnosis had been made, this tool identified a genetic diagnosis in two of seven patients. Targeted sequencing of genes representing a specific spectrum of disorders can result in a higher rate of genetic diagnoses than current diagnostic approaches. Our DSD diagnostic tool provides for first time, in a single blood test, a comprehensive genetic diagnosis in patients presenting with a wide range of urogenital anomalies.

  16. CuBIC: cumulant based inference of higher-order correlations in massively parallel spike trains

    PubMed Central

    Rotter, Stefan; Grün, Sonja

    2009-01-01

    Recent developments in electrophysiological and optical recording techniques enable the simultaneous observation of large numbers of neurons. A meaningful interpretation of the resulting multivariate data, however, presents a serious challenge. In particular, the estimation of higher-order correlations that characterize the cooperative dynamics of groups of neurons is impeded by the combinatorial explosion of the parameter space. The resulting requirements with respect to sample size and recording time has rendered the detection of coordinated neuronal groups exceedingly difficult. Here we describe a novel approach to infer higher-order correlations in massively parallel spike trains that is less susceptible to these problems. Based on the superimposed activity of all recorded neurons, the cumulant-based inference of higher-order correlations (CuBIC) presented here exploits the fact that the absence of higher-order correlations imposes also strong constraints on correlations of lower order. Thus, estimates of only few lower-order cumulants suffice to infer higher-order correlations in the population. As a consequence, CuBIC is much better compatible with the constraints of in vivo recordings than previous approaches, which is shown by a systematic analysis of its parameter dependence. PMID:19862611

  17. An atlas of human gene expression from massively parallel signature sequencing (MPSS)

    PubMed Central

    Jongeneel, C. Victor; Delorenzi, Mauro; Iseli, Christian; Zhou, Daixing; Haudenschild, Christian D.; Khrebtukova, Irina; Kuznetsov, Dmitry; Stevenson, Brian J.; Strausberg, Robert L.; Simpson, Andrew J.G.; Vasicek, Thomas J.

    2005-01-01

    We have used massively parallel signature sequencing (MPSS) to sample the transcriptomes of 32 normal human tissues to an unprecedented depth, thus documenting the patterns of expression of almost 20,000 genes with high sensitivity and specificity. The data confirm the widely held belief that differences in gene expression between cell and tissue types are largely determined by transcripts derived from a limited number of tissue-specific genes, rather than by combinations of more promiscuously expressed genes. Expression of a little more than half of all known human genes seems to account for both the common requirements and the specific functions of the tissues sampled. A classification of tissues based on patterns of gene expression largely reproduces classifications based on anatomical and biochemical properties. The unbiased sampling of the human transcriptome achieved by MPSS supports the idea that most human genes have been mapped, if not functionally characterized. This data set should prove useful for the identification of tissue-specific genes, for the study of global changes induced by pathological conditions, and for the definition of a minimal set of genes necessary for basic cell maintenance. The data are available on the Web at http://mpss.licr.org and http://sgb.lynxgen.com. PMID:15998913

  18. Tracking the roots of cellulase hyperproduction by the fungus Trichoderma reesei using massively parallel DNA sequencing

    SciTech Connect

    Le Crom, Stphane; Schackwitz, Wendy; Pennacchiod, Len; Magnuson, Jon K.; Culley, David E.; Collett, James R.; Martin, Joel X.; Druzhinina, Irina S.; Mathis, Hugues; Monot, Frdric; Seiboth, Bernhard; Cherry, Barbara; Rey, Michael; Berka, Randy; Kubicek, Christian P.; Baker, Scott E.; Margeot, Antoine

    2009-09-22

    Trichoderma reesei (teleomorph Hypocrea jecorina) is the main industrial source of cellulases and hemicellulases harnessed for the hydrolysis of biomass to simple sugars, which can then be converted to biofuels, such as ethanol, and other chemicals. The highly productive strains in use today were generated by classical mutagenesis. To learn how cellulase production was improved by these techniques, we performed massively parallel sequencing to identify mutations in the genomes of two hyperproducing strains (NG14, and its direct improved descendant, RUT C30). We detected a surprisingly high number of mutagenic events: 223 single nucleotides variants, 15 small deletions or insertions and 18 larger deletions leading to the loss of more than 100 kb of genomic DNA. From these events we report previously undocumented non-synonymous mutations in 43 genes that are mainly involved in nuclear transport, mRNA stability, transcription, secretion/vacuolar targeting, and metabolism. This homogeneity of functional categories suggests that multiple changes are necessary to improve cellulase production and not simply a few clear-cut mutagenic events. Phenotype microarrays show that some of these mutations result in strong changes in the carbon assimilation pattern of the two mutants with respect to the wild type strain QM6a. Our analysis provides the first genome-wide insights into the changes induced by classical mutagenesis in a filamentous fungus, and suggests new areas for the generation of enhanced T. reesei strains for industrial applications such as biofuel production.

  19. Tracking the roots of cellulase hyperproduction by the fungus Trichoderma reesei using massively parallel DNA sequencing.

    PubMed

    Le Crom, Stéphane; Schackwitz, Wendy; Pennacchio, Len; Magnuson, Jon K; Culley, David E; Collett, James R; Martin, Joel; Druzhinina, Irina S; Mathis, Hugues; Monot, Frédéric; Seiboth, Bernhard; Cherry, Barbara; Rey, Michael; Berka, Randy; Kubicek, Christian P; Baker, Scott E; Margeot, Antoine

    2009-09-22

    Trichoderma reesei (teleomorph Hypocrea jecorina) is the main industrial source of cellulases and hemicellulases harnessed for the hydrolysis of biomass to simple sugars, which can then be converted to biofuels such as ethanol and other chemicals. The highly productive strains in use today were generated by classical mutagenesis. To learn how cellulase production was improved by these techniques, we performed massively parallel sequencing to identify mutations in the genomes of two hyperproducing strains (NG14, and its direct improved descendant, RUT C30). We detected a surprisingly high number of mutagenic events: 223 single nucleotides variants, 15 small deletions or insertions, and 18 larger deletions, leading to the loss of more than 100 kb of genomic DNA. From these events, we report previously undocumented non-synonymous mutations in 43 genes that are mainly involved in nuclear transport, mRNA stability, transcription, secretion/vacuolar targeting, and metabolism. This homogeneity of functional categories suggests that multiple changes are necessary to improve cellulase production and not simply a few clear-cut mutagenic events. Phenotype microarrays show that some of these mutations result in strong changes in the carbon assimilation pattern of the two mutants with respect to the wild-type strain QM6a. Our analysis provides genome-wide insights into the changes induced by classical mutagenesis in a filamentous fungus and suggests areas for the generation of enhanced T. reesei strains for industrial applications such as biofuel production.

  20. Massively parallel sequencing-based survey of eukaryotic community structures in Hiroshima Bay and Ishigaki Island.

    PubMed

    Nagai, Satoshi; Hida, Kohsuke; Urusizaki, Shingo; Takano, Yoshihito; Hongo, Yuki; Kameda, Takahiko; Abe, Kazuo

    2016-02-01

    In this study, we compared the eukaryote biodiversity between Hiroshima Bay and Ishigaki Island in Japanese coastal waters by using the massively parallel sequencing (MPS)-based technique to collect preliminary data. The relative abundance of Alveolata was highest in both localities, and the second highest groups were Stramenopiles, Opisthokonta, or Hacrobia, which varied depending on the samples considered. For microalgal phyla, the relative abundance of operational taxonomic units (OTUs) and the number of MPS were highest for Dinophyceae in both localities, followed by Bacillariophyceae in Hiroshima Bay, and by Bacillariophyceae or Chlorophyceae in Ishigaki Island. The number of detected OTUs in Hiroshima Bay and Ishigaki Island was 645 and 791, respectively, and 15.3% and 12.5% of the OTUs were common between the two localities. In the non-metric multidimensional scaling analysis, the samples from the two localities were plotted in different positions. In the dendrogram developed using similarity indices, the samples were clustered into different nodes based on localities with high multiscale bootstrap values, reflecting geographic differences in biodiversity. Thus, we succeeded in demonstrating biodiversity differences between the two localities, although the read numbers of the MPSs were not high enough. The corresponding analysis showed a clear seasonal change in the biodiversity of Hiroshima Bay but it was not clear in Ishigaki Island. Thus, the MPS-based technique shows a great advantage of high performance by detecting several hundreds of OTUs from a single sample, strongly suggesting the effectiveness to apply this technique to routine monitoring programs.

  1. Clinical and ethical considerations of massively parallel sequencing in transplantation science

    PubMed Central

    Scherer, Andreas

    2013-01-01

    Massively parallel sequencing (MPS), alias next-generation sequencing, is making its way from research laboratories into applied sciences and clinics. MPS is a framework of experimental procedures which offer possibilities for genome research and genetics which could only be dreamed of until around 2005 when these technologies became available. Sequencing of a transcriptome, exome, even entire genomes is now possible within a time frame and precision that we could only hope for 10 years ago. Linking other experimental procedures with MPS enables researchers to study secondary DNA modifications across the entire genome, and protein binding sites, to name a few applications. How the advancements of sequencing technologies can contribute to transplantation science is subject of this discussion: immediate applications are in graft matching via human leukocyte antigen sequencing, as part of systems biology approaches which shed light on gene expression processes during immune response, as biomarkers of graft rejection, and to explore changes of microbiomes as a result of transplantation. Of considerable importance is the socio-ethical aspect of data ownership, privacy, informed consent, and result report to the study participant. While the technology is advancing rapidly, legislation is lagging behind due to the globalisation of data requisition, banking and sharing. PMID:24392310

  2. Radiation hydrodynamics using characteristics on adaptive decomposed domains for massively parallel star formation simulations

    NASA Astrophysics Data System (ADS)

    Buntemeyer, Lars; Banerjee, Robi; Peters, Thomas; Klassen, Mikhail; Pudritz, Ralph E.

    2016-02-01

    We present an algorithm for solving the radiative transfer problem on massively parallel computers using adaptive mesh refinement and domain decomposition. The solver is based on the method of characteristics which requires an adaptive raytracer that integrates the equation of radiative transfer. The radiation field is split into local and global components which are handled separately to overcome the non-locality problem. The solver is implemented in the framework of the magneto-hydrodynamics code FLASH and is coupled by an operator splitting step. The goal is the study of radiation in the context of star formation simulations with a focus on early disc formation and evolution. This requires a proper treatment of radiation physics that covers both the optically thin as well as the optically thick regimes and the transition region in particular. We successfully show the accuracy and feasibility of our method in a series of standard radiative transfer problems and two 3D collapse simulations resembling the early stages of protostar and disc formation.

  3. Adaptive Flow Simulation of Turbulence in Subject-Specific Abdominal Aortic Aneurysm on Massively Parallel Computers

    NASA Astrophysics Data System (ADS)

    Sahni, Onkar; Jansen, Kenneth; Shephard, Mark; Taylor, Charles

    2007-11-01

    Flow within the healthy human vascular system is typically laminar but diseased conditions can alter the geometry sufficiently to produce transitional/turbulent flows in regions focal (and immediately downstream) of the diseased section. The mean unsteadiness (pulsatile or respiratory cycle) further complicates the situation making traditional turbulence simulation techniques (e.g., Reynolds-averaged Navier-Stokes simulations (RANSS)) suspect. At the other extreme, direct numerical simulation (DNS) while fully appropriate can lead to large computational expense, particularly when the simulations must be done quickly since they are intended to affect the outcome of a medical treatment (e.g., virtual surgical planning). To produce simulations in a clinically relevant time frame requires; 1) adaptive meshing technique that closely matches the desired local mesh resolution in all three directions to the highly anisotropic physical length scales in the flow, 2) efficient solution algorithms, and 3) excellent scaling on massively parallel computers. In this presentation we will demonstrate results for a subject-specific simulation of an abdominal aortic aneurysm using stabilized finite element method on anisotropically adapted meshes consisting of O(10^8) elements over O(10^4) processors.

  4. Comparison of Two Massively Parallel Sequencing Platforms using 83 Single Nucleotide Polymorphisms for Human Identification.

    PubMed

    Apaga, Dame Loveliness T; Dennis, Sheila E; Salvador, Jazelyn M; Calacal, Gayvelline C; De Ungria, Maria Corazon A

    2017-03-24

    The potential of Massively Parallel Sequencing (MPS) technology to vastly expand the capabilities of human identification led to the emergence of different MPS platforms that use forensically relevant genetic markers. Two of the MPS platforms that are currently available are the MiSeq(®) FGx™ Forensic Genomics System (Illumina) and the HID-Ion Personal Genome Machine (PGM)™ (Thermo Fisher Scientific). These are coupled with the ForenSeq™ DNA Signature Prep kit (Illumina) and the HID-Ion AmpliSeq™ Identity Panel (Thermo Fisher Scientific), respectively. In this study, we compared the genotyping performance of the two MPS systems based on 83 SNP markers that are present in both MPS marker panels. Results show that MiSeq(®) FGx™ has greater sample-to-sample variation than the HID-Ion PGM™ in terms of read counts for all the 83 SNP markers. Allele coverage ratio (ACR) values show generally balanced heterozygous reads for both platforms. Two and four SNP markers from the MiSeq(®) FGx™ and HID-Ion PGM™, respectively, have average ACR values lower than the recommended value of 0.67. Comparison of genotype calls showed 99.7% concordance between the two platforms.

  5. Massively parallel energy space exploration for uncluttered visualization of vascular structures.

    PubMed

    Jeon, Yongkweon; Won, Joong-Ho; Yoon, Sungroh

    2013-01-01

    Images captured using computed tomography and magnetic resonance angiography are used in the examination of the abdominal aorta and its branches. The examination of all clinically relevant branches simultaneously in a single 2-D image without any misleading overlaps facilitates the diagnosis of vascular abnormalities. This problem is called uncluttered single-image visualization (USIV). We can solve the USIV problem by assigning energy-based scores to visualization candidates and then finding the candidate that optimizes the score; this approach is similar to the manner in which the protein side-chain placement problem has been solved. To obtain near-optimum images, we need to explore the energy space extensively, which is often time consuming. This paper describes a method for exploring the energy space in a massively parallel fashion using graphics processing units. According to our experiments, in which we used 30 images obtained from five patients, the proposed method can reduce the total visualization time substantially. We believe that the proposed method can make a significant contribution to the effective visualization of abdominal vascular structures and precise diagnosis of related abnormalities.

  6. Development and validation of a clinical cancer genomic profiling test based on massively parallel DNA sequencing.

    PubMed

    Frampton, Garrett M; Fichtenholtz, Alex; Otto, Geoff A; Wang, Kai; Downing, Sean R; He, Jie; Schnall-Levin, Michael; White, Jared; Sanford, Eric M; An, Peter; Sun, James; Juhn, Frank; Brennan, Kristina; Iwanik, Kiel; Maillet, Ashley; Buell, Jamie; White, Emily; Zhao, Mandy; Balasubramanian, Sohail; Terzic, Selmira; Richards, Tina; Banning, Vera; Garcia, Lazaro; Mahoney, Kristen; Zwirko, Zac; Donahue, Amy; Beltran, Himisha; Mosquera, Juan Miguel; Rubin, Mark A; Dogan, Snjezana; Hedvat, Cyrus V; Berger, Michael F; Pusztai, Lajos; Lechner, Matthias; Boshoff, Chris; Jarosz, Mirna; Vietz, Christine; Parker, Alex; Miller, Vincent A; Ross, Jeffrey S; Curran, John; Cronin, Maureen T; Stephens, Philip J; Lipson, Doron; Yelensky, Roman

    2013-11-01

    As more clinically relevant cancer genes are identified, comprehensive diagnostic approaches are needed to match patients to therapies, raising the challenge of optimization and analytical validation of assays that interrogate millions of bases of cancer genomes altered by multiple mechanisms. Here we describe a test based on massively parallel DNA sequencing to characterize base substitutions, short insertions and deletions (indels), copy number alterations and selected fusions across 287 cancer-related genes from routine formalin-fixed and paraffin-embedded (FFPE) clinical specimens. We implemented a practical validation strategy with reference samples of pooled cell lines that model key determinants of accuracy, including mutant allele frequency, indel length and amplitude of copy change. Test sensitivity achieved was 95-99% across alteration types, with high specificity (positive predictive value >99%). We confirmed accuracy using 249 FFPE cancer specimens characterized by established assays. Application of the test to 2,221 clinical cases revealed clinically actionable alterations in 76% of tumors, three times the number of actionable alterations detected by current diagnostic tests.

  7. The use of targeted genomic capture and massively parallel sequencing in diagnosis of Chinese Leukoencephalopathies

    PubMed Central

    Wang, Xiaole; He, Fang; Yin, Fei; Chen, Chao; Wu, Liwen; Yang, Lifen; Peng, Jing

    2016-01-01

    Leukoencephalopathies are diseases with high clinical heterogeneity. In clinical work, it’s difficult for doctors to make a definite etiological diagnosis. Here, we designed a custom probe library which contains the known pathogenic genes reported to be associated with Leukoencephalopathies, and performed targeted gene capture and massively parallel sequencing (MPS) among 49 Chinese patients who has white matter damage as the main imaging changes, and made the validation by Sanger sequencing for the probands’ parents. As result, a total of 40.8% (20/49) of the patients identified pathogenic mutations, including four associated with metachromatic leukodystrophy, three associated with vanishing white matter leukoencephalopathy, three associated with mitochondrial complex I deficiency, one associated with Globoid cell leukodystrophy (or Krabbe diseases), three associated with megalencephalic leukoencephalopathy with subcortical cysts, two associated with Pelizaeus-Merzbacher disease, two associated with X-linked adrenoleukodystrophy, one associated with Zellweger syndrome and one associated with Alexander disease. Targeted capture and MPS enables to identify mutations of all classes causing leukoencephalopathy. Our study combines targeted capture and MPS technology with clinical and genetic diagnosis and highlights its usefulness for rapid and comprehensive genetic testing in the clinical setting. This method will also expand our knowledge of the genetic and clinical spectra of leukoencephalopathy. PMID:27779215

  8. Evaluation of a549 as a new vaccine cell substrate: digging deeper with massively parallel sequencing.

    PubMed

    Shabram, Paul; Kolman, John L

    2014-01-01

    In the past three decades, the use of tumorigenic cell substrates has been the topic of five Vaccine and Related Biological Products Advisory Committee (VRBPAC) meetings, including a review of the A549 cell line in September 2012. Over that period of time, major technological advances in biotechnology have improved our ability to assess the risk associated with using a tumorigenic cell line. As part of the September 2012 review, we assessed the history of A549 cells and evaluated the probable transforming event based on patterns of mutations to cancer genes. In addition, massively parallel sequencing was used to first screen then augment the characterization of A549 cells by searching for the presence of hidden viral threats using sequencing of the entire cellular transcriptome and comparing sequences to a curated viral sequence database. Based upon the combined results of next-generation sequencing technology along with standard cell characterization as outlined in published regulatory guidances, we believe that A549 cells pose no more risk than any other cell substrate for the manufacture of vaccines.

  9. ALEGRA -- A massively parallel h-adaptive code for solid dynamics

    SciTech Connect

    Summers, R.M.; Wong, M.K.; Boucheron, E.A.; Weatherby, J.R.

    1997-12-31

    ALEGRA is a multi-material, arbitrary-Lagrangian-Eulerian (ALE) code for solid dynamics designed to run on massively parallel (MP) computers. It combines the features of modern Eulerian shock codes, such as CTH, with modern Lagrangian structural analysis codes using an unstructured grid. ALEGRA is being developed for use on the teraflop supercomputers to conduct advanced three-dimensional (3D) simulations of shock phenomena important to a variety of systems. ALEGRA was designed with the Single Program Multiple Data (SPMD) paradigm, in which the mesh is decomposed into sub-meshes so that each processor gets a single sub-mesh with approximately the same number of elements. Using this approach the authors have been able to produce a single code that can scale from one processor to thousands of processors. A current major effort is to develop efficient, high precision simulation capabilities for ALEGRA, without the computational cost of using a global highly resolved mesh, through flexible, robust h-adaptivity of finite elements. H-adaptivity is the dynamic refinement of the mesh by subdividing elements, thus changing the characteristic element size and reducing numerical error. The authors are working on several major technical challenges that must be met to make effective use of HAMMER on MP computers.

  10. Whole genome characterization of hepatitis B virus quasispecies with massively parallel pyrosequencing.

    PubMed

    Li, F; Zhang, D; Li, Y; Jiang, D; Luo, S; Du, N; Chen, W; Deng, L; Zeng, C

    2015-03-01

    Viral quasispecies analysis is important for basic and clinical research. This study was designed to detect hepatitis B virus (HBV) genome-wide mutation profiling with detailed variant composition in individual patients, especially quasispecies evolution correlating with liver disease progression. We characterized viral populations by massively parallel pyrosequencing at whole HBV genome level in 17 patients with advanced liver disease (ALD) and 30 chronic carriers (CC). An average sequencing coverage of 2047× and 687× in ALD and CC groups, respectively, were achieved. Deep sequencing data resolved the landscapes of HBV substitutions and a more complicated quasispecies composition than previously observed. The values of substitution frequencies in quasispecies were clustered as either more than 80% or less than 20%, forming a unique U-shaped distribution pattern in both clinical groups. Furthermore, quantitative comparison of mutation frequencies of each site between two groups resulted in a spectrum of substitutions associated with liver disease progression, and among which, C2288A/T, C2304A, and A/G2525C/T were novel candidates. Moreover, distinct deletion patterns in preS, X, and C regions were shown between the two groups. In conclusion, pyrosequencing of the whole HBV genome revealed a panorama of viral quasispecies composition, characteristics of substitution distribution, and mutations correlating to severe liver disease.

  11. Novel Y-chromosome Short Tandem Repeat Variants Detected Through the Use of Massively Parallel Sequencing

    PubMed Central

    Warshauer, David H.; Churchill, Jennifer D.; Novroski, Nicole; King, Jonathan L.; Budowle, Bruce

    2015-01-01

    Massively parallel sequencing (MPS) technology is capable of determining the sizes of short tandem repeat (STR) alleles as well as their individual nucleotide sequences. Thus, single nucleotide polymorphisms (SNPs) within the repeat regions of STRs and variations in the pattern of repeat units in a given repeat motif can be used to differentiate alleles of the same length. In this study, MPS was used to sequence 28 forensically-relevant Y-chromosome STRs in a set of 41 DNA samples from the 3 major U.S. population groups (African Americans, Caucasians, and Hispanics). The resulting sequence data, which were analyzed with STRait Razor v2.0, revealed 37 unique allele sequence variants that have not been previously reported. Of these, 19 sequences were variations of documented sequences resulting from the presence of intra-repeat SNPs or alternative repeat unit patterns. Despite a limited sampling, two of the most frequently-observed variants were found only in African American samples. The remaining 18 variants represented allele sequences for which there were no published data with which to compare. These findings illustrate the great potential of MPS with regard to increasing the resolving power of STR typing and emphasize the need for sample population characterization of STR alleles. PMID:26391384

  12. Massively parallel LES of azimuthal thermo-acoustic instabilities in annular gas turbines

    NASA Astrophysics Data System (ADS)

    Wolf, P.; Staffelbach, G.; Roux, A.; Gicquel, L.; Poinsot, T.; Moureau, V.

    2009-06-01

    Increasingly stringent regulations and the need to tackle rising fuel prices have placed great emphasis on the design of aeronautical gas turbines, which are unfortunately more and more prone to combustion instabilities. In the particular field of annular combustion chambers, these instabilities often take the form of azimuthal modes. To predict these modes, one must compute the full combustion chamber, which remained out of reach until very recently and the development of massively parallel computers. In this article, full annular Large Eddy Simulations (LES) of two helicopter combustors, which differ only on the swirlers' design, are performed. In both computations, LES captures self-established rotating azimuthal modes. However, the two cases exhibit different thermo-acoustic responses and the resulting limit-cycles are different. With the first design, a self-excited strong instability develops, leading to pulsating flames and local flashback. In the second case, the flames are much less affected by the azimuthal mode and remain stable, allowing an acceptable operation. Hence, this study highlights the potential of LES for discriminating injection system designs. To cite this article: P. Wolf et al., C. R. Mecanique 337 (2009).

  13. Ab initio construction of a eukaryotic transcriptome by massively parallel mRNA sequencing

    PubMed Central

    Yassour, Moran; Kaplan, Tommy; Fraser, Hunter B.; Levin, Joshua Z.; Pfiffner, Jenna; Adiconis, Xian; Schroth, Gary; Luo, Shujun; Khrebtukova, Irina; Gnirke, Andreas; Nusbaum, Chad; Thompson, Dawn-Anne; Friedman, Nir; Regev, Aviv

    2009-01-01

    Defining the transcriptome, the repertoire of transcribed regions encoded in the genome, is a challenging experimental task. Current approaches, relying on sequencing of ESTs or cDNA libraries, are expensive and labor-intensive. Here, we present a general approach for ab initio discovery of the complete transcriptome of the budding yeast, based only on the unannotated genome sequence and millions of short reads from a single massively parallel sequencing run. Using novel algorithms, we automatically construct a highly accurate transcript catalog. Our approach automatically and fully defines 86% of the genes expressed under the given conditions, and discovers 160 previously undescribed transcription units of 250 bp or longer. It correctly demarcates the 5′ and 3′ UTR boundaries of 86 and 77% of expressed genes, respectively. The method further identifies 83% of known splice junctions in expressed genes, and discovers 25 previously uncharacterized introns, including 2 cases of condition-dependent intron retention. Our framework is applicable to poorly understood organisms, and can lead to greater understanding of the transcribed elements in an explored genome. PMID:19208812

  14. Nanopantography: A new method for massively parallel nanopatterning over large areas

    NASA Astrophysics Data System (ADS)

    Xu, Lin

    Nanopantography, a radically new method for versatile fabrication of sub-20 nm features in a massively parallel fashion, represents a breakthrough in nanotechnology. The concept of this technique is to focus ion "beamlets" in parallel to write identical, arbitrary nano-patterns. Depending on the ion species, nanopatterns can be either etched, or deposited by nanopantography. An array of electrostatic lenses and a broad-area, directional, monoenergetic ion beam are required to implement nanopantography. This dissertation is dedicated to extracting an ion beam with desired properties from a plasma source and realizing nanopantography using this beam. A novel ion extraction strategy has been used to extract a nearly monoenergetic and energy-specified ion beam from a capacitively-coupled or an inductively-coupled, pulsed Ar plasma. The electron temperature decayed rapidly in the afterglow, resulting in uniform plasma potential, and minimal energy spread for ions extracted in the afterglow. Ion energy was controlled by a DC bias, or alternatively by a high-voltage pulse, on the ring electrode surrounding the plasma. Langmuir probe measurements indicated that this bias raised the plasma potential without heating the electrons in the afterglow. The energy spread was 3.4 eV (FWHM) For a peak ion beam energy of 102.0 eV. Similar results were obtained in an inductively-coupled pulsed plasma when the acceleration ring was pulsed exclusively during the afterglow. To achieve Ni deposition by nanopantography, higher Ni atom and ion densities are desired in the plasma source. An ionized physical vapor deposition (IPVD) system with a Ni internal RF coil and Ni target was used to introduce Ni atoms, and a fraction of the atoms becomes ionized in the high-density plasma. Optical emission spectroscopy (OAS) and optical absorption spectroscopy (OAS), in combination with global models, were used to determine the Ni atom and ion density. For a pressure of 8-20 mTorr and coil power of 40

  15. Tubal Ligation

    MedlinePlus

    ... The insert causes scar tissue to form and seal off the tubes. Tubal ligation is an abdominal ... instruments passed through the abdominal wall, your doctor seals the fallopian tubes by destroying segments of the ...

  16. Massively parallel computation of 3D flow and reactions in chemical vapor deposition reactors

    SciTech Connect

    Salinger, A.G.; Shadid, J.N.; Hutchinson, S.A.; Hennigan, G.L.; Devine, K.D.; Moffat, H.K.

    1997-12-01

    Computer modeling of Chemical Vapor Deposition (CVD) reactors can greatly aid in the understanding, design, and optimization of these complex systems. Modeling is particularly attractive in these systems since the costs of experimentally evaluating many design alternatives can be prohibitively expensive, time consuming, and even dangerous, when working with toxic chemicals like Arsine (AsH{sub 3}): until now, predictive modeling has not been possible for most systems since the behavior is three-dimensional and governed by complex reaction mechanisms. In addition, CVD reactors often exhibit large thermal gradients, large changes in physical properties over regions of the domain, and significant thermal diffusion for gas mixtures with widely varying molecular weights. As a result, significant simplifications in the models have been made which erode the accuracy of the models` predictions. In this paper, the authors will demonstrate how the vast computational resources of massively parallel computers can be exploited to make possible the analysis of models that include coupled fluid flow and detailed chemistry in three-dimensional domains. For the most part, models have either simplified the reaction mechanisms and concentrated on the fluid flow, or have simplified the fluid flow and concentrated on rigorous reactions. An important CVD research thrust has been in detailed modeling of fluid flow and heat transfer in the reactor vessel, treating transport and reaction of chemical species either very simply or as a totally decoupled problem. Using the analogy between heat transfer and mass transfer, and the fact that deposition is often diffusion limited, much can be learned from these calculations; however, the effects of thermal diffusion, the change in physical properties with composition, and the incorporation of surface reaction mechanisms are not included in this model, nor can transitions to three-dimensional flows be detected.

  17. Three-dimensional electromagnetic modeling and inversion on massively parallel computers

    SciTech Connect

    Newman, G.A.; Alumbaugh, D.L.

    1996-03-01

    This report has demonstrated techniques that can be used to construct solutions to the 3-D electromagnetic inverse problem using full wave equation modeling. To this point great progress has been made in developing an inverse solution using the method of conjugate gradients which employs a 3-D finite difference solver to construct model sensitivities and predicted data. The forward modeling code has been developed to incorporate absorbing boundary conditions for high frequency solutions (radar), as well as complex electrical properties, including electrical conductivity, dielectric permittivity and magnetic permeability. In addition both forward and inverse codes have been ported to a massively parallel computer architecture which allows for more realistic solutions that can be achieved with serial machines. While the inversion code has been demonstrated on field data collected at the Richmond field site, techniques for appraising the quality of the reconstructions still need to be developed. Here it is suggested that rather than employing direct matrix inversion to construct the model covariance matrix which would be impossible because of the size of the problem, one can linearize about the 3-D model achieved in the inverse and use Monte-Carlo simulations to construct it. Using these appraisal and construction tools, it is now necessary to demonstrate 3-D inversion for a variety of EM data sets that span the frequency range from induction sounding to radar: below 100 kHz to 100 MHz. Appraised 3-D images of the earth`s electrical properties can provide researchers opportunities to infer the flow paths, flow rates and perhaps the chemistry of fluids in geologic mediums. It also offers a means to study the frequency dependence behavior of the properties in situ. This is of significant relevance to the Department of Energy, paramount to characterizing and monitoring of environmental waste sites and oil and gas exploration.

  18. Lattice gauge theory on a massively parallel computing facility. Final report

    SciTech Connect

    Sugar, R.

    1998-08-07

    This grant provided access to the massively parallel computing facilities at Oak Ridge National Laboratory for the study of lattice gauge theory. The major project was a calculation of the weak decay constants of pseudoscalar mesons with one light and one heavy quark. A number of these constants have not yet been measured, so the calculations constituted a set of predictions which will be tested by future experiments. More importantly, f{sub B} and f{sub B{sub s}}, the decay constants of the B and B{sub s} mesons, are crucial inputs for extracting information regarding the CKM matrix element V{sub td} from experimental measurements of B-{anti B} mixing, and future measurements of B{sub s}-{anti B}{sub s} mixing planned for the B-factory currently under construction at the Stanford Linear Accelerator Center. V{sub td} is one of the least well determined parameters of the Standard Model of High Energy Physics. It does not appear likely that F{sub B} and f{sub B{sub s}} will be measured experimentally in the near future, so lattice calculations such as this will play a crucial role in extracting information about the Standard Model from the B-factory experiments. The author has carried out the most accurate calculations of the heavy-light decay constants to date within the quenched approximation, that is ignoring the effects of sea quarks. Furthermore, his was the only group to have estimated the errors in the decay constants associated with the quenched approximation.

  19. Massively parallel sampling of lattice proteins reveals foundations of thermal adaptation

    NASA Astrophysics Data System (ADS)

    Venev, Sergey V.; Zeldovich, Konstantin B.

    2015-08-01

    Evolution of proteins in bacteria and archaea living in different conditions leads to significant correlations between amino acid usage and environmental temperature. The origins of these correlations are poorly understood, and an important question of protein theory, physics-based prediction of types of amino acids overrepresented in highly thermostable proteins, remains largely unsolved. Here, we extend the random energy model of protein folding by weighting the interaction energies of amino acids by their frequencies in protein sequences and predict the energy gap of proteins designed to fold well at elevated temperatures. To test the model, we present a novel scalable algorithm for simultaneous energy calculation for many sequences in many structures, targeting massively parallel computing architectures such as graphics processing unit. The energy calculation is performed by multiplying two matrices, one representing the complete set of sequences, and the other describing the contact maps of all structural templates. An implementation of the algorithm for the CUDA platform is available at http://www.github.com/kzeldovich/galeprot and calculates protein folding energies over 250 times faster than a single central processing unit. Analysis of amino acid usage in 64-mer cubic lattice proteins designed to fold well at different temperatures demonstrates an excellent agreement between theoretical and simulated values of energy gap. The theoretical predictions of temperature trends of amino acid frequencies are significantly correlated with bioinformatics data on 191 bacteria and archaea, and highlight protein folding constraints as a fundamental selection pressure during thermal adaptation in biological evolution.

  20. Implementation of a Message Passing Interface into a Cloud-Resolving Model for Massively Parallel Computing

    NASA Technical Reports Server (NTRS)

    Juang, Hann-Ming Henry; Tao, Wei-Kuo; Zeng, Xi-Ping; Shie, Chung-Lin; Simpson, Joanne; Lang, Steve

    2004-01-01

    The capability for massively parallel programming (MPP) using a message passing interface (MPI) has been implemented into a three-dimensional version of the Goddard Cumulus Ensemble (GCE) model. The design for the MPP with MPI uses the concept of maintaining similar code structure between the whole domain as well as the portions after decomposition. Hence the model follows the same integration for single and multiple tasks (CPUs). Also, it provides for minimal changes to the original code, so it is easily modified and/or managed by the model developers and users who have little knowledge of MPP. The entire model domain could be sliced into one- or two-dimensional decomposition with a halo regime, which is overlaid on partial domains. The halo regime requires that no data be fetched across tasks during the computational stage, but it must be updated before the next computational stage through data exchange via MPI. For reproducible purposes, transposing data among tasks is required for spectral transform (Fast Fourier Transform, FFT), which is used in the anelastic version of the model for solving the pressure equation. The performance of the MPI-implemented codes (i.e., the compressible and anelastic versions) was tested on three different computing platforms. The major results are: 1) both versions have speedups of about 99% up to 256 tasks but not for 512 tasks; 2) the anelastic version has better speedup and efficiency because it requires more computations than that of the compressible version; 3) equal or approximately-equal numbers of slices between the x- and y- directions provide the fastest integration due to fewer data exchanges; and 4) one-dimensional slices in the x-direction result in the slowest integration due to the need for more memory relocation for computation.

  1. Massively parallel computation of lattice associative memory classifiers on multicore processors

    NASA Astrophysics Data System (ADS)

    Ritter, Gerhard X.; Schmalz, Mark S.; Hayden, Eric T.

    2011-09-01

    Over the past quarter century, concepts and theory derived from neural networks (NNs) have featured prominently in the literature of pattern recognition. Implementationally, classical NNs based on the linear inner product can present performance challenges due to the use of multiplication operations. In contrast, NNs having nonlinear kernels based on Lattice Associative Memories (LAM) theory tend to concentrate primarily on addition and maximum/minimum operations. More generally, the emergence of LAM-based NNs, with their superior information storage capacity, fast convergence and training due to relatively lower computational cost, as well as noise-tolerant classification has extended the capabilities of neural networks far beyond the limited applications potential of classical NNs. This paper explores theory and algorithmic approaches for the efficient computation of LAM-based neural networks, in particular lattice neural nets and dendritic lattice associative memories. Of particular interest are massively parallel architectures such as multicore CPUs and graphics processing units (GPUs). Originally developed for video gaming applications, GPUs hold the promise of high computational throughput without compromising numerical accuracy. Unfortunately, currently-available GPU architectures tend to have idiosyncratic memory hierarchies that can produce unacceptably high data movement latencies for relatively simple operations, unless careful design of theory and algorithms is employed. Advantageously, some GPUs (e.g., the Nvidia Fermi GPU) are optimized for efficient streaming computation (e.g., concurrent multiply and add operations). As a result, the linear or nonlinear inner product structures of NNs are inherently suited to multicore GPU computational capabilities. In this paper, the authors' recent research in lattice associative memories and their implementation on multicores is overviewed, with results that show utility for a wide variety of pattern

  2. A SNP panel for identity and kinship testing using massive parallel sequencing.

    PubMed

    Grandell, Ida; Samara, Raed; Tillmar, Andreas O

    2016-07-01

    Within forensic genetics, there is still a need for supplementary DNA marker typing in order to increase the power to solve cases for both identity testing and complex kinship issues. One major disadvantage with current capillary electrophoresis (CE) methods is the limitation in DNA marker multiplex capability. By utilizing massive parallel sequencing (MPS) technology, this capability can, however, be increased. We have designed a customized GeneRead DNASeq SNP panel (Qiagen) of 140 previously published autosomal forensically relevant identity SNPs for analysis using MPS. One single amplification step was followed by library preparation using the GeneRead Library Prep workflow (Qiagen). The sequencing was performed on a MiSeq System (Illumina), and the bioinformatic analyses were done using the software Biomedical Genomics Workbench (CLC Bio, Qiagen). Forty-nine individuals from a Swedish population were genotyped in order to establish genotype frequencies and to evaluate the performance of the assay. The analyses showed to have a balanced coverage among the included loci, and the heterozygous balance showed to have less than 0.5 % outliers. Analyses of dilution series of the 2800M Control DNA gave reproducible results down to 0.2 ng DNA input. In addition, typing of FTA samples and bone samples was performed with promising results. Further studies and optimizations are, however, required for a more detailed evaluation of the performance of degraded and PCR-inhibited forensic samples. In summary, the assay offers a straightforward sample-to-genotype workflow and could be useful to gain information in forensic casework, for both identity testing and in order to solve complex kinship issues.

  3. Transcriptional analysis of the Arabidopsis ovule by massively parallel signature sequencing

    PubMed Central

    Sánchez-León, Nidia; Arteaga-Vázquez, Mario; Alvarez-Mejía, César; Mendiola-Soto, Javier; Durán-Figueroa, Noé; Rodríguez-Leal, Daniel; Rodríguez-Arévalo, Isaac; García-Campayo, Vicenta; García-Aguilar, Marcelina; Olmedo-Monfil, Vianey; Arteaga-Sánchez, Mario; Martínez de la Vega, Octavio; Nobuta, Kan; Vemaraju, Kalyan; Meyers, Blake C.; Vielle-Calzada, Jean-Philippe

    2012-01-01

    The life cycle of flowering plants alternates between a predominant sporophytic (diploid) and an ephemeral gametophytic (haploid) generation that only occurs in reproductive organs. In Arabidopsis thaliana, the female gametophyte is deeply embedded within the ovule, complicating the study of the genetic and molecular interactions involved in the sporophytic to gametophytic transition. Massively parallel signature sequencing (MPSS) was used to conduct a quantitative large-scale transcriptional analysis of the fully differentiated Arabidopsis ovule prior to fertilization. The expression of 9775 genes was quantified in wild-type ovules, additionally detecting >2200 new transcripts mapping to antisense or intergenic regions. A quantitative comparison of global expression in wild-type and sporocyteless (spl) individuals resulted in 1301 genes showing 25-fold reduced or null activity in ovules lacking a female gametophyte, including those encoding 92 signalling proteins, 75 transcription factors, and 72 RNA-binding proteins not reported in previous studies based on microarray profiling. A combination of independent genetic and molecular strategies confirmed the differential expression of 28 of them, showing that they are either preferentially active in the female gametophyte, or dependent on the presence of a female gametophyte to be expressed in sporophytic cells of the ovule. Among 18 genes encoding pentatricopeptide-repeat proteins (PPRs) that show transcriptional activity in wild-type but not spl ovules, CIHUATEOTL (At4g38150) is specifically expressed in the female gametophyte and necessary for female gametogenesis. These results expand the nature of the transcriptional universe present in the ovule of Arabidopsis, and offer a large-scale quantitative reference of global expression for future genomic and developmental studies. PMID:22442422

  4. A Massively Parallel Sequencing Approach Uncovers Ancient Origins and High Genetic Variability of Endangered Przewalski's Horses

    PubMed Central

    Goto, Hiroki; Ryder, Oliver A.; Fisher, Allison R.; Schultz, Bryant; Nekrutenko, Anton; Makova, Kateryna D.

    2011-01-01

    The endangered Przewalski's horse is the closest relative of the domestic horse and is the only true wild horse species surviving today. The question of whether Przewalski's horse is the direct progenitor of domestic horse has been hotly debated. Studies of DNA diversity within Przewalski's horses have been sparse but are urgently needed to ensure their successful reintroduction to the wild. In an attempt to resolve the controversy surrounding the phylogenetic position and genetic diversity of Przewalski's horses, we used massively parallel sequencing technology to decipher the complete mitochondrial and partial nuclear genomes for all four surviving maternal lineages of Przewalski's horses. Unlike single-nucleotide polymorphism (SNP) typing usually affected by ascertainment bias, the present method is expected to be largely unbiased. Three mitochondrial haplotypes were discovered—two similar ones, haplotypes I/II, and one substantially divergent from the other two, haplotype III. Haplotypes I/II versus III did not cluster together on a phylogenetic tree, rejecting the monophyly of Przewalski's horse maternal lineages, and were estimated to split 0.117–0.186 Ma, significantly preceding horse domestication. In the phylogeny based on autosomal sequences, Przewalski's horses formed a monophyletic clade, separate from the Thoroughbred domestic horse lineage. Our results suggest that Przewalski's horses have ancient origins and are not the direct progenitors of domestic horses. The analysis of the vast amount of sequence data presented here suggests that Przewalski's and domestic horse lineages diverged at least 0.117 Ma but since then have retained ancestral genetic polymorphism and/or experienced gene flow. PMID:21803766

  5. Entire Mitochondrial DNA Sequencing on Massively Parallel Sequencing for the Korean Population

    PubMed Central

    2017-01-01

    Mitochondrial DNA (mtDNA) genome analysis has been a potent tool in forensic practice as well as in the understanding of human phylogeny in the maternal lineage. The traditional mtDNA analysis is focused on the control region, but the introduction of massive parallel sequencing (MPS) has made the typing of the entire mtDNA genome (mtGenome) more accessible for routine analysis. The complete mtDNA information can provide large amounts of novel genetic data for diverse populations as well as improved discrimination power for identification. The genetic diversity of the mtDNA sequence in different ethnic populations has been revealed through MPS analysis, but the Korean population not only has limited MPS data for the entire mtGenome, the existing data is mainly focused on the control region. In this study, the complete mtGenome data for 186 Koreans, obtained using Ion Torrent Personal Genome Machine (PGM) technology and retrieved from rather common mtDNA haplogroups based on the control region sequence, are described. The results showed that 24 haplogroups, determined with hypervariable regions only, branched into 47 subhaplogroups, and point heteroplasmy was more frequent in the coding regions. In addition, sequence variations in the coding regions observed in this study were compared with those presented in other reports on different populations, and there were similar features observed in the sequence variants for the predominant haplogroups among East Asian populations, such as Haplogroup D and macrohaplogroups M9, G, and D. This study is expected to be the trigger for the development of Korean specific mtGenome data followed by numerous future studies. PMID:28244283

  6. Massively parallel neural circuits for stereoscopic color vision: encoding, decoding and identification.

    PubMed

    Lazar, Aurel A; Slutskiy, Yevgeniy B; Zhou, Yiyin

    2015-03-01

    Past work demonstrated how monochromatic visual stimuli could be faithfully encoded and decoded under Nyquist-type rate conditions. Color visual stimuli were then traditionally encoded and decoded in multiple separate monochromatic channels. The brain, however, appears to mix information about color channels at the earliest stages of the visual system, including the retina itself. If information about color is mixed and encoded by a common pool of neurons, how can colors be demixed and perceived? We present Color Video Time Encoding Machines (Color Video TEMs) for encoding color visual stimuli that take into account a variety of color representations within a single neural circuit. We then derive a Color Video Time Decoding Machine (Color Video TDM) algorithm for color demixing and reconstruction of color visual scenes from spikes produced by a population of visual neurons. In addition, we formulate Color Video Channel Identification Machines (Color Video CIMs) for functionally identifying color visual processing performed by a spiking neural circuit. Furthermore, we derive a duality between TDMs and CIMs that unifies the two and leads to a general theory of neural information representation for stereoscopic color vision. We provide examples demonstrating that a massively parallel color visual neural circuit can be first identified with arbitrary precision and its spike trains can be subsequently used to reconstruct the encoded stimuli. We argue that evaluation of the functional identification methodology can be effectively and intuitively performed in the stimulus space. In this space, a signal reconstructed from spike trains generated by the identified neural circuit can be compared to the original stimulus.

  7. Investigations on the usefulness of the Massively Parallel Processor for study of electronic properties of atomic and condensed matter systems

    NASA Technical Reports Server (NTRS)

    Das, T. P.

    1988-01-01

    The usefulness of the Massively Parallel Processor (MPP) for investigation of electronic structures and hyperfine properties of atomic and condensed matter systems was explored. The major effort was directed towards the preparation of algorithms for parallelization of the computational procedure being used on serial computers for electronic structure calculations in condensed matter systems. Detailed descriptions of investigations and results are reported, including MPP adaptation of self-consistent charge extended Hueckel (SCCEH) procedure, MPP adaptation of the first-principles Hartree-Fock cluster procedure for electronic structures of large molecules and solid state systems, and MPP adaptation of the many-body procedure for atomic systems.

  8. Targeted-capture massively-parallel sequencing enables robust detection of clinically informative mutations from formalin-fixed tumours

    PubMed Central

    Wong, Stephen Q.; Li, Jason; Salemi, Renato; Sheppard, Karen E.; Hongdo Do; Tothill, Richard W.; McArthur, Grant A.; Dobrovic, Alexander

    2013-01-01

    Massively parallel sequencing offers the ability to interrogate a tumour biopsy for multiple mutational changes. For clinical samples, methodologies must enable maximal extraction of available sequence information from formalin-fixed and paraffin-embedded (FFPE) material. We assessed the use of targeted capture for mutation detection in FFPE DNA. The capture probes targeted the coding region of all known kinase genes and selected oncogenes and tumour suppressor genes. Seven melanoma cell lines and matching FFPE xenograft DNAs were sequenced. An informatics pipeline was developed to identify variants and contaminating mouse reads. Concordance of 100% was observed between unfixed and formalin-fixed for reported COSMIC variants including BRAF V600E. mutations in genes not conventionally screened including ERBB4, ATM, STK11 and CDKN2A were readily detected. All regions were adequately covered with independent reads regardless of GC content. This study indicates that hybridisation capture is a robust approach for massively parallel sequencing of FFPE samples. PMID:24336498

  9. Massively parallel hypercube FFTs: CM-2 implementation and error analysis of a parallel trigonometric factor generation method

    SciTech Connect

    Tong, C.H. ); Swarztrauber, P.N. )

    1991-08-01

    On parallel computers, the way the data elements are mapped to the processors may have a large effect on the timing performance of a given algorithm. In our previous paper, we have examined a few mapping strategies for the ordered radix-2 DIF (decimation-in-frequency) Fast Fourier Transform. In particular, we have shown how reduction of communication can be achieved by combining the order and computational phases through the use of i-cycles. A parallel methods was also presented for computing the trigonometric factors which requires neither trigonometric function evaluation nor interprocessor communication. This paper first reviews some of the experimental results on the Connection Machine to demonstrate the importance of reducing communication in a parallel algorithm. The emphasis of this paper, however, is on analyzing the numerical stability of the proposed method for generating the trigonometric factors and showing how the error can be improved. 16 refs., 12 tabs.

  10. Optimization of the deflated Conjugate Gradient algorithm for the solving of elliptic equations on massively parallel machines

    NASA Astrophysics Data System (ADS)

    Malandain, Mathias; Maheu, Nicolas; Moureau, Vincent

    2013-04-01

    The discretization of Partial Differential Equations often leads to the need of solving large symmetric linear systems. In the case of the Navier-Stokes equations for incompressible flows, solving the elliptic pressure Poisson equation can represent the most important part of the computational time required for the massively parallel simulation of the flow. The need for efficiency that this issue induces is completed with a need for stability, in particular when dealing with unstructured meshes. Here, a stable and efficient variant of the Deflated Preconditioned Conjugate Gradient (DPCG) solver is first presented. This two-level method uses an arbitrary coarse grid to reduce the computational cost of the solving. However, in the massively parallel implementation of this technique for very large linear systems, the coarse grids generated can count up to millions of cells, which makes direct solvings on the coarse level impossible. The solving on the coarse grid, performed with a Preconditioned Conjugate Gradient (PCG) solver for this reason, may involve a large number of communications, which reduces dramatically the performances on massively parallel machines. To this effect, two methods developed in order to reduce the number of iterations on the coarse level are introduced, that is the creation of improved initial guesses and the adaptation of the convergence criterion. The design of these methods make them easy to implement in any already existing DPCG solver. The structural requirements for an efficient massively parallel unstructured solver and the implementation of this solver are described. The novel DPCG method is assessed for applications involving turbulence, heat transfers and two-phase flows, with grids up to 17.8 billion elements. Numerical results show a two- to 12-fold reduction of the number of iterations on the coarse level, which implies a reduction of the computational time of the Poisson solver up to 71% and a global reduction of the proportion

  11. A Massive Parallel Variational Multiscale FEM Scheme Applied to Nonhydrostatic Atmospheric Dynamics

    NASA Astrophysics Data System (ADS)

    Vazquez, Mariano; Marras, Simone; Moragues, Margarida; Jorba, Oriol; Houzeaux, Guillaume; Aubry, Romain

    2010-05-01

    The solution of the fully compressible Euler equations of stratified flows is approached from the point of view of Computational Fluid Dynamics techniques. Specifically, the main aim of this contribution is the introduction of a Variational Multiscale Finite Element (CVMS-FE) approach to solve dry atmospheric dynamics effectively on massive parallel architectures with more than 1000 processors. The conservation form of the equations of motion is discretized in all directions with a Galerkin scheme with stabilization given by the compressible counterpart of the variational multiscale technique of Hughes [1] and Houzeaux et al. [2]. The justification of this effort is twofold: the search of optimal parallelization characteristics and linear scalability trends on petascale machines is one. The development of a numerical algorithm whose local nature helps maintaining minimal the communication among the processors implies, in fact, a large leap towards efficient parallel computing. Second, the rising trend to global models and models of higher spatial resolution naturally suggests the use of adaptive grids to only resolve zones of larger gradients while keeping the computational mesh properly coarse elsewhere (thus keeping the computational cost low). With these two hypotheses in mind, the finite element scheme presented here is an open option to the development of the next generation Numerical Weather Prediction (NWP) codes. This methodology is as new in Computational Fluid Dynamics for compressible flows at low Mach number as it is in Numerical Weather Prediction (NWP). We however mean to show its ability to maintain stability in the solution of thermal, gravity-driven flows in a stratified environment in the specific context of dry atmospheric dynamics. Standard two dimensional benchmarks are implemented and compared against the reference literature. In the context of thermal and gravity-driven flows in a neutral atmosphere, we present: (1) the density current

  12. A massively parallel adaptive scheme for melt migration in geodynamics computations

    NASA Astrophysics Data System (ADS)

    Dannberg, Juliane; Heister, Timo; Grove, Ryan

    2016-04-01

    Melt generation and migration are important processes for the evolution of the Earth's interior and impact the global convection of the mantle. While they have been the subject of numerous investigations, the typical time and length-scales of melt transport are vastly different from global mantle convection, which determines where melt is generated. This makes it difficult to study mantle convection and melt migration in a unified framework. In addition, modelling magma dynamics poses the challenge of highly non-linear and spatially variable material properties, in particular the viscosity. We describe our extension of the community mantle convection code ASPECT that adds equations describing the behaviour of silicate melt percolating through and interacting with a viscously deforming host rock. We use the original compressible formulation of the McKenzie equations, augmented by an equation for the conservation of energy. This approach includes both melt migration and melt generation with the accompanying latent heat effects, and it incorporates the individual compressibilities of the solid and the fluid phase. For this, we derive an accurate and stable Finite Element scheme that can be combined with adaptive mesh refinement. This is particularly advantageous for this type of problem, as the resolution can be increased in mesh cells where melt is present and viscosity gradients are high, whereas a lower resolution is sufficient in regions without melt. Together with a high-performance, massively parallel implementation, this allows for high resolution, 3d, compressible, global mantle convection simulations coupled with melt migration. Furthermore, scalable iterative linear solvers are required to solve the large linear systems arising from the discretized system. Finally, we present benchmarks and scaling tests of our solver up to tens of thousands of cores, show the effectiveness of adaptive mesh refinement when applied to melt migration and compare the

  13. Investigation into the sequence structure of 23 Y chromosomal STR loci using massively parallel sequencing.

    PubMed

    Kwon, So Yeun; Lee, Hwan Young; Kim, Eun Hye; Lee, Eun Young; Shin, Kyoung-Jin

    2016-11-01

    Next-generation sequencing (NGS) can produce massively parallel sequencing (MPS) data for many targeted regions with a high depth of coverage, suggesting its successful application to the amplicons of forensic genetic markers. In the present study, we evaluated the practical utility of MPS in Y-chromosome short tandem repeat (Y-STR) analysis using a multiplex polymerase chain reaction (PCR) system. The multiplex PCR system simultaneously amplified 24 Y-chromosomal markers, including the PowerPlex(®) Y23 loci (DYS19, DYS385ab, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438, DYS439, DYS448, DYS456, DYS458, DYS481, DYS533, DYS549, DYS570, DYS576, DYS635, DYS643, and YGATAH4) and the M175 marker with the small-sized amplicons ranging from 85 to 253bp. The barcoded libraries for the amplicons of the 24 Y-chromosomal markers were produced using a simplified PCR-based library preparation method and successfully sequenced using MPS on a MiSeq(®) System with samples from 250 unrelated Korean males. The genotyping concordance between MPS and the capillary electrophoresis (CE) method, as well as the sequence structure of the 23 Y-STRs, were investigated. Three samples exhibited discordance between the MPS and CE results at DYS385, DYS439, and DYS576. There were 12 Y-STR loci that showed sequence variations in the alleles by a fragment size determination, and the most varied alleles occurred in DYS389II with a different sequence structure in the repeat region. The largest increase in gene diversity between the CE and MPS results was in DYS437 at +34.41%. Single nucleotide polymorphisms (SNPs), insertions, and deletions (indels) were observed in the flanking regions of DYS481, DYS576, and DYS385, respectively. Stutter and noise ratios of the 23 Y-STRs using the developed MPS system were also investigated. Based on these results, the MPS analysis system used in this study could facilitate the investigation into the sequences of the 23 Y-STRs in forensic

  14. Underlying Data for Sequencing the Mitochondrial Genome with the Massively Parallel Sequencing Platform Ion Torrent™ PGM™

    PubMed Central

    2015-01-01

    Background Massively parallel sequencing (MPS) technologies have the capacity to sequence targeted regions or whole genomes of multiple nucleic acid samples with high coverage by sequencing millions of DNA fragments simultaneously. Compared with Sanger sequencing, MPS also can reduce labor and cost on a per nucleotide basis and indeed on a per sample basis. In this study, whole genomes of human mitochondria (mtGenome) were sequenced on the Personal Genome Machine (PGMTM) (Life Technologies, San Francisco, CA), the out data were assessed, and the results were compared with data previously generated on the MiSeqTM (Illumina, San Diego, CA). The objectives of this paper were to determine the feasibility, accuracy, and reliability of sequence data obtained from the PGM. Results 24 samples were multiplexed (in groups of six) and sequenced on the at least 10 megabase throughput 314 chip. The depth of coverage pattern was similar among all 24 samples; however the coverage across the genome varied. For strand bias, the average ratio of coverage between the forward and reverse strands at each nucleotide position indicated that two-thirds of the positions of the genome had ratios that were greater than 0.5. A few sites had more extreme strand bias. Another observation was that 156 positions had a false deletion rate greater than 0.15 in one or more individuals. There were 31-98 (SNP) mtGenome variants observed per sample for the 24 samples analyzed. The total 1237 (SNP) variants were concordant between the results from the PGM and MiSeq. The quality scores for haplogroup assignment for all 24 samples ranged between 88.8%-100%. Conclusions In this study, mtDNA sequence data generated from the PGM were analyzed and the output evaluated. Depth of coverage variation and strand bias were identified but generally were infrequent and did not impact reliability of variant calls. Multiplexing of samples was demonstrated which can improve throughput and reduce cost per sample analyzed

  15. Performance analysis of three dimensional integral equation computations on a massively parallel computer. M.S. Thesis

    NASA Technical Reports Server (NTRS)

    Logan, Terry G.

    1994-01-01

    The purpose of this study is to investigate the performance of the integral equation computations using numerical source field-panel method in a massively parallel processing (MPP) environment. A comparative study of computational performance of the MPP CM-5 computer and conventional Cray-YMP supercomputer for a three-dimensional flow problem is made. A serial FORTRAN code is converted into a parallel CM-FORTRAN code. Some performance results are obtained on CM-5 with 32, 62, 128 nodes along with those on Cray-YMP with a single processor. The comparison of the performance indicates that the parallel CM-FORTRAN code near or out-performs the equivalent serial FORTRAN code for some cases.

  16. A fully coupled method for massively parallel simulation of hydraulically driven fractures in 3-dimensions: FULLY COUPLED PARALLEL SIMULATION OF HYDRAULIC FRACTURES IN 3-D

    DOE PAGES

    Settgast, Randolph R.; Fu, Pengcheng; Walsh, Stuart D. C.; ...

    2016-09-18

    This study describes a fully coupled finite element/finite volume approach for simulating field-scale hydraulically driven fractures in three dimensions, using massively parallel computing platforms. The proposed method is capable of capturing realistic representations of local heterogeneities, layering and natural fracture networks in a reservoir. A detailed description of the numerical implementation is provided, along with numerical studies comparing the model with both analytical solutions and experimental results. The results demonstrate the effectiveness of the proposed method for modeling large-scale problems involving hydraulically driven fractures in three dimensions.

  17. Massively parallel dual control volume grand canonical molecular dynamics with LADERA II. Gradient driven diffusion through polymers

    NASA Astrophysics Data System (ADS)

    Ford Grant, David M.; Heffelfinger, S.

    This paper, the second part of a series, extends the capabilities of the LADERA FORTRAN code for massively parallel dual control volume grand canonical molecular dynamics (DCVGCMD). DCV-GCMD is a hybrid of two more common molecular simulation techniques (grand canonical Monte Carlo and molecular dynamics) which allows the direct molecularlevel modelling of diffusion under a chemical potential gradient. The present version of the code, LADERA-B has the capability of modelling systems with explicit intramolecular interactions such as bonds, angles, and dihedral rotations. The utility of the new code for studying gradient-driven diffusion of small molecules through polymers is demonstrated by applying it to two model systems. LADERA-B includes another new feature, which is the use of neighbour lists in force calculations. This feature increases the speed of the code but presents several challenges in the parallel hybrid algorithm. There is discussion on how these problems were addressed and how our implementation results in a significant increase in speed over the original LADERA. Scaling results are presented for LADERA-B on two massively parallel message-passing machines.

  18. Massively parallel dual control volume grand canonical molecular dynamics with LADERA I. Gradient driven diffusion in Lennard-Jones fluids

    NASA Astrophysics Data System (ADS)

    Heffelfinger David, Grant S.; Ford, M.

    A new algorithm to enable the implementation of dual control volume grand canonical molecular dynamics (DCV-GCMD) on massively parallel (MP) architectures is presented. DCVGCMD can be thought of as hybridization of molecular dynamics (MD) and grand canonical Monte Carlo (GCMC) and was developed recently to make possible the simulation of gradient-driven diffusion. The method has broad application to such problems as membrane separations, drug delivery systems, diffusion in polymers and zeolites, etc. The massively parallel algorithm for the DCV-GCMD method has been implemented in a code named LADERA which employs the short range Lennard-Jones potential for pure fluids and multicomponent mixtures including bulk and confined (single pore as well as amorphous solid materials) systems. Like DCV-GCMD, LADERA's MP algorithm can be thought of as a hybridization of two different algorithms, spatial MD and spatial GCMC. The DCV-GCMD method is described fully followed by the DCV-GCMD parallel algorithm employed in LADERA. The scaling characteristics of the new MP algorithm are presented together with the results of the application of LADERA to ternary and quaternary Lennard-Jones mixtures.

  19. A massively parallel algorithm for the collision probability calculations in the Apollo-II code using the PVM library

    SciTech Connect

    Stankovski, Z.

    1995-12-31

    The collision probability method in neutron transport, as applied to 2D geometries, consume a great amount of computer time, for a typical 2D assembly calculation about 90% of the computing time is consumed in the collision probability evaluations. Consequently RZ or 3D calculations became prohibitive. In this paper the author presents a simple but efficient parallel algorithm based on the message passing host/node programmation model. Parallelization was applied to the energy group treatment. Such approach permits parallelization of the existing code, requiring only limited modifications. Sequential/parallel computer portability is preserved, which is a necessary condition for a industrial code. Sequential performances are also preserved. The algorithm is implemented on a CRAY 90 coupled to a 128 processor T3D computer, a 16 processor IBM SPI and a network of workstations, using the Public Domain PVM library. The tests were executed for a 2D geometry with the standard 99-group library. All results were very satisfactory, the best ones with IBM SPI. Because of heterogeneity of the workstation network, the author did not ask high performances for this architecture. The same source code was used for all computers. A more impressive advantage of this algorithm will appear in the calculations of the SAPHYR project (with the future fine multigroup library of about 8000 groups) with a massively parallel computer, using several hundreds of processors.

  20. PFLOTRAN User Manual: A Massively Parallel Reactive Flow and Transport Model for Describing Surface and Subsurface Processes

    SciTech Connect

    Lichtner, Peter C.; Hammond, Glenn E.; Lu, Chuan; Karra, Satish; Bisht, Gautam; Andre, Benjamin; Mills, Richard; Kumar, Jitendra

    2015-01-20

    PFLOTRAN solves a system of generally nonlinear partial differential equations describing multi-phase, multicomponent and multiscale reactive flow and transport in porous materials. The code is designed to run on massively parallel computing architectures as well as workstations and laptops (e.g. Hammond et al., 2011). Parallelization is achieved through domain decomposition using the PETSc (Portable Extensible Toolkit for Scientific Computation) libraries for the parallelization framework (Balay et al., 1997). PFLOTRAN has been developed from the ground up for parallel scalability and has been run on up to 218 processor cores with problem sizes up to 2 billion degrees of freedom. Written in object oriented Fortran 90, the code requires the latest compilers compatible with Fortran 2003. At the time of this writing this requires gcc 4.7.x, Intel 12.1.x and PGC compilers. As a requirement of running problems with a large number of degrees of freedom, PFLOTRAN allows reading input data that is too large to fit into memory allotted to a single processor core. The current limitation to the problem size PFLOTRAN can handle is the limitation of the HDF5 file format used for parallel IO to 32 bit integers. Noting that 232 = 4; 294; 967; 296, this gives an estimate of the maximum problem size that can be currently run with PFLOTRAN. Hopefully this limitation will be remedied in the near future.

  1. System and method for representing and manipulating three-dimensional objects on massively parallel architectures

    DOEpatents

    Karasick, Michael S.; Strip, David R.

    1996-01-01

    A parallel computing system is described that comprises a plurality of uniquely labeled, parallel processors, each processor capable of modelling a three-dimensional object that includes a plurality of vertices, faces and edges. The system comprises a front-end processor for issuing a modelling command to the parallel processors, relating to a three-dimensional object. Each parallel processor, in response to the command and through the use of its own unique label, creates a directed-edge (d-edge) data structure that uniquely relates an edge of the three-dimensional object to one face of the object. Each d-edge data structure at least includes vertex descriptions of the edge and a description of the one face. As a result, each processor, in response to the modelling command, operates upon a small component of the model and generates results, in parallel with all other processors, without the need for processor-to-processor intercommunication.

  2. System and method for representing and manipulating three-dimensional objects on massively parallel architectures

    DOEpatents

    Karasick, M.S.; Strip, D.R.

    1996-01-30

    A parallel computing system is described that comprises a plurality of uniquely labeled, parallel processors, each processor capable of modeling a three-dimensional object that includes a plurality of vertices, faces and edges. The system comprises a front-end processor for issuing a modeling command to the parallel processors, relating to a three-dimensional object. Each parallel processor, in response to the command and through the use of its own unique label, creates a directed-edge (d-edge) data structure that uniquely relates an edge of the three-dimensional object to one face of the object. Each d-edge data structure at least includes vertex descriptions of the edge and a description of the one face. As a result, each processor, in response to the modeling command, operates upon a small component of the model and generates results, in parallel with all other processors, without the need for processor-to-processor intercommunication. 8 figs.

  3. A Precision Dose Control Circuit for Maskless E-Beam Lithography With Massively Parallel Vertically Aligned Carbon Nanofibers

    SciTech Connect

    Eliza, Sazia A.; Islam, Syed K; Rahman, Touhidur; Bull, Nora D; Blalock, Benjamin; Baylor, Larry R; Ericson, Milton Nance; Gardner, Walter L

    2011-01-01

    This paper describes a highly accurate dose control circuit (DCC) for the emission of a desired number of electrons from vertically aligned carbon nanofibers (VACNFs) in a massively parallel maskless e-beam lithography system. The parasitic components within the VACNF device cause a premature termination of the electron emission, resulting in underexposure of the photoresist. In this paper, we compensate for the effects of the parasitic components and noise while reducing the area of the chip and achieving a precise count of emitted electrons from the VACNFs to obtain the optimum dose for the e-beam lithography.

  4. Method and apparatus for obtaining stack traceback data for multiple computing nodes of a massively parallel computer system

    DOEpatents

    Gooding, Thomas Michael; McCarthy, Patrick Joseph

    2010-03-02

    A data collector for a massively parallel computer system obtains call-return stack traceback data for multiple nodes by retrieving partial call-return stack traceback data from each node, grouping the nodes in subsets according to the partial traceback data, and obtaining further call-return stack traceback data from a representative node or nodes of each subset. Preferably, the partial data is a respective instruction address from each node, nodes having identical instruction address being grouped together in the same subset. Preferably, a single node of each subset is chosen and full stack traceback data is retrieved from the call-return stack within the chosen node.

  5. Integer-encoded massively parallel processing of fast-learning fuzzy ARTMAP neural networks

    NASA Astrophysics Data System (ADS)

    Bahr, Hubert A.; DeMara, Ronald F.; Georgiopoulos, Michael

    1997-04-01

    In this paper we develop techniques that are suitable for the parallel implementation of Fuzzy ARTMAP networks. Speedup and learning performance results are provided for execution on a DECmpp/Sx-1208 parallel processor consisting of a DEC RISC Workstation Front-End and MasPar MP-1 Back-End with 8,192 processors. Experiments of the parallel implementation were conducted on the Letters benchmark database developed by Frey and Slate. The results indicate a speedup on the order of 1000-fold which allows combined training and testing time of under four minutes.

  6. Hybrid massively parallel fast sweeping method for static Hamilton–Jacobi equations

    SciTech Connect

    Detrixhe, Miles; Gibou, Frédéric

    2016-10-01

    The fast sweeping method is a popular algorithm for solving a variety of static Hamilton–Jacobi equations. Fast sweeping algorithms for parallel computing have been developed, but are severely limited. In this work, we present a multilevel, hybrid parallel algorithm that combines the desirable traits of two distinct parallel methods. The fine and coarse grained components of the algorithm take advantage of heterogeneous computer architecture common in high performance computing facilities. We present the algorithm and demonstrate its effectiveness on a set of example problems including optimal control, dynamic games, and seismic wave propagation. We give results for convergence, parallel scaling, and show state-of-the-art speedup values for the fast sweeping method.

  7. Efficient Extraction of Regional Subsets from Massive Climate Datasets using Parallel IO

    NASA Astrophysics Data System (ADS)

    Daily, J.; Schuchardt, K.; Palmer, B. J.

    2010-12-01

    The size of datasets produced by current climate models is increasing rapidly to the scale of petabytes. To handle data at this scale parallel analysis tools are required, however the majority of climate analysis software is serial and remains at the scale of workstations. Further, many climate analysis tools are designed to process regularly gridded data but lack sufficient features to handle unstructured grids. This paper presents a data-parallel subsetter capable of correctly handling unstructured grids while scaling to over 2000 cores. The approach is based on the partitioned global address space (PGAS) parallel programming model and one-sided communication. The paper demonstrates that parallel analysis of climate data succeeds in practice, although IO remains the single greatest bottleneck.

  8. Hybrid massively parallel fast sweeping method for static Hamilton-Jacobi equations

    NASA Astrophysics Data System (ADS)

    Detrixhe, Miles; Gibou, Frédéric

    2016-10-01

    The fast sweeping method is a popular algorithm for solving a variety of static Hamilton-Jacobi equations. Fast sweeping algorithms for parallel computing have been developed, but are severely limited. In this work, we present a multilevel, hybrid parallel algorithm that combines the desirable traits of two distinct parallel methods. The fine and coarse grained components of the algorithm take advantage of heterogeneous computer architecture common in high performance computing facilities. We present the algorithm and demonstrate its effectiveness on a set of example problems including optimal control, dynamic games, and seismic wave propagation. We give results for convergence, parallel scaling, and show state-of-the-art speedup values for the fast sweeping method.

  9. Massively parallel implementation of the multi-reference Brillouin-Wigner CCSD method

    SciTech Connect

    Brabec, Jiri; Krishnamoorthy, Sriram; van Dam, Hubertus JJ; Kowalski, Karol; Pittner, Jiri

    2011-10-06

    This paper reports the parallel implementation of the Brillouin Wigner MultiReference Coupled Cluster method with Single and Double excitations (BW-MRCCSD). Preliminary tests for systems composed of 304 and 440 correlated obritals demonstrate the performance of our implementation across 1000 cores and clearly indicate the advantages of using improved task scheduling. Possible ways for further improvements of the parallel performance are also delineated.

  10. A parallel computing tool for large-scale simulation of massive fluid injection in thermo-poro-mechanical systems

    NASA Astrophysics Data System (ADS)

    Karrech, Ali; Schrank, Christoph; Regenauer-Lieb, Klaus

    2015-10-01

    Massive fluid injections into the earth's upper crust are commonly used to stimulate permeability in geothermal reservoirs, enhance recovery in oil reservoirs, store carbon dioxide and so forth. Currently used models for reservoir simulation are limited to small perturbations and/or hydraulic aspects that are insufficient to describe the complex thermal-hydraulic-mechanical behaviour of natural geomaterials. Comprehensive approaches, which take into account the non-linear mechanical deformations of rock masses, fluid flow in percolating pore spaces, and changes of temperature due to heat transfer, are necessary to predict the behaviour of deep geo-materials subjected to high pressure and temperature changes. In this paper, we introduce a thermodynamically consistent poromechanics formulation which includes coupled thermal, hydraulic and mechanical processes. Moreover, we propose a numerical integration strategy based on massively parallel computing. The proposed formulations and numerical integration are validated using analytical solutions of simple multi-physics problems. As a representative application, we investigate the massive injection of fluids within deep formation to mimic the conditions of reservoir stimulation. The model showed, for instance, the effects of initial pre-existing stress fields on the orientations of stimulation-induced failures.

  11. Optical, mechanical, and electro-optical design of an interferometric test station for massive parallel inspection of MEMS and MOEMS

    NASA Astrophysics Data System (ADS)

    Gastinger, Kay; Haugholt, Karl Henrik; Kujawinska, Malgorzata; Jozwik, Michal; Schaeffel, Christoph; Beer, Stephan

    2009-06-01

    The paper presents the optical, mechanical, and electro-optical design of an interferometric inspection system for massive parallel inspection of MicroElectroMechanicalSystems (MEMS) and MicroOptoElectroMechanicalSystems (MOEMS). The basic idea is to adapt a micro-optical probing wafer to the M(O)EMS wafer under test. The probing wafer is exchangeable and contains a micro-optical interferometer array. A low coherent and a laser interferometer array are developed. Two preliminary interferometer designs are presented; a low coherent interferometer array based on a Mirau configuration and a laser interferometer array based on a Twyman-Green configuration. The optical design focuses on the illumination and imaging concept for the interferometer array. The mechanical design concentrates on the scanning system and the integration in a standard test station for micro-fabrication. Models of single channel low coherence and laser interferometers and preliminary measurement results are presented. The smart-pixel approach for massive parallel electro-optical detection and data reduction is discussed.

  12. Smart pixel camera based signal processing in an interferometric test station for massive parallel inspection of MEMS and MOEMS

    NASA Astrophysics Data System (ADS)

    Styk, Adam; Lambelet, Patrick; Røyset, Arne; Kujawińska, Małgorzata; Gastinger, Kay

    2010-09-01

    The paper presents the electro-optical design of an interferometric inspection system for massive parallel inspection of Micro(Opto)ElectroMechanicalSystems (M(O)EMS). The basic idea is to adapt a micro-optical probing wafer to the M(O)EMS wafer under test. The probing wafer is exchangeable and contains a micro-optical interferometer array: a low coherent interferometer (LCI) array based on a Mirau configuration and a laser interferometer (LI) array based on a Twyman-Green configuration. The interference signals are generated in the micro-optical interferometers and are applied for M(O)EMS shape and deformation measurements by means of LCI and for M(O)EMS vibration analysis (the resonance frequency and spatial mode distribution) by means of LI. Distributed array of 5×5 smart pixel imagers detects the interferometric signals. The signal processing is based on the "on pixel" processing capacity of the smart pixel camera array, which can be utilised for phase shifting, signal demodulation or envelope maximum determination. Each micro-interferometer image is detected by the 140 × 146 pixels sub-array distributed in the imaging plane. In the paper the architecture of cameras with smart-pixel approach are described and their application for massive parallel electrooptical detection and data reduction is discussed. The full data processing paths for laser interferometer and low coherent interferometer are presented.

  13. A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification

    NASA Astrophysics Data System (ADS)

    Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun

    2016-12-01

    Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value.

  14. A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification

    PubMed Central

    Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun

    2016-01-01

    Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value. PMID:27905520

  15. A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification.

    PubMed

    Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun

    2016-12-01

    Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value.

  16. Massively parallel computing simulation of fluid flow in the unsaturated zone of Yucca Mountain, Nevada

    SciTech Connect

    Zhang, Keni; Wu, Yu-Shu; Bodvarsson, G.S.

    2001-08-31

    This paper presents the application of parallel computing techniques to large-scale modeling of fluid flow in the unsaturated zone (UZ) at Yucca Mountain, Nevada. In this study, parallel computing techniques, as implemented into the TOUGH2 code, are applied in large-scale numerical simulations on a distributed-memory parallel computer. The modeling study has been conducted using an over-one-million-cell three-dimensional numerical model, which incorporates a wide variety of field data for the highly heterogeneous fractured formation at Yucca Mountain. The objective of this study is to analyze the impact of various surface infiltration scenarios (under current and possible future climates) on flow through the UZ system, using various hydrogeological conceptual models with refined grids. The results indicate that the one-million-cell models produce better resolution results and reveal some flow patterns that cannot be obtained using coarse-grid modeling models.

  17. Satisfiability Test with Synchronous Simulated Annealing on the Fujitsu AP1000 Massively-Parallel Multiprocessor

    NASA Technical Reports Server (NTRS)

    Sohn, Andrew; Biswas, Rupak

    1996-01-01

    Solving the hard Satisfiability Problem is time consuming even for modest-sized problem instances. Solving the Random L-SAT Problem is especially difficult due to the ratio of clauses to variables. This report presents a parallel synchronous simulated annealing method for solving the Random L-SAT Problem on a large-scale distributed-memory multiprocessor. In particular, we use a parallel synchronous simulated annealing procedure, called Generalized Speculative Computation, which guarantees the same decision sequence as sequential simulated annealing. To demonstrate the performance of the parallel method, we have selected problem instances varying in size from 100-variables/425-clauses to 5000-variables/21,250-clauses. Experimental results on the AP1000 multiprocessor indicate that our approach can satisfy 99.9 percent of the clauses while giving almost a 70-fold speedup on 500 processors.

  18. A Novel Algorithm for Solving the Multidimensional Neutron Transport Equation on Massively Parallel Architectures

    SciTech Connect

    Azmy, Yousry

    2014-06-10

    We employ the Integral Transport Matrix Method (ITMM) as the kernel of new parallel solution methods for the discrete ordinates approximation of the within-group neutron transport equation. The ITMM abandons the repetitive mesh sweeps of the traditional source iterations (SI) scheme in favor of constructing stored operators that account for the direct coupling factors among all the cells' fluxes and between the cells' and boundary surfaces' fluxes. The main goals of this work are to develop the algorithms that construct these operators and employ them in the solution process, determine the most suitable way to parallelize the entire procedure, and evaluate the behavior and parallel performance of the developed methods with increasing number of processes, P. The fastest observed parallel solution method, Parallel Gauss-Seidel (PGS), was used in a weak scaling comparison with the PARTISN transport code, which uses the source iteration (SI) scheme parallelized with the Koch-baker-Alcouffe (KBA) method. Compared to the state-of-the-art SI-KBA with diffusion synthetic acceleration (DSA), this new method- even without acceleration/preconditioning-is completitive for optically thick problems as P is increased to the tens of thousands range. For the most optically thick cells tested, PGS reduced execution time by an approximate factor of three for problems with more than 130 million computational cells on P = 32,768. Moreover, the SI-DSA execution times's trend rises generally more steeply with increasing P than the PGS trend. Furthermore, the PGS method outperforms SI for the periodic heterogeneous layers (PHL) configuration problems. The PGS method outperforms SI and SI-DSA on as few as P = 16 for PHL problems and reduces execution time by a factor of ten or more for all problems considered with more than 2 million computational cells on P = 4.096.

  19. Massively Parallel, Three-Dimensional Transport Solutions for the k-Eigenvalue Problem

    SciTech Connect

    Davidson, Gregory G; Evans, Thomas M; Jarrell, Joshua J; Pandya, Tara M; Slaybaugh, R

    2014-01-01

    We have implemented a new multilevel parallel decomposition in the Denovo dis- crete ordinates radiation transport code. In concert with Krylov subspace iterative solvers, the multilevel decomposition allows concurrency over energy in addition to space-angle, enabling scalability beyond the limits imposed by the traditional KBA space-angle partitioning. Furthermore, a new Arnoldi-based k-eigenvalue solver has been implemented. The added phase-space concurrency combined with the high- performance Krylov and Arnoldi solvers has enabled weak scaling to O(100K) cores on the Jaguar XK6 supercomputer. The multilevel decomposition provides sucient parallelism to scale to exascale computing and beyond.

  20. A Initio Study of the SILICON(111)-(7 by 7) Surface Reconstruction: a Challenge for Massively Parallel Computation

    NASA Astrophysics Data System (ADS)

    Brommer, Karl Daniel

    This thesis presents the first ab initio calculation of the Si(111)-(7 x 7) surface reconstruction, perhaps the most complex and widely studied surface of a solid. The large number of atoms in the unit cell has up to now defied any complete and realistic treatment of its properties. In this thesis, we exploit the power of massively parallel computation to investigate the surface reconstruction with a supercell geometry containing 700 effective atoms. These calculations predict the fully relaxed atomic geometry of this system; allow construction of theoretical STM images as a function of bias voltages; and predict the energy difference between the (7 x 7) and (2 x 1) reconstructions. The diversity of dangling bond sites on the (7 x 7) surface provides an optimal system for investigating chemical reactivity. A detailed study of the electronic surface states is presented, showing that the interpretation of the surface chemical reactivity in terms of newly developed theories of local softness is consistent with chemisorption experiments. We conclude with predictions of results for surface reactions involving a large variety of atoms and molecules. The method of computing electronic structure on a massively parallel computer is fully described, including a discussion of how the calculations would be improved through implementation on a more modern parallel computer. The results demonstrate that the state of the art in ab initio quantum-mechanical computation of electronic structure has been raised to a new echelon as the study of systems involving thousands of atoms is now possible. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617 -253-5668; Fax 617-253-1690.).

  1. Spatiotemporal Domain Decomposition for Massive Parallel Computation of Space-Time Kernel Density

    NASA Astrophysics Data System (ADS)

    Hohl, A.; Delmelle, E. M.; Tang, W.

    2015-07-01

    Accelerated processing capabilities are deemed critical when conducting analysis on spatiotemporal datasets of increasing size, diversity and availability. High-performance parallel computing offers the capacity to solve computationally demanding problems in a limited timeframe, but likewise poses the challenge of preventing processing inefficiency due to workload imbalance between computing resources. Therefore, when designing new algorithms capable of implementing parallel strategies, careful spatiotemporal domain decomposition is necessary to account for heterogeneity in the data. In this study, we perform octtree-based adaptive decomposition of the spatiotemporal domain for parallel computation of space-time kernel density. In order to avoid edge effects near subdomain boundaries, we establish spatiotemporal buffers to include adjacent data-points that are within the spatial and temporal kernel bandwidths. Then, we quantify computational intensity of each subdomain to balance workloads among processors. We illustrate the benefits of our methodology using a space-time epidemiological dataset of Dengue fever, an infectious vector-borne disease that poses a severe threat to communities in tropical climates. Our parallel implementation of kernel density reaches substantial speedup compared to sequential processing, and achieves high levels of workload balance among processors due to great accuracy in quantifying computational intensity. Our approach is portable of other space-time analytical tests.

  2. O( N) tight-binding molecular dynamics on massively parallel computers: an orbital decomposition approach

    NASA Astrophysics Data System (ADS)

    Canning, A.; Galli, G.; Mauri, F.; De Vita, A.; Car, R.

    1996-04-01

    The implementation of an O( N) tight-binding molecular dynamics code on the Cray T3D parallel computer is discussed. The O( N) energy functional depends on non-orthogonal, localised orbitals and a chemical potential parameter which determines the number of electrons in the system. The localisation introduces a sparse nature to the orbital data and Hamiltonian matrix, greatly changing the coding on parallel machines compared to non-localised systems. The data distribution, communication routines and dynamic load-balancing scheme of the program are presented in detail together with the speed and scaling of the code on various homogeneous and inhomogeneous physical systems. Performance results will be presented for systems of 2048 to 32768 atoms on 32 to 512 processors. We discuss the relevance to quantum molecular dynamics simulations with localised orbitals, of techniques used for programming short-range classical molecular dynamics simulations on parallel machines. The absence of global communications and the localised nature of the orbitals makes these algorithms extremely scalable in terms of memory and speed on parallel systems with fast communications. The main aim of this article is to present in detail all the new concepts and programming techniques that localisation of the orbitals introduces which scientists, coming from a background in non-localised quantum molecular dynamics simulations, may be unfamiliar with.

  3. Optical binary de Bruijn networks for massively parallel computing: design methodology and feasibility study

    NASA Astrophysics Data System (ADS)

    Louri, Ahmed; Sung, Hongki

    1995-10-01

    The interconnection network structure can be the deciding and limiting factor in the cost and the performance of parallel computers. One of the most popular point-to-point interconnection networks for parallel computers today is the hypercube. The regularity, logarithmic diameter, symmetry, high connectivity, fault tolerance, simple routing, and reconfigurability (easy embedding of other network topologies) of the hypercube make it a very attractive choice for parallel computers. Unfortunately the hypercube possesses a major drawback, which is the links per node increases as the network grows in size. As an alternative to the hypercube, the binary de Bruijn (BdB) network has recently received much attention. The BdB not only provides a logarithmic diameter, fault tolerance, and simple routing but also requires fewer links than the hypercube for the same network size. Additionally, a major advantage of the BdB edges per node is independent of the network size. This makes it very desirable for large-scale parallel systems. However, because of its asymmetrical nature and global connectivity, it poses a major challenge for VLSI technology. Optics, owing to its three-dimensional and global-connectivity nature, seems to be very suitable for implementing BdB networks. We present an implementation methodology for optical BdB networks. The distinctive feature of the proposed implementation methodology is partitionability of the network into a few primitive operations that can be implemented efficiently. We further show feasibility of the

  4. Massively-Parallel Architectures for Automatic Recognition of Visual Speech Signals

    DTIC Science & Technology

    1988-10-12

    sometimes the tongue ), others are not. The visible articulators’ contribution to the acoustic signal result in speech sounds that are much more...parallel formant synthesizer," JASA, vol. 67, pp. 971-995 , March 1980. Kohonen, T., Self-Organization and Associative Memory, Springer-Verlag, Berlin, 1984

  5. An open source massively parallel solver for Richards equation: Mechanistic modelling of water fluxes at the watershed scale

    NASA Astrophysics Data System (ADS)

    Orgogozo, L.; Renon, N.; Soulaine, C.; Hénon, F.; Tomer, S. K.; Labat, D.; Pokrovsky, O. S.; Sekhar, M.; Ababou, R.; Quintard, M.

    2014-12-01

    In this paper we present a massively parallel open source solver for Richards equation, named the RichardsFOAM solver. This solver has been developed in the framework of the open source generalist computational fluid dynamics tool box OpenFOAM® and is capable to deal with large scale problems in both space and time. The source code for RichardsFOAM may be downloaded from the CPC program library website. It exhibits good parallel performances (up to ˜90% parallel efficiency with 1024 processors both in strong and weak scaling), and the conditions required for obtaining such performances are analysed and discussed. These performances enable the mechanistic modelling of water fluxes at the scale of experimental watersheds (up to few square kilometres of surface area), and on time scales of decades to a century. Such a solver can be useful in various applications, such as environmental engineering for long term transport of pollutants in soils, water engineering for assessing the impact of land settlement on water resources, or in the study of weathering processes on the watersheds.

  6. Massively parallel classification of single-trial EEG signals using a min-max modular neural network.

    PubMed

    Lu, Bao-Liang; Shin, Jonghan; Ichikawa, Michinori

    2004-03-01

    This paper presents a method for classifying single-trial electroencephalogram (EEG) signals using min-max modular neural networks implemented in a massively parallel way. The method has three main steps. First, a large-scale, complex EEG classification problem is simply divided into a reasonable number of two-class subproblems, as small as needed. Second, the two-class subproblems are simply learned by individual smaller network modules in parallel. Finally, all the individual trained network modules are integrated into a hierarchical, parallel, and modular classifier according to two module combination laws. To demonstrate the effectiveness of the method, we perform simulations on fifteen different four-class EEG classification tasks, each of which consists of 1491 training and 636 test data. These EEG classification tasks were created using a set of non-averaged, single-trial hippocampal EEG signals recorded from rats; the features of the EEG signals are extracted using wavelet transform techniques. The experimental results indicate that the proposed method has several attractive features. 1) The method is appreciably faster than the existing approach that is based on conventional multilayer perceptrons. 2) Complete learning of complex EEG classification problems can be easily realized, and better generalization performance can be achieved. 3) The method scales up to large-scale, complex EEG classification problems.

  7. Solution of the within-group multidimensional discrete ordinates transport equations on massively parallel architectures

    NASA Astrophysics Data System (ADS)

    Zerr, Robert Joseph

    2011-12-01

    The integral transport matrix method (ITMM) has been used as the kernel of new parallel solution methods for the discrete ordinates approximation of the within-group neutron transport equation. The ITMM abandons the repetitive mesh sweeps of the traditional source iterations (SI) scheme in favor of constructing stored operators that account for the direct coupling factors among all the cells and between the cells and boundary surfaces. The main goals of this work were to develop the algorithms that construct these operators and employ them in the solution process, determine the most suitable way to parallelize the entire procedure, and evaluate the behavior and performance of the developed methods for increasing number of processes. This project compares the effectiveness of the ITMM with the SI scheme parallelized with the Koch-Baker-Alcouffe (KBA) method. The primary parallel solution method involves a decomposition of the domain into smaller spatial sub-domains, each with their own transport matrices, and coupled together via interface boundary angular fluxes. Each sub-domain has its own set of ITMM operators and represents an independent transport problem. Multiple iterative parallel solution methods have investigated, including parallel block Jacobi (PBJ), parallel red/black Gauss-Seidel (PGS), and parallel GMRES (PGMRES). The fastest observed parallel solution method, PGS, was used in a weak scaling comparison with the PARTISN code. Compared to the state-of-the-art SI-KBA with diffusion synthetic acceleration (DSA), this new method without acceleration/preconditioning is not competitive for any problem parameters considered. The best comparisons occur for problems that are difficult for SI DSA, namely highly scattering and optically thick. SI DSA execution time curves are generally steeper than the PGS ones. However, until further testing is performed it cannot be concluded that SI DSA does not outperform the ITMM with PGS even on several thousand or tens of

  8. Nonlinear structural response using adaptive dynamic relaxation on a massively-parallel-processing system

    NASA Technical Reports Server (NTRS)

    Oakley, David R.; Knight, Norman F., Jr.

    1994-01-01

    A parallel adaptive dynamic relaxation (ADR) algorithm has been developed for nonlinear structural analysis. This algorithm has minimal memory requirements, is easily parallelizable and scalable to many processors, and is generally very reliable and efficient for highly nonlinear problems. Performance evaluations on single-processor computers have shown that the ADR algorithm is reliable and highly vectorizable, and that it is competitive with direct solution methods for the highly nonlinear problems considered. The present algorithm is implemented on the 512-processor Intel Touchstone DELTA system at Caltech, and it is designed to minimize the extent and frequency of interprocessor communication. The algorithm has been used to solve for the nonlinear static response of two and three dimensional hyperelastic systems involving contact. Impressive relative speedups have been achieved and demonstrate the high scalability of the ADR algorithm. For the class of problems addressed, the ADR algorithm represents a very promising approach for parallel-vector processing.

  9. Analysis and selection of optimal function implementations in massively parallel computer

    DOEpatents

    Archer, Charles Jens; Peters, Amanda; Ratterman, Joseph D.

    2011-05-31

    An apparatus, program product and method optimize the operation of a parallel computer system by, in part, collecting performance data for a set of implementations of a function capable of being executed on the parallel computer system based upon the execution of the set of implementations under varying input parameters in a plurality of input dimensions. The collected performance data may be used to generate selection program code that is configured to call selected implementations of the function in response to a call to the function under varying input parameters. The collected performance data may be used to perform more detailed analysis to ascertain the comparative performance of the set of implementations of the function under the varying input parameters.

  10. Parallel HOP: A Scalable Halo Finder for Massive Cosmological Data Sets

    NASA Astrophysics Data System (ADS)

    Skory, Stephen; Turk, Matthew J.; Norman, Michael L.; Coil, Alison L.

    2010-11-01

    Modern N-body cosmological simulations contain billions (109) of dark matter particles. These simulations require hundreds to thousands of gigabytes of memory and employ hundreds to tens of thousands of processing cores on many compute nodes. In order to study the distribution of dark matter in a cosmological simulation, the dark matter halos must be identified using a halo finder, which establishes the halo membership of every particle in the simulation. The resources required for halo finding are similar to the requirements for the simulation itself. In particular, simulations have become too extensive to use commonly employed halo finders, such that the computational requirements to identify halos must now be spread across multiple nodes and cores. Here, we present a scalable-parallel halo finding method called Parallel HOP for large-scale cosmological simulation data. Based on the halo finder HOP, it utilizes message passing interface and domain decomposition to distribute the halo finding workload across multiple compute nodes, enabling analysis of much larger data sets than is possible with the strictly serial or previous parallel implementations of HOP. We provide a reference implementation of this method as a part of the toolkit "yt", an analysis toolkit for adaptive mesh refinement data that include complementary analysis modules. Additionally, we discuss a suite of benchmarks that demonstrate that this method scales well up to several hundred tasks and data sets in excess of 20003 particles. The Parallel HOP method and our implementation can be readily applied to any kind of N-body simulation data and is therefore widely applicable.

  11. Chemoselective ligation

    DOEpatents

    Saxon, Eliana; Bertozzi, Carolyn Ruth

    2011-12-13

    The present invention features a chemoselective ligation reaction that can be carried out under physiological conditions. In general, the invention involves condensation of a specifically engineered phosphine, which can provide for formation of an amide bond between the two reactive partners resulting in a final product comprising a phosphine moiety, or which can be engineered to comprise a cleavable linker so that a substituent of the phosphine is transferred to the azide, releasing an oxidized phosphine byproduct and producing a native amide bond in the final product. The selectivity of the reaction and its compatibility with aqueous environments provides for its application in vivo (e.g., on the cell surface or intracellularly) and in vitro (e.g., synthesis of peptides and other polymers, production of modified (e.g., labeled) amino acids).

  12. Chemoselective ligation

    DOEpatents

    Saxon, Eliana; Bertozzi, Carolyn

    2006-10-17

    The present invention features a chemoselective ligation reaction that can be carried out under physiological conditions. In general, the invention involves condensation of a specifically engineered phosphine, which can provide for formation of an amide bond between the two reactive partners resulting in a final product comprising a phosphine moiety, or which can be engineered to comprise a cleavable linker so that a substituent of the phosphine is transferred to the azide, releasing an oxidized phosphine byproduct and producing a native amide bond in the final product. The selectivity of the reaction and its compatibility with aqueous environments provides for its application in vivo (e.g., on the cell surface or intracellularly) and in vitro (e.g., synthesis of peptides and other polymers, production of modified (e.g., labeled) amino acids).

  13. Chemoselective ligation

    DOEpatents

    Saxon, Eliana; Bertozzi, Carolyn Ruth

    2010-11-23

    The present invention features a chemoselective ligation reaction that can be carried out under physiological conditions. In general, the invention involves condensation of a specifically engineered phosphine, which can provide for formation of an amide bond between the two reactive partners resulting in a final product comprising a phosphine moiety, or which can be engineered to comprise a cleavable linker so that a substituent of the phosphine is transferred to the azide, releasing an oxidized phosphine byproduct and producing a native amide bond in the final product. The selectivity of the reaction and its compatibility with aqueous environments provides for its application in vivo (e.g., on the cell surface or intracellularly) and in vitro (e.g., synthesis of peptides and other polymers, production of modified (e.g., labeled) amino acids).

  14. Chemoselective ligation

    DOEpatents

    Saxon, Eliana; Bertozzi, Carolyn

    2003-05-27

    The present invention features a chemoselective ligation reaction that can be carried out under physiological conditions. In general, the invention involves condensation of a specifically engineered phosphine, which can provide for formation of an amide bond between the two reactive partners resulting in a final product comprising a phosphine moiety, or which can be engineered to comprise a cleavable linker so that a substituent of the phosphine is transferred to the azide, releasing an oxidized phosphine byproduct and producing a native amide bond in the final product. The selectivity of the reaction and its compatibility with aqueous environments provides for its application in vivo (e.g., on the cell surface or intracellularly) and in vitro (e.g., synthesis of peptides and other polymers, production of modified (e.g., labeled) amino acids).

  15. Chemoselective ligation

    DOEpatents

    Saxon, Eliana; Bertozzi, Carolyn R.

    2010-02-23

    The present invention features a chemoselective ligation reaction that can be carried out under physiological conditions. In general, the invention involves condensation of a specifically engineered phosphine, which can provide for formation of an amide bond between the two reactive partners resulting in a final product comprising a phosphine moiety, or which can be engineered to comprise a cleavable linker so that a substituent of the phosphine is transferred to the azide, releasing an oxidized phosphine byproduct and producing a native amide bond in the final product. The selectivity of the reaction and its compatibility with aqueous environments provides for its application in vivo (e.g. on the cell surface or intracellularly) and in vitro (e.g., synthesis of peptides and other polymers, production of modified (e.g., labeled) amino acids).

  16. Chemoselective ligation

    DOEpatents

    Saxon, Eliana; Bertozzi, Carolyn R.

    2011-04-12

    The present invention features a chemoselective ligation reaction that can be carried out under physiological conditions. In general, the invention involves condensation of a specifically engineered phosphine, which can provide for formation of an amide bond between the two reactive partners resulting in a final product comprising a phosphine moiety, or which can be engineered to comprise a cleavable linker so that a substituent of the phosphine is transferred to the azide, releasing an oxidized phosphine byproduct and producing a native amide bond in the final product. The selectivity of the reaction and its compatibility with aqueous environments provides for its application in vivo (e.g., on the cell surface or intracellularly) and in vitro (e.g., synthesis of peptides and other polymers, production of modified (e.g., labeled) amino acids).

  17. Chemoselective ligation

    DOEpatents

    Saxon, Eliana; Bertozzi, Carolyn R.

    2011-05-10

    The present invention features a chemoselective ligation reaction that can be carried out under physiological conditions. In general, the invention involves condensation of a specifically engineered phosphine, which can provide for formation of an amide bond between the two reactive partners resulting in a final product comprising a phosphine moiety, or which can be engineered to comprise a cleavable linker so that a substituent of the phosphine is transferred to the azide, releasing an oxidized phosphine byproduct and producing a native amide bond in the final product. The selectivity of the reaction and its compatibility with aqueous environments provides for its application in vivo (e.g., on the cell surface or intracellularly) and in vitro (e.g., synthesis of peptides and other polymers, production of modified (e.g., labeled) amino acids).

  18. Massive and parallel expression profiling using microarrayed single-cell sequencing

    PubMed Central

    Vickovic, Sanja; Ståhl, Patrik L.; Salmén, Fredrik; Giatrellis, Sarantis; Westholm, Jakub Orzechowski; Mollbrink, Annelie; Navarro, José Fernández; Custodio, Joaquin; Bienko, Magda; Sutton, Lesley-Ann; Rosenquist, Richard; Frisén, Jonas; Lundeberg, Joakim

    2016-01-01

    Single-cell transcriptome analysis overcomes problems inherently associated with averaging gene expression measurements in bulk analysis. However, single-cell analysis is currently challenging in terms of cost, throughput and robustness. Here, we present a method enabling massive microarray-based barcoding of expression patterns in single cells, termed MASC-seq. This technology enables both imaging and high-throughput single-cell analysis, characterizing thousands of single-cell transcriptomes per day at a low cost (0.13 USD/cell), which is two orders of magnitude less than commercially available systems. Our novel approach provides data in a rapid and simple way. Therefore, MASC-seq has the potential to accelerate the study of subtle clonal dynamics and help provide critical insights into disease development and other biological processes. PMID:27739429

  19. Massive and parallel expression profiling using microarrayed single-cell sequencing.

    PubMed

    Vickovic, Sanja; Ståhl, Patrik L; Salmén, Fredrik; Giatrellis, Sarantis; Westholm, Jakub Orzechowski; Mollbrink, Annelie; Navarro, José Fernández; Custodio, Joaquin; Bienko, Magda; Sutton, Lesley-Ann; Rosenquist, Richard; Frisén, Jonas; Lundeberg, Joakim

    2016-10-14

    Single-cell transcriptome analysis overcomes problems inherently associated with averaging gene expression measurements in bulk analysis. However, single-cell analysis is currently challenging in terms of cost, throughput and robustness. Here, we present a method enabling massive microarray-based barcoding of expression patterns in single cells, termed MASC-seq. This technology enables both imaging and high-throughput single-cell analysis, characterizing thousands of single-cell transcriptomes per day at a low cost (0.13 USD/cell), which is two orders of magnitude less than commercially available systems. Our novel approach provides data in a rapid and simple way. Therefore, MASC-seq has the potential to accelerate the study of subtle clonal dynamics and help provide critical insights into disease development and other biological processes.

  20. Advances in time-domain electromagnetic simulation capabilities through the use of overset grids and massively parallel computing

    NASA Astrophysics Data System (ADS)

    Blake, Douglas Clifton

    A new methodology is presented for conducting numerical simulations of electromagnetic scattering and wave-propagation phenomena on massively parallel computing platforms. A process is constructed which is rooted in the Finite-Volume Time-Domain (FVTD) technique to create a simulation capability that is both versatile and practical. In terms of versatility, the method is platform independent, is easily modifiable, and is capable of solving a large number of problems with no alterations. In terms of practicality, the method is sophisticated enough to solve problems of engineering significance and is not limited to mere academic exercises. In order to achieve this capability, techniques are integrated from several scientific disciplines including computational fluid dynamics, computational electromagnetics, and parallel computing. The end result is the first FVTD solver capable of utilizing the highly flexible overset-gridding process in a distributed-memory computing environment. In the process of creating this capability, work is accomplished to conduct the first study designed to quantify the effects of domain-decomposition dimensionality on the parallel performance of hyperbolic partial differential equations solvers; to develop a new method of partitioning a computational domain comprised of overset grids; and to provide the first detailed assessment of the applicability of overset grids to the field of computational electromagnetics. Using these new methods and capabilities, results from a large number of wave propagation and scattering simulations are presented. The overset-grid FVTD algorithm is demonstrated to produce results of comparable accuracy to single-grid simulations while simultaneously shortening the grid-generation process and increasing the flexibility and utility of the FVTD technique. Furthermore, the new domain-decomposition approaches developed for overset grids are shown to be capable of producing partitions that are better load balanced and

  1. Identification of the Bovine Arachnomelia Mutation by Massively Parallel Sequencing Implicates Sulfite Oxidase (SUOX) in Bone Development

    PubMed Central

    Drögemüller, Cord; Tetens, Jens; Sigurdsson, Snaevar; Gentile, Arcangelo; Testoni, Stefania; Lindblad-Toh, Kerstin; Leeb, Tosso

    2010-01-01

    Arachnomelia is a monogenic recessive defect of skeletal development in cattle. The causative mutation was previously mapped to a ∼7 Mb interval on chromosome 5. Here we show that array-based sequence capture and massively parallel sequencing technology, combined with the typical family structure in livestock populations, facilitates the identification of the causative mutation. We re-sequenced the entire critical interval in a healthy partially inbred cow carrying one copy of the critical chromosome segment in its ancestral state and one copy of the same segment with the arachnomelia mutation, and we detected a single heterozygous position. The genetic makeup of several partially inbred cattle provides extremely strong support for the causality of this mutation. The mutation represents a single base insertion leading to a premature stop codon in the coding sequence of the SUOX gene and is perfectly associated with the arachnomelia phenotype. Our findings suggest an important role for sulfite oxidase in bone development. PMID:20865119

  2. Massively parallel single-cell RNA-seq for marker-free decomposition of tissues into cell types.

    PubMed

    Jaitin, Diego Adhemar; Kenigsberg, Ephraim; Keren-Shaul, Hadas; Elefant, Naama; Paul, Franziska; Zaretsky, Irina; Mildner, Alexander; Cohen, Nadav; Jung, Steffen; Tanay, Amos; Amit, Ido

    2014-02-14

    In multicellular organisms, biological function emerges when heterogeneous cell types form complex organs. Nevertheless, dissection of tissues into mixtures of cellular subpopulations is currently challenging. We introduce an automated massively parallel single-cell RNA sequencing (RNA-seq) approach for analyzing in vivo transcriptional states in thousands of single cells. Combined with unsupervised classification algorithms, this facilitates ab initio cell-type characterization of splenic tissues. Modeling single-cell transcriptional states in dendritic cells and additional hematopoietic cell types uncovers rich cell-type heterogeneity and gene-modules activity in steady state and after pathogen activation. Cellular diversity is thereby approached through inference of variable and dynamic pathway activity rather than a fixed preprogrammed cell-type hierarchy. These data demonstrate single-cell RNA-seq as an effective tool for comprehensive cellular decomposition of complex tissues.

  3. USH2 caused by GPR98 mutation diagnosed by massively parallel sequencing in advance of the occurrence of visual symptoms

    PubMed Central

    Moteki, Hideaki; Yoshimura, Hidekane; Azaiez, Hela; Booth, Kevin T.; Shearer, A Eliot; Sloan, Christina M.; Kolbe, Diana L.; Murata, Toshinori; Smith, Richard J. H.; Usami, Shin-ichi

    2015-01-01

    Objective We present two patients who were identified with mutations in the GPR98 gene that causes Usher syndrome type 2 (USH2). Methods One hundred ninety-four (194) Japanese subjects from unrelated and families were enrolled in the study. Targeted genomic enrichment and massively parallel sequencing of all known non-syndromic hearing loss genes were used to identify the genetic causes of hearing loss. Results We identified causative mutations in the GPR98 gene in one family (two siblings). The patients had moderate sloping hearing loss, and no progression was observed over a period of 10 years. Fundus examinations were normal. However, electroretinogram revealed impaired responses in both patients. Conclusion Early diagnosis of Usher syndrome has many advantages for patients and their families. This study supports the use of comprehensive genetic diagnosis for Usher syndrome, especially prior to the onset of visual symptoms, to provide the highest chance of diagnostic success in early life stages. PMID:25743181

  4. Genome-Wide Footprints of Pig Domestication and Selection Revealed through Massive Parallel Sequencing of Pooled DNA

    PubMed Central

    Amaral, Andreia J.; Ferretti, Luca; Megens, Hendrik-Jan; Crooijmans, Richard P. M. A.; Nie, Haisheng; Ramos-Onsins, Sebastian E.; Perez-Enciso, Miguel; Schook, Lawrence B.; Groenen, Martien A. M.

    2011-01-01

    Background Artificial selection has caused rapid evolution in domesticated species. The identification of selection footprints across domesticated genomes can contribute to uncover the genetic basis of phenotypic diversity. Methodology/Main Findings Genome wide footprints of pig domestication and selection were identified using massive parallel sequencing of pooled reduced representation libraries (RRL) representing ∼2% of the genome from wild boar and four domestic pig breeds (Large White, Landrace, Duroc and Pietrain) which have been under strong selection for muscle development, growth, behavior and coat color. Using specifically developed statistical methods that account for DNA pooling, low mean sequencing depth, and sequencing errors, we provide genome-wide estimates of nucleotide diversity and genetic differentiation in pig. Widespread signals suggestive of positive and balancing selection were found and the strongest signals were observed in Pietrain, one of the breeds most intensively selected for muscle development. Most signals were population-specific but affected genomic regions which harbored genes for common biological categories including coat color, brain development, muscle development, growth, metabolism, olfaction and immunity. Genetic differentiation in regions harboring genes related to muscle development and growth was higher between breeds than between a given breed and the wild boar. Conclusions/Significance These results, suggest that although domesticated breeds have experienced similar selective pressures, selection has acted upon different genes. This might reflect the multiple domestication events of European breeds or could be the result of subsequent introgression of Asian alleles. Overall, it was estimated that approximately 7% of the porcine genome has been affected by selection events. This study illustrates that the massive parallel sequencing of genomic pools is a cost-effective approach to identify footprints of selection

  5. Inter-laboratory evaluation of SNP-based forensic identification by massively parallel sequencing using the Ion PGM™.

    PubMed

    Eduardoff, M; Santos, C; de la Puente, M; Gross, T E; Fondevila, M; Strobl, C; Sobrino, B; Ballard, D; Schneider, P M; Carracedo, Á; Lareu, M V; Parson, W; Phillips, C

    2015-07-01

    Next generation sequencing (NGS) offers the opportunity to analyse forensic DNA samples and obtain massively parallel coverage of targeted short sequences with the variants they carry. We evaluated the levels of sequence coverage, genotyping precision, sensitivity and mixed DNA patterns of a prototype version of the first commercial forensic NGS kit: the HID-Ion AmpliSeq™ Identity Panel with 169-markers designed for the Ion PGM™ system. Evaluations were made between three laboratories following closely matched Ion PGM™ protocols and a simple validation framework of shared DNA controls. The sequence coverage obtained was extensive for the bulk of SNPs targeted by the HID-Ion AmpliSeq™ Identity Panel. Sensitivity studies showed 90-95% of SNP genotypes could be obtained from 25 to 100pg of input DNA. Genotyping concordance tests included Coriell cell-line control DNA analyses checked against whole-genome sequencing data from 1000 Genomes and Complete Genomics, indicating a very high concordance rate of 99.8%. Discordant genotypes detected in rs1979255, rs1004357, rs938283, rs2032597 and rs2399332 indicate these loci should be excluded from the panel. Therefore, the HID-Ion AmpliSeq™ Identity Panel and Ion PGM™ system provide a sensitive and accurate forensic SNP genotyping assay. However, low-level DNA produced much more varied sequence coverage and in forensic use the Ion PGM™ system will require careful calibration of the total samples loaded per chip to preserve the genotyping reliability seen in routine forensic DNA. Furthermore, assessments of mixed DNA indicate the user's control of sequence analysis parameter settings is necessary to ensure mixtures are detected robustly. Given the sensitivity of Ion PGM™, this aspect of forensic genotyping requires further optimisation before massively parallel sequencing is applied to routine casework.

  6. Massively parallel DNA sequencing successfully identifies new causative mutations in deafness genes in patients with cochlear implantation and EAS.

    PubMed

    Miyagawa, Maiko; Nishio, Shin-ya; Ikeda, Takuo; Fukushima, Kunihiro; Usami, Shin-ichi

    2013-01-01

    Genetic factors, the most common etiology in severe to profound hearing loss, are one of the key determinants of Cochlear Implantation (CI) and Electric Acoustic Stimulation (EAS) outcomes. Satisfactory auditory performance after receiving a CI/EAS in patients with certain deafness gene mutations indicates that genetic testing would be helpful in predicting CI/EAS outcomes and deciding treatment choices. However, because of the extreme genetic heterogeneity of deafness, clinical application of genetic information still entails difficulties. Target exon sequencing using massively parallel DNA sequencing is a new powerful strategy to discover rare causative genes in Mendelian disorders such as deafness. We used massive sequencing of the exons of 58 target candidate genes to analyze 8 (4 early-onset, 4 late-onset) Japanese CI/EAS patients, who did not have mutations in commonly found genes including GJB2, SLC26A4, or mitochondrial 1555A>G or 3243A>G mutations. We successfully identified four rare causative mutations in the MYO15A, TECTA, TMPRSS3, and ACTG1 genes in four patients who showed relatively good auditory performance with CI including EAS, suggesting that genetic testing may be able to predict the performance after implantation.

  7. Extended computational kernels in a massively parallel implementation of the Trotter-Suzuki approximation

    NASA Astrophysics Data System (ADS)

    Wittek, Peter; Calderaro, Luca

    2015-12-01

    We extended a parallel and distributed implementation of the Trotter-Suzuki algorithm for simulating quantum systems to study a wider range of physical problems and to make the library easier to use. The new release allows periodic boundary conditions, many-body simulations of non-interacting particles, arbitrary stationary potential functions, and imaginary time evolution to approximate the ground state energy. The new release is more resilient to the computational environment: a wider range of compiler chains and more platforms are supported. To ease development, we provide a more extensive command-line interface, an application programming interface, and wrappers from high-level languages.

  8. Massive parallelization of a 3D finite difference electromagnetic forward solution using domain decomposition methods on multiple CUDA enabled GPUs

    NASA Astrophysics Data System (ADS)

    Schultz, A.

    2010-12-01

    3D forward solvers lie at the core of inverse formulations used to image the variation of electrical conductivity within the Earth's interior. This property is associated with variations in temperature, composition, phase, presence of volatiles, and in specific settings, the presence of groundwater, geothermal resources, oil/gas or minerals. The high cost of 3D solutions has been a stumbling block to wider adoption of 3D methods. Parallel algorithms for modeling frequency domain 3D EM problems have not achieved wide scale adoption, with emphasis on fairly coarse grained parallelism using MPI and similar approaches. The communications bandwidth as well as the latency required to send and receive network communication packets is a limiting factor in implementing fine grained parallel strategies, inhibiting wide adoption of these algorithms. Leading Graphics Processor Unit (GPU) companies now produce GPUs with hundreds of GPU processor cores per die. The footprint, in silicon, of the GPU's restricted instruction set is much smaller than the general purpose instruction set required of a CPU. Consequently, the density of processor cores on a GPU can be much greater than on a CPU. GPUs also have local memory, registers and high speed communication with host CPUs, usually through PCIe type interconnects. The extremely low cost and high computational power of GPUs provides the EM geophysics community with an opportunity to achieve fine grained (i.e. massive) parallelization of codes on low cost hardware. The current generation of GPUs (e.g. NVidia Fermi) provides 3 billion transistors per chip die, with nearly 500 processor cores and up to 6 GB of fast (DDR5) GPU memory. This latest generation of GPU supports fast hardware double precision (64 bit) floating point operations of the type required for frequency domain EM forward solutions. Each Fermi GPU board can sustain nearly 1 TFLOP in double precision, and multiple boards can be installed in the host computer system. We

  9. Parallel group independent component analysis for massive fMRI data sets.

    PubMed

    Chen, Shaojie; Huang, Lei; Qiu, Huitong; Nebel, Mary Beth; Mostofsky, Stewart H; Pekar, James J; Lindquist, Martin A; Eloyan, Ani; Caffo, Brian S

    2017-01-01

    Independent component analysis (ICA) is widely used in the field of functional neuroimaging to decompose data into spatio-temporal patterns of co-activation. In particular, ICA has found wide usage in the analysis of resting state fMRI (rs-fMRI) data. Recently, a number of large-scale data sets have become publicly available that consist of rs-fMRI scans from thousands of subjects. As a result, efficient ICA algorithms that scale well to the increased number of subjects are required. To address this problem, we propose a two-stage likelihood-based algorithm for performing group ICA, which we denote Parallel Group Independent Component Analysis (PGICA). By utilizing the sequential nature of the algorithm and parallel computing techniques, we are able to efficiently analyze data sets from large numbers of subjects. We illustrate the efficacy of PGICA, which has been implemented in R and is freely available through the Comprehensive R Archive Network, through simulation studies and application to rs-fMRI data from two large multi-subject data sets, consisting of 301 and 779 subjects respectively.

  10. A massively parallel method of characteristic neutral particle transport code for GPUs

    SciTech Connect

    Boyd, W. R.; Smith, K.; Forget, B.

    2013-07-01

    Over the past 20 years, parallel computing has enabled computers to grow ever larger and more powerful while scientific applications have advanced in sophistication and resolution. This trend is being challenged, however, as the power consumption for conventional parallel computing architectures has risen to unsustainable levels and memory limitations have come to dominate compute performance. Heterogeneous computing platforms, such as Graphics Processing Units (GPUs), are an increasingly popular paradigm for solving these issues. This paper explores the applicability of GPUs for deterministic neutron transport. A 2D method of characteristics (MOC) code - OpenMOC - has been developed with solvers for both shared memory multi-core platforms as well as GPUs. The multi-threading and memory locality methodologies for the GPU solver are presented. Performance results for the 2D C5G7 benchmark demonstrate 25-35 x speedup for MOC on the GPU. The lessons learned from this case study will provide the basis for further exploration of MOC on GPUs as well as design decisions for hardware vendors exploring technologies for the next generation of machines for scientific computing. (authors)

  11. Parallel group independent component analysis for massive fMRI data sets

    PubMed Central

    Huang, Lei; Qiu, Huitong; Nebel, Mary Beth; Mostofsky, Stewart H.; Pekar, James J.; Lindquist, Martin A.; Eloyan, Ani; Caffo, Brian S.

    2017-01-01

    Independent component analysis (ICA) is widely used in the field of functional neuroimaging to decompose data into spatio-temporal patterns of co-activation. In particular, ICA has found wide usage in the analysis of resting state fMRI (rs-fMRI) data. Recently, a number of large-scale data sets have become publicly available that consist of rs-fMRI scans from thousands of subjects. As a result, efficient ICA algorithms that scale well to the increased number of subjects are required. To address this problem, we propose a two-stage likelihood-based algorithm for performing group ICA, which we denote Parallel Group Independent Component Analysis (PGICA). By utilizing the sequential nature of the algorithm and parallel computing techniques, we are able to efficiently analyze data sets from large numbers of subjects. We illustrate the efficacy of PGICA, which has been implemented in R and is freely available through the Comprehensive R Archive Network, through simulation studies and application to rs-fMRI data from two large multi-subject data sets, consisting of 301 and 779 subjects respectively. PMID:28278208

  12. Harnessing the killer micros: Applications from LLNL's massively parallel computing initiative

    SciTech Connect

    Belak, J.F.

    1991-07-01

    Recent developments in microprocessor technology have led to performance on scalar applications exceeding traditional supercomputers. This suggests that coupling hundreds or even thousands of these killer-micros'' (all working on a single physical problem) may lead to performance on vector applications in excess of vector supercomputers. Also, future generation killer-micros are expected to have vector floating point units as well. The purpose of this paper is to present an overview of the parallel computing environment at Lawrence Livermore National Laboratory. However, the perspective is necessarily quite narrow and most of the examples are taken from the author's implementation of a large scale molecular dynamics code on the BBN-TC2000 at LLNL. Parallelism is achieved through a geometric domain decomposition -- each processor is assigned a distinct region of space and all atoms contained therein. As the atomic positions evolve, the processors must exchange ownership of specific atoms. This geometric domain decomposition proves to be quite general and we highlight its application to image processing and hydrodynamics simulations as well. 10 refs., 6 figs.

  13. Library preparation and multiplex capture for massive parallel sequencing applications made efficient and easy.

    PubMed

    Neiman, Mårten; Sundling, Simon; Grönberg, Henrik; Hall, Per; Czene, Kamila; Lindberg, Johan; Klevebring, Daniel

    2012-01-01

    During the recent years, rapid development of sequencing technologies and a competitive market has enabled researchers to perform massive sequencing projects at a reasonable cost. As the price for the actual sequencing reactions drops, enabling more samples to be sequenced, the relative price for preparing libraries gets larger and the practical laboratory work becomes complex and tedious. We present a cost-effective strategy for simplified library preparation compatible with both whole genome- and targeted sequencing experiments. An optimized enzyme composition and reaction buffer reduces the number of required clean-up steps and allows for usage of bulk enzymes which makes the whole process cheap, efficient and simple. We also present a two-tagging strategy, which allows for multiplex sequencing of targeted regions. To prove our concept, we have prepared libraries for low-pass sequencing from 100 ng DNA, performed 2-, 4- and 8-plex exome capture and a 96-plex capture of a 500 kb region. In all samples we see a high concordance (>99.4%) of SNP calls when comparing to commercially available SNP-chip platforms.

  14. Deep mutational scanning of an antibody against epidermal growth factor receptor using mammalian cell display and massively parallel pyrosequencing.

    PubMed

    Forsyth, Charles M; Juan, Veronica; Akamatsu, Yoshiko; DuBridge, Robert B; Doan, Minhtam; Ivanov, Alexander V; Ma, Zhiyuan; Polakoff, Dixie; Razo, Jennifer; Wilson, Keith; Powers, David B

    2013-01-01

    We developed a method for deep mutational scanning of antibody complementarity-determining regions (CDRs) that can determine in parallel the effect of every possible single amino acid CDR substitution on antigen binding. The method uses libraries of full length IgGs containing more than 1000 CDR point mutations displayed on mammalian cells, sorted by flow cytometry into subpopulations based on antigen affinity and analyzed by massively parallel pyrosequencing. Higher, lower and neutral affinity mutations are identified by their enrichment or depletion in the FACS subpopulations. We applied this method to a humanized version of the anti-epidermal growth factor receptor antibody cetuximab, generated a near comprehensive data set for 1060 point mutations that recapitulates previously determined structural and mutational data for these CDRs and identified 67 point mutations that increase affinity. The large-scale, comprehensive sequence-function data sets generated by this method should have broad utility for engineering properties such as antibody affinity and specificity and may advance theoretical understanding of antibody-antigen recognition.

  15. Running ATLAS workloads within massively parallel distributed applications using Athena Multi-Process framework (AthenaMP)

    NASA Astrophysics Data System (ADS)

    Calafiura, Paolo; Leggett, Charles; Seuster, Rolf; Tsulaia, Vakhtang; Van Gemmeren, Peter

    2015-12-01

    AthenaMP is a multi-process version of the ATLAS reconstruction, simulation and data analysis framework Athena. By leveraging Linux fork and copy-on-write mechanisms, it allows for sharing of memory pages between event processors running on the same compute node with little to no change in the application code. Originally targeted to optimize the memory footprint of reconstruction jobs, AthenaMP has demonstrated that it can reduce the memory usage of certain configurations of ATLAS production jobs by a factor of 2. AthenaMP has also evolved to become the parallel event-processing core of the recently developed ATLAS infrastructure for fine-grained event processing (Event Service) which allows the running of AthenaMP inside massively parallel distributed applications on hundreds of compute nodes simultaneously. We present the architecture of AthenaMP, various strategies implemented by AthenaMP for scheduling workload to worker processes (for example: Shared Event Queue and Shared Distributor of Event Tokens) and the usage of AthenaMP in the diversity of ATLAS event processing workloads on various computing resources: Grid, opportunistic resources and HPC.

  16. Massively parallel kinetic Monte Carlo simulations of charge carrier transport in organic semiconductors

    NASA Astrophysics Data System (ADS)

    van der Kaap, N. J.; Koster, L. J. A.

    2016-02-01

    A parallel, lattice based Kinetic Monte Carlo simulation is developed that runs on a GPGPU board and includes Coulomb like particle-particle interactions. The performance of this computationally expensive problem is improved by modifying the interaction potential due to nearby particle moves, instead of fully recalculating it. This modification is achieved by adding dipole correction terms that represent the particle move. Exact evaluation of these terms is guaranteed by representing all interactions as 32-bit floating numbers, where only the integers between -222 and 222 are used. We validate our method by modelling the charge transport in disordered organic semiconductors, including Coulomb interactions between charges. Performance is mainly governed by the particle density in the simulation volume, and improves for increasing densities. Our method allows calculations on large volumes including particle-particle interactions, which is important in the field of organic semiconductors.

  17. Massively parallel recording of unit and local field potentials with silicon-based electrodes.

    PubMed

    Csicsvari, Jozsef; Henze, Darrell A; Jamieson, Brian; Harris, Kenneth D; Sirota, Anton; Barthó, Péter; Wise, Kensall D; Buzsáki, György

    2003-08-01

    Parallel recording of neuronal activity in the behaving animal is a prerequisite for our understanding of neuronal representation and storage of information. Here we describe the development of micro-machined silicon microelectrode arrays for unit and local field recordings. The two-dimensional probes with 96 or 64 recording sites provided high-density recording of unit and field activity with minimal tissue displacement or damage. The on-chip active circuit eliminated movement and other artifacts and greatly reduced the weight of the headgear. The precise geometry of the recording tips allowed for the estimation of the spatial location of the recorded neurons and for high-resolution estimation of extracellular current source density. Action potentials could be simultaneously recorded from the soma and dendrites of the same neurons. Silicon technology is a promising approach for high-density, high-resolution sampling of neuronal activity in both basic research and prosthetic devices.

  18. Efficient Massively-Parallel Approach for Soving the Time-Dependent Schrodinger Equation

    NASA Astrophysics Data System (ADS)

    Schneider, B. I.; Hu, S. X.; Collins, L. A.

    2006-05-01

    A variety of problems in physics and chemistry require the solution of the time-dependent Schr"odinger equation (TDSE), including atoms and molecules in oscillating electromagnetic fields, atomic collisions, ultracold systems, and materials subjected to external forces. We describe an approach in which the Finite Element Discrete Variable Representation (FEDVR) is combined with the Real-Space Product (RSP) algorithm to generate an efficient and highly accurate method for the solution of both the linear and nonlinear TDSE. The FEDVR provides a highly-accurate spatial representation using a minimum number of grid points (N) while the RSP algorithm propagates the wavefunction in O(N) operations per time step. Parallelization of the method is transparent and is implemented by distributing one or two spatial dimension across the available processors within the Message-Passing-Interface (MPI) scheme. The complete formalism and a number of three-dimensional (3D) examples are given.

  19. The transition to massively parallel computing within a production environment at a DOE access center

    SciTech Connect

    McCoy, M.G.

    1993-04-01

    In contemplating the transition from sequential to MP computing, the National Energy Research Supercomputer Center (NERSC) is faced with the frictions inherent in the duality of its mission. There have been two goals, the first has been to provide a stable, serviceable, production environment to the user base, the second to bring the most capable early serial supercomputers to the Center to make possible the leading edge simulations. This seeming conundrum has in reality been a source of strength. The task of meeting both goals was faced before with the CRAY 1 which, as delivered, was all iron; so the problems associated with the advent of parallel computers are not entirely new, but they are serious. Current vector supercomputers, such as the C90, offer mature production environments, including software tools, a large applications base, and generality; these machines can be used to attack the spectrum of scientific applications by a large user base knowledgeable in programming techniques for this architecture. Parallel computers to date have offered less developed, even rudimentary, working environments, a sparse applications base, and forced specialization. They have been specialized in terms of programming models, and specialized in terms of the kinds of applications which would do well on the machines. Given this context, why do many service computer centers feel that now is the time to cease or slow the procurement of traditional vector supercomputers in favor of MP systems? What are some of the issues that NERSC must face to engineer a smooth transition? The answers to these questions are multifaceted and by no means completely clear. However, a route exists as a result of early efforts at the Laboratories combined with research within the HPCC Program. One can begin with an analysis of why the hardware and software appearing shortly should be made available to the mainstream, and then address what would be required in an initial production environment.

  20. De novo assembly and validation of planaria transcriptome by massive parallel sequencing and shotgun proteomics.

    PubMed

    Adamidi, Catherine; Wang, Yongbo; Gruen, Dominic; Mastrobuoni, Guido; You, Xintian; Tolle, Dominic; Dodt, Matthias; Mackowiak, Sebastian D; Gogol-Doering, Andreas; Oenal, Pinar; Rybak, Agnieszka; Ross, Eric; Sánchez Alvarado, Alejandro; Kempa, Stefan; Dieterich, Christoph; Rajewsky, Nikolaus; Chen, Wei

    2011-07-01

    Freshwater planaria are a very attractive model system for stem cell biology, tissue homeostasis, and regeneration. The genome of the planarian Schmidtea mediterranea has recently been sequenced and is estimated to contain >20,000 protein-encoding genes. However, the characterization of its transcriptome is far from complete. Furthermore, not a single proteome of the entire phylum has been assayed on a genome-wide level. We devised an efficient sequencing strategy that allowed us to de novo assemble a major fraction of the S. mediterranea transcriptome. We then used independent assays and massive shotgun proteomics to validate the authenticity of transcripts. In total, our de novo assembly yielded 18,619 candidate transcripts with a mean length of 1118 nt after filtering. A total of 17,564 candidate transcripts could be mapped to 15,284 distinct loci on the current genome reference sequence. RACE confirmed complete or almost complete 5' and 3' ends for 22/24 transcripts. The frequencies of frame shifts, fusion, and fission events in the assembled transcripts were computationally estimated to be 4.2%-13%, 0%-3.7%, and 2.6%, respectively. Our shotgun proteomics produced 16,135 distinct peptides that validated 4200 transcripts (FDR ≤1%). The catalog of transcripts assembled in this study, together with the identified peptides, dramatically expands and refines planarian gene annotation, demonstrated by validation of several previously unknown transcripts with stem cell-dependent expression patterns. In addition, our robust transcriptome characterization pipeline could be applied to other organisms without genome assembly. All of our data, including homology annotation, are freely available at SmedGD, the S. mediterranea genome database.

  1. Measuring the sequence-affinity landscape of antibodies with massively parallel titration curves

    PubMed Central

    Adams, Rhys M; Mora, Thierry; Walczak, Aleksandra M; Kinney, Justin B

    2016-01-01

    Despite the central role that antibodies play in the adaptive immune system and in biotechnology, much remains unknown about the quantitative relationship between an antibody’s amino acid sequence and its antigen binding affinity. Here we describe a new experimental approach, called Tite-Seq, that is capable of measuring binding titration curves and corresponding affinities for thousands of variant antibodies in parallel. The measurement of titration curves eliminates the confounding effects of antibody expression and stability that arise in standard deep mutational scanning assays. We demonstrate Tite-Seq on the CDR1H and CDR3H regions of a well-studied scFv antibody. Our data shed light on the structural basis for antigen binding affinity and suggests a role for secondary CDR loops in establishing antibody stability. Tite-Seq fills a large gap in the ability to measure critical aspects of the adaptive immune system, and can be readily used for studying sequence-affinity landscapes in other protein systems. DOI: http://dx.doi.org/10.7554/eLife.23156.001 PMID:28035901

  2. Massively parallel first-principles simulation of electron dynamics in materials

    DOE PAGES

    Draeger, Erik W.; Andrade, Xavier; Gunnels, John A.; ...

    2017-03-04

    Here we present a highly scalable, parallel implementation of first-principles electron dynamics coupled with molecular dynamics (MD). By using optimized kernels, network topology aware communication, and by fully distributing all terms in the time-dependent Kohn–Sham equation, we demonstrate unprecedented time to solution for disordered aluminum systems of 2000 atoms (22,000 electrons) and 5400 atoms (59,400 electrons), with wall clock time as low as 7.5 s per MD time step. Despite a significant amount of non-local communication required in every iteration, we achieved excellent strong scaling and sustained performance on the Sequoia Blue Gene/Q supercomputer at LLNL. We obtained up tomore » 59% of the theoretical sustained peak performance on 16,384 nodes and performance of 8.75 Petaflop/s (43% of theoretical peak) on the full 98,304 node machine (1,572,864 cores). Lastly, scalable explicit electron dynamics allows for the study of phenomena beyond the reach of standard first-principles MD, in particular, materials subject to strong or rapid perturbations, such as pulsed electromagnetic radiation, particle irradiation, or strong electric currents.« less

  3. Massively parallel manipulation of single cells and microparticles using optical images.

    PubMed

    Chiou, Pei Yu; Ohta, Aaron T; Wu, Ming C

    2005-07-21

    The ability to manipulate biological cells and micrometre-scale particles plays an important role in many biological and colloidal science applications. However, conventional manipulation techniques--including optical tweezers, electrokinetic forces (electrophoresis, dielectrophoresis, travelling-wave dielectrophoresis), magnetic tweezers, acoustic traps and hydrodynamic flows--cannot achieve high resolution and high throughput at the same time. Optical tweezers offer high resolution for trapping single particles, but have a limited manipulation area owing to tight focusing requirements; on the other hand, electrokinetic forces and other mechanisms provide high throughput, but lack the flexibility or the spatial resolution necessary for controlling individual cells. Here we present an optical image-driven dielectrophoresis technique that permits high-resolution patterning of electric fields on a photoconductive surface for manipulating single particles. It requires 100,000 times less optical intensity than optical tweezers. Using an incoherent light source (a light-emitting diode or a halogen lamp) and a digital micromirror spatial light modulator, we have demonstrated parallel manipulation of 15,000 particle traps on a 1.3 x 1.0 mm2 area. With direct optical imaging control, multiple manipulation functions are combined to achieve complex, multi-step manipulation protocols.

  4. Massively-parallel electrical-conductivity imaging of hydrocarbonsusing the Blue Gene/L supercomputer

    SciTech Connect

    Commer, M.; Newman, G.A.; Carazzone, J.J.; Dickens, T.A.; Green,K.E.; Wahrmund, L.A.; Willen, D.E.; Shiu, J.

    2007-05-16

    Large-scale controlled source electromagnetic (CSEM)three-dimensional (3D) geophysical imaging is now receiving considerableattention for electrical conductivity mapping of potential offshore oiland gas reservoirs. To cope with the typically large computationalrequirements of the 3D CSEM imaging problem, our strategies exploitcomputational parallelism and optimized finite-difference meshing. Wereport on an imaging experiment, utilizing 32,768 tasks/processors on theIBM Watson Research Blue Gene/L (BG/L) supercomputer. Over a 24-hourperiod, we were able to image a large scale marine CSEM field data setthat previously required over four months of computing time ondistributed clusters utilizing 1024 tasks on an Infiniband fabric. Thetotal initial data misfit could be decreased by 67 percent within 72completed inversion iterations, indicating an electrically resistiveregion in the southern survey area below a depth of 1500 m below theseafloor. The major part of the residual misfit stems from transmitterparallel receiver components that have an offset from the transmittersail line (broadside configuration). Modeling confirms that improvedbroadside data fits can be achieved by considering anisotropic electricalconductivities. While delivering a satisfactory gross scale image for thedepths of interest, the experiment provides important evidence for thenecessity of discriminating between horizontal and verticalconductivities for maximally consistent 3D CSEM inversions.

  5. Estimating genome-wide gene networks using nonparametric Bayesian network models on massively parallel computers.

    PubMed

    Tamada, Yoshinori; Imoto, Seiya; Araki, Hiromitsu; Nagasaki, Masao; Print, Cristin; Charnock-Jones, D Stephen; Miyano, Satoru

    2011-01-01

    We present a novel algorithm to estimate genome-wide gene networks consisting of more than 20,000 genes from gene expression data using nonparametric Bayesian networks. Due to the difficulty of learning Bayesian network structures, existing algorithms cannot be applied to more than a few thousand genes. Our algorithm overcomes this limitation by repeatedly estimating subnetworks in parallel for genes selected by neighbor node sampling. Through numerical simulation, we confirmed that our algorithm outperformed a heuristic algorithm in a shorter time. We applied our algorithm to microarray data from human umbilical vein endothelial cells (HUVECs) treated with siRNAs, to construct a human genome-wide gene network, which we compared to a small gene network estimated for the genes extracted using a traditional bioinformatics method. The results showed that our genome-wide gene network contains many features of the small network, as well as others that could not be captured during the small network estimation. The results also revealed master-regulator genes that are not in the small network but that control many of the genes in the small network. These analyses were impossible to realize without our proposed algorithm.

  6. Massively parallel manipulation of single cells and microparticles using optical images

    NASA Astrophysics Data System (ADS)

    Chiou, Pei Yu; Ohta, Aaron T.; Wu, Ming C.

    2005-07-01

    The ability to manipulate biological cells and micrometre-scale particles plays an important role in many biological and colloidal science applications. However, conventional manipulation techniques-including optical tweezers, electrokinetic forces (electrophoresis, dielectrophoresis, travelling-wave dielectrophoresis), magnetic tweezers, acoustic traps and hydrodynamic flows-cannot achieve high resolution and high throughput at the same time. Optical tweezers offer high resolution for trapping single particles, but have a limited manipulation area owing to tight focusing requirements; on the other hand, electrokinetic forces and other mechanisms provide high throughput, but lack the flexibility or the spatial resolution necessary for controlling individual cells. Here we present an optical image-driven dielectrophoresis technique that permits high-resolution patterning of electric fields on a photoconductive surface for manipulating single particles. It requires 100,000 times less optical intensity than optical tweezers. Using an incoherent light source (a light-emitting diode or a halogen lamp) and a digital micromirror spatial light modulator, we have demonstrated parallel manipulation of 15,000 particle traps on a 1.3 × 1.0mm2 area. With direct optical imaging control, multiple manipulation functions are combined to achieve complex, multi-step manipulation protocols.

  7. DFT-Based Electronic Structure Calculations on Hybrid and Massively Parallel Computer Architectures

    NASA Astrophysics Data System (ADS)

    Briggs, Emil; Hodak, Miroslav; Lu, Wenchang; Bernholc, Jerry

    2014-03-01

    The latest generation of supercomputers is capable of multi-petaflop peak performance, achieved by using thousands of multi-core CPU's and often coupled with thousands of GPU's. However, efficient utilization of this computing power for electronic structure calculations presents significant challenges. We describe adaptations of the Real-Space Multigrid (RMG) code that enable it to scale well to thousands of nodes. A hybrid technique that uses one MPI process per node, rather than on per core was adopted with OpenMP and POSIX threads used for intra-node parallelization. This reduces the number of MPI process's by an order of magnitude or more and improves individual node memory utilization. GPU accelerators are also becoming common and are capable of extremely high performance for vector workloads. However, they typically have much lower scalar performance than CPU's, so achieving good performance requires that the workload is carefully partitioned and data transfer between CPU and GPU is optimized. We have used a hybrid approach utilizing MPI/OpenMP/POSIX threads and GPU accelerators to reach excellent scaling to over 100,000 cores on a Cray XE6 platform as well as a factor of three performance improvement when using a Cray XK7 system with CPU-GPU nodes.

  8. Efficient massively parallel simulation of dynamic channel assignment schemes for wireless cellular communications

    NASA Technical Reports Server (NTRS)

    Greenberg, Albert G.; Lubachevsky, Boris D.; Nicol, David M.; Wright, Paul E.

    1994-01-01

    Fast, efficient parallel algorithms are presented for discrete event simulations of dynamic channel assignment schemes for wireless cellular communication networks. The driving events are call arrivals and departures, in continuous time, to cells geographically distributed across the service area. A dynamic channel assignment scheme decides which call arrivals to accept, and which channels to allocate to the accepted calls, attempting to minimize call blocking while ensuring co-channel interference is tolerably low. Specifically, the scheme ensures that the same channel is used concurrently at different cells only if the pairwise distances between those cells are sufficiently large. Much of the complexity of the system comes from ensuring this separation. The network is modeled as a system of interacting continuous time automata, each corresponding to a cell. To simulate the model, conservative methods are used; i.e., methods in which no errors occur in the course of the simulation and so no rollback or relaxation is needed. Implemented on a 16K processor MasPar MP-1, an elegant and simple technique provides speedups of about 15 times over an optimized serial simulation running on a high speed workstation. A drawback of this technique, typical of conservative methods, is that processor utilization is rather low. To overcome this, new methods were developed that exploit slackness in event dependencies over short intervals of time, thereby raising the utilization to above 50 percent and the speedup over the optimized serial code to about 120 times.

  9. Sassena — X-ray and neutron scattering calculated from molecular dynamics trajectories using massively parallel computers

    NASA Astrophysics Data System (ADS)

    Lindner, Benjamin; Smith, Jeremy C.

    2012-07-01

    Massively parallel computers now permit the molecular dynamics (MD) simulation of multi-million atom systems on time scales up to the microsecond. However, the subsequent analysis of the resulting simulation trajectories has now become a high performance computing problem in itself. Here, we present software for calculating X-ray and neutron scattering intensities from MD simulation data that scales well on massively parallel supercomputers. The calculation and data staging schemes used maximize the degree of parallelism and minimize the IO bandwidth requirements. The strong scaling tested on the Jaguar Petaflop Cray XT5 at Oak Ridge National Laboratory exhibits virtually linear scaling up to 7000 cores for most benchmark systems. Since both MPI and thread parallelism is supported, the software is flexible enough to cover scaling demands for different types of scattering calculations. The result is a high performance tool capable of unifying large-scale supercomputing and a wide variety of neutron/synchrotron technology. Catalogue identifier: AELW_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AELW_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU General Public License, version 3 No. of lines in distributed program, including test data, etc.: 1 003 742 No. of bytes in distributed program, including test data, etc.: 798 Distribution format: tar.gz Programming language: C++, OpenMPI Computer: Distributed Memory, Cluster of Computers with high performance network, Supercomputer Operating system: UNIX, LINUX, OSX Has the code been vectorized or parallelized?: Yes, the code has been parallelized using MPI directives. Tested with up to 7000 processors RAM: Up to 1 Gbytes/core Classification: 6.5, 8 External routines: Boost Library, FFTW3, CMAKE, GNU C++ Compiler, OpenMPI, LibXML, LAPACK Nature of problem: Recent developments in supercomputing allow molecular dynamics simulations to

  10. Modeling cardiovascular hemodynamics using the lattice Boltzmann method on massively parallel supercomputers

    NASA Astrophysics Data System (ADS)

    Randles, Amanda Elizabeth

    the modeling of fluids in vessels with smaller diameters and a method for introducing the deformational forces exerted on the arterial flows from the movement of the heart by borrowing concepts from cosmodynamics are presented. These additional forces have a great impact on the endothelial shear stress. Third, the fluid model is extended to not only recover Navier-Stokes hydrodynamics, but also a wider range of Knudsen numbers, which is especially important in micro- and nano-scale flows. The tradeoffs of many optimizations methods such as the use of deep halo level ghost cells that, alongside hybrid programming models, reduce the impact of such higher-order models and enable efficient modeling of extreme regimes of computational fluid dynamics are discussed. Fourth, the extension of these models to other research questions like clogging in microfluidic devices and determining the severity of co-arctation of the aorta is presented. Through this work, a validation of these methods by taking real patient data and the measured pressure value before the narrowing of the aorta and predicting the pressure drop across the co-arctation is shown. Comparison with the measured pressure drop in vivo highlights the accuracy and potential impact of such patient specific simulations. Finally, a method to enable the simulation of longer trajectories in time by discretizing both spatially and temporally is presented. In this method, a serial coarse iterator is used to initialize data at discrete time steps for a fine model that runs in parallel. This coarse solver is based on a larger time step and typically a coarser discretization in space. Iterative refinement enables the compute-intensive fine iterator to be modeled with temporal parallelization. The algorithm consists of a series of prediction-corrector iterations completing when the results have converged within a certain tolerance. Combined, these developments allow large fluid models to be simulated for longer time durations

  11. Massively Parallelized Pollen Tube Guidance and Mechanical Measurements on a Lab-on-a-Chip Platform.

    PubMed

    Shamsudhin, Naveen; Laeubli, Nino; Atakan, Huseyin Baris; Vogler, Hannes; Hu, Chengzhi; Haeberle, Walter; Sebastian, Abu; Grossniklaus, Ueli; Nelson, Bradley J

    2016-01-01

    Pollen tubes are used as a model in the study of plant morphogenesis, cellular differentiation, cell wall biochemistry, biomechanics, and intra- and intercellular signaling. For a "systems-understanding" of the bio-chemo-mechanics of tip-polarized growth in pollen tubes, the need for a versatile, experimental assay platform for quantitative data collection and analysis is critical. We introduce a Lab-on-a-Chip (LoC) concept for high-throughput pollen germination and pollen tube guidance for parallelized optical and mechanical measurements. The LoC localizes a large number of growing pollen tubes on a single plane of focus with unidirectional tip-growth, enabling high-resolution quantitative microscopy. This species-independent LoC platform can be integrated with micro-/nano-indentation systems, such as the cellular force microscope (CFM) or the atomic force microscope (AFM), allowing for rapid measurements of cell wall stiffness of growing tubes. As a demonstrative example, we show the growth and directional guidance of hundreds of lily (Lilium longiflorum) and Arabidopsis (Arabidopsis thaliana) pollen tubes on a single LoC microscopy slide. Combining the LoC with the CFM, we characterized the cell wall stiffness of lily pollen tubes. Using the stiffness statistics and finite-element-method (FEM)-based approaches, we computed an effective range of the linear elastic moduli of the cell wall spanning the variability space of physiological parameters including internal turgor, cell wall thickness, and tube diameter. We propose the LoC device as a versatile and high-throughput phenomics platform for plant reproductive and development biology using the pollen tube as a model.

  12. Massively Parallelized Pollen Tube Guidance and Mechanical Measurements on a Lab-on-a-Chip Platform

    PubMed Central

    Laeubli, Nino; Atakan, Huseyin Baris; Vogler, Hannes; Hu, Chengzhi; Haeberle, Walter; Sebastian, Abu; Grossniklaus, Ueli; Nelson, Bradley J.

    2016-01-01

    Pollen tubes are used as a model in the study of plant morphogenesis, cellular differentiation, cell wall biochemistry, biomechanics, and intra- and intercellular signaling. For a “systems-understanding” of the bio-chemo-mechanics of tip-polarized growth in pollen tubes, the need for a versatile, experimental assay platform for quantitative data collection and analysis is critical. We introduce a Lab-on-a-Chip (LoC) concept for high-throughput pollen germination and pollen tube guidance for parallelized optical and mechanical measurements. The LoC localizes a large number of growing pollen tubes on a single plane of focus with unidirectional tip-growth, enabling high-resolution quantitative microscopy. This species-independent LoC platform can be integrated with micro-/nano-indentation systems, such as the cellular force microscope (CFM) or the atomic force microscope (AFM), allowing for rapid measurements of cell wall stiffness of growing tubes. As a demonstrative example, we show the growth and directional guidance of hundreds of lily (Lilium longiflorum) and Arabidopsis (Arabidopsis thaliana) pollen tubes on a single LoC microscopy slide. Combining the LoC with the CFM, we characterized the cell wall stiffness of lily pollen tubes. Using the stiffness statistics and finite-element-method (FEM)-based approaches, we computed an effective range of the linear elastic moduli of the cell wall spanning the variability space of physiological parameters including internal turgor, cell wall thickness, and tube diameter. We propose the LoC device as a versatile and high-throughput phenomics platform for plant reproductive and development biology using the pollen tube as a model. PMID:27977748

  13. Massively Parallel Geostatistical Inversion of Coupled Processes in Heterogeneous Porous Media

    NASA Astrophysics Data System (ADS)

    Ngo, A.; Schwede, R. L.; Li, W.; Bastian, P.; Ippisch, O.; Cirpka, O. A.

    2012-04-01

    another level of parallelization has been added.

  14. Massively parallel tag sequencing reveals the complexity of anaerobic marine protistan communities

    PubMed Central

    Stoeck, Thorsten; Behnke, Anke; Christen, Richard; Amaral-Zettler, Linda; Rodriguez-Mora, Maria J; Chistoserdov, Andrei; Orsi, William; Edgcomb, Virginia P

    2009-01-01

    Background Recent advances in sequencing strategies make possible unprecedented depth and scale of sampling for molecular detection of microbial diversity. Two major paradigm-shifting discoveries include the detection of bacterial diversity that is one to two orders of magnitude greater than previous estimates, and the discovery of an exciting 'rare biosphere' of molecular signatures ('species') of poorly understood ecological significance. We applied a high-throughput parallel tag sequencing (454 sequencing) protocol adopted for eukaryotes to investigate protistan community complexity in two contrasting anoxic marine ecosystems (Framvaren Fjord, Norway; Cariaco deep-sea basin, Venezuela). Both sampling sites have previously been scrutinized for protistan diversity by traditional clone library construction and Sanger sequencing. By comparing these clone library data with 454 amplicon library data, we assess the efficiency of high-throughput tag sequencing strategies. We here present a novel, highly conservative bioinformatic analysis pipeline for the processing of large tag sequence data sets. Results The analyses of ca. 250,000 sequence reads revealed that the number of detected Operational Taxonomic Units (OTUs) far exceeded previous richness estimates from the same sites based on clone libraries and Sanger sequencing. More than 90% of this diversity was represented by OTUs with less than 10 sequence tags. We detected a substantial number of taxonomic groups like Apusozoa, Chrysomerophytes, Centroheliozoa, Eustigmatophytes, hyphochytriomycetes, Ichthyosporea, Oikomonads, Phaeothamniophytes, and rhodophytes which remained undetected by previous clone library-based diversity surveys of the sampling sites. The most important innovations in our newly developed bioinformatics pipeline employ (i) BLASTN with query parameters adjusted for highly variable domains and a complete database of public ribosomal RNA (rRNA) gene sequences for taxonomic assignments of tags; (ii

  15. Ring polymer chains confined in a slit geometry of two parallel walls: the massive field theory approach

    NASA Astrophysics Data System (ADS)

    Usatenko, Z.; Halun, J.

    2017-01-01

    The investigation of a dilute solution of phantom ideal ring polymer chains confined in a slit geometry of two parallel repulsive walls, two inert walls, and for the mixed case of one inert and the other one repulsive wall, was performed. Taking into account the well known correspondence between the field theoretical {φ4} O(n)-vector model in the limit n\\to 0 and the behaviour of long-flexible polymer chains in a good solvent, the investigation of a dilute solution of long-flexible ring polymer chains with the excluded volume interaction (EVI) confined in a slit geometry of two parallel repulsive walls was performed in the framework of the massive field theory approach at fixed space dimensions d  =  3 up to one-loop order. For all the above mentioned cases, the correspondent depletion interaction potentials, the depletion forces and the forces which exert the phantom ideal ring polymers and the ring polymers with the EVI on the walls were calculated, respectively. The obtained results indicate that the phantom ideal ring polymer chains and the ring polymer chains with the EVI due to the complexity of chain topology and because of the entropical reason demonstrate completely different behaviour in confined geometries than linear polymer chains. For example, the phantom ideal ring polymers prefer to escape from the space not only between two repulsive walls but also in the case of two inert walls, which leads to the attractive depletion forces. The ring polymer chains with less complex knot types (with the bigger radius of gyration) in a ring topology in the wide slit region exert higher forces on the confining repulsive walls. The depletion force in the case of mixed boundary conditions becomes repulsive in contrast to the case of linear polymer chains.

  16. Massively Parallel Implementation of Explicitly Correlated Coupled-Cluster Singles and Doubles Using TiledArray Framework.

    PubMed

    Peng, Chong; Calvin, Justus A; Pavošević, Fabijan; Zhang, Jinmei; Valeev, Edward F

    2016-12-29

    A new distributed-memory massively parallel implementation of standard and explicitly correlated (F12) coupled-cluster singles and doubles (CCSD) with canonical O(N(6)) computational complexity is described. The implementation is based on the TiledArray tensor framework. Novel features of the implementation include (a) all data greater than O(N) is distributed in memory and (b) the mixed use of density fitting and integral-driven formulations that optionally allows to avoid storage of tensors with three and four unoccupied indices. Excellent strong scaling is demonstrated on a multicore shared-memory computer, a commodity distributed-memory computer, and a national-scale supercomputer. The performance on a shared-memory computer is competitive with the popular CCSD implementations in ORCA and Psi4. Moreover, the CCSD performance on a commodity-size cluster significantly improves on the state-of-the-art package NWChem. The large-scale parallel explicitly correlated coupled-cluster implementation makes routine accurate estimation of the coupled-cluster basis set limit for molecules with 20 or more atoms. Thus, it can provide valuable benchmarks for the merging reduced-scaling coupled-cluster approaches. The new implementation allowed us to revisit the basis set limit for the CCSD contribution to the binding energy of π-stacked uracil dimer, a challenging paradigm of π-stacking interactions from the S66 benchmark database. The revised value for the CCSD correlation binding energy obtained with the help of quadruple-ζ CCSD computations, -8.30 ± 0.02 kcal/mol, is significantly different from the S66 reference value, -8.50 kcal/mol, as well as other CBS limit estimates in the recent literature.

  17. Massively parallel E-beam inspection: enabling next-generation patterned defect inspection for wafer and mask manufacturing

    NASA Astrophysics Data System (ADS)

    Malloy, Matt; Thiel, Brad; Bunday, Benjamin D.; Wurm, Stefan; Mukhtar, Maseeh; Quoi, Kathy; Kemen, Thomas; Zeidler, Dirk; Eberle, Anna Lena; Garbowski, Tomasz; Dellemann, Gregor; Peters, Jan Hendrik

    2015-03-01

    SEMATECH aims to identify and enable disruptive technologies to meet the ever-increasing demands of semiconductor high volume manufacturing (HVM). As such, a program was initiated in 2012 focused on high-speed e-beam defect inspection as a complement, and eventual successor, to bright field optical patterned defect inspection [1]. The primary goal is to enable a new technology to overcome the key gaps that are limiting modern day inspection in the fab; primarily, throughput and sensitivity to detect ultra-small critical defects. The program specifically targets revolutionary solutions based on massively parallel e-beam technologies, as opposed to incremental improvements to existing e-beam and optical inspection platforms. Wafer inspection is the primary target, but attention is also being paid to next generation mask inspection. During the first phase of the multi-year program multiple technologies were reviewed, a down-selection was made to the top candidates, and evaluations began on proof of concept systems. A champion technology has been selected and as of late 2014 the program has begun to move into the core technology maturation phase in order to enable eventual commercialization of an HVM system. Performance data from early proof of concept systems will be shown along with roadmaps to achieving HVM performance. SEMATECH's vision for moving from early-stage development to commercialization will be shown, including plans for development with industry leading technology providers.

  18. A bumpy ride on the diagnostic bench of massive parallel sequencing, the case of the mitochondrial genome.

    PubMed

    Vancampenhout, Kim; Caljon, Ben; Spits, Claudia; Stouffs, Katrien; Jonckheere, An; De Meirleir, Linda; Lissens, Willy; Vanlander, Arnaud; Smet, Joél; De Paepe, Boel; Van Coster, Rudy; Seneca, Sara

    2014-01-01

    The advent of massive parallel sequencing (MPS) has revolutionized the field of human molecular genetics, including the diagnostic study of mitochondrial (mt) DNA dysfunction. The analysis of the complete mitochondrial genome using MPS platforms is now common and will soon outrun conventional sequencing. However, the development of a robust and reliable protocol is rather challenging. A previous pilot study for the re-sequencing of human mtDNA revealed an uneven coverage, affecting predominantly part of the plus strand. In an attempt to address this problem, we undertook a comparative study of standard and modified protocols for the Ion Torrent PGM system. We could not improve strand representation by altering the recommended shearing methodology of the standard workflow or omitting the DNA polymerase amplification step from the library construction process. However, we were able to associate coverage bias of the plus strand with a specific sequence motif. Additionally, we compared coverage and variant calling across technologies. The same samples were also sequenced on a MiSeq device which showed that coverage and heteroplasmic variant calling were much improved.

  19. External Quality Assessment for Detection of Fetal Trisomy 21, 18, and 13 by Massively Parallel Sequencing in Clinical Laboratories.

    PubMed

    Zhang, Rui; Zhang, Hongyun; Li, Yulong; Han, Yanxi; Xie, Jiehong; Li, Jinming

    2016-03-01

    An external quality assessment for detection of trisomy 21, 18, and 13 by massively parallel sequencing was implemented by the National Center for Clinical Laboratories of People's Republic of China in 2014. Simulated samples were prepared by mixing fragmented abnormal DNA with plasma from non-pregnant women. The external quality assessment panel, comprising 5 samples from pregnant healthy women, 2 samples with sex chromosome aneuploidies, and 13 samples with different concentrations of fetal fractions positive for trisomy 21, 18, and 13, was then distributed to participating laboratories. In total, 55.6% (47 of 84) of respondents correctly identified each of the samples in the panel. Seventeen false-negative and 87 gray zone results were reported, most [102 of 104 (98.1%)] of which were derived from for trisomy samples with effective fetal fractions <4%. No laboratories generated false-positive results. In addition, we observed varied diagnostic capabilities of different assays, with the assay on the basis of NextSeq CN500 performing better than others, whereas Z values generated by BGISEQ-100 fluctuated greatly. There were no significant correlations between the numbers of unique sequence reads and Z values from any trisomy sample generated by BGISEQ-100. Overall, most clinical laboratories detected samples containing effective fetal fractions >4%. Our study shows need for further laboratory training in the management of samples with low fetal fractions. For some assays, precision of Z values needs to be improved.

  20. Feasibility of using the Massively Parallel Processor for large eddy simulations and other Computational Fluid Dynamics applications

    NASA Technical Reports Server (NTRS)

    Bruno, John

    1984-01-01

    The results of an investigation into the feasibility of using the MPP for direct and large eddy simulations of the Navier-Stokes equations is presented. A major part of this study was devoted to the implementation of two of the standard numerical algorithms for CFD. These implementations were not run on the Massively Parallel Processor (MPP) since the machine delivered to NASA Goddard does not have sufficient capacity. Instead, a detailed implementation plan was designed and from these were derived estimates of the time and space requirements of the algorithms on a suitably configured MPP. In addition, other issues related to the practical implementation of these algorithms on an MPP-like architecture were considered; namely, adaptive grid generation, zonal boundary conditions, the table lookup problem, and the software interface. Performance estimates show that the architectural components of the MPP, the Staging Memory and the Array Unit, appear to be well suited to the numerical algorithms of CFD. This combined with the prospect of building a faster and larger MMP-like machine holds the promise of achieving sustained gigaflop rates that are required for the numerical simulations in CFD.

  1. Large-scale modeling of the Antarctic ice sheet using a massively-parallelized finite element model (CIELO).

    NASA Astrophysics Data System (ADS)

    Larour, E.; Rignot, E.; Morlighem, M.; Seroussi, H.

    2008-12-01

    We implemented a fully-three-dimensional, thermo-mechanical, finite element model of the Antarctic Ice Sheet with a spatial resolution varying from 10 km inland to 2 km along the coast on a massively-parallelized architecture named CIELO and developed at JPL. The model is based on a "Pattyn" type formulation for ice sheets, and a "MacAyeal shelf-stream" formulation for ice shelves. Both types of formulations are coupled using penalty methods, which enables a considerable reduction of the computational load. Using a simple law of basal friction (based on locally computed balanced velocities), the model is able to replicate the location and order-magnitude speed of major ice streams and ice shelves. We then coupled the model with observations of ice motion from SAR interferometry to refine the pattern of basal friction using an inverse control method (MacAyeal 1993). The result provides an excellent agreement with observations and a first complete mapping of the pattern of basal friction along the coast of Antarctica at a resolution compatible with the size of its glaciers and ice streams.

  2. Combined fragment molecular orbital cluster in molecule approach to massively parallel electron correlation calculations for large systems.

    PubMed

    Findlater, Alexander D; Zahariev, Federico; Gordon, Mark S

    2015-04-16

    The local correlation "cluster-in-molecule" (CIM) method is combined with the fragment molecular orbital (FMO) method, providing a flexible, massively parallel, and near-linear scaling approach to the calculation of electron correlation energies for large molecular systems. Although the computational scaling of the CIM algorithm is already formally linear, previous knowledge of the Hartree-Fock (HF) reference wave function and subsequent localized orbitals is required; therefore, extending the CIM method to arbitrarily large systems requires the aid of low-scaling/linear-scaling approaches to HF and orbital localization. Through fragmentation, the combined FMO-CIM method linearizes the scaling, with respect to system size, of the HF reference and orbital localization calculations, achieving near-linear scaling at both the reference and electron correlation levels. For the 20-residue alanine α helix, the preliminary implementation of the FMO-CIM method captures 99.6% of the MP2 correlation energy, requiring 21% of the MP2 wall time. The new method is also applied to solvated adamantine to illustrate the multilevel capability of the FMO-CIM method.

  3. Massively parallel rRNA gene sequencing exacerbates the potential for biased community diversity comparisons due to variable library sizes

    SciTech Connect

    Gihring, Thomas; Green, Stefan; Schadt, Christopher Warren

    2011-01-01

    Technologies for massively parallel sequencing are revolutionizing microbial ecology and are vastly increasing the scale of ribosomal RNA (rRNA) gene studies. Although pyrosequencing has increased the breadth and depth of possible rRNA gene sampling, one drawback is that the number of reads obtained per sample is difficult to control. Pyrosequencing libraries typically vary widely in the number of sequences per sample, even within individual studies, and there is a need to revisit the behaviour of richness estimators and diversity indices with variable gene sequence library sizes. Multiple reports and review papers have demonstrated the bias in non-parametric richness estimators (e.g. Chao1 and ACE) and diversity indices when using clone libraries. However, we found that biased community comparisons are accumulating in the literature. Here we demonstrate the effects of sample size on Chao1, ACE, CatchAll, Shannon, Chao-Shen and Simpson's estimations specifically using pyrosequencing libraries. The need to equalize the number of reads being compared across libraries is reiterated, and investigators are directed towards available tools for making unbiased diversity comparisons.

  4. Massively parallel sequencing of Chikso (Korean brindle cattle) to discover genome-wide SNPs and InDels.

    PubMed

    Choi, Jung-Woo; Liao, Xiaoping; Park, Sairom; Jeon, Heoyn-Jeong; Chung, Won-Hyong; Stothard, Paul; Park, Yeon-Soo; Lee, Jeong-Koo; Lee, Kyung-Tai; Kim, Sang-Hwan; Oh, Jae-Don; Kim, Namshin; Kim, Tae-Hun; Lee, Hak-Kyo; Lee, Sung-Jin

    2013-09-01

    Since the completion of the bovine sequencing projects, a substantial number of genetic variations such as single nucleotide polymorphisms have become available across the cattle genome. Recently, cataloguing such genetic variations has been accelerated using massively parallel sequencing technology. However, most of the recent studies have been concentrated on European Bos taurus cattle breeds, resulting in a severe lack of knowledge for valuable native cattle genetic resources worldwide. Here, we present the first whole-genome sequencing results for an endangered Korean native cattle breed, Chikso, using the Illumina HiSeq 2,000 sequencing platform. The genome of a Chikso bull was sequenced to approximately 25.3-fold coverage with 98.8% of the bovine reference genome sequence (UMD 3.1) covered. In total, 5,874,026 single nucleotide polymorphisms and 551,363 insertion/deletions were identified across all 29 autosomes and the X-chromosome, of which 45% and 75% were previously unknown, respectively. Most of the variations (92.7% of single nucleotide polymorphisms and 92.9% of insertion/deletions) were located in intergenic and intron regions. A total of 16,273 single nucleotide polymorphisms causing missense mutations were detected in 7,111 genes throughout the genome, which could potentially contribute to variation in economically important traits in Chikso. This study provides a valuable resource for further investigations of the genetic mechanisms underlying traits of interest in cattle, and for the development of improved genomics-based breeding tools.

  5. Massively Parallel Sequencing of Chikso (Korean Brindle Cattle) to Discover Genome-Wide SNPs and InDels

    PubMed Central

    Choi, Jung-Woo; Liao, Xiaoping; Park, Sairom; Jeon, Heoyn-Jeong; Chung, Won-Hyong; Stothard, Paul; Park, Yeon-Soo; Lee, Jeong-Koo; Lee, Kyung-Tai; Kim, Sang-Hwan; Oh, Jae-Don; Kim, Namshin; Kim, Tae-Hun; Lee, Hak-Kyo; Lee, Sung-Jin

    2013-01-01

    Since the completion of the bovine sequencing projects, a substantial number of genetic variations such as single nucleotide polymorphisms have become available across the cattle genome. Recently, cataloguing such genetic variations has been accelerated using massively parallel sequencing technology. However, most of the recent studies have been concentrated on European Bos taurus cattle breeds, resulting in a severe lack of knowledge for valuable native cattle genetic resources worldwide. Here, we present the first whole-genome sequencing results for an endangered Korean native cattle breed, Chikso, using the Illumina HiSeq 2,000 sequencing platform. The genome of a Chikso bull was sequenced to approximately 25.3-fold coverage with 98.8% of the bovine reference genome sequence (UMD 3.1) covered. In total, 5,874,026 single nucleotide polymorphisms and 551,363 insertion/deletions were identified across all 29 autosomes and the X-chromosome, of which 45% and 75% were previously unknown, respectively. Most of the variations (92.7% of single nucleotide polymorphisms and 92.9% of insertion/deletions) were located in intergenic and intron regions. A total of 16,273 single nucleotide polymorphisms causing missense mutations were detected in 7,111 genes throughout the genome, which could potentially contribute to variation in economically important traits in Chikso. This study provides a valuable resource for further investigations of the genetic mechanisms underlying traits of interest in cattle, and for the development of improved genomics-based breeding tools. PMID:23912596

  6. Non-CAR resists and advanced materials for Massively Parallel E-Beam Direct Write process integration

    NASA Astrophysics Data System (ADS)

    Pourteau, Marie-Line; Servin, Isabelle; Lepinay, Kévin; Essomba, Cyrille; Dal'Zotto, Bernard; Pradelles, Jonathan; Lattard, Ludovic; Brandt, Pieter; Wieland, Marco

    2016-03-01

    The emerging Massively Parallel-Electron Beam Direct Write (MP-EBDW) is an attractive high resolution high throughput lithography technology. As previously shown, Chemically Amplified Resists (CARs) meet process/integration specifications in terms of dose-to-size, resolution, contrast, and energy latitude. However, they are still limited by their line width roughness. To overcome this issue, we tested an alternative advanced non-CAR and showed it brings a substantial gain in sensitivity compared to CAR. We also implemented and assessed in-line post-lithographic treatments for roughness mitigation. For outgassing-reduction purpose, a top-coat layer is added to the total process stack. A new generation top-coat was tested and showed improved printing performances compared to the previous product, especially avoiding dark erosion: SEM cross-section showed a straight pattern profile. A spin-coatable charge dissipation layer based on conductive polyaniline has also been tested for conductivity and lithographic performances, and compatibility experiments revealed that the underlying resist type has to be carefully chosen when using this product. Finally, the Process Of Reference (POR) trilayer stack defined for 5 kV multi-e-beam lithography was successfully etched with well opened and straight patterns, and no lithography-etch bias.

  7. HLA-F coding and regulatory segments variability determined by massively parallel sequencing procedures in a Brazilian population sample.

    PubMed

    Lima, Thálitta Hetamaro Ayala; Buttura, Renato Vidal; Donadi, Eduardo Antônio; Veiga-Castelli, Luciana Caricati; Mendes-Junior, Celso Teixeira; Castelli, Erick C

    2016-10-01

    Human Leucocyte Antigen F (HLA-F) is a non-classical HLA class I gene distinguished from its classical counterparts by low allelic polymorphism and distinctive expression patterns. Its exact function remains unknown. It is believed that HLA-F has tolerogenic and immune modulatory properties. Currently, there is little information regarding the HLA-F allelic variation among human populations and the available studies have evaluated only a fraction of the HLA-F gene segment and/or have searched for known alleles only. Here we present a strategy to evaluate the complete HLA-F variability including its 5' upstream, coding and 3' downstream segments by using massively parallel sequencing procedures. HLA-F variability was surveyed on 196 individuals from the Brazilian Southeast. The results indicate that the HLA-F gene is indeed conserved at the protein level, where thirty coding haplotypes or coding alleles were detected, encoding only four different HLA-F full-length protein molecules. Moreover, a same protein molecule is encoded by 82.45% of all coding alleles detected in this Brazilian population sample. However, the HLA-F nucleotide and haplotype variability is much higher than our current knowledge both in Brazilians and considering the 1000 Genomes Project data. This protein conservation is probably a consequence of the key role of HLA-F in the immune system physiology.

  8. Probing the kinetic landscape of Hox transcription factor-DNA binding in live cells by massively parallel Fluorescence Correlation Spectroscopy.

    PubMed

    Papadopoulos, Dimitrios K; Krmpot, Aleksandar J; Nikolić, Stanko N; Krautz, Robert; Terenius, Lars; Tomancak, Pavel; Rigler, Rudolf; Gehring, Walter J; Vukojević, Vladana

    2015-11-01

    Hox genes encode transcription factors that control the formation of body structures, segment-specifically along the anterior-posterior axis of metazoans. Hox transcription factors bind nuclear DNA pervasively and regulate a plethora of target genes, deploying various molecular mechanisms that depend on the developmental and cellular context. To analyze quantitatively the dynamics of their DNA-binding behavior we have used confocal laser scanning microscopy (CLSM), single-point fluorescence correlation spectroscopy (FCS), fluorescence cross-correlation spectroscopy (FCCS) and bimolecular fluorescence complementation (BiFC). We show that the Hox transcription factor Sex combs reduced (Scr) forms dimers that strongly associate with its specific fork head binding site (fkh250) in live salivary gland cell nuclei. In contrast, dimers of a constitutively inactive, phospho-mimicking variant of Scr show weak, non-specific DNA-binding. Our studies reveal that nuclear dynamics of Scr is complex, exhibiting a changing landscape of interactions that is difficult to characterize by probing one point at a time. Therefore, we also provide mechanistic evidence using massively parallel FCS (mpFCS). We found that Scr dimers are predominantly formed on the DNA and are equally abundant at the chromosomes and an introduced multimeric fkh250 binding-site, indicating different mobilities, presumably reflecting transient binding with different affinities on the DNA. Our proof-of-principle results emphasize the advantages of mpFCS for quantitative characterization of fast dynamic processes in live cells.

  9. A massively parallel track-finding system for the LEVEL 2 trigger in the CLAS detector at CEBAF

    SciTech Connect

    Doughty, D.C. Jr.; Collins, P.; Lemon, S. ); Bonneau, P. )

    1994-02-01

    The track segment finding subsystem of the LEVEL 2 trigger in the CLAS detector has been designed and prototyped. Track segments will be found in the 35,076 wires of the drift chambers using a massively parallel array of 768 Xilinx XC-4005 FPGA's. These FPGA's are located on daughter cards attached to the front-end boards distributed around the detector. Each chip is responsible for finding tracks passing through a 4 x 6 slice of an axial superlayer, and reports two segment found bits, one for each pair of cells. The algorithm used finds segments even when one or two layers or cells along the track is missing (this number is programmable), while being highly resistant to false segments arising from noise hits. Adjacent chips share data to find tracks crossing cell and board boundaries. For maximum speed, fully combinatorial logic is used inside each chip, with the result that all segments in the detector are found within 150 ns. Segment collection boards gather track segments from each axial superlayer and pass them via a high speed link to the segment linking subsystem in an additional 400 ns for typical events. The Xilinx chips are ram-based and therefore reprogrammable, allowing for future upgrades and algorithm enhancements.

  10. LiNbO3: A photovoltaic substrate for massive parallel manipulation and patterning of nano-objects

    NASA Astrophysics Data System (ADS)

    Carrascosa, M.; García-Cabañes, A.; Jubera, M.; Ramiro, J. B.; Agulló-López, F.

    2015-12-01

    The application of evanescent photovoltaic (PV) fields, generated by visible illumination of Fe:LiNbO3 substrates, for parallel massive trapping and manipulation of micro- and nano-objects is critically reviewed. The technique has been often referred to as photovoltaic or photorefractive tweezers. The main advantage of the new method is that the involved electrophoretic and/or dielectrophoretic forces do not require any electrodes and large scale manipulation of nano-objects can be easily achieved using the patterning capabilities of light. The paper describes the experimental techniques for particle trapping and the main reported experimental results obtained with a variety of micro- and nano-particles (dielectric and conductive) and different illumination configurations (single beam, holographic geometry, and spatial light modulator projection). The report also pays attention to the physical basis of the method, namely, the coupling of the evanescent photorefractive fields to the dielectric response of the nano-particles. The role of a number of physical parameters such as the contrast and spatial periodicities of the illumination pattern or the particle deposition method is discussed. Moreover, the main properties of the obtained particle patterns in relation to potential applications are summarized, and first demonstrations reviewed. Finally, the PV method is discussed in comparison to other patterning strategies, such as those based on the pyroelectric response and the electric fields associated to domain poling of ferroelectric materials.

  11. Massively parallel molecular-dynamics simulation of ice crystallisation and melting: the roles of system size, ensemble, and electrostatics.

    PubMed

    English, Niall J

    2014-12-21

    Ice crystallisation and melting was studied via massively parallel molecular dynamics under periodic boundary conditions, using approximately spherical ice nano-particles (both "isolated" and as a series of heterogeneous "seeds") of varying size, surrounded by liquid water and at a variety of temperatures. These studies were performed for a series of systems ranging in size from ∼1 × 10(6) to 8.6 × 10(6) molecules, in order to establish system-size effects upon the nano-clusters" crystallisation and dissociation kinetics. Both "traditional" four-site and "single-site" and water models were used, with and without formal point charges, dipoles, and electrostatics, respectively. Simulations were carried out in the microcanonical and isothermal-isobaric ensembles, to assess the influence of "artificial" thermo- and baro-statting, and important disparities were observed, which declined upon using larger systems. It was found that there was a dependence upon system size for both ice growth and dissociation, in that larger systems favoured slower growth and more rapid melting, given the lower extent of "communication" of ice nano-crystallites with their periodic replicae in neighbouring boxes. Although the single-site model exhibited less variation with system size vis-à-vis the multiple-site representation with explicit electrostatics, its crystallisation-dissociation kinetics was artificially fast.

  12. Massively parallel molecular-dynamics simulation of ice crystallisation and melting: The roles of system size, ensemble, and electrostatics

    NASA Astrophysics Data System (ADS)

    English, Niall J.

    2014-12-01

    Ice crystallisation and melting was studied via massively parallel molecular dynamics under periodic boundary conditions, using approximately spherical ice nano-particles (both "isolated" and as a series of heterogeneous "seeds") of varying size, surrounded by liquid water and at a variety of temperatures. These studies were performed for a series of systems ranging in size from ˜1 × 106 to 8.6 × 106 molecules, in order to establish system-size effects upon the nano-clusters" crystallisation and dissociation kinetics. Both "traditional" four-site and "single-site" and water models were used, with and without formal point charges, dipoles, and electrostatics, respectively. Simulations were carried out in the microcanonical and isothermal-isobaric ensembles, to assess the influence of "artificial" thermo- and baro-statting, and important disparities were observed, which declined upon using larger systems. It was found that there was a dependence upon system size for both ice growth and dissociation, in that larger systems favoured slower growth and more rapid melting, given the lower extent of "communication" of ice nano-crystallites with their periodic replicae in neighbouring boxes. Although the single-site model exhibited less variation with system size vis-à-vis the multiple-site representation with explicit electrostatics, its crystallisation-dissociation kinetics was artificially fast.

  13. LiNbO{sub 3}: A photovoltaic substrate for massive parallel manipulation and patterning of nano-objects

    SciTech Connect

    Carrascosa, M.; García-Cabañes, A.; Jubera, M.; Ramiro, J. B.; Agulló-López, F.

    2015-12-15

    The application of evanescent photovoltaic (PV) fields, generated by visible illumination of Fe:LiNbO{sub 3} substrates, for parallel massive trapping and manipulation of micro- and nano-objects is critically reviewed. The technique has been often referred to as photovoltaic or photorefractive tweezers. The main advantage of the new method is that the involved electrophoretic and/or dielectrophoretic forces do not require any electrodes and large scale manipulation of nano-objects can be easily achieved using the patterning capabilities of light. The paper describes the experimental techniques for particle trapping and the main reported experimental results obtained with a variety of micro- and nano-particles (dielectric and conductive) and different illumination configurations (single beam, holographic geometry, and spatial light modulator projection). The report also pays attention to the physical basis of the method, namely, the coupling of the evanescent photorefractive fields to the dielectric response of the nano-particles. The role of a number of physical parameters such as the contrast and spatial periodicities of the illumination pattern or the particle deposition method is discussed. Moreover, the main properties of the obtained particle patterns in relation to potential applications are summarized, and first demonstrations reviewed. Finally, the PV method is discussed in comparison to other patterning strategies, such as those based on the pyroelectric response and the electric fields associated to domain poling of ferroelectric materials.

  14. Enabling inspection solutions for future mask technologies through the development of massively parallel E-Beam inspection

    NASA Astrophysics Data System (ADS)

    Malloy, Matt; Thiel, Brad; Bunday, Benjamin D.; Wurm, Stefan; Jindal, Vibhu; Mukhtar, Maseeh; Quoi, Kathy; Kemen, Thomas; Zeidler, Dirk; Eberle, Anna Lena; Garbowski, Tomasz; Dellemann, Gregor; Peters, Jan Hendrik

    2015-09-01

    The new device architectures and materials being introduced for sub-10nm manufacturing, combined with the complexity of multiple patterning and the need for improved hotspot detection strategies, have pushed current wafer inspection technologies to their limits. In parallel, gaps in mask inspection capability are growing as new generations of mask technologies are developed to support these sub-10nm wafer manufacturing requirements. In particular, the challenges associated with nanoimprint and extreme ultraviolet (EUV) mask inspection require new strategies that enable fast inspection at high sensitivity. The tradeoffs between sensitivity and throughput for optical and e-beam inspection are well understood. Optical inspection offers the highest throughput and is the current workhorse of the industry for both wafer and mask inspection. E-beam inspection offers the highest sensitivity but has historically lacked the throughput required for widespread adoption in the manufacturing environment. It is unlikely that continued incremental improvements to either technology will meet tomorrow's requirements, and therefore a new inspection technology approach is required; one that combines the high-throughput performance of optical with the high-sensitivity capabilities of e-beam inspection. To support the industry in meeting these challenges SUNY Poly SEMATECH has evaluated disruptive technologies that can meet the requirements for high volume manufacturing (HVM), for both the wafer fab [1] and the mask shop. Highspeed massively parallel e-beam defect inspection has been identified as the leading candidate for addressing the key gaps limiting today's patterned defect inspection techniques. As of late 2014 SUNY Poly SEMATECH completed a review, system analysis, and proof of concept evaluation of multiple e-beam technologies for defect inspection. A champion approach has been identified based on a multibeam technology from Carl Zeiss. This paper includes a discussion on the

  15. Detection of CYP2C19 Genetic Variants in Malaysian Orang Asli from Massively Parallel Sequencing Data

    PubMed Central

    Ang, Geik Yong; Yu, Choo Yee; Subramaniam, Vinothini; Abdul Khalid, Mohd Ikhmal Hanif; Tuan Abdu Aziz, Tuan Azlin; Johari James, Richard; Ahmad, Aminuddin; Abdul Rahman, Thuhairah; Mohd Nor, Fadzilah; Ismail, Adzrool Idzwan; Md. Isa, Kamarudzaman; Salleh, Hood; Teh, Lay Kek; Salleh, Mohd Zaki

    2016-01-01

    The human cytochrome P450 (CYP) is a superfamily of enzymes that have been a focus in research for decades due to their prominent role in drug metabolism. CYP2C is one of the major subfamilies which metabolize more than 10% of all clinically used drugs. In the context of CYP2C19, several key genetic variations that alter the enzyme’s activity have been identified and catalogued in the CYP allele nomenclature database. In this study, we investigated the presence of well-established variants as well as novel polymorphisms in the CYP2C19 gene of 62 Orang Asli from the Peninsular Malaysia. A total of 449 genetic variants were detected including 70 novel polymorphisms; 417 SNPs were located in introns, 23 in upstream, 7 in exons, and 2 in downstream regions. Five alleles and seven genotypes were inferred based on the polymorphisms that were found. Null alleles that were observed include CYP2C19*3 (6.5%), *2 (5.7%) and *35 (2.4%) whereas allele with increased function *17 was detected at a frequency of 4.8%. The normal metabolizer genotype was the most predominant (66.1%), followed by intermediate metabolizer (19.4%), rapid metabolizer (9.7%) and poor metabolizer (4.8%) genotypes. Findings from this study provide further insights into the CYP2C19 genetic profile of the Orang Asli as previously unreported variant alleles were detected through the use of massively parallel sequencing technology platform. The systematic and comprehensive analysis of CYP2C19 will allow uncharacterized variants that are present in the Orang Asli to be included in the genotyping panel in the future. PMID:27798644

  16. My-Forensic-Loci-queries (MyFLq) framework for analysis of forensic STR data generated by massive parallel sequencing.

    PubMed

    Van Neste, Christophe; Vandewoestyne, Mado; Van Criekinge, Wim; Deforce, Dieter; Van Nieuwerburgh, Filip

    2014-03-01

    Forensic scientists are currently investigating how to transition from capillary electrophoresis (CE) to massive parallel sequencing (MPS) for analysis of forensic DNA profiles. MPS offers several advantages over CE such as virtually unlimited multiplexy of loci, combining both short tandem repeat (STR) and single nucleotide polymorphism (SNP) loci, small amplicons without constraints of size separation, more discrimination power, deep mixture resolution and sample multiplexing. We present our bioinformatic framework My-Forensic-Loci-queries (MyFLq) for analysis of MPS forensic data. For allele calling, the framework uses a MySQL reference allele database with automatically determined regions of interest (ROIs) by a generic maximal flanking algorithm which makes it possible to use any STR or SNP forensic locus. Python scripts were designed to automatically make allele calls starting from raw MPS data. We also present a method to assess the usefulness and overall performance of a forensic locus with respect to MPS, as well as methods to estimate whether an unknown allele, which sequence is not present in the MySQL database, is in fact a new allele or a sequencing error. The MyFLq framework was applied to an Illumina MiSeq dataset of a forensic Illumina amplicon library, generated from multilocus STR polymerase chain reaction (PCR) on both single contributor samples and multiple person DNA mixtures. Although the multilocus PCR was not yet optimized for MPS in terms of amplicon length or locus selection, the results show excellent results for most loci. The results show a high signal-to-noise ratio, correct allele calls, and a low limit of detection for minor DNA contributors in mixed DNA samples. Technically, forensic MPS affords great promise for routine implementation in forensic genomics. The method is also applicable to adjacent disciplines such as molecular autopsy in legal medicine and in mitochondrial DNA research.

  17. Assessing the clinical value of targeted massively parallel sequencing in a longitudinal, prospective population-based study of cancer patients

    PubMed Central

    Wong, S Q; Fellowes, A; Doig, K; Ellul, J; Bosma, T J; Irwin, D; Vedururu, R; Tan, A Y-C; Weiss, J; Chan, K S; Lucas, M; Thomas, D M; Dobrovic, A; Parisot, J P; Fox, S B

    2015-01-01

    Introduction: Recent discoveries in cancer research have revealed a plethora of clinically actionable mutations that provide therapeutic, prognostic and predictive benefit to patients. The feasibility of screening mutations as part of the routine clinical care of patients remains relatively unexplored as the demonstration of massively parallel sequencing (MPS) of tumours in the general population is required to assess its value towards the health-care system. Methods: Cancer 2015 study is a large-scale, prospective, multisite cohort of newly diagnosed cancer patients from Victoria, Australia with 1094 patients recruited. MPS was performed using the Illumina TruSeq Amplicon Cancer Panel. Results: Overall, 854 patients were successfully sequenced for 48 common cancer genes. Accurate determination of clinically relevant mutations was possible including in less characterised cancer types; however, technical limitations including formalin-induced sequencing artefacts were uncovered. Applying strict filtering criteria, clinically relevant mutations were identified in 63% of patients, with 26% of patients displaying a mutation with therapeutic implications. A subset of patients was validated for canonical mutations using the Agena Bioscience MassARRAY system with 100% concordance. Whereas the prevalence of mutations was consistent with other institutionally based series for some tumour streams (breast carcinoma and colorectal adenocarcinoma), others were different (lung adenocarcinoma and head and neck squamous cell carcinoma), which has significant implications for health economic modelling of particular targeted agents. Actionable mutations in tumours not usually thought to harbour such genetic changes were also identified. Conclusions: Reliable delivery of a diagnostic assay able to screen for a range of actionable mutations in this cohort was achieved, opening unexpected avenues for investigation and treatment of cancer patients. PMID:25742471

  18. Evaluation of cells and biological reagents for adventitious agents using degenerate primer PCR and massively parallel sequencing.

    PubMed

    McClenahan, Shasta D; Uhlenhaut, Christine; Krause, Philip R

    2014-12-12

    We employed a massively parallel sequencing (MPS)-based approach to test reagents and model cell substrates including Chinese hamster ovary (CHO), Madin-Darby canine kidney (MDCK), African green monkey kidney (Vero), and High Five insect cell lines for adventitious agents. RNA and DNA were extracted either directly from the samples or from viral capsid-enriched preparations, and then subjected to MPS-based non-specific virus detection with degenerate oligonucleotide primer (DOP) PCR. MPS by 454, Illumina MiSeq, and Illumina HiSeq was compared on independent samples. Virus detection using these methods was reproducibly achieved. Unclassified sequences from CHO cells represented cellular sequences not yet submitted to the databases typically used for sequence identification. The sensitivity of MPS-based virus detection was consistent with theoretically expected limits based on dilution of virus in cellular nucleic acids. Capsid preparation increased the number of viral sequences detected. Potential viral sequences were detected in several samples; in each case, these sequences were either artifactual or (based on additional studies) shown not to be associated with replication-competent viruses. Virus-like sequences were more likely to be identified in BLAST searches using virus-specific databases that did not contain cellular sequences. Detected viral sequences included previously described retrovirus and retrovirus-like sequences in CHO, Vero, MDCK and High Five cells, and nodavirus and endogenous bracovirus sequences in High Five insect cells. Bovine viral diarrhea virus, bovine hokovirus, and porcine circovirus sequences were detected in some reagents. A recently described parvo-like virus present in some nucleic acid extraction resins was also identified in cells and extraction controls from some samples. The present study helps to illustrate the potential for MPS-based strategies in evaluating the presence of viral nucleic acids in various sample types

  19. Monte Carlo standardless approach for laser induced breakdown spectroscopy based on massive parallel graphic processing unit computing

    NASA Astrophysics Data System (ADS)

    Demidov, A.; Eschlböck-Fuchs, S.; Kazakov, A. Ya.; Gornushkin, I. B.; Kolmhofer, P. J.; Pedarnig, J. D.; Huber, N.; Heitz, J.; Schmid, T.; Rössler, R.; Panne, U.

    2016-11-01

    The improved Monte-Carlo (MC) method for standard-less analysis in laser induced breakdown spectroscopy (LIBS) is presented. Concentrations in MC LIBS are found by fitting model-generated synthetic spectra to experimental spectra. The current version of MC LIBS is based on the graphic processing unit (GPU) computation and reduces the analysis time down to several seconds per spectrum/sample. The previous version of MC LIBS which was based on the central processing unit (CPU) computation requested unacceptably long analysis times of 10's minutes per spectrum/sample. The reduction of the computational time is achieved through the massively parallel computing on the GPU which embeds thousands of co-processors. It is shown that the number of iterations on the GPU exceeds that on the CPU by a factor > 1000 for the 5-dimentional parameter space and yet requires > 10-fold shorter computational time. The improved GPU-MC LIBS outperforms the CPU-MS LIBS in terms of accuracy, precision, and analysis time. The performance is tested on LIBS-spectra obtained from pelletized powders of metal oxides consisting of CaO, Fe2O3, MgO, and TiO2 that simulated by-products of steel industry, steel slags. It is demonstrated that GPU-based MC LIBS is capable of rapid multi-element analysis with relative error between 1 and 10's percent that is sufficient for industrial applications (e.g. steel slag analysis). The results of the improved GPU-based MC LIBS are positively compared to that of the CPU-based MC LIBS as well as to the results of the standard calibration-free (CF) LIBS based on the Boltzmann plot method.

  20. Genome Evolution and Meiotic Maps by Massively Parallel DNA Sequencing: Spotted Gar, an Outgroup for the Teleost Genome Duplication

    PubMed Central

    Amores, Angel; Catchen, Julian; Ferrara, Allyse; Fontenot, Quenton; Postlethwait, John H.

    2011-01-01

    Genomic resources for hundreds of species of evolutionary, agricultural, economic, and medical importance are unavailable due to the expense of well-assembled genome sequences and difficulties with multigenerational studies. Teleost fish provide many models for human disease but possess anciently duplicated genomes that sometimes obfuscate connectivity. Genomic information representing a fish lineage that diverged before the teleost genome duplication (TGD) would provide an outgroup for exploring the mechanisms of evolution after whole-genome duplication. We exploited massively parallel DNA sequencing to develop meiotic maps with thrift and speed by genotyping F1 offspring of a single female and a single male spotted gar (Lepisosteus oculatus) collected directly from nature utilizing only polymorphisms existing in these two wild individuals. Using Stacks, software that automates the calling of genotypes from polymorphisms assayed by Illumina sequencing, we constructed a map containing 8406 markers. RNA-seq on two map-cross larvae provided a reference transcriptome that identified nearly 1000 mapped protein-coding markers and allowed genome-wide analysis of conserved synteny. Results showed that the gar lineage diverged from teleosts before the TGD and its genome is organized more similarly to that of humans than teleosts. Thus, spotted gar provides a critical link between medical models in teleost fish, to which gar is biologically similar, and humans, to which gar is genomically similar. Application of our F1 dense mapping strategy to species with no prior genome information promises to facilitate comparative genomics and provide a scaffold for ordering the numerous contigs arising from next generation genome sequencing. PMID:21828280

  1. Three-dimensional gyrokinetic particle-in-cell simulation of plasmas on a massively parallel computer: Final report on LDRD Core Competency Project, FY 1991--FY 1993

    SciTech Connect

    Byers, J.A.; Williams, T.J.; Cohen, B.I.; Dimits, A.M.

    1994-04-27

    One of the programs of the Magnetic fusion Energy (MFE) Theory and computations Program is studying the anomalous transport of thermal energy across the field lines in the core of a tokamak. We use the method of gyrokinetic particle-in-cell simulation in this study. For this LDRD project we employed massively parallel processing, new algorithms, and new algorithms, and new formal techniques to improve this research. Specifically, we sought to take steps toward: researching experimentally-relevant parameters in our simulations, learning parallel computing to have as a resource for our group, and achieving a 100 {times} speedup over our starting-point Cray2 simulation code`s performance.

  2. Massively Parallel Rogue Cell Detection Using Serial Time-Encoded Amplified Microscopy of Inertially Ordered Cells in High-Throughput Flow

    DTIC Science & Technology

    2012-08-01

    TITLE: Massively Parallel Rogue Cell Detection Using Serial Time- Encoded Amplified Microscopy of Inertially Ordered Cells in High- Throughput Flow ...5a. CONTRACT NUMBER Encoded Amplified Microscopy of Inertially Ordered Cells in High-Throughput Flow 5b. GRANT NUMBER W81XWH-10-1-0519...for blur flee imaging of breast cancer cells and blood cells in flow with and without microparticle labels to EpCAM on the cell surfaces. 2

  3. Relationship Between Faults Oriented Parallel and Oblique to Bedding in Neogene Massive Siliceous Mudstones at The Horonobe Underground Research Laboratory, Japan

    NASA Astrophysics Data System (ADS)

    Hayano, Akira; Ishii, Eiichi

    2016-10-01

    This study investigates the mechanical relationship between bedding-parallel and bedding-oblique faults in a Neogene massive siliceous mudstone at the site of the Horonobe Underground Research Laboratory (URL) in Hokkaido, Japan, on the basis of observations of drill-core recovered from pilot boreholes and fracture mapping on shaft and gallery walls. Four bedding-parallel faults with visible fault gouge, named respectively the MM Fault, the Last MM Fault, the S1 Fault, and the S2 Fault (stratigraphically, from the highest to the lowest), were observed in two pilot boreholes (PB-V01 and SAB-1). The distribution of the bedding-parallel faults at 350 m depth in the Horonobe URL indicates that these faults are spread over at least several tens of meters in parallel along a bedding plane. The observation that the bedding-oblique fault displaces the Last MM fault is consistent with the previous interpretation that the bedding- oblique faults formed after the bedding-parallel faults. In addition, the bedding-parallel faults terminate near the MM and S1 faults, indicating that the bedding-parallel faults with visible fault gouge act to terminate the propagation of younger bedding-oblique faults. In particular, the MM and S1 faults, which have a relatively thick fault gouge, appear to have had a stronger control on the propagation of bedding-oblique faults than did the Last MM fault, which has a relatively thin fault gouge.

  4. Massively Parallel Assimilation of TOGA/TAO and Topex/Poseidon Measurements into a Quasi Isopycnal Ocean General Circulation Model Using an Ensemble Kalman Filter

    NASA Technical Reports Server (NTRS)

    Keppenne, Christian L.; Rienecker, Michele; Borovikov, Anna Y.; Suarez, Max

    1999-01-01

    A massively parallel ensemble Kalman filter (EnKF)is used to assimilate temperature data from the TOGA/TAO array and altimetry from TOPEX/POSEIDON into a Pacific basin version of the NASA Seasonal to Interannual Prediction Project (NSIPP)ls quasi-isopycnal ocean general circulation model. The EnKF is an approximate Kalman filter in which the error-covariance propagation step is modeled by the integration of multiple instances of a numerical model. An estimate of the true error covariances is then inferred from the distribution of the ensemble of model state vectors. This inplementation of the filter takes advantage of the inherent parallelism in the EnKF algorithm by running all the model instances concurrently. The Kalman filter update step also occurs in parallel by having each processor process the observations that occur in the region of physical space for which it is responsible. The massively parallel data assimilation system is validated by withholding some of the data and then quantifying the extent to which the withheld information can be inferred from the assimilation of the remaining data. The distributions of the forecast and analysis error covariances predicted by the ENKF are also examined.

  5. Complete data preparation flow for Massively Parallel E-Beam lithography on 28nm node full-field design

    NASA Astrophysics Data System (ADS)

    Fay, Aurélien; Browning, Clyde; Brandt, Pieter; Chartoire, Jacky; Bérard-Bergery, Sébastien; Hazart, Jérôme; Chagoya, Alexandre; Postnikov, Sergei; Saib, Mohamed; Lattard, Ludovic; Schavione, Patrick

    2016-03-01

    Massively parallel mask-less electron beam lithography (MP-EBL) offers a large intrinsic flexibility at a low cost of ownership in comparison to conventional optical lithography tools. This attractive direct-write technique needs a dedicated data preparation flow to correct both electronic and resist processes. Moreover, Data Prep has to be completed in a short enough time to preserve the flexibility advantage of MP-EBL. While the MP-EBL tools have currently entered an advanced stage of development, this paper will focus on the data preparation side of the work for specifically the MAPPER Lithography FLX-1200 tool [1]-[4], using the ASELTA Nanographics Inscale software. The complete flow as well as the methodology used to achieve a full-field layout data preparation, within an acceptable cycle time, will be presented. Layout used for Data Prep evaluation was one of a 28 nm technology node Metal1 chip with a field size of 26x33mm2, compatible with typical stepper/scanner field sizes and wafer stepping plans. Proximity Effect Correction (PEC) was applied to the entire field, which was then exported as a single file to MAPPER Lithography's machine format, containing fractured shapes and dose assignments. The Soft Edge beam to beam stitching method was employed in the specific overlap regions defined by the machine format as well. In addition to PEC, verification of the correction was included as part of the overall data preparation cycle time. This verification step was executed on the machine file format to ensure pattern fidelity and accuracy as late in the flow as possible. Verification over the full chip, involving billions of evaluation points, is performed both at nominal conditions and at Process Window corners in order to ensure proper exposure and process latitude. The complete MP-EBL data preparation flow was demonstrated for a 28 nm node Metal1 layout in 37 hours. The final verification step shows that the Edge Placement Error (EPE) is kept below 2.25 nm

  6. FWT2D: A massively parallel program for frequency-domain full-waveform tomography of wide-aperture seismic data—Part 1: Algorithm

    NASA Astrophysics Data System (ADS)

    Sourbier, Florent; Operto, Stéphane; Virieux, Jean; Amestoy, Patrick; L'Excellent, Jean-Yves

    2009-03-01

    This is the first paper in a two-part series that describes a massively parallel code that performs 2D frequency-domain full-waveform inversion of wide-aperture seismic data for imaging complex structures. Full-waveform inversion methods, namely quantitative seismic imaging methods based on the resolution of the full wave equation, are computationally expensive. Therefore, designing efficient algorithms which take advantage of parallel computing facilities is critical for the appraisal of these approaches when applied to representative case studies and for further improvements. Full-waveform modelling requires the resolution of a large sparse system of linear equations which is performed with the massively parallel direct solver MUMPS for efficient multiple-shot simulations. Efficiency of the multiple-shot solution phase (forward/backward substitutions) is improved by using the BLAS3 library. The inverse problem relies on a classic local optimization approach implemented with a gradient method. The direct solver returns the multiple-shot wavefield solutions distributed over the processors according to a domain decomposition driven by the distribution of the LU factors. The domain decomposition of the wavefield solutions is used to compute in parallel the gradient of the objective function and the diagonal Hessian, this latter providing a suitable scaling of the gradient. The algorithm allows one to test different strategies for multiscale frequency inversion ranging from successive mono-frequency inversion to simultaneous multifrequency inversion. These different inversion strategies will be illustrated in the following companion paper. The parallel efficiency and the scalability of the code will also be quantified.

  7. MPI/OpenMP hybrid parallel algorithm for resolution of identity second-order Møller-Plesset perturbation calculation of analytical energy gradient for massively parallel multicore supercomputers.

    PubMed

    Katouda, Michio; Nakajima, Takahito

    2017-03-30

    A massively parallel algorithm of the analytical energy gradient calculations based the resolution of identity Møller-Plesset perturbation (RI-MP2) method from the restricted Hartree-Fock reference is presented for geometry optimization calculations and one-electron property calculations of large molecules. This algorithm is designed for massively parallel computation on multicore supercomputers applying the Message Passing Interface (MPI) and Open Multi-Processing (OpenMP) hybrid parallel programming model. In this algorithm, the two-dimensional hierarchical MP2 parallelization scheme is applied using a huge number of MPI processes (more than 1000 MPI processes) for acceleration of the computationally demanding O(N(5) ) step such as calculations of occupied-occupied and virtual-virtual blocks of MP2 one-particle density matrix and MP2 two-particle density matrices. The new parallel algorithm performance is assessed using test calculations of several large molecules such as buckycatcher C60 @C60 H28 (144 atoms, 1820 atomic orbitals (AOs) for def2-SVP basis set, and 3888 AOs for def2-TZVP), nanographene dimer (C96 H24 )2 (240 atoms, 2928 AOs for def2-SVP, and 6432 AOs for cc-pVTZ), and trp-cage protein 1L2Y (304 atoms and 2906 AOs for def2-SVP) using up to 32,768 nodes and 262,144 central processing unit (CPU) cores of the K computer. The results of geometry optimization calculations of trp-cage protein 1L2Y at the RI-MP2/def2-SVP level using the 3072 nodes and 24,576 cores of the K computer are presented and discussed to assess the efficiency of the proposed algorithm. © 2017 Wiley Periodicals, Inc.

  8. Real-time parallel implementation of Pulse-Doppler radar signal processing chain on a massively parallel machine based on multi-core DSP and Serial RapidIO interconnect

    NASA Astrophysics Data System (ADS)

    Klilou, Abdessamad; Belkouch, Said; Elleaume, Philippe; Le Gall, Philippe; Bourzeix, François; Hassani, Moha M'Rabet

    2014-12-01

    Pulse-Doppler radars require high-computing power. A massively parallel machine has been developed in this paper to implement a Pulse-Doppler radar signal processing chain in real-time fashion. The proposed machine consists of two C6678 digital signal processors (DSPs), each with eight DSP cores, interconnected with Serial RapidIO (SRIO) bus. In this study, each individual core is considered as the basic processing element; hence, the proposed parallel machine contains 16 processing elements. A straightforward model has been adopted to distribute the Pulse-Doppler radar signal processing chain. This model provides low latency, but communication inefficiency limits system performance. This paper proposes several optimizations that greatly reduce the inter-processor communication in a straightforward model and improves the parallel efficiency of the system. A use case of the Pulse-Doppler radar signal processing chain has been used to illustrate and validate the concept of the proposed mapping model. Experimental results show that the parallel efficiency of the proposed parallel machine is about 90%.

  9. Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by routing through transporter nodes

    DOEpatents

    Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

    2010-11-16

    A massively parallel computer system contains an inter-nodal communications network of node-to-node links. An automated routing strategy routes packets through one or more intermediate nodes of the network to reach a destination. Some packets are constrained to be routed through respective designated transporter nodes, the automated routing strategy determining a path from a respective source node to a respective transporter node, and from a respective transporter node to a respective destination node. Preferably, the source node chooses a routing policy from among multiple possible choices, and that policy is followed by all intermediate nodes. The use of transporter nodes allows greater flexibility in routing.

  10. Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by employing bandwidth shells at areas of overutilization

    DOEpatents

    Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

    2010-04-27

    A massively parallel computer system contains an inter-nodal communications network of node-to-node links. An automated routing strategy routes packets through one or more intermediate nodes of the network to reach a final destination. The default routing strategy is altered responsive to detection of overutilization of a particular path of one or more links, and at least some traffic is re-routed by distributing the traffic among multiple paths (which may include the default path). An alternative path may require a greater number of link traversals to reach the destination node.

  11. Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by dynamic global mapping of contended links

    DOEpatents

    Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

    2011-10-04

    A massively parallel nodal computer system periodically collects and broadcasts usage data for an internal communications network. A node sending data over the network makes a global routing determination using the network usage data. Preferably, network usage data comprises an N-bit usage value for each output buffer associated with a network link. An optimum routing is determined by summing the N-bit values associated with each link through which a data packet must pass, and comparing the sums associated with different possible routes.

  12. Method and apparatus for analyzing error conditions in a massively parallel computer system by identifying anomalous nodes within a communicator set

    DOEpatents

    Gooding, Thomas Michael

    2011-04-19

    An analytical mechanism for a massively parallel computer system automatically analyzes data retrieved from the system, and identifies nodes which exhibit anomalous behavior in comparison to their immediate neighbors. Preferably, anomalous behavior is determined by comparing call-return stack tracebacks for each node, grouping like nodes together, and identifying neighboring nodes which do not themselves belong to the group. A node, not itself in the group, having a large number of neighbors in the group, is a likely locality of error. The analyzer preferably presents this information to the user by sorting the neighbors according to number of adjoining members of the group.

  13. Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by dynamically adjusting local routing strategies

    DOEpatents

    Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

    2010-03-16

    A massively parallel computer system contains an inter-nodal communications network of node-to-node links. Each node implements a respective routing strategy for routing data through the network, the routing strategies not necessarily being the same in every node. The routing strategies implemented in the nodes are dynamically adjusted during application execution to shift network workload as required. Preferably, adjustment of routing policies in selective nodes is performed at synchronization points. The network may be dynamically monitored, and routing strategies adjusted according to detected network conditions.

  14. Development of a MEMS electrostatic condenser lens array for nc-Si surface electron emitters of the Massive Parallel Electron Beam Direct-Write system

    NASA Astrophysics Data System (ADS)

    Kojima, A.; Ikegami, N.; Yoshida, T.; Miyaguchi, H.; Muroyama, M.; Yoshida, S.; Totsu, K.; Koshida, N.; Esashi, M.

    2016-03-01

    Developments of a Micro Electro-Mechanical System (MEMS) electrostatic Condenser Lens Array (CLA) for a Massively Parallel Electron Beam Direct Write (MPEBDW) lithography system are described. The CLA converges parallel electron beams for fine patterning. The structure of the CLA was designed on a basis of analysis by a finite element method (FEM) simulation. The lens was fabricated with precise machining and assembled with a nanocrystalline silicon (nc-Si) electron emitter array as an electron source of MPEBDW. The nc-Si electron emitter has the advantage that a vertical-emitted surface electron beam can be obtained without any extractor electrodes. FEM simulation of electron optics characteristics showed that the size of the electron beam emitted from the electron emitter was reduced to 15% by a radial direction, and the divergence angle is reduced to 1/18.

  15. Three pillars for achieving quantum mechanical molecular dynamics simulations of huge systems: Divide-and-conquer, density-functional tight-binding, and massively parallel computation.

    PubMed

    Nishizawa, Hiroaki; Nishimura, Yoshifumi; Kobayashi, Masato; Irle, Stephan; Nakai, Hiromi

    2016-08-05

    The linear-scaling divide-and-conquer (DC) quantum chemical methodology is applied to the density-functional tight-binding (DFTB) theory to develop a massively parallel program that achieves on-the-fly molecular reaction dynamics simulations of huge systems from scratch. The functions to perform large scale geometry optimization and molecular dynamics with DC-DFTB potential energy surface are implemented to the program called DC-DFTB-K. A novel interpolation-based algorithm is developed for parallelizing the determination of the Fermi level in the DC method. The performance of the DC-DFTB-K program is assessed using a laboratory computer and the K computer. Numerical tests show the high efficiency of the DC-DFTB-K program, a single-point energy gradient calculation of a one-million-atom system is completed within 60 s using 7290 nodes of the K computer. © 2016 Wiley Periodicals, Inc.

  16. Massively Parallel Sequencing Detected a Mutation in the MFN2 Gene Missed by Sanger Sequencing Due to a Primer Mismatch on an SNP Site.

    PubMed

    Neupauerová, Jana; Grečmalová, Dagmar; Seeman, Pavel; Laššuthová, Petra

    2016-05-01

    We describe a patient with early onset severe axonal Charcot-Marie-Tooth disease (CMT2) with dominant inheritance, in whom Sanger sequencing failed to detect a mutation in the mitofusin 2 (MFN2) gene because of a single nucleotide polymorphism (rs2236057) under the PCR primer sequence. The severe early onset phenotype and the family history with severely affected mother (died after delivery) was very suggestive of CMT2A and this suspicion was finally confirmed by a MFN2 mutation. The mutation p.His361Tyr was later detected in the patient by massively parallel sequencing with a gene panel for hereditary neuropathies. According to this information, new primers for amplification and sequencing were designed which bind away from the polymorphic sites of the patient's DNA. Sanger sequencing with these new primers then confirmed the heterozygous mutation in the MFN2 gene in this patient. This case report shows that massively parallel sequencing may in some rare cases be more sensitive than Sanger sequencing and highlights the importance of accurate primer design which requires special attention.

  17. Mapping unstructured grid computations to massively parallel computers. Ph.D. Thesis - Rensselaer Polytechnic Inst., Feb. 1992

    NASA Technical Reports Server (NTRS)

    Hammond, Steven Warren

    1992-01-01

    Investigated here is this mapping problem: assign the tasks of a parallel program to the processors of a parallel computer such that the execution time is minimized. First, a taxonomy of objective functions and heuristics used to solve the mapping problem is presented. Next, we develop a highly parallel heuristic mapping algorithm, called Cyclic Pairwise Exchange (CPE), and discuss its place in the taxonomy. CPE uses local pairwise exchanges of processor assignments to iteratively improve an initial mapping. A variety of initial mapping schemes are tested and recursive spectral bipartitioning (RSB) followed by CPE is shown to result in the best mappings. For the test cases studied here, problems arising in computational fluid dynamics and structural mechanics on unstructured triangular and tetrahedral meshes, RSB and CPE outperform methods based on simulated annealing. Much less time is required to do the mapping and the results obtained are better. Compared with random and naive mappings, RSB and CPE reduce the communication time two fold for the test problems used. Finally, we use CPE in two applications on a CM-2. The first application is a data parallel mesh-vertex upwind finite volume scheme for solving the Euler equations on 2-D triangular unstructured meshes. CPE is used to map grid points to processors. The performance of this code is compared with a similar code on a Cray-YMP and an Intel iPSC/860. The second application is parallel sparse matrix-vector multiplication used in the iterative solution of large sparse linear systems of equations. We map rows of the matrix to processors and use an inner-product based matrix-vector multiplication. We demonstrate that this method is an order of magnitude faster than methods based on scan operations for our test cases.

  18. Massively parallel and linear-scaling algorithm for second-order Møller-Plesset perturbation theory applied to the study of supramolecular wires

    NASA Astrophysics Data System (ADS)

    Kjærgaard, Thomas; Baudin, Pablo; Bykov, Dmytro; Eriksen, Janus Juul; Ettenhuber, Patrick; Kristensen, Kasper; Larkin, Jeff; Liakh, Dmitry; Pawłowski, Filip; Vose, Aaron; Wang, Yang Min; Jørgensen, Poul

    2017-03-01

    We present a scalable cross-platform hybrid MPI/OpenMP/OpenACC implementation of the Divide-Expand-Consolidate (DEC) formalism with portable performance on heterogeneous HPC architectures. The Divide-Expand-Consolidate formalism is designed to reduce the steep computational scaling of conventional many-body methods employed in electronic structure theory to linear scaling, while providing a simple mechanism for controlling the error introduced by this approximation. Our massively parallel implementation of this general scheme has three levels of parallelism, being a hybrid of the loosely coupled task-based parallelization approach and the conventional MPI +X programming model, where X is either OpenMP or OpenACC. We demonstrate strong and weak scalability of this implementation on heterogeneous HPC systems, namely on the GPU-based Cray XK7 Titan supercomputer at the Oak Ridge National Laboratory. Using the ;resolution of the identity second-order Møller-Plesset perturbation theory; (RI-MP2) as the physical model for simulating correlated electron motion, the linear-scaling DEC implementation is applied to 1-aza-adamantane-trione (AAT) supramolecular wires containing up to 40 monomers (2440 atoms, 6800 correlated electrons, 24 440 basis functions and 91 280 auxiliary functions). This represents the largest molecular system treated at the MP2 level of theory, demonstrating an efficient removal of the scaling wall pertinent to conventional quantum many-body methods.

  19. Massively parallel and linear-scaling algorithm for second-order Moller–Plesset perturbation theory applied to the study of supramolecular wires

    DOE PAGES

    Kjaergaard, Thomas; Baudin, Pablo; Bykov, Dmytro; ...

    2016-11-16

    Here, we present a scalable cross-platform hybrid MPI/OpenMP/OpenACC implementation of the Divide–Expand–Consolidate (DEC) formalism with portable performance on heterogeneous HPC architectures. The Divide–Expand–Consolidate formalism is designed to reduce the steep computational scaling of conventional many-body methods employed in electronic structure theory to linear scaling, while providing a simple mechanism for controlling the error introduced by this approximation. Our massively parallel implementation of this general scheme has three levels of parallelism, being a hybrid of the loosely coupled task-based parallelization approach and the conventional MPI +X programming model, where X is either OpenMP or OpenACC. We demonstrate strong and weak scalabilitymore » of this implementation on heterogeneous HPC systems, namely on the GPU-based Cray XK7 Titan supercomputer at the Oak Ridge National Laboratory. Using the “resolution of the identity second-order Moller–Plesset perturbation theory” (RI-MP2) as the physical model for simulating correlated electron motion, the linear-scaling DEC implementation is applied to 1-aza-adamantane-trione (AAT) supramolecular wires containing up to 40 monomers (2440 atoms, 6800 correlated electrons, 24 440 basis functions and 91 280 auxiliary functions). This represents the largest molecular system treated at the MP2 level of theory, demonstrating an efficient removal of the scaling wall pertinent to conventional quantum many-body methods.« less

  20. Massively parallel and linear-scaling algorithm for second-order Moller–Plesset perturbation theory applied to the study of supramolecular wires

    SciTech Connect

    Kjaergaard, Thomas; Baudin, Pablo; Bykov, Dmytro; Eriksen, Janus Juul; Ettenhuber, Patrick; Kristensen, Kasper; Larkin, Jeff; Liakh, Dmitry; Pawlowski, Filip; Vose, Aaron; Wang, Yang Min; Jorgensen, Poul

    2016-11-16

    Here, we present a scalable cross-platform hybrid MPI/OpenMP/OpenACC implementation of the Divide–Expand–Consolidate (DEC) formalism with portable performance on heterogeneous HPC architectures. The Divide–Expand–Consolidate formalism is designed to reduce the steep computational scaling of conventional many-body methods employed in electronic structure theory to linear scaling, while providing a simple mechanism for controlling the error introduced by this approximation. Our massively parallel implementation of this general scheme has three levels of parallelism, being a hybrid of the loosely coupled task-based parallelization approach and the conventional MPI +X programming model, where X is either OpenMP or OpenACC. We demonstrate strong and weak scalability of this implementation on heterogeneous HPC systems, namely on the GPU-based Cray XK7 Titan supercomputer at the Oak Ridge National Laboratory. Using the “resolution of the identity second-order Moller–Plesset perturbation theory” (RI-MP2) as the physical model for simulating correlated electron motion, the linear-scaling DEC implementation is applied to 1-aza-adamantane-trione (AAT) supramolecular wires containing up to 40 monomers (2440 atoms, 6800 correlated electrons, 24 440 basis functions and 91 280 auxiliary functions). This represents the largest molecular system treated at the MP2 level of theory, demonstrating an efficient removal of the scaling wall pertinent to conventional quantum many-body methods.

  1. Advances in Time-Domain Electromagnetic Simulation Capabilities Through the Use of Overset Grids and Massively Parallel Computing

    DTIC Science & Technology

    1997-03-01

    to construct their computer codes (written largely in FORTRAN) to exploit this type of architecture. Towards the end of the 1980s, however, vector...for exploiting parallel architectures using both single and overset grids in conjunction with typical grid-based PDE solvers in general and FVTD...8217. Furthermore, it naturally exploits the means by which the electric and magnetic fields are related through the curl operators. Unfortunately, although stag

  2. Mutations causing medullary cystic kidney disease type 1 (MCKD1) lie in a large VNTR in MUC1 missed by massively parallel sequencing

    PubMed Central

    Kirby, Andrew; Gnirke, Andreas; Jaffe, David B.; Barešová, Veronika; Pochet, Nathalie; Blumenstiel, Brendan; Ye, Chun; Aird, Daniel; Stevens, Christine; Robinson, James T.; Cabili, Moran N.; Gat-Viks, Irit; Kelliher, Edward; Daza, Riza; DeFelice, Matthew; Hůlková, Helena; Sovová, Jana; Vylet’al, Petr; Antignac, Corinne; Guttman, Mitchell; Handsaker, Robert E.; Perrin, Danielle; Steelman, Scott; Sigurdsson, Snaevar; Scheinman, Steven J.; Sougnez, Carrie; Cibulskis, Kristian; Parkin, Melissa; Green, Todd; Rossin, Elizabeth; Zody, Michael C.; Xavier, Ramnik J.; Pollak, Martin R.; Alper, Seth L.; Lindblad-Toh, Kerstin; Gabriel, Stacey; Hart, P. Suzanne; Regev, Aviv; Nusbaum, Chad; Kmoch, Stanislav; Bleyer, Anthony J.; Lander, Eric S.; Daly, Mark J.

    2014-01-01

    While genetic lesions responsible for some Mendelian disorders can be rapidly discovered through massively parallel sequencing (MPS) of whole genomes or exomes, not all diseases readily yield to such efforts. We describe the illustrative case of the simple Mendelian disorder medullary cystic kidney disease type 1 (MCKD1), mapped more than a decade ago to a 2-Mb region on chromosome 1. Ultimately, only by cloning, capillary sequencing, and de novo assembly, we found that each of six MCKD1 families harbors an equivalent, but apparently independently arising, mutation in sequence dramatically underrepresented in MPS data: the insertion of a single C in one copy (but a different copy in each family) of the repeat unit comprising the extremely long (~1.5-5 kb), GC-rich (>80%), coding VNTR in the mucin 1 gene. The results provide a cautionary tale about the challenges in identifying genes responsible for Mendelian, let alone more complex, disorders through MPS. PMID:23396133

  3. Use of Massively Parallel Pyrosequencing to Evaluate the Diversity of and Selection on Plasmodium falciparum csp T-Cell Epitopes in Lilongwe, Malawi

    PubMed Central

    Bailey, Jeffrey A.; Mvalo, Tisungane; Aragam, Nagesh; Weiser, Matthew; Congdon, Seth; Kamwendo, Debbie; Martinson, Francis; Hoffman, Irving; Meshnick, Steven R.; Juliano, Jonathan J.

    2012-01-01

    The development of an effective malaria vaccine has been hampered by the genetic diversity of commonly used target antigens. This diversity has led to concerns about allele-specific immunity limiting the effectiveness of vaccines. Despite extensive genetic diversity of circumsporozoite protein (CS), the most successful malaria vaccine is RTS/S, a monovalent CS vaccine. By use of massively parallel pyrosequencing, we evaluated the diversity of CS haplotypes across the T-cell epitopes in parasites from Lilongwe, Malawi. We identified 57 unique parasite haplotypes from 100 participants. By use of ecological and molecular indexes of diversity, we saw no difference in the diversity of CS haplotypes between adults and children. We saw evidence of weak variant-specific selection within this region of CS, suggesting naturally acquired immunity does induce variant-specific selection on CS. Therefore, the impact of CS vaccines on variant frequencies with widespread implementation of vaccination requires further study. PMID:22551816

  4. An open-source, massively parallel code for non-LTE synthesis and inversion of spectral lines and Zeeman-induced Stokes profiles

    NASA Astrophysics Data System (ADS)

    Socas-Navarro, H.; de la Cruz Rodríguez, J.; Asensio Ramos, A.; Trujillo Bueno, J.; Ruiz Cobo, B.

    2015-05-01

    With the advent of a new generation of solar telescopes and instrumentation, interpreting chromospheric observations (in particular, spectropolarimetry) requires new, suitable diagnostic tools. This paper describes a new code, NICOLE, that has been designed for Stokes non-LTE radiative transfer, for synthesis and inversion of spectral lines and Zeeman-induced polarization profiles, spanning a wide range of atmospheric heights from the photosphere to the chromosphere. The code features a number of unique features and capabilities and has been built from scratch with a powerful parallelization scheme that makes it suitable for application on massive datasets using large supercomputers. The source code is written entirely in Fortran 90/2003 and complies strictly with the ANSI standards to ensure maximum compatibility and portability. It is being publicly released, with the idea of facilitating future branching by other groups to augment its capabilities. The source code is currently hosted at the following repository: http://https://github.com/hsocasnavarro/NICOLE

  5. Performance of a TthPrimPol-based whole genome amplification kit for copy number alteration detection using massively parallel sequencing

    PubMed Central

    Deleye, Lieselot; De Coninck, Dieter; Dheedene, Annelies; De Sutter, Petra; Menten, Björn; Deforce, Dieter; Van Nieuwerburgh, Filip

    2016-01-01

    Starting from only a few cells, current whole genome amplification (WGA) methods provide enough DNA to perform massively parallel sequencing (MPS). Unfortunately, all current WGA methods introduce representation bias which limits detection of copy number aberrations (CNAs) smaller than 3 Mb. A recent WGA method, called TruePrime single cell WGA, uses a recently discovered DNA primase, TthPrimPol, instead of artificial primers to initiate DNA amplification. This method could lead to a lower representation bias, and consequently to a better detection of CNAs. The enzyme requires no complementarity and thus should generate random primers, equally distributed across the genome. The performance of TruePrime WGA was assessed for aneuploidy screening and CNA analysis after MPS, starting from 1, 3 or 5 cells. Although the method looks promising, the single cell TruePrime WGA kit v1 is not suited for high resolution CNA detection after MPS because too much representation bias is introduced. PMID:27546482

  6. Computational methods for epigenetic analysis: the protocol of computational analysis for modified methylation-specific digital karyotyping based on massively parallel sequencing.

    PubMed

    Li, Jian; Zhao, Qian; Bolund, Lars

    2011-01-01

    Massively parallel sequencing technology opens new possibilities for epigenetic research. Many methods have been developed based on the new sequencing platforms, allowing an ultra-deep mapping of epigenetic variants in a fast and cost-effective way. However, handling millions of short reads produced by these sequencing platforms is a huge challenge for many laboratories. Thus, there is a need for the development of accurate and fast computational tools for epigenetic studies in the new era of genomic sequencing.Modified methylation-specific digital karyotyping (MMSDK) is an improved method for genome-wide DNA methylation profiling based on the combination of traditional MSDK and Illumina/Solexa sequencing. Here, we introduce our computational tools used in the MMSDK analysis process from the experimental design to statistical analysis. We have developed a mapping process based on the in silico simulation of combined enzyme cutting and tag extraction of the reference genome. Subsequently, the 20-21 nucleotides (nt) long tags obtained by sequencing are mapped to the simulated library using an open source software Mapping and Assembly with Qualities. Our computational methods include trimming, annotation, normalization, and counting the reads to obtain digital DNA methylation profiles. We present the complete protocol and discuss some important issues that should be considered by readers, such as handling of repeat sequences, SNPs, and normalization. The core part of this protocol (mapping and annotation of tags) is suitable for any tag profiling-based methods, and it could also be modified to analyze results from other types of epigenetic studies based on massively parallel sequencing.

  7. Massively parallel algorithm and implementation of RI-MP2 energy calculation for peta-scale many-core supercomputers.

    PubMed

    Katouda, Michio; Naruse, Akira; Hirano, Yukihiko; Nakajima, Takahito

    2016-11-15

    A new parallel algorithm and its implementation for the RI-MP2 energy calculation utilizing peta-flop-class many-core supercomputers are presented. Some improvements from the previous algorithm (J. Chem. Theory Comput. 2013, 9, 5373) have been performed: (1) a dual-level hierarchical parallelization scheme that enables the use of more than 10,000 Message Passing Interface (MPI) processes and (2) a new data communication scheme that reduces network communication overhead. A multi-node and multi-GPU implementation of the present algorithm is presented for calculations on a central processing unit (CPU)/graphics processing unit (GPU) hybrid supercomputer. Benchmark results of the new algorithm and its implementation using the K computer (CPU clustering system) and TSUBAME 2.5 (CPU/GPU hybrid system) demonstrate high efficiency. The peak performance of 3.1 PFLOPS is attained using 80,199 nodes of the K computer. The peak performance of the multi-node and multi-GPU implementation is 514 TFLOPS using 1349 nodes and 4047 GPUs of TSUBAME 2.5. © 2016 Wiley Periodicals, Inc.

  8. Rapid profiling of the antigen regions recognized by serum antibodies using massively parallel sequencing of antigen-specific libraries.

    PubMed

    Domina, Maria; Lanza Cariccio, Veronica; Benfatto, Salvatore; D'Aliberti, Deborah; Venza, Mario; Borgogni, Erica; Castellino, Flora; Biondo, Carmelo; D'Andrea, Daniel; Grassi, Luigi; Tramontano, Anna; Teti, Giuseppe; Felici, Franco; Beninati, Concetta

    2014-01-01

    There is a need for techniques capable of identifying the antigenic epitopes targeted by polyclonal antibody responses during deliberate or natural immunization. Although successful, traditional phage library screening is laborious and can map only some of the epitopes. To accelerate and improve epitope identification, we have employed massive sequencing of phage-displayed antigen-specific libraries using the Illumina MiSeq platform. This enabled us to precisely identify the regions of a model antigen, the meningococcal NadA virulence factor, targeted by serum antibodies in vaccinated individuals and to rank hundreds of antigenic fragments according to their immunoreactivity. We found that next generation sequencing can significantly empower the analysis of antigen-specific libraries by allowing simultaneous processing of dozens of library/serum combinations in less than two days, including the time required for antibody-mediated library selection. Moreover, compared with traditional plaque picking, the new technology (named Phage-based Representation OF Immuno-Ligand Epitope Repertoire or PROFILER) provides superior resolution in epitope identification. PROFILER seems ideally suited to streamline and guide rational antigen design, adjuvant selection, and quality control of newly produced vaccines. Furthermore, this method is also susceptible to find important applications in other fields covered by traditional quantitative serology.

  9. RH 1.5D: a massively parallel code for multi-level radiative transfer with partial frequency redistribution and Zeeman polarisation

    NASA Astrophysics Data System (ADS)

    Pereira, Tiago M. D.; Uitenbroek, Han

    2015-02-01

    The emergence of three-dimensional magneto-hydrodynamic simulations of stellar atmospheres has sparked a need for efficient radiative transfer codes to calculate detailed synthetic spectra. We present RH 1.5D, a massively parallel code based on the RH code and capable of performing Zeeman polarised multi-level non-local thermodynamical equilibrium calculations with partial frequency redistribution for an arbitrary amount of chemical species. The code calculates spectra from 3D, 2D or 1D atmospheric models on a column-by-column basis (or 1.5D). While the 1.5D approximation breaks down in the cores of very strong lines in an inhomogeneous environment, it is nevertheless suitable for a large range of scenarios and allows for faster convergence with finer control over the iteration of each simulation column. The code scales well to at least tens of thousands of CPU cores, and is publicly available. In the present work we briefly describe its inner workings, strategies for convergence optimisation, its parallelism, and some possible applications.

  10. Partition-of-unity finite-element method for large scale quantum molecular dynamics on massively parallel computational platforms

    SciTech Connect

    Pask, J E; Sukumar, N; Guney, M; Hu, W

    2011-02-28

    Over the course of the past two decades, quantum mechanical calculations have emerged as a key component of modern materials research. However, the solution of the required quantum mechanical equations is a formidable task and this has severely limited the range of materials systems which can be investigated by such accurate, quantum mechanical means. The current state of the art for large-scale quantum simulations is the planewave (PW) method, as implemented in now ubiquitous VASP, ABINIT, and QBox codes, among many others. However, since the PW method uses a global Fourier basis, with strictly uniform resolution at all points in space, and in which every basis function overlaps every other at every point, it suffers from substantial inefficiencies in calculations involving atoms with localized states, such as first-row and transition-metal atoms, and requires substantial nonlocal communications in parallel implementations, placing critical limits on scalability. In recent years, real-space methods such as finite-differences (FD) and finite-elements (FE) have been developed to address these deficiencies by reformulating the required quantum mechanical equations in a strictly local representation. However, while addressing both resolution and parallel-communications problems, such local real-space approaches have been plagued by one key disadvantage relative to planewaves: excessive degrees of freedom (grid points, basis functions) needed to achieve the required accuracies. And so, despite critical limitations, the PW method remains the standard today. In this work, we show for the first time that this key remaining disadvantage of real-space methods can in fact be overcome: by building known atomic physics into the solution process using modern partition-of-unity (PU) techniques in finite element analysis. Indeed, our results show order-of-magnitude reductions in basis size relative to state-of-the-art planewave based methods. The method developed here is

  11. N-Body Classical Systems and Neural Networks on a 3d SIMD Massive Parallel Processor:. APE100/QUADRICS

    NASA Astrophysics Data System (ADS)

    Paolucci, P. S.

    A number of physical systems (e.g., N body Newtonian, Coulombian or Lennard-Jones systems) can be described by N2 interaction terms. Completely connected neural networks are characterised by the same kind of connections: Each neuron sends signals to all the other neurons via synapses. The APE100/Quadricsmassive parallel architecture, with processing power in excess of 100 Gigaflops and a central memory of 8 Gigabytes seems to have processing power and memory adequate to simulate systems formed by more than 1 billion synapses or interaction terms. On the other hand the processing nodes of APE100/Quadrics are organised in a tridimensional cubic lattice; each processing node has a direct communication path only toward the first neighboring nodes. Here we describe a convenient way to map systems with global connectivity onto the first-neighbors connectivity of the APE100/Quadrics architecture. Some numeric criteria, which are useful for matching SIMD tridimensional architectures with globally connected simulations, are introduced.

  12. User's guide of TOUGH2-EGS-MP: A Massively Parallel Simulator with Coupled Geomechanics for Fluid and Heat Flow in Enhanced Geothermal Systems VERSION 1.0

    SciTech Connect

    Xiong, Yi; Fakcharoenphol, Perapon; Wang, Shihao; Winterfeld, Philip H.; Zhang, Keni; Wu, Yu-Shu

    2013-12-01

    TOUGH2-EGS-MP is a parallel numerical simulation program coupling geomechanics with fluid and heat flow in fractured and porous media, and is applicable for simulation of enhanced geothermal systems (EGS). TOUGH2-EGS-MP is based on the TOUGH2-MP code, the massively parallel version of TOUGH2. In TOUGH2-EGS-MP, the fully-coupled flow-geomechanics model is developed from linear elastic theory for thermo-poro-elastic systems and is formulated in terms of mean normal stress as well as pore pressure and temperature. Reservoir rock properties such as porosity and permeability depend on rock deformation, and the relationships between these two, obtained from poro-elasticity theories and empirical correlations, are incorporated into the simulation. This report provides the user with detailed information on the TOUGH2-EGS-MP mathematical model and instructions for using it for Thermal-Hydrological-Mechanical (THM) simulations. The mathematical model includes the fluid and heat flow equations, geomechanical equation, and discretization of those equations. In addition, the parallel aspects of the code, such as domain partitioning and communication between processors, are also included. Although TOUGH2-EGS-MP has the capability for simulating fluid and heat flows coupled with geomechanical effects, it is up to the user to select the specific coupling process, such as THM or only TH, in a simulation. There are several example problems illustrating applications of this program. These example problems are described in detail and their input data are presented. Their results demonstrate that this program can be used for field-scale geothermal reservoir simulation in porous and fractured media with fluid and heat flow coupled with geomechanical effects.

  13. Implementation of a flexible and scalable particle-in-cell method for massively parallel computations in the mantle convection code ASPECT

    NASA Astrophysics Data System (ADS)

    Gassmöller, Rene; Bangerth, Wolfgang

    2016-04-01

    Particle-in-cell methods have a long history and many applications in geodynamic modelling of mantle convection, lithospheric deformation and crustal dynamics. They are primarily used to track material information, the strain a material has undergone, the pressure-temperature history a certain material region has experienced, or the amount of volatiles or partial melt present in a region. However, their efficient parallel implementation - in particular combined with adaptive finite-element meshes - is complicated due to the complex communication patterns and frequent reassignment of particles to cells. Consequently, many current scientific software packages accomplish this efficient implementation by specifically designing particle methods for a single purpose, like the advection of scalar material properties that do not evolve over time (e.g., for chemical heterogeneities). Design choices for particle integration, data storage, and parallel communication are then optimized for this single purpose, making the code relatively rigid to changing requirements. Here, we present the implementation of a flexible, scalable and efficient particle-in-cell method for massively parallel finite-element codes with adaptively changing meshes. Using a modular plugin structure, we allow maximum flexibility of the generation of particles, the carried tracer properties, the advection and output algorithms, and the projection of properties to the finite-element mesh. We present scaling tests ranging up to tens of thousands of cores and tens of billions of particles. Additionally, we discuss efficient load-balancing strategies for particles in adaptive meshes with their strengths and weaknesses, local particle-transfer between parallel subdomains utilizing existing communication patterns from the finite element mesh, and the use of established parallel output algorithms like the HDF5 library. Finally, we show some relevant particle application cases, compare our implementation to a

  14. Rubber band ligation of hemorrhoids: A guide for complications

    PubMed Central

    Albuquerque, Andreia

    2016-01-01

    Rubber band ligation is one of the most important, cost-effective and commonly used treatments for internal hemorrhoids. Different technical approaches were developed mainly to improve efficacy and safety. The technique can be employed using an endoscope with forward-view or retroflexion or without an endoscope, using a suction elastic band ligator or a forceps ligator. Single or multiple ligations can be performed in a single session. Local anaesthetic after ligation can also be used to reduce the post-procedure pain. Mild bleeding, pain, vaso-vagal symptoms, slippage of bands, priapism, difficulty in urination, anal fissure, and chronic longitudinal ulcers are normally considered minor complications, more frequently encountered. Massive bleeding, thrombosed hemorrhoids, severe pain, urinary retention needing catheterization, pelvic sepsis and death are uncommon major complications. Mild pain after rubber band ligation is the most common complication with a high frequency in some studies. Secondary bleeding normally occurs 10 to 14 d after banding and patients taking anti-platelet and/or anti-coagulant medication have a higher risk, with some reports of massive life-threatening haemorrhage. Several infectious complications have also been reported including pelvic sepsis, Fournier’s gangrene, liver abscesses, tetanus and bacterial endocarditis. To date, seven deaths due to these infectious complications were described. Early recognition and immediate treatment of complications are fundamental for a favourable prognosis. PMID:27721924

  15. Massive parallel insertion site sequencing of an arrayed Sinorhizobium meliloti signature-tagged mini-Tn 5 transposon mutant library.

    PubMed

    Serrania, Javier; Johner, Tobias; Rupp, Oliver; Goesmann, Alexander; Becker, Anke

    2017-02-21

    Transposon mutagenesis in conjunction with identification of genomic transposon insertion sites is a powerful tool for gene function studies. We have implemented a protocol for parallel determination of transposon insertion sites by Illumina sequencing involving a hierarchical barcoding method that allowed for tracking back insertion sites to individual clones of an arrayed signature-tagged transposon mutant library. This protocol was applied to further characterize a signature-tagged mini-Tn 5 mutant library comprising about 12,000 mutants of the symbiotic nitrogen-fixing alphaproteobacterium Sinorhizobium meliloti (Pobigaylo et al., 2006; Appl. Environ. Microbiol. 72, 4329-4337). Previously, insertion sites have been determined for 5000 mutants of this library. Combining an adapter-free, inverse PCR method for sequencing library preparation with next generation sequencing, we identified 4473 novel insertion sites, increasing the total number of transposon mutants with known insertion site to 9562. The number of protein-coding genes that were hit at least once by a transposon increased by 1231 to a total number of 3673 disrupted genes, which represents 59% of the predicted protein-coding genes in S. meliloti.

  16. The grid-based fast multipole method--a massively parallel numerical scheme for calculating two-electron interaction energies.

    PubMed

    Toivanen, Elias A; Losilla, Sergio A; Sundholm, Dage

    2015-12-21

    Algorithms and working expressions for a grid-based fast multipole method (GB-FMM) have been developed and implemented. The computational domain is divided into cubic subdomains, organized in a hierarchical tree. The contribution to the electrostatic interaction energies from pairs of neighboring subdomains is computed using numerical integration, whereas the contributions from further apart subdomains are obtained using multipole expansions. The multipole moments of the subdomains are obtained by numerical integration. Linear scaling is achieved by translating and summing the multipoles according to the tree structure, such that each subdomain interacts with a number of subdomains that are almost independent of the size of the system. To compute electrostatic interaction energies of neighboring subdomains, we employ an algorithm which performs efficiently on general purpose graphics processing units (GPGPU). Calculations using one CPU for the FMM part and 20 GPGPUs consisting of tens of thousands of execution threads for the numerical integration algorithm show the scalability and parallel performance of the scheme. For calculations on systems consisting of Gaussian functions (α = 1) distributed as fullerenes from C20 to C720, the total computation time and relative accuracy (ppb) are independent of the system size.

  17. Tubal ligation - slideshow

    MedlinePlus

    ... anatomy URL of this page: //medlineplus.gov/ency/presentations/100044.htm Tubal ligation - Series—Normal anatomy To use the sharing features on this page, please enable JavaScript. Go to slide 1 out of 3 Go to slide 2 ...

  18. Tubal Ligation Reversal

    MedlinePlus

    ... depending on a woman's age and other factors. Success rates may be as high as 80 percent or ... details of the procedure Discuss the likelihood of success and your ability to get pregnant after the procedure Discuss other options for pregnancy, such as IVF A tubal ligation reversal can be done as ...

  19. Massively Parallel Signal Processing using the Graphics Processing Unit for Real-Time Brain-Computer Interface Feature Extraction.

    PubMed

    Wilson, J Adam; Williams, Justin C

    2009-01-01

    The clock speeds of modern computer processors have nearly plateaued in the past 5 years. Consequently, neural prosthetic systems that rely on processing large quantities of data in a short period of time face a bottleneck, in that it may not be possible to process all of the data recorded from an electrode array with high channel counts and bandwidth, such as electrocorticographic grids or other implantable systems. Therefore, in this study a method of using the processing capabilities of a graphics card [graphics processing unit (GPU)] was developed for real-time neural signal processing of a brain-computer interface (BCI). The NVIDIA CUDA system was used to offload processing to the GPU, which is capable of running many operations in parallel, potentially greatly increasing the speed of existing algorithms. The BCI system records many channels of data, which are processed and translated into a control signal, such as the movement of a computer cursor. This signal processing chain involves computing a matrix-matrix multiplication (i.e., a spatial filter), followed by calculating the power spectral density on every channel using an auto-regressive method, and finally classifying appropriate features for control. In this study, the first two computationally intensive steps were implemented on the GPU, and the speed was compared to both the current implementation and a central processing unit-based implementation that uses multi-threading. Significant performance gains were obtained with GPU processing: the current implementation processed 1000 channels of 250 ms in 933 ms, while the new GPU method took only 27 ms, an improvement of nearly 35 times.

  20. Simulation of Ionospheric E-Region Plasma Turbulence with a Massively Parallel Hybrid PIC/Fluid Code

    NASA Astrophysics Data System (ADS)

    Young, M.; Oppenheim, M. M.; Dimant, Y. S.

    2015-12-01

    The Farley-Buneman (FB) and gradient drift (GD) instabilities are plasma instabilities that occur at roughly 100 km in the equatorial E-region ionosphere. They develop when ion-neutral collisions dominate ion motion while electron motion is affected by both electron-neutral collisions and the background magnetic field. GD drift waves grow when the background density gradient and electric field are aligned; FB waves grow when the background electric field causes electrons to E × B drift with a speed slightly larger than the ion acoustic speed. Theory predicts that FB and GD turbulence should develop in the same plasma volume when GD waves create a perturbation electric field that exceeds the threshold value for FB turbulence. However, ionospheric radars, which regularly observe meter-scale irregularities associated with FB turbulence, must infer kilometer-scale GD dynamics rather than observe them directly. Numerical simulations have been unable to simultaneously resolve GD and FB structure. We present results from a parallelized hybrid simulation that uses a particle-in-cell (PIC) method for ions while modeling electrons as an inertialess, quasi-neutral fluid. This approach allows us to reach length scales of hundreds of meters to kilometers with sub-meter resolution, but requires solving a large linear system derived from an elliptic PDE that depends on plasma density, ion flux, and electron parameters. We solve the resultant linear system at each time step via the Portable Extensible Toolkit for Scientific Computing (PETSc). We compare results of simulated FB turbulence from this model to results from a thoroughly tested PIC code and describe progress toward the first simultaneous simulations of FB and GD instabilities. This model has immediate applications to radar observations of the E-region ionosphere, as well as potential applications to the F-region ionosphere and the chromosphere of the Sun.

  1. Scientific development of a massively parallel ocean climate model. Progress report for 1992--1993 and Continuing request for 1993--1994 to CHAMMP (Computer Hardware, Advanced Mathematics, and Model Physics)

    SciTech Connect

    Semtner, A.J. Jr.; Chervin, R.M.

    1993-05-01

    During the second year of CHAMMP funding to the principal investigators, progress has been made in the proposed areas of research, as follows: investigation of the physics of the thermohaline circulation; examination of resolution effects on ocean general circulation; and development of a massively parallel ocean climate model.

  2. Massively parallel implementations of coupled-cluster methods for electron spin resonance spectra. I. Isotropic hyperfine coupling tensors in large radicals

    SciTech Connect

    Verma, Prakash; Morales, Jorge A.; Perera, Ajith

    2013-11-07

    Coupled cluster (CC) methods provide highly accurate predictions of molecular properties, but their high computational cost has precluded their routine application to large systems. Fortunately, recent computational developments in the ACES III program by the Bartlett group [the OED/ERD atomic integral package, the super instruction processor, and the super instruction architecture language] permit overcoming that limitation by providing a framework for massively parallel CC implementations. In that scheme, we are further extending those parallel CC efforts to systematically predict the three main electron spin resonance (ESR) tensors (A-, g-, and D-tensors) to be reported in a series of papers. In this paper inaugurating that series, we report our new ACES III parallel capabilities that calculate isotropic hyperfine coupling constants in 38 neutral, cationic, and anionic radicals that include the {sup 11}B, {sup 17}O, {sup 9}Be, {sup 19}F, {sup 1}H, {sup 13}C, {sup 35}Cl, {sup 33}S,{sup 14}N, {sup 31}P, and {sup 67}Zn nuclei. Present parallel calculations are conducted at the Hartree-Fock (HF), second-order many-body perturbation theory [MBPT(2)], CC singles and doubles (CCSD), and CCSD with perturbative triples [CCSD(T)] levels using Roos augmented double- and triple-zeta atomic natural orbitals basis sets. HF results consistently overestimate isotropic hyperfine coupling constants. However, inclusion of electron correlation effects in the simplest way via MBPT(2) provides significant improvements in the predictions, but not without occasional failures. In contrast, CCSD results are consistently in very good agreement with experimental results. Inclusion of perturbative triples to CCSD via CCSD(T) leads to small improvements in the predictions, which might not compensate for the extra computational effort at a non-iterative N{sup 7}-scaling in CCSD(T). The importance of these accurate computations of isotropic hyperfine coupling constants to elucidate

  3. Massively parallel implementations of coupled-cluster methods for electron spin resonance spectra. I. Isotropic hyperfine coupling tensors in large radicals

    NASA Astrophysics Data System (ADS)

    Verma, Prakash; Perera, Ajith; Morales, Jorge A.

    2013-11-01

    Coupled cluster (CC) methods provide highly accurate predictions of molecular properties, but their high computational cost has precluded their routine application to large systems. Fortunately, recent computational developments in the ACES III program by the Bartlett group [the OED/ERD atomic integral package, the super instruction processor, and the super instruction architecture language] permit overcoming that limitation by providing a framework for massively parallel CC implementations. In that scheme, we are further extending those parallel CC efforts to systematically predict the three main electron spin resonance (ESR) tensors (A-, g-, and D-tensors) to be reported in a series of papers. In this paper inaugurating that series, we report our new ACES III parallel capabilities that calculate isotropic hyperfine coupling constants in 38 neutral, cationic, and anionic radicals that include the 11B, 17O, 9Be, 19F, 1H, 13C, 35Cl, 33S,14N, 31P, and 67Zn nuclei. Present parallel calculations are conducted at the Hartree-Fock (HF), second-order many-body perturbation theory [MBPT(2)], CC singles and doubles (CCSD), and CCSD with perturbative triples [CCSD(T)] levels using Roos augmented double- and triple-zeta atomic natural orbitals basis sets. HF results consistently overestimate isotropic hyperfine coupling constants. However, inclusion of electron correlation effects in the simplest way via MBPT(2) provides significant improvements in the predictions, but not without occasional failures. In contrast, CCSD results are consistently in very good agreement with experimental results. Inclusion of perturbative triples to CCSD via CCSD(T) leads to small improvements in the predictions, which might not compensate for the extra computational effort at a non-iterative N7-scaling in CCSD(T). The importance of these accurate computations of isotropic hyperfine coupling constants to elucidate experimental ESR spectra, to interpret spin-density distributions, and to

  4. An assessment of a massively parallel sequencing approach for the identification of individuals from mass graves of the Spanish Civil War (1936-1939).

    PubMed

    Calafell, Francesc; Anglada, Roger; Bonet, Núria; González-Ruiz, Mercedes; Prats-Muñoz, Gemma; Rasal, Raquel; Lalueza-Fox, Carles; Bertranpetit, Jaume; Malgosa, Assumpció; Casals, Ferran

    2016-10-01

    Next-generation sequencing technologies have opened new opportunities in forensic genetics. Here, we assess the applicability and performance of the MiSeq FGx™ & ForenSeq™ DNA Signature Prep Kit (Illumina) for the identification of individuals from the mass graves of the Spanish Civil War (1936-1939). The main limitations for individual identification are the low number of possible first-degree living relatives and the high levels of DNA degradation reported in previous studies. Massively parallel sequencing technologies enabling the analysis of hundreds of regions and prioritizing short length amplicons constitute a promising tool for this kind of approaches. In this study, we first explore the power of this new technology to detect first- and second-degree kinship given different scenarios of DNA degradation. Second, we specifically assess its performance in a set of low DNA input samples previously analyzed with CE technologies. We conclude that this methodology will allow identification of up to second-degree relatives, even in situations with low sequencing performance and important levels of allele drop-out; it is thus a technology that resolves previous drawbacks and that will allow a successful approximation to the identification of remains.

  5. Massive parallel sequencing of human whole mitochondrial genomes with Ion Torrent technology: an optimized workflow for Anthropological and Population Genetics studies.

    PubMed

    De Fanti, Sara; Vianello, Dario; Giuliani, Cristina; Quagliariello, Andrea; Cherubini, Anna; Sevini, Federica; Iaquilano, Nicoletta; Franceschi, Claudio; Sazzini, Marco; Luiselli, Donata

    2016-11-08

    Investigation of human mitochondrial DNA variation patterns and phylogeny has been extensively used in Anthropological and Population Genetics studies and sequencing the whole mitochondrial genome is progressively becoming the gold standard. Among the currently available massive parallel sequencing technologies, Ion Torrent™ semiconductor sequencing represents a promising approach for such studies. Nevertheless, an experimental protocol conceived to enable the achievement of both as high as possible yield and of the most homogeneous sequence coverage through the whole mitochondrial genome is still not available. The present work was thus aimed at improving the overall performance of whole mitochondrial genomes Ion Torrent™ sequencing, with special focus on the capability to obtain robust coverage and highly reliable variants calling. For this purpose, a series of cost-effective modifications in standard laboratory workflows was fine-tuned to optimize them for medium- and large-scale population studies. A total of 54 human samples were thus subjected to sequencing of the whole mitochondrial genome with the Ion Personal Genome Machine™ System in four distinct experiments and using Ion 314 chips. Seven of the selected samples were also characterized by means of conventional Sanger sequencing for the sake of comparison. Obtained results demonstrated that the implemented optimizations had definitely improved sequencing outputs in terms of both variants calling efficiency and coverage uniformity, enabling to setup an effective and accurate protocol for whole mitochondrial genome sequencing and a considerable reduction in experimental time consumption and sequencing costs.

  6. Towards Anatomic Scale Agent-Based Modeling with a Massively Parallel Spatially Explicit General-Purpose Model of Enteric Tissue (SEGMEnT_HPC)

    PubMed Central

    Cockrell, Robert Chase; Christley, Scott; Chang, Eugene; An, Gary

    2015-01-01

    Perhaps the greatest challenge currently facing the biomedical research community is the ability to integrate highly detailed cellular and molecular mechanisms to represent clinical disease states as a pathway to engineer effective therapeutics. This is particularly evident in the representation of organ-level pathophysiology in terms of abnormal tissue structure, which, through histology, remains a mainstay in disease diagnosis and staging. As such, being able to generate anatomic scale simulations is a highly desirable goal. While computational limitations have previously constrained the size and scope of multi-scale computational models, advances in the capacity and availability of high-performance computing (HPC) resources have greatly expanded the ability of computational models of biological systems to achieve anatomic, clinically relevant scale. Diseases of the intestinal tract are exemplary examples of pathophysiological processes that manifest at multiple scales of spatial resolution, with structural abnormalities present at the microscopic, macroscopic and organ-levels. In this paper, we describe a novel, massively parallel computational model of the gut, the Spatially Explicitly General-purpose Model of Enteric Tissue_HPC (SEGMEnT_HPC), which extends an existing model of the gut epithelium, SEGMEnT, in order to create cell-for-cell anatomic scale simulations. We present an example implementation of SEGMEnT_HPC that simulates the pathogenesis of ileal pouchitis, and important clinical entity that affects patients following remedial surgery for ulcerative colitis. PMID:25806784

  7. Forensic massively parallel sequencing data analysis tool: Implementation of MyFLq as a standalone web- and Illumina BaseSpace(®)-application.

    PubMed

    Van Neste, Christophe; Gansemans, Yannick; De Coninck, Dieter; Van Hoofstat, David; Van Criekinge, Wim; Deforce, Dieter; Van Nieuwerburgh, Filip

    2015-03-01

    Routine use of massively parallel sequencing (MPS) for forensic genomics is on the horizon. The last few years, several algorithms and workflows have been developed to analyze forensic MPS data. However, none have yet been tailored to the needs of the forensic analyst who does not possess an extensive bioinformatics background. We developed our previously published forensic MPS data analysis framework MyFLq (My-Forensic-Loci-queries) into an open-source, user-friendly, web-based application. It can be installed as a standalone web application, or run directly from the Illumina BaseSpace environment. In the former, laboratories can keep their data on-site, while in the latter, data from forensic samples that are sequenced on an Illumina sequencer can be uploaded to Basespace during acquisition, and can subsequently be analyzed using the published MyFLq BaseSpace application. Additional features were implemented such as an interactive graphical report of the results, an interactive threshold selection bar, and an allele length-based analysis in addition to the sequenced-based analysis. Practical use of the application is demonstrated through the analysis of four 16-plex short tandem repeat (STR) samples, showing the complementarity between the sequence- and length-based analysis of the same MPS data.

  8. A myriad of miRNA variants in control and Huntington’s disease brain regions detected by massively parallel sequencing

    PubMed Central

    Martí, Eulàlia; Pantano, Lorena; Bañez-Coronel, Mónica; Llorens, Franc; Miñones-Moyano, Elena; Porta, Sílvia; Sumoy, Lauro; Ferrer, Isidre; Estivill, Xavier

    2010-01-01

    Huntington disease (HD) is a neurodegenerative disorder that predominantly affects neurons of the forebrain. We have applied the Illumina massively parallel sequencing to deeply analyze the small RNA populations of two different forebrain areas, the frontal cortex (FC) and the striatum (ST) of healthy individuals and individuals with HD. More than 80% of the small-RNAs were annotated as microRNAs (miRNAs) in all samples. Deep sequencing revealed length and sequence heterogeneity (IsomiRs) for the vast majority of miRNAs. Around 80–90% of the miRNAs presented modifications in the 3′-terminus mainly in the form of trimming and/or as nucleotide addition variants, while the 5′-terminus of the miRNAs was specially protected from changes. Expression profiling showed strong miRNA and isomiR expression deregulation in HD, most being common to both FC and ST. The analysis of the upstream regulatory regions in co-regulated miRNAs suggests a role for RE1-Silencing Transcription Factor (REST) and P53 in miRNAs downregulation in HD. The putative targets of deregulated miRNAs and seed-region IsomiRs strongly suggest that their altered expression contributes to the aberrant gene expression in HD. Our results show that miRNA variability is a ubiquitous phenomenon in the adult human brain, which may influence gene expression in physiological and pathological conditions. PMID:20591823

  9. High Incidence of Unrecognized Visceral/Neurological Late-onset Niemann-Pick Disease, type C1 Predicted by Analysis of Massively Parallel Sequencing Data Sets

    PubMed Central

    Wassif, Christopher A.; Cross, Joanna L.; Iben, James; Sanchez-Pulido, Luis; Cougnoux, Antony; Platt, Frances M.; Ory, Daniel S.; Ponting, Chris P.; Bailey-Wilson, Joan E.; Biesecker, Leslie G.; Porter, Forbes D.

    2015-01-01

    Purpose Niemann-Pick disease, type C (NPC) is a recessive, neurodegenerative, lysosomal storage disease caused by mutations in either NPC1 or NPC2. The diagnosis is difficult and frequently delayed. Ascertainment is likely incomplete due to both these factors and that the full phenotypic spectrum may not have been fully delineated. Given the recent development of a blood-based diagnostic test and development of potential therapies, it is important to understand the incidence of NPC and to define at risk patient populations. Method We evaluated data from four large massively parallel exome sequencing data sets. Variant sequences were identified and classified as pathogenic or non-pathogenic based on a combination of literature review and bioinformatic analysis. This methodology provided an unbiased approach to determining the allele frequency. Results Our data suggests an incidence rate for NPC1 and NPC2 of 1/92,104 and 1/2,858,998, respectively. However, evaluation of common NPC1 variants, suggests that there may be a late-onset NPC1phenotype with a markedly higher incidence on the order of 1/20,000–39,000. Conclusions We determined a combined incidence of classical NPC of 1/89,229 or 1.12 affected patients per 100,000 conceptions, but predict incomplete ascertainment of a lateonset phenotype of NPC1. This finding strongly supports the need for increased screening of potential patients. PMID:25764212

  10. HLA-G variability and haplotypes detected by massively parallel sequencing procedures in the geographicaly distinct population samples of Brazil and Cyprus.

    PubMed

    Castelli, Erick C; Gerasimou, Petroula; Paz, Michelle A; Ramalho, Jaqueline; Porto, Iane O P; Lima, Thálitta H A; Souza, Andréia S; Veiga-Castelli, Luciana C; Collares, Cristhianna V A; Donadi, Eduardo A; Mendes-Junior, Celso T; Costeas, Paul

    2017-03-01

    The HLA-G molecule presents immunomodulatory properties that might inhibit immune responses when interacting with specific Natural Killer and T cell receptors, such as KIR2DL4, ILT2 and ILT4. Thus, HLA-G might influence the outcome of situations in which fine immune system modulation is required, such as autoimmune diseases, transplants, cancer and pregnancy. The majority of the studies regarding the HLA-G gene variability so far was restricted to a specific gene segment (i.e., promoter, coding or 3' untranslated region), and was performed by using Sanger sequencing and probabilistic models to infer haplotypes. Here we propose a massively parallel sequencing (NGS) with a bioinformatics strategy to evaluate the entire HLA-G regulatory and coding segments, with haplotypes inferred relying more on the straightforward haplotyping capabilities of NGS, and less on probabilistic models. Then, HLA-G variability was surveyed in two admixed population samples of distinct geographical regions and demographic backgrounds, Cyprus and Brazil. Most haplotypes (promoters, coding, 3'UTR and extended ones) were detected both in Brazil and Cyprus and were identical to the ones already described by probabilistic models, indicating that these haplotypes are quite old and may be present worldwide.

  11. Non-invasive prenatal testing using massively parallel sequencing of maternal plasma DNA: from molecular karyotyping to fetal whole-genome sequencing.

    PubMed

    Lo, Y M Dennis

    2013-12-01

    The discovery of cell-free fetal DNA in maternal plasma in 1997 has stimulated a rapid development of non-invasive prenatal testing. The recent advent of massively parallel sequencing has allowed the analysis of circulating cell-free fetal DNA to be performed with unprecedented sensitivity and precision. Fetal trisomies 21, 18 and 13 are now robustly detectable in maternal plasma and such analyses have been available clinically since 2011. Fetal genome-wide molecular karyotyping and whole-genome sequencing have now been demonstrated in a number of proof-of-concept studies. Genome-wide and targeted sequencing of maternal plasma has been shown to allow the non-invasive prenatal testing of β-thalassaemia and can potentially be generalized to other monogenic diseases. It is thus expected that plasma DNA-based non-invasive prenatal testing will play an increasingly important role in future obstetric care. It is thus timely and important that the ethical, social and legal issues of non-invasive prenatal testing be discussed actively by all parties involved in prenatal care.

  12. Massively parallel multiple interacting continua formulation for modeling flow in fractured porous media using the subsurface reactive flow and transport code PFLOTRAN

    NASA Astrophysics Data System (ADS)

    Kumar, J.; Mills, R. T.; Lichtner, P. C.; Hammond, G. E.

    2010-12-01

    Fracture dominated flows occur in numerous subsurface geochemical processes and at many different scales in rock pore structures, micro-fractures, fracture networks and faults. Fractured porous media can be modeled as multiple interacting continua which are connected to each other through transfer terms that capture the flow of mass and energy in response to pressure, temperature and concentration gradients. However, the analysis of large-scale transient problems using the multiple interacting continuum approach presents an algorithmic and computational challenge for problems with very large numbers of degrees of freedom. A generalized dual porosity model based on the Dual Continuum Disconnected Matrix approach has been implemented within a massively parallel multiphysics-multicomponent-multiphase subsurface reactive flow and transport code PFLOTRAN. Developed as part of the Department of Energy's SciDAC-2 program, PFLOTRAN provides subsurface simulation capabilities that can scale from laptops to ultrascale supercomputers, and utilizes the PETSc framework to solve the large, sparse algebraic systems that arises in complex subsurface reactive flow and transport problems. It has been successfully applied to the solution of problems composed of more than two billions degrees of freedom, utilizing up to 131,072 processor cores on Jaguar, the Cray XT5 system at Oak Ridge National Laboratory that is the world’s fastest supercomputer. Building upon the capabilities and computational efficiency of PFLOTRAN, we will present an implementation of the multiple interacting continua formulation for fractured porous media along with an application case study.

  13. SDHA loss-of-function mutations in KIT-PDGFRA wild-type gastrointestinal stromal tumors identified by massively parallel sequencing.

    PubMed

    Pantaleo, Maria A; Astolfi, Annalisa; Indio, Valentina; Moore, Richard; Thiessen, Nina; Heinrich, Michael C; Gnocchi, Chiara; Santini, Donatella; Catena, Fausto; Formica, Serena; Martelli, Pier Luigi; Casadio, Rita; Pession, Andrea; Biasco, Guido

    2011-06-22

    Approximately 10%-15% of gastrointestinal stromal tumors (GISTs) in adults do not harbor any mutation in the KIT or PDGFRA genes (ie, KIT/PDGFRA wild-type GISTs). Recently, mutations in SDHB and SDHC (which encode succinate dehydrogenase subunits B and C, respectively) but not in SDHA and SDHD (which encode subunits A and D, respectively) were identified in KIT/PDGFRA wild-type GISTs. To search for novel pathogenic mutations, we sequenced the tumor transcriptome of two young adult patients who developed sporadic KIT/PDGFRA wild-type GISTs by using a massively parallel sequencing approach. The only variants identified as disease related by computational analysis were in SDHA. One patient carried the homozygous nonsense mutation p.Ser384X, the other patient was a compound heterozygote harboring a p.Arg31X nonsense mutation and a p.Arg589Trp missense mutation. The heterozygous nonsense mutations in both patients were present in germline DNA isolated from peripheral blood. Protein structure analysis indicates that all three mutations lead to functional inactivation of the protein. This is the first report, to our knowle dge, that identifies SDHA inactivation as a common oncogenic event in GISTs that lack a mutation in KIT and PDGFRA.

  14. Performance of the UCAN2 Gyrokinetic Particle In Cell (PIC) Code on Two Massively Parallel Mainframes with Intel ``Sandy Bridge'' Processors

    NASA Astrophysics Data System (ADS)

    Leboeuf, Jean-Noel; Decyk, Viktor; Newman, David; Sanchez, Raul

    2013-10-01

    The massively parallel, 2D domain-decomposed, nonlinear, 3D, toroidal, electrostatic, gyrokinetic, Particle in Cell (PIC), Cartesian geometry UCAN2 code, with particle ions and adiabatic electrons, has been ported to two emerging mainframes. These two computers, one at NERSC in the US built by Cray named Edison and the other at the Barcelona Supercomputer Center (BSC) in Spain built by IBM named MareNostrum III (MNIII) just happen to share the same Intel ``Sandy Bridge'' processors. The successful port of UCAN2 to MNIII which came online first has enabled us to be up and running efficiently in record time on Edison. Overall, the performance of UCAN2 on Edison is superior to that on MNIII, particularly at large numbers of processors (>1024) for the same Intel IFORT compiler. This appears to be due to different MPI modules (OpenMPI on MNIII and MPICH2 on Edison) and different interconnection networks (Infiniband on MNIII and Cray's Aries on Edison) on the two mainframes. Details of these ports and comparative benchmarks are presented. Work supported by OFES, USDOE, under contract no. DE-FG02-04ER54741 with the University of Alaska at Fairbanks.

  15. Touch imprint cytology with massively parallel sequencing (TIC-seq): a simple and rapid method to snapshot genetic alterations in tumors.

    PubMed

    Amemiya, Kenji; Hirotsu, Yosuke; Goto, Taichiro; Nakagomi, Hiroshi; Mochizuki, Hitoshi; Oyama, Toshio; Omata, Masao

    2016-12-01

    Identifying genetic alterations in tumors is critical for molecular targeting of therapy. In the clinical setting, formalin-fixed paraffin-embedded (FFPE) tissue is usually employed for genetic analysis. However, DNA extracted from FFPE tissue is often not suitable for analysis because of its low levels and poor quality. Additionally, FFPE sample preparation is time-consuming. To provide early treatment for cancer patients, a more rapid and robust method is required for precision medicine. We present a simple method for genetic analysis, called touch imprint cytology combined with massively paralleled sequencing (touch imprint cytology [TIC]-seq), to detect somatic mutations in tumors. We prepared FFPE tissues and TIC specimens from tumors in nine lung cancer patients and one patient with breast cancer. We found that the quality and quantity of TIC DNA was higher than that of FFPE DNA, which requires microdissection to enrich DNA from target tissues. Targeted sequencing using a next-generation sequencer obtained sufficient sequence data using TIC DNA. Most (92%) somatic mutations in lung primary tumors were found to be consistent between TIC and FFPE DNA. We also applied TIC DNA to primary and metastatic tumor tissues to analyze tumor heterogeneity in a breast cancer patient, and showed that common and distinct mutations among primary and metastatic sites could be classified into two distinct histological subtypes. TIC-seq is an alternative and feasible method to analyze genomic alterations in tumors by simply touching the cut surface of specimens to slides.

  16. Helena, the hidden beauty: Resolving the most common West Eurasian mtDNA control region haplotype by massively parallel sequencing an Italian population sample.

    PubMed

    Bodner, Martin; Iuvaro, Alessandra; Strobl, Christina; Nagl, Simone; Huber, Gabriela; Pelotti, Susi; Pettener, Davide; Luiselli, Donata; Parson, Walther

    2015-03-01

    The analysis of mitochondrial (mt)DNA is a powerful tool in forensic genetics when nuclear markers fail to give results or maternal relatedness is investigated. The mtDNA control region (CR) contains highly condensed variation and is therefore routinely typed. Some samples exhibit an identical haplotype in this restricted range. Thus, they convey only weak evidence in forensic queries and limited phylogenetic information. However, a CR match does not imply that also the mtDNA coding regions are identical or samples belong to the same phylogenetic lineage. This is especially the case for the most frequent West Eurasian CR haplotype 263G 315.1C 16519C, which is observed in various clades within haplogroup H and occurs at a frequency of 3-4% in many European populations. In this study, we investigated the power of massively parallel complete mtGenome sequencing in 29 Italian samples displaying the most common West Eurasian CR haplotype - and found an unexpected high diversity. Twenty-eight different haplotypes falling into 19 described sub-clades of haplogroup H were revealed in the samples with identical CR sequences. This study demonstrates the benefit of complete mtGenome sequencing for forensic applications to enforce maximum discrimination, more comprehensive heteroplasmy detection, as well as highest phylogenetic resolution.

  17. Dielectrophoresis-assisted massively parallel cell pairing and fusion based on field constriction created by a micro-orifice array sheet.

    PubMed

    Kimura, Yuji; Gel, Murat; Techaumnat, Boonchai; Oana, Hidehiro; Kotera, Hidetoshi; Washizu, Masao

    2011-09-01

    In this paper, we present a novel electrofusion device that enables massive parallelism, using an electrically insulating sheet having a two-dimensional micro-orifice array. The sheet is sandwiched by a pair of micro-chambers with immersed electrodes, and each chamber is filled with the suspensions of the two types of cells to be fused. Dielectrophoresis, assisted by sedimentation, is used to position the cells in the upper chamber down onto the orifices, then the device is flipped over to position the cells on the other side, so that cell pairs making contact in the orifice are formed. When a pulse voltage is applied to the electrodes, most voltage drop occurs around the orifice and impressed on the cell membrane in the orifice. This makes possible the application of size-independent voltage to fuse two cells in contact at all orifices exclusively in 1:1 manner. In the experiment, cytoplasm of one of the cells is stained with a fluorescence dye, and the transfer of the fluorescence to the other cell is used as the indication of fusion events. The two-dimensional orifice arrangement at the pitch of 50 μm realizes simultaneous fusion of 6 × 10³ cells on a 4 mm diameter chip, and the fusion yield of 78-90% is achieved for various sizes and types of cells.

  18. Molecular diagnostic profiling of lung cancer specimens with a semiconductor-based massive parallel sequencing approach: feasibility, costs, and performance compared with conventional sequencing.

    PubMed

    Endris, Volker; Penzel, Roland; Warth, Arne; Muckenhuber, Alexander; Schirmacher, Peter; Stenzinger, Albrecht; Weichert, Wilko

    2013-11-01

    In the context of personalized oncology, screening for somatic tumor mutations is crucial for prediction of an individual patient's response to therapy. Massive parallel sequencing (MPS) has been suggested for routine diagnostics, but this technology has not been sufficiently evaluated with respect to feasibility, reliability, and cost effectiveness with routine diagnostic formalin-fixed, paraffin-embedded material. We performed ultradeep targeted semiconductor-based MPS (190 amplicons covering hotspot mutations in 46 genes) in a variety of formalin-fixed, paraffin-embedded diagnostic samples of lung adenocarcinoma tissue with known EGFR mutations (n = 28). The samples reflected the typical spectrum of tissue material for diagnostics, including small biopsies and samples with low tumor-cell content. Using MPS, we successfully sequenced all samples, with a mean read depth of 2947 reads per amplicon. High-quality sequence reads were obtained from samples containing ≥10% tumor material. In all but one sample, variant calling identified the same EGFR mutations as were detected by conventional Sanger sequencing. Moreover, we identified 43 additional mutations in 17 genes and detected amplifications in the EGFR and ERBB2 genes. MPS performance was reliable and independent of the type of material, as well as of the fixation and extraction methods, but was influenced by tumor-cell content and the degree of DNA degradation. Using sample multiplexing, focused MPS approached diagnostically acceptable cost rates.

  19. Development of ballistic hot electron emitter and its applications to parallel processing: active-matrix massive direct-write lithography in vacuum and thin films deposition in solutions

    NASA Astrophysics Data System (ADS)

    Koshida, N.; Kojima, A.; Ikegami, N.; Suda, R.; Yagi, M.; Shirakashi, J.; Yoshida, T.; Miyaguchi, H.; Muroyama, M.; Nishino, H.; Yoshida, S.; Sugata, M.; Totsu, K.; Esashi, M.

    2015-03-01

    Making the best use of the characteristic features in nanocrystalline Si (nc-Si) ballistic hot electron source, the alternative lithographic technology is presented based on the two approaches: physical excitation in vacuum and chemical reduction in solutions. The nc-Si cold cathode is a kind of metal-insulator-semiconductor (MIS) diode, composed of a thin metal film, an nc-Si layer, an n+-Si substrate, and an ohmic back contact. Under a biased condition, energetic electrons are uniformly and directionally emitted through the thin surface electrodes. In vacuum, this emitter is available for active-matrix drive massive parallel lithography. Arrayed 100×100 emitters (each size: 10×10 μm2, pitch: 100 μm) are fabricated on silicon substrate by conventional planar process, and then every emitter is bonded with integrated complementary metal-oxide-semiconductor (CMOS) driver using through-silicon-via (TSV) interconnect technology. Electron multi-beams emitted from selected devices are focused by a micro-electro-mechanical system (MEMS) condenser lens array and introduced into an accelerating system with a demagnification factor of 100. The electron accelerating voltage is 5 kV. The designed size of each beam landing on the target is 10×10 nm2 in square. Here we discuss the fabrication process of the emitter array with TSV holes, implementation of integrated ctive-matrix driver circuit, the bonding of these components, the construction of electron optics, and the overall operation in the exposure system including the correction of possible aberrations. The experimental results of this mask-less parallel pattern transfer are shown in terms of simple 1:1 projection and parallel lithography under an active-matrix drive scheme. Another application is the use of this emitter as an active electrode supplying highly reducing electrons into solutions. A very small amount of metal-salt solutions is dripped onto the nc-Si emitter surface, and the emitter is driven without

  20. Parallel inversion of a massive ERT data set to characterize deep vadose zone contamination beneath former nuclear waste infiltration galleries at the Hanford Site B-Complex (Invited)

    NASA Astrophysics Data System (ADS)

    Johnson, T.; Rucker, D. F.; Wellman, D.

    2013-12-01

    revealed the general footprint of vadose zone contamination beneath infiltration galleries. In 2011, the USDOE commissioned an effort to re-invert the B-Complex ERT data as a whole using a recently developed massively parallel 3D ERT inversion code. The computational mesh included approximately 1.085 million elements and closely honored the 37m of topographic relief as determined by LiDAR imaging. The water table and tank boundaries were also incorporated into the mesh to facilitate regularization disconnects, enabling sharp conductivity contrasts where they occur naturally without penalty. The data were inverted using 1024 processors, requiring 910 Gb of memory and 11.5 hours of computation time. The imaging results revealed previously unrealized detail concerning the distribution and behavior of contaminants migrating through the vadose zone, and are currently being used by site cleanup operators and regulators to understand the origin of a groundwater nitrate plume emerging from one of the infiltration galleries. The results overall demonstrate the utility of high performance computing, unstructured meshing, and custom regularization constraints for optimal processing of massive ERT data sets enabled by modern ERT survey hardware.

  1. Oestrogen deficiency after tubal ligation.

    PubMed

    Cattanach, J

    1985-04-13

    4 of 7 women who had undergone tubal ligation within the past seven years were found to have oestrogen excretion concentrations at ovulation below the tenth percentile. A disturbance in the oestrogen/progesterone ratio as a consequence of localised hypertension at the ovary, when the utero-ovarian arterial loop is occluded at tubal ligation, is proposed as a possible cause of oestrogen deficiency syndrome, dysfunctional uterine bleeding, and menorrhagia after tubal ligation. Similar pathophysiology may occur after hysterectomy with ovarian conservation.

  2. Influences of diurnal sampling bias on fixed-point monitoring of plankton biodiversity determined using a massively parallel sequencing-based technique.

    PubMed

    Nagai, Satoshi; Hida, Kohsuke; Urushizaki, Shingo; Onitsuka, Goh; Yasuike, Motoshige; Nakamura, Yoji; Fujiwara, Atushi; Tajimi, Seisuke; Kimoto, Katsunori; Kobayashi, Takanori; Gojobori, Takashi; Ototake, Mitsuru

    2016-02-01

    In this study, we investigated the influence of diurnal sampling bias on the community structure of plankton by comparing the biodiversity among seawater samples (n=9) obtained every 3h for 24h by using massively parallel sequencing (MPS)-based plankton monitoring at a fixed point conducted at Himedo seaport in Yatsushiro Sea, Japan. The number of raw operational taxonomy units (OTUs) and OTUs after re-sampling was 507-658 (558 ± 104, mean ± standard deviation) and 448-544 (467 ± 81), respectively, indicating high plankton biodiversity at the sampling location. The relative abundance of the top 20 OTUs in the samples from Himedo seaport was 48.8-67.7% (58.0 ± 5.8%), and the highest-ranked OTU was Pseudo-nitzschia species (Bacillariophyta) with a relative abundance of 17.3-39.2%, followed by Oithona sp. 1 and Oithona sp. 2 (Arthropoda). During seawater sampling, the semidiurnal tidal current having an amplitude of 0.3ms(-1) was dominant, and the westward residual current driven by the northeasterly wind was continuously observed during the 24-h monitoring. Therefore, the relative abundance of plankton species apparently fluctuated among the samples, but no significant difference was noted according to G-test (p>0.05). Significant differences were observed between the samples obtained from a different locality (Kusuura in Yatsushiro Sea) and at different dates, suggesting that the influence of diurnal sampling bias on plankton diversity, determined using the MPS-based survey, was not significant and acceptable.

  3. Uncooled digital IRFPA-family with 17μm pixel-pitch based on amorphous silicon with massively parallel Sigma-Delta-ADC readout

    NASA Astrophysics Data System (ADS)

    Weiler, D.; Hochschulz, F.; Würfel, D.; Lerch, R.; Geruschke, T.; Wall, S.; Heß, J.; Wang, Q.; Vogt, H.

    2014-06-01

    This paper presents the results of an advanced digital IRFPA-family developed by Fraunhofer IMS. The IRFPA-family compromises the two different optical resolutions VGA (640 ×480 pixel) and QVGA (320 × 240 pixel) by using a pin-compatible detector board. The uncooled IRFPAs are designed for thermal imaging applications in the LWIR (8 .. 14μm) range with a full-frame frequency of 30 Hz and a high thermal sensitivity. The microbolometer with a pixel-pitch of 17μm consists of amorphous silicon as the sensing layer. By scaling and optimizing our previous microbolometer technology with a pixel-pitch of 25μm we enhance the thermal sensitivity of the microbolometer. The microbolometers are read out by a novel readout architecture which utilizes massively parallel on-chip Sigma-Delta-ADCs. This results in a direct digital conversion of the resistance change of the microbolometer induced by incident infrared radiation. To reduce production costs a chip-scale-package is used as vacuum package. This vacuum package consists of an IR-transparent window with an antireflection coating and a soldering frame which is fixed by a wafer-to-chip process directly on top of the CMOS-substrate. The chip-scale-package is placed onto a detector board by a chip-on-board technique. The IRFPAs are completely fabricated at Fraunhofer IMS on 8" CMOS wafers with an additional surface micromachining process. In this paper the architecture of the readout electronics, the packaging, and the electro-optical performance characterization are presented.

  4. Improvements of a digital 25 μm pixel-pitch uncooled amorphous silicon TEC-less VGA IRFPA with massively parallel Sigma-Delta-ADC readout

    NASA Astrophysics Data System (ADS)

    Weiler, Dirk; Ruß, Marco; Würfel, Daniel; Lerch, Renee; Yang, Pin; Bauer, Jochen; Heß, Jennifer; Kropelnicki, Piotr; Vogt, Holger

    2011-06-01

    This paper presents the improvements of an advanced digital VGA-IRFPA developed by Fraunhofer-IMS. The uncooled IRFPA is designed for thermal imaging applications in the LWIR (8 .. 14 μm) range with a full-frame frequency of 30 Hz and a high sensitivity with NETD < 100 mK @ f/1. The microbolometer with a pixel-pitch of 25 μm consists of amorphous silicon as the sensing layer. The structure of the microbolometer has been optimized for a better performance compared to the 1st generation IRFPA1. The thermal isolation has been doubled by increasing the length and by decreasing the width of the legs. To increase the fill-factor the contact areas have been reduced. The microbolometers are read out by a novel readout architecture which utilizes massively parallel on-chip Sigma-Delta-ADCs. This results in a direct digital conversion of the resistance change of the microbolometer induced by incident infrared radiation. Two different solutions for the vacuum package have been developed. To reduce production costs a chip-scale-package is used. This vacuum package consists of an IR-transparent window with antireflection coating and a soldering frame which is fixed by a wafer-to-chip process directly on top of the read substrate. An alternative solution based on the use of a standard ceramic package is utilized as a vacuum package. This packaging solution is used for high performance applications. The IRFPAs are completely fabricated at Fraunhofer-IMS on 8" CMOS wafers with an additional surface micromachining process.

  5. Quaternary Morphodynamics of Fluvial Dispersal Systems Revealed: The Fly River, PNG, and the Sunda Shelf, SE Asia, simulated with the Massively Parallel GPU-based Model 'GULLEM'

    NASA Astrophysics Data System (ADS)

    Aalto, R. E.; Lauer, J. W.; Darby, S. E.; Best, J.; Dietrich, W. E.

    2015-12-01

    During glacial-marine transgressions vast volumes of sediment are deposited due to the infilling of lowland fluvial systems and shallow shelves, material that is removed during ensuing regressions. Modelling these processes would illuminate system morphodynamics, fluxes, and 'complexity' in response to base level change, yet such problems are computationally formidable. Environmental systems are characterized by strong interconnectivity, yet traditional supercomputers have slow inter-node communication -- whereas rapidly advancing Graphics Processing Unit (GPU) technology offers vastly higher (>100x) bandwidths. GULLEM (GpU-accelerated Lowland Landscape Evolution Model) employs massively parallel code to simulate coupled fluvial-landscape evolution for complex lowland river systems over large temporal and spatial scales. GULLEM models the accommodation space carved/infilled by representing a range of geomorphic processes, including: river & tributary incision within a multi-directional flow regime, non-linear diffusion, glacial-isostatic flexure, hydraulic geometry, tectonic deformation, sediment production, transport & deposition, and full 3D tracking of all resulting stratigraphy. Model results concur with the Holocene dynamics of the Fly River, PNG -- as documented with dated cores, sonar imaging of floodbasin stratigraphy, and the observations of topographic remnants from LGM conditions. Other supporting research was conducted along the Mekong River, the largest fluvial system of the Sunda Shelf. These and other field data provide tantalizing empirical glimpses into the lowland landscapes of large rivers during glacial-interglacial transitions, observations that can be explored with this powerful numerical model. GULLEM affords estimates for the timing and flux budgets within the Fly and Sunda Systems, illustrating complex internal system responses to the external forcing of sea level and climate. Furthermore, GULLEM can be applied to most ANY fluvial system to

  6. A Genomic and Protein-Protein Interaction Analyses of Nonsyndromic Hearing Impairment in Cameroon Using Targeted Genomic Enrichment and Massively Parallel Sequencing.

    PubMed

    Lebeko, Kamogelo; Manyisa, Noluthando; Chimusa, Emile R; Mulder, Nicola; Dandara, Collet; Wonkam, Ambroise

    2017-02-01

    Hearing impairment (HI) is one of the leading causes of disability in the world, impacting the social, economic, and psychological well-being of the affected individual. This is particularly true in sub-Saharan Africa, which carries one of the highest burdens of this condition. Despite this, there are limited data on the most prevalent genes or mutations that cause HI among sub-Saharan Africans. Next-generation technologies, such as targeted genomic enrichment and massively parallel sequencing, offer new promise in this context. This study reports, for the first time to the best of our knowledge, on the prevalence of novel mutations identified through a platform of 116 HI genes (OtoSCOPE(®)), among 82 African probands with HI. Only variants OTOF NM_194248.2:c.766-2A>G and MYO7A NM_000260.3:c.1996C>T, p.Arg666Stop were found in 3 (3.7%) and 5 (6.1%) patients, respectively. In addition and uniquely, the analysis of protein-protein interactions (PPI), through interrogation of gene subnetworks, using a custom script and two databases (Enrichr and PANTHER), and an algorithm in the igraph package of R, identified the enrichment of sensory perception and mechanical stimulus biological processes, and the most significant molecular functions of these variants pertained to binding or structural activity. Furthermore, 10 genes (MYO7A, MYO6, KCTD3, NUMA1, MYH9, KCNQ1, UBC, DIAPH1, PSMC2, and RDX) were identified as significant hubs within the subnetworks. Results reveal that the novel variants identified among familial cases of HI in Cameroon are not common, and PPI analysis has highlighted the role of 10 genes, potentially important in understanding HI genomics among Africans.

  7. Assessing quality standards for ChIP-seq and related massive parallel sequencing-generated datasets: When rating goes beyond avoiding the crisis.

    PubMed

    Mendoza-Parra, Marco Antonio; Gronemeyer, Hinrich

    2014-12-01

    Massive parallel DNA sequencing combined with chromatin immunoprecipitation and a large variety of DNA/RNA-enrichment methodologies is at the origin of data resources of major importance. Indeed these resources, available for multiple genomes, represent the most comprehensive catalogue of (i) cell, development and signal transduction-specified patterns of binding sites for transcription factors ('cistromes') and for transcription and chromatin modifying machineries and (ii) the patterns of specific local post-translational modifications of histones and DNA ('epigenome') or of regulatory chromatin binding factors. In addition, (iii) the resources specifying chromatin structure alterations are emerging. Importantly, these types of "omics" datasets populate increasingly public repositories and provide highly valuable resources for the exploration of general principles of cell function in a multi-dimensional genome-transcriptome-epigenome-chromatin structure context. However, data mining is critically dependent on the data quality, an issue that, surprisingly, is still largely ignored by scientists and well-financed consortia, data repositories and scientific journals. So what determines the quality of ChIP-seq experiments and the datasets generated therefrom and what refrains scientists from associating quality criteria to their data? In this 'opinion' we trace the various parameters that influence the quality of this type of datasets, as well as the computational efforts that were made until now to qualify them. Moreover, we describe a universal quality control (QC) certification approach that provides a quality rating for ChIP-seq and enrichment-related assays. The corresponding QC tool and a regularly updated database, from which at present the quality parameters of more than 8000 datasets can be retrieved, are freely accessible at www.ngs-qc.org.

  8. Massively parallel sequencing of phyllodes tumours of the breast reveals actionable mutations, and TERT promoter hotspot mutations and TERT gene amplification as likely drivers of progression

    PubMed Central

    Piscuoglio, Salvatore; Ng, Charlotte K Y; Murray, Melissa; Burke, Kathleen A; Edelweiss, Marcia; Geyer, Felipe C; Macedo, Gabriel S; Inagaki, Akiko; Papanastasiou, Anastasios D; Martelotto, Luciano G; Marchio, Caterina; Lim, Raymond S; Ioris, Rafael A; Nahar, Pooja K; De Bruijn, Ino; Smyth, Lillian; Akram, Muzaffar; Ross, Dara; Petrini, John H; Norton, Larry; Solit, David B; Baselga, Jose; Brogi, Edi; Ladanyi, Marc; Weigelt, Britta; Reis-Filho, Jorge S

    2015-01-01

    Phyllodes tumours (PTs) are breast fibroepithelial lesions that are graded based on histological criteria as benign, borderline or malignant. PTs may recur locally. Borderline PTs and malignant PTs may metastasize to distant sites. Breast fibroepithelial lesions, including PTs and fibroadenomas, are characterized by recurrent MED12 exon 2 somatic mutations. We sought to define the repertoire of somatic genetic alterations in PTs and whether these may assist in the differential diagnosis of these lesions. We collected 100 fibroadenomas, 40 benign PTs, 14 borderline PTs and 22 malignant PTs. Six, 6 and 13 benign, borderline and malignant PTs respectively and their matched normal tissue were subjected to targeted massively parallel sequencing (MPS) using the MSK-IMPACT sequencing assay. Recurrent MED12 mutations were found in 56% of PTs; in addition, mutations affecting cancer genes (e.g. TP53, RB1, SETD2 and EGFR) were exclusively detected in borderline and malignant PTs. We found a novel recurrent clonal hotspot mutation in the TERT promoter (−124 C>T) in 52% and TERT gene amplification in 4% of PTs. Laser capture microdissection revealed that these mutations were restricted to the mesenchymal component of PTs. Sequencing analysis of the entire cohort revealed that the frequency of TERT alterations increased from benign (18%), to borderline (57%) and to malignant PTs (68%; P<0.01), and TERT alterations were associated with increased levels of TERT mRNA (P<0.001). No TERT alterations were observed in fibroadenomas. An analysis of TERT promoter sequencing and gene amplification distinguished PTs from fibroadenomas with a sensitivity and a positive predictive value of 100% (CI 95.38%–100%) and 100% (CI 85.86%–100%), respectively, and a sensitivity and a negative predictive value of 39% (CI 28.65%–51.36%) and 68% (CI 60.21%–75.78%), respectively. Our results suggest that TERT alterations may drive the progression of PTs, and may assist in the differential

  9. Massively parallel sequencing of forensic STRs: Considerations of the DNA commission of the International Society for Forensic Genetics (ISFG) on minimal nomenclature requirements.

    PubMed

    Parson, Walther; Ballard, David; Budowle, Bruce; Butler, John M; Gettings, Katherine B; Gill, Peter; Gusmão, Leonor; Hares, Douglas R; Irwin, Jodi A; King, Jonathan L; Knijff, Peter de; Morling, Niels; Prinz, Mechthild; Schneider, Peter M; Neste, Christophe Van; Willuweit, Sascha; Phillips, Christopher

    2016-05-01

    The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that provide a precise description of the repeat allele structure of a STR marker and variants that may reside in the flanking areas of the repeat region. When a STR contains a complex arrangement of repeat motifs, the level of genetic polymorphism revealed by the sequence data can increase substantially. As repeat structures can be complex and include substitutions, insertions, deletions, variable tandem repeat arrangements of multiple nucleotide motifs, and flanking region SNPs, established capillary electrophoresis (CE) allele descriptions must be supplemented by a new system of STR allele nomenclature, which retains backward compatibility with the CE data that currently populate national DNA databases and that will continue to be produced for the coming years. Thus, there is a pressing need to produce a standardized framework for describing complex sequences that enable comparison with currently used repeat allele nomenclature derived from conventional CE systems. It is important to discern three levels of information in hierarchical order (i) the sequence, (ii) the alignment, and (iii) the nomenclature of STR sequence data. We propose a sequence (text) string format the minimal requirement of data storage that laboratories should follow when adopting MPS of STRs. We further discuss the variant annotation and sequence comparison framework necessary to maintain compatibility among established and future data. This system must be easy to use and interpret by the DNA specialist, based on a universally accessible genome assembly, and in place before the uptake of MPS by the general forensic community starts to generate sequence data on a large scale. While the established

  10. BerkeleyGW: A massively parallel computer package for the calculation of the quasiparticle and optical properties of materials and nanostructures

    NASA Astrophysics Data System (ADS)

    Deslippe, Jack; Samsonidze, Georgy; Strubbe, David A.; Jain, Manish; Cohen, Marvin L.; Louie, Steven G.

    2012-06-01

    BerkeleyGW is a massively parallel computational package for electron excited-state properties that is based on the many-body perturbation theory employing the ab initio GW and GW plus Bethe-Salpeter equation methodology. It can be used in conjunction with many density-functional theory codes for ground-state properties, including PARATEC, PARSEC, Quantum ESPRESSO, SIESTA, and Octopus. The package can be used to compute the electronic and optical properties of a wide variety of material systems from bulk semiconductors and metals to nanostructured materials and molecules. The package scales to 10 000s of CPUs and can be used to study systems containing up to 100s of atoms. Program summaryProgram title: BerkeleyGW Catalogue identifier: AELG_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AELG_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Open source BSD License. See code for licensing details. No. of lines in distributed program, including test data, etc.: 576 540 No. of bytes in distributed program, including test data, etc.: 110 608 809 Distribution format: tar.gz Programming language: Fortran 90, C, C++, Python, Perl, BASH Computer: Linux/UNIX workstations or clusters Operating system: Tested on a variety of Linux distributions in parallel and serial as well as AIX and Mac OSX RAM: (50-2000) MB per CPU (Highly dependent on system size) Classification: 7.2, 7.3, 16.2, 18 External routines: BLAS, LAPACK, FFTW, ScaLAPACK (optional), MPI (optional). All available under open-source licenses. Nature of problem: The excited state properties of materials involve the addition or subtraction of electrons as well as the optical excitations of electron-hole pairs. The excited particles interact strongly with other electrons in a material system. This interaction affects the electronic energies, wavefunctions and lifetimes. It is well known that ground-state theories, such as standard methods

  11. NON-CONFORMING FINITE ELEMENTS; MESH GENERATION, ADAPTIVITY AND RELATED ALGEBRAIC MULTIGRID AND DOMAIN DECOMPOSITION METHODS IN MASSIVELY PARALLEL COMPUTING ENVIRONMENT

    SciTech Connect

    Lazarov, R; Pasciak, J; Jones, J

    2002-02-01

    Construction, analysis and numerical testing of efficient solution techniques for solving elliptic PDEs that allow for parallel implementation have been the focus of the research. A number of discretization and solution methods for solving second order elliptic problems that include mortar and penalty approximations and domain decomposition methods for finite elements and finite volumes have been investigated and analyzed. Techniques for parallel domain decomposition algorithms in the framework of PETC and HYPRE have been studied and tested. Hierarchical parallel grid refinement and adaptive solution methods have been implemented and tested on various model problems. A parallel code implementing the mortar method with algebraically constructed multiplier spaces was developed.

  12. Implementation of molecular dynamics and its extensions with the coarse-grained UNRES force field on massively parallel systems; towards millisecond-scale simulations of protein structure, dynamics, and thermodynamics.

    PubMed

    Liwo, Adam; Ołdziej, Stanisław; Czaplewski, Cezary; Kleinerman, Dana S; Blood, Philip; Scheraga, Harold A

    2010-03-09

    We report the implementation of our united-residue UNRES force field for simulations of protein structure and dynamics with massively parallel architectures. In addition to coarse-grained parallelism already implemented in our previous work, in which each conformation was treated by a different task, we introduce a fine-grained level in which energy and gradient evaluation are split between several tasks. The Message Passing Interface (MPI) libraries have been utilized to construct the parallel code. The parallel performance of the code has been tested on a professional Beowulf cluster (Xeon Quad Core), a Cray XT3 supercomputer, and two IBM BlueGene/P supercomputers with canonical and replica-exchange molecular dynamics. With IBM BlueGene/P, about 50 % efficiency and 120-fold speed-up of the fine-grained part was achieved for a single trajectory of a 767-residue protein with use of 256 processors/trajectory. Because of averaging over the fast degrees of freedom, UNRES provides an effective 1000-fold speed-up compared to the experimental time scale and, therefore, enables us to effectively carry out millisecond-scale simulations of proteins with 500 and more amino-acid residues in days of wall-clock time.

  13. [Post-tubal ligation syndrome].

    PubMed

    Satoh, K; Osada, H

    1993-01-01

    Post-tubal ligation syndrome includes pain during intercourse, aching lower back, premenstrual tension syndrome, difficulty in menstruating, uterine hemorrhage, and absence of menstruation. The syndrome is caused by blood circulation problems in and around the Fallopian tubes and ovaries, pressure on nerves, and intrapelvic adhesion. Differentiating between this syndrome and endometritis during diagnosis and differentiating between functional hemorrhage due to hormonal abnormality and anatomical hemorrhage due to polyp or tumor is very important. Since the symptoms of this syndrome are mild, simple symptomatic treatment is sufficient in most cases. In some cases, however, desquamation surgery or reversal of tubal ligation may be necessary. Endoscopic surgery is also available. In Japan, because of widespread use of condoms and IUDs, tubal ligation is not very common.

  14. Parallel processing ITS

    SciTech Connect

    Fan, W.C.; Halbleib, J.A. Sr.

    1996-09-01

    This report provides a users` guide for parallel processing ITS on a UNIX workstation network, a shared-memory multiprocessor or a massively-parallel processor. The parallelized version of ITS is based on a master/slave model with message passing. Parallel issues such as random number generation, load balancing, and communication software are briefly discussed. Timing results for example problems are presented for demonstration purposes.

  15. Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by semi-randomly varying routing policies for different packets

    DOEpatents

    Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

    2010-11-23

    A massively parallel computer system contains an inter-nodal communications network of node-to-node links. Nodes vary a choice of routing policy for routing data in the network in a semi-random manner, so that similarly situated packets are not always routed along the same path. Semi-random variation of the routing policy tends to avoid certain local hot spots of network activity, which might otherwise arise using more consistent routing determinations. Preferably, the originating node chooses a routing policy for a packet, and all intermediate nodes in the path route the packet according to that policy. Policies may be rotated on a round-robin basis, selected by generating a random number, or otherwise varied.

  16. The postal tubal ligation syndrome.

    PubMed

    Faber, E; Rocko, J M; Timmes, J J; Zolli, A F

    1981-01-01

    The frequency of symptoms following tubal ligation calls for an examination of the basic problem with the methods now used. This discussion recommends a modification of tubal ligation which as performed during the past 2-1/2 years has been symptom free, post operatively. What is meant by symptom free is those symptoms which can be directly related to tubal ligation. Symptomatology is complex and insidious. Characteristically, there is a latent period of no symptoms. This asymptomatic period may be totally subjective and may last several years during which time the correlation between surgery and symptoms is obscured. This is particularly the case if purely symptomatic therapeusis has had some degree of success. The latest period is followed by the gradual development of the following: menstrual disorders; abdominal pain which is usually located in the lower abdomen and is of 2 varieties, i.e., dysmenorrhea and nonmenstrual pain; and infection. Physical examination demonstrates little. This set of symptoms, which has been documented also by Poma et al., and when taken as a whole, constitutes a syndrome which should be termed the posttubal ligation syndrome. These patients give a history of repeat X-rays, biopsies, endoscopies, and surgical exploration. Some of these patients have had 4 or 5 celiotomies. A modification of the traditional method of tubal ligation definitely requires consideration. The characteristics of the oviducts which need mention and emphasis are reviewed. On the basis of the reviewed considerations, it becomes obvious that smooth transport of the ovum is a necessity and that obstruction in the tubes will cause a reaction similar to obstruction anywhere in the body. Tubal ligation should be performed in such a manner so as not to obstruct the ova from passing down the tube. The tubes should be cut fairly close to the uterus and be tied. The rest of the tube from fimbria to the isthmus should be left open. In this manner, the ovum passes into the

  17. rf resistance and inductance of massively parallel single walled carbon nanotubes: Direct, broadband measurements and near perfect 50 Ω impedance matching

    NASA Astrophysics Data System (ADS)

    Rutherglen, Chris; Jain, Dheeraj; Burke, Peter

    2008-08-01

    We report using dielectrophoresis to accumulate hundred to thousands of solubilized single walled carbon nanotubes in parallel to achieve impedance values very close to 50Ω. This allows us to clearly measure the real (resistive) and imaginary (inductive) impedance over a broad frequency range. We find a negligible to mild frequency dependent resistance for the devices and an imaginary impedance that is significantly smaller then the resistance over the range of dc to 20GHz. This clearly and unambiguously demonstrates that kinetic inductance is not the major issue facing nanotube array interconnects, when compared to the real impedance (the resistance).

  18. [Latex ligation in treatment of chronic hemorrhoids].

    PubMed

    Ektov, V N; Somov, K A

    2015-01-01

    We analyzed the results of treatment of 432 patients with chronic hemorrhoids using different variants of latex ligation. New technique including ligation of mucosa and submucosa of low-ampullar rectum providing ligation of hemorrhoidalvessels, lifting and recto-anal repair is developed and suggested. This method is advisable to use in case of chronic internal hemorrhoids stages I and II. The authors recommend simultaneous combined ligation of mucosa of low-ampullar rectum and internal hemorrhoids for stages III and IV. Different variants of latex ligation with external hemorrhoids excision were used in 103 patients. Pointed variants of latex ligation preserve important advantages including mini-invasiveness, simplicity and wide availability, low cost. Good remote results were obtained after these procedures in 87.3% of observations. Suggested tactics extends use of latex ligation and increases its effectiveness in treatment of different stages and forms of chronic hemorrhoids.

  19. Implementation and scaling of the fully coupled Terrestrial Systems Modeling Platform (TerrSysMP) in a massively parallel supercomputing environment - a case study on JUQUEEN (IBM Blue Gene/Q)

    NASA Astrophysics Data System (ADS)

    Gasper, F.; Goergen, K.; Kollet, S.; Shrestha, P.; Sulis, M.; Rihani, J.; Geimer, M.

    2014-06-01

    Continental-scale hyper-resolution simulations constitute a grand challenge in characterizing non-linear feedbacks of states and fluxes of the coupled water, energy, and biogeochemical cycles of terrestrial systems. Tackling this challenge requires advanced coupling and supercomputing technologies for earth system models that are discussed in this study, utilizing the example of the implementation of the newly developed Terrestrial Systems Modeling Platform (TerrSysMP) on JUQUEEN (IBM Blue Gene/Q) of the Jülich Supercomputing Centre, Germany. The applied coupling strategies rely on the Multiple Program Multiple Data (MPMD) paradigm and require memory and load balancing considerations in the exchange of the coupling fields between different component models and allocation of computational resources, respectively. These considerations can be reached with advanced profiling and tracing tools leading to the efficient use of massively parallel computing environments, which is then mainly determined by the parallel performance of individual component models. However, the problem of model I/O and initialization in the peta-scale range requires major attention, because this constitutes a true big data challenge in the perspective of future exa-scale capabilities, which is unsolved.

  20. The Application of a Massively Parallel Computer to the Simulation of Electrical Wave Propagation Phenomena in the Heart Muscle Using Simplified Models

    NASA Technical Reports Server (NTRS)

    Karpoukhin, Mikhii G.; Kogan, Boris Y.; Karplus, Walter J.

    1995-01-01

    The simulation of heart arrhythmia and fibrillation are very important and challenging tasks. The solution of these problems using sophisticated mathematical models is beyond the capabilities of modern super computers. To overcome these difficulties it is proposed to break the whole simulation problem into two tightly coupled stages: generation of the action potential using sophisticated models. and propagation of the action potential using simplified models. The well known simplified models are compared and modified to bring the rate of depolarization and action potential duration restitution closer to reality. The modified method of lines is used to parallelize the computational process. The conditions for the appearance of 2D spiral waves after the application of a premature beat and the subsequent traveling of the spiral wave inside the simulated tissue are studied.

  1. Massive Stars

    NASA Astrophysics Data System (ADS)

    Livio, Mario; Villaver, Eva

    2009-11-01

    Participants; Preface Mario Livio and Eva Villaver; 1. High-mass star formation by gravitational collapse of massive cores M. R. Krumholz; 2. Observations of massive star formation N. A. Patel; 3. Massive star formation in the Galactic center D. F. Figer; 4. An X-ray tour of massive star-forming regions with Chandra L. K. Townsley; 5. Massive stars: feedback effects in the local universe M. S. Oey and C. J. Clarke; 6. The initial mass function in clusters B. G. Elmegreen; 7. Massive stars and star clusters in the Antennae galaxies B. C. Whitmore; 8. On the binarity of Eta Carinae T. R. Gull; 9. Parameters and winds of hot massive stars R. P. Kudritzki and M. A. Urbaneja; 10. Unraveling the Galaxy to find the first stars J. Tumlinson; 11. Optically observable zero-age main-sequence O stars N. R. Walborn; 12. Metallicity-dependent Wolf-Raynet winds P. A. Crowther; 13. Eruptive mass loss in very massive stars and Population III stars N. Smith; 14. From progenitor to afterlife R. A. Chevalier; 15. Pair-production supernovae: theory and observation E. Scannapieco; 16. Cosmic infrared background and Population III: an overview A. Kashlinsky.

  2. Assessment of HaloPlex amplification for sequence capture and massively parallel sequencing of arrhythmogenic right ventricular cardiomyopathy-associated genes.

    PubMed

    Gréen, Anna; Gréen, Henrik; Rehnberg, Malin; Svensson, Anneli; Gunnarsson, Cecilia; Jonasson, Jon

    2015-01-01

    The genetic basis of arrhythmogenic right ventricular cardiomyopathy (ARVC) is complex. Mutations in genes encoding components of the cardiac desmosomes have been implicated as being causally related to ARVC. Next-generation sequencing allows parallel sequencing and duplication/deletion analysis of many genes simultaneously, which is appropriate for screening of mutations in disorders with heterogeneous genetic backgrounds. We designed and validated a next-generation sequencing test panel for ARVC using HaloPlex. We used SureDesign to prepare a HaloPlex enrichment system for sequencing of DES, DSC2, DSG2, DSP, JUP, PKP2, RYR2, TGFB3, TMEM43, and TTN from patients with ARVC using a MiSeq instrument. Performance characteristics were determined by comparison with Sanger, as the gold standard, and TruSeq Custom Amplicon sequencing of DSC2, DSG2, DSP, JUP, and PKP2. All the samples were successfully sequenced after HaloPlex capture, with >99% of targeted nucleotides covered by >20×. The sequences were of high quality, although one problematic area due to a presumptive context-specific sequencing error-causing motif located in exon 1 of the DSP gene was detected. The mutations found by Sanger sequencing were also found using the HaloPlex technique. Depending on the bioinformatics pipeline, sensitivity varied from 99.3% to 100%, and specificity varied from 99.9% to 100%. Three variant positions found by Sanger and HaloPlex sequencing were missed by TruSeq Custom Amplicon owing to loss of coverage.

  3. PORTA: A three-dimensional multilevel radiative transfer code for modeling the intensity and polarization of spectral lines with massively parallel computers

    NASA Astrophysics Data System (ADS)

    Štěpán, Jiří; Trujillo Bueno, Javier

    2013-09-01

    The interpretation of the intensity and polarization of the spectral line radiation produced in the atmosphere of the Sun and of other stars requires solving a radiative transfer problem that can be very complex, especially when the main interest lies in modeling the spectral line polarization produced by scattering processes and the Hanle and Zeeman effects. One of the difficulties is that the plasma of a stellar atmosphere can be highly inhomogeneous and dynamic, which implies the need to solve the non-equilibrium problem of the generation and transfer of polarized radiation in realistic three-dimensional (3D) stellar atmospheric models. Here we present PORTA, an efficient multilevel radiative transfer code we have developed for the simulation of the spectral line polarization caused by scattering processes and the Hanle and Zeeman effects in 3D models of stellar atmospheres. The numerical method of solution is based on the non-linear multigrid iterative method and on a novel short-characteristics formal solver of the Stokes-vector transfer equation which uses monotonic Bézier interpolation. Therefore, with PORTA the computing time needed to obtain at each spatial grid point the self-consistent values of the atomic density matrix (which quantifies the excitation state of the atomic system) scales linearly with the total number of grid points. Another crucial feature of PORTA is its parallelization strategy, which allows us to speed up the numerical solution of complicated 3D problems by several orders of magnitude with respect to sequential radiative transfer approaches, given its excellent linear scaling with the number of available processors. The PORTA code can also be conveniently applied to solve the simpler 3D radiative transfer problem of unpolarized radiation in multilevel systems.

  4. Parallel computing works

    SciTech Connect

    Not Available

    1991-10-23

    An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of many computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.

  5. A novel DNA deletion-ligation reaction catalyzed in vitro by a developmentally controlled activity from Tetrahymena cells.

    PubMed

    Robinson, E K; Cohen, P D; Blackburn, E H

    1989-09-08

    Developmentally controlled genomic deletion-ligations occur during ciliate macronuclear differentiation. We have identified a novel activity in Tetrahymena cell-free extracts that efficiently catalyzes a specific set of intramolecular DNA deletion-ligation reactions. When synthetic DNA oligonucleotide substrates were used, all the deletion-ligation products resembled those formed in vivo in that they resulted from deletions between pairs of short direct repeats. The reaction is ATP-dependent, salt-sensitive, and strongly influenced by the oligonucleotide substrate sequence. The deletion-ligation activity has an apparent size of 200-500 kd, no nuclease-sensitive component, and is highly enriched in cells developing new macronuclei. The temperature inactivation profile of the activity parallels the temperature lethality profile specific for Tetrahymena cells developing new macronuclei. We suggest that this deletion-ligation activity carries out the genomic deletions in developing macronuclei in vivo.

  6. Implementation and scaling of the fully coupled Terrestrial Systems Modeling Platform (TerrSysMP v1.0) in a massively parallel supercomputing environment - a case study on JUQUEEN (IBM Blue Gene/Q)

    NASA Astrophysics Data System (ADS)

    Gasper, F.; Goergen, K.; Shrestha, P.; Sulis, M.; Rihani, J.; Geimer, M.; Kollet, S.

    2014-10-01

    Continental-scale hyper-resolution simulations constitute a grand challenge in characterizing nonlinear feedbacks of states and fluxes of the coupled water, energy, and biogeochemical cycles of terrestrial systems. Tackling this challenge requires advanced coupling and supercomputing technologies for earth system models that are discussed in this study, utilizing the example of the implementation of the newly developed Terrestrial Systems Modeling Platform (TerrSysMP v1.0) on JUQUEEN (IBM Blue Gene/Q) of the Jülich Supercomputing Centre, Germany. The applied coupling strategies rely on the Multiple Program Multiple Data (MPMD) paradigm using the OASIS suite of external couplers, and require memory and load balancing considerations in the exchange of the coupling fields between different component models and the allocation of computational resources, respectively. Using the advanced profiling and tracing tool Scalasca to determine an optimum load balancing leads to a 19% speedup. In massively parallel supercomputer environments, the coupler OASIS-MCT is recommended, which resolves memory limitations that may be significant in case of very large computational domains and exchange fields as they occur in these specific test cases and in many applications in terrestrial research. However, model I/O and initialization in the petascale range still require major attention, as they constitute true big data challenges in light of future exascale computing resources. Based on a factor-two speedup due to compiler optimizations, a refactored coupling interface using OASIS-MCT and an optimum load balancing, the problem size in a weak scaling study can be increased by a factor of 64 from 512 to 32 768 processes while maintaining parallel efficiencies above 80% for the component models.

  7. Massive Hemoptysis.

    PubMed

    Rali, Parth; Gandhi, Viral; Tariq, Cheema

    2016-01-01

    Hemoptysis, or coughing of blood, oftentimes triggers anxiety and fear for patients. The etiology of hemoptysis will determine the clinical course, which includes watchful waiting or intensive care admission. Any amount of hemoptysis that compromises the patient's respiratory status is considered massive hemoptysis and should be considered a medical emergency. In this article, we review introduction, definition, bronchial circulation anatomy, etiology, and management of massive hemoptysis.

  8. Reconstruction of the 1997/1998 El Nino from TOPEX/POSEIDON and TOGA/TAO Data Using a Massively Parallel Pacific-Ocean Model and Ensemble Kalman Filter

    NASA Technical Reports Server (NTRS)

    Keppenne, C. L.; Rienecker, M.; Borovikov, A. Y.

    1999-01-01

    Two massively parallel data assimilation systems in which the model forecast-error covariances are estimated from the distribution of an ensemble of model integrations are applied to the assimilation of 97-98 TOPEX/POSEIDON altimetry and TOGA/TAO temperature data into a Pacific basin version the NASA Seasonal to Interannual Prediction Project (NSIPP)ls quasi-isopycnal ocean general circulation model. in the first system, ensemble of model runs forced by an ensemble of atmospheric model simulations is used to calculate asymptotic error statistics. The data assimilation then occurs in the reduced phase space spanned by the corresponding leading empirical orthogonal functions. The second system is an ensemble Kalman filter in which new error statistics are computed during each assimilation cycle from the time-dependent ensemble distribution. The data assimilation experiments are conducted on NSIPP's 512-processor CRAY T3E. The two data assimilation systems are validated by withholding part of the data and quantifying the extent to which the withheld information can be inferred from the assimilation of the remaining data. The pros and cons of each system are discussed.

  9. Determinants of tubal ligation in Puebla, Mexico.

    PubMed

    Rudzik, Alanna E F; Leonard, Susan H; Sievert, Lynnette L

    2011-06-21

    Tubal ligation provides an effective and reliable method by which women can choose to limit the number of children they will bear. However, because of the irreversibility of the procedure and other potential disadvantages, it is important to understand factors associated with women's choice of this method of birth control. Between May 1999 and August 2000, data were collected from 755 women aged 40 to 60 years from a cross-section of neighborhoods of varying socio-economic make-up in Puebla, Mexico, finding a tubal ligation rate of 42.2%. Multiple logistic regression models were utilized to examine demographic, socio-economic, and reproductive history characteristics in relation to women's choice of tubal ligation. Regression analyses were repeated with participants grouped by age to determine how the timing of availability of tubal ligation related to the decision to undergo the procedure. The results of this study suggest that younger age, more education, use of some forms of birth control, and increased parity were associated with women's decisions to undergo tubal ligation. The statistically significant difference of greater tubal ligation and lower hysterectomy rates across age groups reflect increased access to tubal ligation in Mexico from the early 1970s, supporting the idea that women's choice of tubal ligation was related to access.

  10. Formation of oligonucleotide-PNA-chimeras by template-directed ligation

    NASA Technical Reports Server (NTRS)

    Koppitz, M.; Nielsen, P. E.; Orgel, L. E.; Bada, J. L. (Principal Investigator)

    1998-01-01

    DNA sequences have previously been reported to act as templates for the synthesis of PNA, and vice versa. A continuous evolutionary transition from an informational replicating system based on one polymer to a system based on the other would be facilitated if it were possible to form chimeras, that is molecules that contain monomers of both types. Here we show that ligation to form chimeras proceeds efficiently both on PNA and on DNA templates. The efficiency of ligation is primarily determined by the number of backbone bonds at the ligation site and the relative orientation of template and substrate strands. The most efficient reactions result in the formation of chimeras with ligation junctions resembling the structures of the backbones of PNA and DNA and with antiparallel alignment of both components of the chimera with the template, that is, ligations involving formation of 3'-phosphoramidate and 5'-ester bonds. However, double helices involving PNA are stable both with antiparallel and parallel orientation of the two strands. Ligation on PNA but not on DNA templates is, therefore, sometimes possible on templates with reversed orientation. The relevance of these findings to discussions of possible transitions between genetic systems is discussed.

  11. Parallel image compression

    NASA Technical Reports Server (NTRS)

    Reif, John H.

    1987-01-01

    A parallel compression algorithm for the 16,384 processor MPP machine was developed. The serial version of the algorithm can be viewed as a combination of on-line dynamic lossless test compression techniques (which employ simple learning strategies) and vector quantization. These concepts are described. How these concepts are combined to form a new strategy for performing dynamic on-line lossy compression is discussed. Finally, the implementation of this algorithm in a massively parallel fashion on the MPP is discussed.

  12. Associative Networks on a Massively Parallel Computer.

    DTIC Science & Technology

    1985-10-01

    lgbt (as a group of numbers, in this case), but this only leads to sensible queries when a statistical function is applied: "What is the largest salary...by column one value. all of the information will be grouped appropriately. A prograrr. like the following will work : WO - (REL = -#parent-of); SO 0- 0...column one value, all of the informatioi will be grouped appropriately. A program like the following will work : Me 4- (REL = -#parent-of); SO - D: * IF MO

  13. Massively parallel X-ray holography

    SciTech Connect

    Spence, John C.H; Marchesini, Stefano; Boutet, Sebastien; Sakdinawat, Anne E.; Bogan, Michael J.; Bajt, Sasa; Barty, Anton; Chapman, Henry N.; Frank, Matthias; Hau-Riege, Stefan P.; Szöke, Abraham; Cui, Congwu; Shapiro, David A.; Howells, MAlcolm R.; Shaevitz, Joshua W; Lee, Joanna Y.; Hajdu, Janos; Seibert, Marvin M.

    2008-08-01

    Advances in the development of free-electron lasers offer the realistic prospect of nanoscale imaging on the timescale of atomic motions. We identify X-ray Fourier-transform holography1,2,3 as a promising but, so far, inefficient scheme to do this. We show that a uniformly redundant array4 placed next to the sample, multiplies the efficiency of X-ray Fourier transform holography by more than three orders of magnitude, approaching that of a perfect lens, and provides holographic images with both amplitude- and phase-contrast information. The experiments reported here demonstrate this concept by imaging a nano-fabricated object at a synchrotron source, and a bacterial cell with a soft-X-ray free-electron laser, where illumination by a single 15-fs pulse was successfully used in producing the holographic image. As X-ray lasers move to shorter wavelengths we expect to obtain higher spatial resolution ultrafast movies of transient states of matter

  14. Robustness of Massively Parallel Sequencing Platforms

    PubMed Central

    Kavak, Pınar; Yüksel, Bayram; Aksu, Soner; Kulekci, M. Oguzhan; Güngör, Tunga; Hach, Faraz; Şahinalp, S. Cenk; Alkan, Can; Sağıroğlu, Mahmut Şamil

    2015-01-01

    The improvements in high throughput sequencing technologies (HTS) made clinical sequencing projects such as ClinSeq and Genomics England feasible. Although there are significant improvements in accuracy and reproducibility of HTS based analyses, the usability of these types of data for diagnostic and prognostic applications necessitates a near perfect data generation. To assess the usability of a widely used HTS platform for accurate and reproducible clinical applications in terms of robustness, we generated whole genome shotgun (WGS) sequence data from the genomes of two human individuals in two different genome sequencing centers. After analyzing the data to characterize SNPs and indels using the same tools (BWA, SAMtools, and GATK), we observed significant number of discrepancies in the call sets. As expected, the most of the disagreements between the call sets were found within genomic regions containing common repeats and segmental duplications, albeit only a small fraction of the discordant variants were within the exons and other functionally relevant regions such as promoters. We conclude that although HTS platforms are sufficiently powerful for providing data for first-pass clinical tests, the variant predictions still need to be confirmed using orthogonal methods before using in clinical applications. PMID:26382624

  15. Modeling groundwater flow on massively parallel computers

    SciTech Connect

    Ashby, S.F.; Falgout, R.D.; Fogwell, T.W.; Tompson, A.F.B.

    1994-12-31

    The authors will explore the numerical simulation of groundwater flow in three-dimensional heterogeneous porous media. An interdisciplinary team of mathematicians, computer scientists, hydrologists, and environmental engineers is developing a sophisticated simulation code for use on workstation clusters and MPPs. To date, they have concentrated on modeling flow in the saturated zone (single phase), which requires the solution of a large linear system. they will discuss their implementation of preconditioned conjugate gradient solvers. The preconditioners under consideration include simple diagonal scaling, s-step Jacobi, adaptive Chebyshev polynomial preconditioning, and multigrid. They will present some preliminary numerical results, including simulations of groundwater flow at the LLNL site. They also will demonstrate the code`s scalability.

  16. Massively parallel collaboration : a literature review.

    SciTech Connect

    Dornburg, Courtney C.; Stevens, Susan Marie; Davidson, George S.; Forsythe, James Chris

    2007-09-01

    The present paper explores group dynamics and electronic communication, two components of wicked problem solving that are inherent to the national security environment (as well as many other business environments). First, because there can be no ''right'' answer or solution without first having agreement about the definition of the problem and the social meaning of a ''right solution'', these problems (often) fundamentally relate to the social aspects of groups, an area with much empirical research and application still needed. Second, as computer networks have been increasingly used to conduct business with decreased costs, increased information accessibility, and rapid document, database, and message exchange, electronic communication enables a new form of problem solving group that has yet to be well understood, especially as it relates to solving wicked problems.

  17. High-speed massively parallel scanning

    DOEpatents

    Decker, Derek E.

    2010-07-06

    A new technique for recording a series of images of a high-speed event (such as, but not limited to: ballistics, explosives, laser induced changes in materials, etc.) is presented. Such technique(s) makes use of a lenslet array to take image picture elements (pixels) and concentrate light from each pixel into a spot that is much smaller than the pixel. This array of spots illuminates a detector region (e.g., film, as one embodiment) which is scanned transverse to the light, creating tracks of exposed regions. Each track is a time history of the light intensity for a single pixel. By appropriately configuring the array of concentrated spots with respect to the scanning direction of the detection material, different tracks fit between pixels and sufficient lengths are possible which can be of interest in several high-speed imaging applications.

  18. Massively parallel sequencing technology in pathogenic microbes.

    PubMed

    Tripathy, Sucheta; Jiang, Rays H Y

    2012-01-01

    Next-Generation Sequencing (NGS) methods have revolutionized various aspects of genomics including transcriptome analysis. Digital expression analysis is all set to replace analog expression analysis that uses microarray chips through their cost-effectiveness, reproducibility, accuracy, and speed. The last 2 years have seen a surge in the development of statistical methods and software tools for analysis and visualization of NGS data. Large amounts of NGS data are available for pathogenic fungi and oomycetes. As the analysis results start pouring in, it brings about a paradigm shift in the understanding of host pathogen interactions with discovery of new transcripts, splice variants, mutations, regulatory elements, and epigenetic controls. Here we describe the core technology of the new sequencing platforms, the methodology of data analysis, and different aspects of applications.

  19. Cloning of DNA fragments: ligation reactions in agarose gel.

    PubMed

    Furtado, Agnelo

    2014-01-01

    Ligation reactions to ligate a desired DNA fragment into a vector can be challenging to beginners and especially if the amount of the insert is limiting. Although additives known as crowding agents, such as PEG 8000, added to the ligation mixes can increase the success one has with ligation reactions, in practice the amount of insert used in the ligation can determine the success or the failure of the ligation reaction. The method described here, which uses insert DNA in gel slice added directly into the ligation reaction, has two benefits: (a) using agarose as the crowding agent and (b) reducing steps of insert purification. The use of rapid ligation buffer and incubation of the ligation reaction at room temperature greatly increase the efficiency of the ligation reaction even for blunt-ended ligation.

  20. Parallel Dislocation Simulator

    SciTech Connect

    2006-10-30

    ParaDiS is software capable of simulating the motion, evolution, and interaction of dislocation networks in single crystals using massively parallel computer architectures. The software is capable of outputting the stress-strain response of a single crystal whose plastic deformation is controlled by the dislocation processes.

  1. Heterotopic Pregnancy Following Reversal of Tubal Ligation.

    PubMed

    Eschenbach, Stephanie; Entzian, Dirk; Baumgarten, Deborah A

    2008-01-01

    We describe the case of a 37-year-old woman who had undergone reversal of tubal ligation with subsequent heterotopic pregnancy. We review the initial radiological evaluation of heterotopic pregnancies and the subsequent radiologic findings following appropriate therapy.

  2. Appendix E: Parallel Pascal development system

    NASA Technical Reports Server (NTRS)

    1985-01-01

    The Parallel Pascal Development System enables Parallel Pascal programs to be developed and tested on a conventional computer. It consists of several system programs, including a Parallel Pascal to standard Pascal translator, and a library of Parallel Pascal subprograms. The library includes subprograms for using Parallel Pascal on a parallel system with a fixed degree of parallelism, such as the Massively Parallel Processor, to conveniently manipulate arrays which have dimensions than the hardware. Programs can be conveninetly tested with small sized arrays on the conventional computer before attempting to run on a parallel system.

  3. Parallelized direct execution simulation of message-passing parallel programs

    NASA Technical Reports Server (NTRS)

    Dickens, Phillip M.; Heidelberger, Philip; Nicol, David M.

    1994-01-01

    As massively parallel computers proliferate, there is growing interest in findings ways by which performance of massively parallel codes can be efficiently predicted. This problem arises in diverse contexts such as parallelizing computers, parallel performance monitoring, and parallel algorithm development. In this paper we describe one solution where one directly executes the application code, but uses a discrete-event simulator to model details of the presumed parallel machine such as operating system and communication network behavior. Because this approach is computationally expensive, we are interested in its own parallelization specifically the parallelization of the discrete-event simulator. We describe methods suitable for parallelized direct execution simulation of message-passing parallel programs, and report on the performance of such a system, Large Application Parallel Simulation Environment (LAPSE), we have built on the Intel Paragon. On all codes measured to date, LAPSE predicts performance well typically within 10 percent relative error. Depending on the nature of the application code, we have observed low slowdowns (relative to natively executing code) and high relative speedups using up to 64 processors.

  4. Torque expression in self-ligating orthodontic brackets and conventionally ligated brackets: A systematic review

    PubMed Central

    Al-Thomali, Yousef; Mohamed, Roshan-Noor; Basha, Sakeenabi

    2017-01-01

    Background To evaluate the torque expression of self ligating (SL) orthodontic brackets and conventionally ligated brackets and the torque expression in active and passive SL brackets. Material and Methods Our systematic search included MEDLINE, EMBASE, CINAHL, PsychINFO, Scopus, and key journals and review articles; the date of the last search was April 4th 2016. We graded the methodological quality of the studies by means of the Quality Assessment Tool for Quantitative Studies, developed for the Effective Public Health Practice Project (EPHPP). Results In total, 87 studies were identified for screening, and 9 studies were eligible. The quality assessment rated one of the study as being of strong quality, 7 (77.78%) of these studies as being of moderate quality. Three out of 7 studies which compared SL and conventionally ligated brackets showed, conventionally ligated brackets with highest torque expression compared to SL brackets. Badawi showed active SL brackets with highest torque expression compared to passive SL brackets. Major and Brauchli showed no significant differences in torque expression of active and passive SL brackets. Conclusions Conventionally ligated brackets presented with highest torque expression compared to SL brackets. Minor difference was recorded in a torque expression of active and passive SL brackets. Key words:Systematic review, self ligation, torque expression, conventional ligation. PMID:28149476

  5. Differences Between Distributed and Parallel Systems

    SciTech Connect

    Brightwell, R.; Maccabe, A.B.; Rissen, R.

    1998-10-01

    Distributed systems have been studied for twenty years and are now coming into wider use as fast networks and powerful workstations become more readily available. In many respects a massively parallel computer resembles a network of workstations and it is tempting to port a distributed operating system to such a machine. However, there are significant differences between these two environments and a parallel operating system is needed to get the best performance out of a massively parallel system. This report characterizes the differences between distributed systems, networks of workstations, and massively parallel systems and analyzes the impact of these differences on operating system design. In the second part of the report, we introduce Puma, an operating system specifically developed for massively parallel systems. We describe Puma portals, the basic building blocks for message passing paradigms implemented on top of Puma, and show how the differences observed in the first part of the report have influenced the design and implementation of Puma.

  6. Parallel architectures for vision

    SciTech Connect

    Maresca, M. ); Lavin, M.A. ); Li, H. )

    1988-08-01

    Vision computing involves the execution of a large number of operations on large sets of structured data. Sequential computers cannot achieve the speed required by most of the current applications and therefore parallel architectural solutions have to be explored. In this paper the authors examine the options that drive the design of a vision oriented computer, starting with the analysis of the basic vision computation and communication requirements. They briefly review the classical taxonomy for parallel computers, based on the multiplicity of the instruction and data stream, and apply a recently proposed criterion, the degree of autonomy of each processor, to further classify fine-grain SIMD massively parallel computers. They identify three types of processor autonomy, namely operation autonomy, addressing autonomy, and connection autonomy. For each type they give the basic definitions and show some examples. They focus on the concept of connection autonomy, which they believe is a key point in the development of massively parallel architectures for vision. They show two examples of parallel computers featuring different types of connection autonomy - the Connection Machine and the Polymorphic-Torus - and compare their cost and benefit.

  7. Adaptive parallel logic networks

    NASA Technical Reports Server (NTRS)

    Martinez, Tony R.; Vidal, Jacques J.

    1988-01-01

    Adaptive, self-organizing concurrent systems (ASOCS) that combine self-organization with massive parallelism for such applications as adaptive logic devices, robotics, process control, and system malfunction management, are presently discussed. In ASOCS, an adaptive network composed of many simple computing elements operating in combinational and asynchronous fashion is used and problems are specified by presenting if-then rules to the system in the form of Boolean conjunctions. During data processing, which is a different operational phase from adaptation, the network acts as a parallel hardware circuit.

  8. Post tubal ligation syndrome or iatrogenic hydrosalpinx.

    PubMed

    Gregory, M G

    1981-10-01

    The purpose of this case report is as follows: to attempt to establish an association between the observed increase in hydrosalpinx and the phenomenal increase in surgical sterilization; to present a credible etiology for iatrogenic hydrosalpinx; and to discuss the pathogenesis of a disease process henceforth referred to as post tubal ligation syndrome. A 36-year-old white woman was admitted to Park View Hospital in Nashville, Tennessee on January 7, 1981 for evaluation of continuous lower abdominal pain, abdominal pressure, and dyspareunia for several months. The woman had 2 children who were delivered vaginally. An abdominal tubal ligation was performed for sterilization when she was 27, and vaginal hysterectomy, with anterior and posterior colporrhaphy, was done for symptomatic pelvic relaxation at age 33. Physical examination showed tenderness without palpable masses in the pelvic adnexal areas. Laboratory studies were within normal limits. On January 9, 1981, the patient underwent exploratory laparotomy, and bilateral salpingo-oophorectomy. She was found to have bilateral hydrosalpinx. Historically, hydrosalpinx has been considered an intermediary step in pelvic inflammatory disease. Iatrogenic hydrosalpinx is, in essence, initiated by an initial insult, e.g., tubal ligation, fulguration, or application of a mechanical clip or band. Theoretically, single point interruption of a fallopian tube should produce no ill effects. The popularity and success of tubal ligation attest to single point interruption of an otherwise normal fallopian tube as an innocuous procedure. A schematic drawing is provided of the same tube insulted a 2nd time and consequently the situation is prefactory to development of hydrosalpinx, i.e., a tube lined with secretory epithelium is closed at both ends. Secretion within this closed system will produce dilatation. This "2nd" insult to the normal fallopian tube, post tubal ligation, may take 1 of several forms. The symptoms of iatrogenic

  9. Bidirectional Glenn with interruption of antegrade pulmonary blood flow: Which is the preferred option: Ligation or division of the pulmonary artery?

    PubMed Central

    Chowdhury, Ujjwal Kumar; Kapoor, Poonam Malhotra; Rao, Keerthi; Gharde, Parag; Kumawat, Mukesh; Jagia, Priya

    2016-01-01

    We report a rare complication of massive aneurysm of the proximal ligated end of the main pulmonary artery which occurred in the setting of a patient with a functionally univentricular heart and increased pulmonary blood flow undergoing superior cavopulmonary connection. Awareness of this possibility may guide others to electively transect the pulmonary artery in such a clinical setting. PMID:27397472

  10. Potential performance of methods for parallelism across time in ODEs

    SciTech Connect

    Xuhai, Xu; Gear, C.W.

    1990-01-01

    An earlier report described a possible approach to massive parallelism across time. This report describes some numerical experiments on the method and two modifications to the method to increase the parallelism and improve the behavior for nonlinear stiff problems.

  11. Neurodevelopmental Outcomes Following Two Different Treatment Approaches (Early Ligation versus Selective Ligation) for Patent Ductus Arteriosus

    PubMed Central

    Wickremasinghe, Andrea C.; Rogers, Elizabeth E.; Piecuch, Robert E.; Johnson, Bridget C.; Golden, Suzanne; Moon-Grady, Anita J.; Clyman, Ronald I.

    2012-01-01

    Objective To examine whether a change in the approach to management of persistent patent ductus arteriosus (PDA), from “early ligation” to “selective ligation,” is associated with an increased risk of abnormal neurodevelopmental outcome. Study design In 2005, we changed our PDA treatment protocol (in infants ≤27 6/7 weeks gestation) from an “early ligation” approach, with PDA ligation quickly if they failed to close after indomethacin (Period 1: 1/99–12/04), to a “selective ligation” approach, with PDA ligation only if specific criteria were met (Period 2: 1/05–5/09). All infants in both periods received prophylactic indomethacin. Multivariate analysis was used to compare the odds of a composite Abnormal Neurodevelopmental Outcome (Bayley MDI or Cognitive score <70, cerebral palsy, blindness, and/or deafness) associated with each treatment approach at 18–36 months (n=224). Results During Period 1, 23% of the infants in follow-up failed indomethacin treatment, and all were ligated; during Period 2, 30% of infants failed indomethacin, and 66% were ligated after meeting pre-specified criteria. Infants treated with the “selective ligation” strategy had fewer Abnormal Outcomes than infants treated with the “early ligation” approach (OR=0.07, p=0.046). Infants ligated before 10 days of age had an increased incidence of Abnormal Neurodevelopmental Outcome. The significant difference in outcomes between the two PDA treatment strategies could be accounted for, in part, by the earlier age of ligation during Period 1. Conclusions A “selective ligation” approach for PDAs that fail to close with indomethacin does not worsen neurodevelopmental outcome at 18–36 months. PMID:22795222

  12. Atrophy of submandibular gland by the duct ligation and a blockade of SP receptor in rats.

    PubMed

    Hishida, Sumiyo; Ozaki, Noriyuki; Honda, Takashi; Shigetomi, Toshio; Ueda, Minoru; Hibi, Hideharu; Sugiura, Yasuo

    2016-05-01

    To clarify the mechanisms underlying the submandibular gland atrophies associated with ptyalolithiasis, morphological changes were examined in the rat submandibular gland following either surgical intervention of the duct or functional blockade at substance P receptors (SPRs). Progressive acinar atrophy was observed after duct ligation or avulsion of periductal tissues. This suggested that damage to periductal tissue involving nerve fibers might contribute to ligation-associated acinar atrophy. Immunohistochemically labeled-substance P positive nerve fibers (SPFs) coursed in parallel with the main duct and were distributed around the interlobular, striated, granular and intercalated duct, and glandular acini. Strong SPR immunoreactivity was observed in the duct. Injection into the submandibular gland of a SPR antagonist induced marked acinar atrophy. The results revealed that disturbance of SPFs and SPRs might be involved in the atrophy of the submandibular gland associated with ptyalolithiasis.

  13. Atrophy of submandibular gland by the duct ligation and a blockade of SP receptor in rats

    PubMed Central

    Hishida, Sumiyo; Ozaki, Noriyuki; Honda, Takashi; Shigetomi, Toshio; Ueda, Minoru; Hibi, Hideharu; Sugiura, Yasuo

    2016-01-01

    ABSTRACT To clarify the mechanisms underlying the submandibular gland atrophies associated with ptyalolithiasis, morphological changes were examined in the rat submandibular gland following either surgical intervention of the duct or functional blockade at substance P receptors (SPRs). Progressive acinar atrophy was observed after duct ligation or avulsion of periductal tissues. This suggested that damage to periductal tissue involving nerve fibers might contribute to ligation-associated acinar atrophy. Immunohistochemically labeled-substance P positive nerve fibers (SPFs) coursed in parallel with the main duct and were distributed around the interlobular, striated, granular and intercalated duct, and glandular acini. Strong SPR immunoreactivity was observed in the duct. Injection into the submandibular gland of a SPR antagonist induced marked acinar atrophy. The results revealed that disturbance of SPFs and SPRs might be involved in the atrophy of the submandibular gland associated with ptyalolithiasis. PMID:27303108

  14. The physics of parallel machines

    NASA Technical Reports Server (NTRS)

    Chan, Tony F.

    1988-01-01

    The idea is considered that architectures for massively parallel computers must be designed to go beyond supporting a particular class of algorithms to supporting the underlying physical processes being modelled. Physical processes modelled by partial differential equations (PDEs) are discussed. Also discussed is the idea that an efficient architecture must go beyond nearest neighbor mesh interconnections and support global and hierarchical communications.

  15. Sequencing by Cyclic Ligation and Cleavage (CycLiC) directly on a microarray captured template.

    PubMed

    Mir, Kalim U; Qi, Hong; Salata, Oleg; Scozzafava, Giuseppe

    2009-01-01

    Next generation sequencing methods that can be applied to both the resequencing of whole genomes and to the selective resequencing of specific parts of genomes are needed. We describe (i) a massively scalable biochemistry, Cyclical Ligation and Cleavage (CycLiC) for contiguous base sequencing and (ii) apply it directly to a template captured on a microarray. CycLiC uses four color-coded DNA/RNA chimeric oligonucleotide libraries (OL) to extend a primer, a base at a time, along a template. The cycles comprise the steps: (i) ligation of OLs, (ii) identification of extended base by label detection, and (iii) cleavage to remove label/terminator and undetermined bases. For proof-of-principle, we show that the method conforms to design and that we can read contiguous bases of sequence correctly from a template captured by hybridization from solution to a microarray probe. The method is amenable to massive scale-up, miniaturization and automation. Implementation on a microarray format offers the potential for both selection and sequencing of a large number of genomic regions on a single platform. Because the method uses commonly available reagents it can be developed further by a community of users.

  16. Successful Management of Two Cases of Placenta Accreta and a Literature Review: Use of the B-Lynch Suture and Bilateral Uterine Artery Ligation Procedures

    PubMed Central

    Arab, Maliheh; Ghavami, Behnaz; Saraeian, Samaneh; Sheibani, Samaneh; Abbasian Azar, Fatemeh; Hosseini-Zijoud, Seyed-Mostafa

    2016-01-01

    Introduction Placenta accreta is an increasingly common complication of pregnancy that can result in massive hemorrhage. Case Presentation We describe two cases of placenta accreta, with successful conservative management in a referral hospital in Tehran, Iran. In both cases, two procedures were performed: compression suture (B-Lynch) and a perfusion-decreasing procedure (bilateral uterine artery ligation). We also present the results of a narrative literature review. Conclusions The double B-Lynch and uterine arterial ligation procedure in cases of abnormal placentation might be strongly considered in fertility preservation, coagulopathy, coexisting medical disease, blood access shortage, low surgical experience, distant local hospitals, and no help. PMID:27354921

  17. Massive transfusion and massive transfusion protocol

    PubMed Central

    Patil, Vijaya; Shetmahajan, Madhavi

    2014-01-01

    Haemorrhage remains a major cause of potentially preventable deaths. Rapid transfusion of large volumes of blood products is required in patients with haemorrhagic shock which may lead to a unique set of complications. Recently, protocol based management of these patients using massive transfusion protocol have shown improved outcomes. This section discusses in detail both management and complications of massive blood transfusion. PMID:25535421

  18. Enzyme-mediated ligation technologies for peptides and proteins.

    PubMed

    Schmidt, Marcel; Toplak, Ana; Quaedflieg, Peter Jlm; Nuijens, Timo

    2017-02-18

    With the steadily increasing complexity and quantity requirements for peptides in industry and academia, the efficient and site-selective ligation of peptides and proteins represents a highly desirable goal. Within this context, enzyme-mediated ligation technologies for peptides and proteins have attracted great interest in recent years as they represent an extremely powerful extension to the scope of chemical methodologies (e.g. native chemical ligation) in basic and applied research. Compared to chemical ligation methods, enzymatic strategies using ligases such as sortase, butelase, peptiligase or omniligase generally feature excellent chemoselectivity, therefore making them valuable tools for protein and peptide chemists.

  19. Parallel rendering

    NASA Technical Reports Server (NTRS)

    Crockett, Thomas W.

    1995-01-01

    This article provides a broad introduction to the subject of parallel rendering, encompassing both hardware and software systems. The focus is on the underlying concepts and the issues which arise in the design of parallel rendering algorithms and systems. We examine the different types of parallelism and how they can be applied in rendering applications. Concepts from parallel computing, such as data decomposition, task granularity, scalability, and load balancing, are considered in relation to the rendering problem. We also explore concepts from computer graphics, such as coherence and projection, which have a significant impact on the structure of parallel rendering algorithms. Our survey covers a number of practical considerations as well, including the choice of architectural platform, communication and memory requirements, and the problem of image assembly and display. We illustrate the discussion with numerous examples from the parallel rendering literature, representing most of the principal rendering methods currently used in computer graphics.

  20. Streamlined expressed protein ligation using split inteins.

    PubMed

    Vila-Perelló, Miquel; Liu, Zhihua; Shah, Neel H; Willis, John A; Idoyaga, Juliana; Muir, Tom W

    2013-01-09

    Chemically modified proteins are invaluable tools for studying the molecular details of biological processes, and they also hold great potential as new therapeutic agents. Several methods have been developed for the site-specific modification of proteins, one of the most widely used being expressed protein ligation (EPL) in which a recombinant α-thioester is ligated to an N-terminal Cys-containing peptide. Despite the widespread use of EPL, the generation and isolation of the required recombinant protein α-thioesters remain challenging. We describe here a new method for the preparation and purification of recombinant protein α-thioesters using engineered versions of naturally split DnaE inteins. This family of autoprocessing enzymes is closely related to the inteins currently used for protein α-thioester generation, but they feature faster kinetics and are split into two inactive polypeptides that need to associate to become active. Taking advantage of the strong affinity between the two split intein fragments, we devised a streamlined procedure for the purification and generation of protein α-thioesters from cell lysates and applied this strategy for the semisynthesis of a variety of proteins including an acetylated histone and a site-specifically modified monoclonal antibody.

  1. As fast and selective as enzymatic ligations: unpaired nucleobases increase the selectivity of DNA-controlled native chemical PNA ligation.

    PubMed

    Ficht, Simon; Dose, Christian; Seitz, Oliver

    2005-11-01

    DNA-controlled reactions offer interesting opportunities in biological, chemical, and nanosciences. In practical applications, such as in DNA sequence analysis, the sequence fidelity of the chemical-ligation reaction is of central importance. We present a ligation reaction that is as fast as and much more selective than enzymatic T4 ligase-mediated oligonucleotide ligations. The selectivity was higher than 3000-fold in discriminating matched from singly mismatched DNA templates. It is demonstrated that this enormous selectivity is the hallmark of the particular ligation architecture, which is distinct from previous ligation architectures designed as "nick ligations". Interestingly, the fidelity of the native chemical ligation of peptide nucleic acids was increased by more than one order of magnitude when performing the ligation in such a way that an abasic-site mimic was formed opposite an unpaired template base. It is shown that the high sequence fidelity of the abasic ligation could facilitate the MALDI-TOF mass-spectrometric analysis of early cancer onset by allowing the detection of as little as 0.2 % of single-base mutant DNA in the presence of 99.8 % wild-type DNA.

  2. Runtime volume visualization for parallel CFD

    NASA Technical Reports Server (NTRS)

    Ma, Kwan-Liu

    1995-01-01

    This paper discusses some aspects of design of a data distributed, massively parallel volume rendering library for runtime visualization of parallel computational fluid dynamics simulations in a message-passing environment. Unlike the traditional scheme in which visualization is a postprocessing step, the rendering is done in place on each node processor. Computational scientists who run large-scale simulations on a massively parallel computer can thus perform interactive monitoring of their simulations. The current library provides an interface to handle volume data on rectilinear grids. The same design principles can be generalized to handle other types of grids. For demonstration, we run a parallel Navier-Stokes solver making use of this rendering library on the Intel Paragon XP/S. The interactive visual response achieved is found to be very useful. Performance studies show that the parallel rendering process is scalable with the size of the simulation as well as with the parallel computer.

  3. Ultrascalable petaflop parallel supercomputer

    DOEpatents

    Blumrich, Matthias A.; Chen, Dong; Chiu, George; Cipolla, Thomas M.; Coteus, Paul W.; Gara, Alan G.; Giampapa, Mark E.; Hall, Shawn; Haring, Rudolf A.; Heidelberger, Philip; Kopcsay, Gerard V.; Ohmacht, Martin; Salapura, Valentina; Sugavanam, Krishnan; Takken, Todd

    2010-07-20

    A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.

  4. (Bio)molecular surface patterning by phototriggered oxime ligation.

    PubMed

    Pauloehrl, Thomas; Delaittre, Guillaume; Bruns, Michael; Meißler, Maria; Börner, Hans G; Bastmeyer, Martin; Barner-Kowollik, Christopher

    2012-09-03

    Making light work of ligation: A novel method utilizes light for oxime ligation chemistry. A quantitative, low-energy photodeprotection generates aldehyde, which subsequently reacts with aminooxy moieties. The spatial control allows patterning on surfaces with a fluoro marker and GRGSGR peptide, and can be imaged by time-of-flight secondary-ion mass spectrometry.

  5. Parallel fast gauss transform

    SciTech Connect

    Sampath, Rahul S; Sundar, Hari; Veerapaneni, Shravan

    2010-01-01

    We present fast adaptive parallel algorithms to compute the sum of N Gaussians at N points. Direct sequential computation of this sum would take O(N{sup 2}) time. The parallel time complexity estimates for our algorithms are O(N/n{sub p}) for uniform point distributions and O( (N/n{sub p}) log (N/n{sub p}) + n{sub p}log n{sub p}) for non-uniform distributions using n{sub p} CPUs. We incorporate a plane-wave representation of the Gaussian kernel which permits 'diagonal translation'. We use parallel octrees and a new scheme for translating the plane-waves to efficiently handle non-uniform distributions. Computing the transform to six-digit accuracy at 120 billion points took approximately 140 seconds using 4096 cores on the Jaguar supercomputer. Our implementation is 'kernel-independent' and can handle other 'Gaussian-type' kernels even when explicit analytic expression for the kernel is not known. These algorithms form a new class of core computational machinery for solving parabolic PDEs on massively parallel architectures.

  6. A new 10-min ligation method using a modified buffer system with a very low amount of T4 DNA ligase: the "Coffee Break Ligation" technique.

    PubMed

    Yoshino, Yuki; Ishida, Masaharu; Horii, Akira

    2007-10-01

    The ligation reaction is widely used in molecular biology. There are several kits available that complete the ligation reaction very rapidly but they are rather expensive. In this study, we successfully modified the ligation buffer with much lower cost than existing kits. The ligation reaction can be completed in 10 min using very low activities such as 0.01 U T4 DNA ligase, and costs only $1 for 100 reactions of 20 microl scale. We name this ligation system the "Coffee Break Ligation" system; one can complete ligation reaction while drinking a cup of coffee, and perform 100 reactions by spending money equivalent to a cup of coffee.

  7. Irreversible sortase A-mediated ligation driven by diketopiperazine formation.

    PubMed

    Liu, Fa; Luo, Ethan Y; Flora, David B; Mezo, Adam R

    2014-01-17

    Sortase A (SrtA)-mediated ligation has emerged as an attractive tool in bioorganic chemistry attributing to the remarkable specificity of the ligation reaction and the physiological reaction conditions. However, the reversible nature of this reaction limits the efficiency of the ligation reaction and has become a significant constraint to its more widespread use. We report herein a novel set of SrtA substrates (LPETGG-isoacyl-Ser and LPETGG-isoacyl-Hse) that can be irreversibly ligated to N-terminal Gly-containing moieties via the deactivation of the SrtA-excised peptide fragment through diketopiperazine (DKP) formation. The convenience of the synthetic procedure and the stability of the substrates in the ligation buffer suggest that both LPETGG-isoacyl-Ser and LPETGG-isoacyl-Hse are valuable alternatives to existing irreversible SrtA substrate sequences.

  8. A simple microfluidic assay for the detection of ligation product.

    PubMed

    Zhang, Lei; Wang, Jingjing; Roebelen, Johann; Tripathi, Anubhav

    2015-02-01

    We present a novel microfluidic-based approach to detect ligation products. The conformal specificity of ligases is used in various molecular assays to detect point mutations. Traditional methods of detecting ligation products include denaturing gel electrophoresis, sequence amplification, and melting curve analysis. Gel electrophoresis is a labor- and time-intensive process, while sequence amplification and melting curve analysis require instruments capable of accurate thermal ramping and sensitive optical detection. Microfluidics has been widely applied in genomics, proteomics, and cell cytometry to enable rapid and automated assays. We designed an assay that fluorogenically detects ligation products following a simple magnetic separation through a microfluidic channel. 100 nM of synthetic HIV-1 K103N minority mutant templates were successfully detected in 30 min. This simple and rapid method can be coupled with any ligation assay for the detection of ligation products.

  9. Template-directed oligonucleotide ligation on hydroxylapatite

    NASA Technical Reports Server (NTRS)

    Acevedo, O. L.; Orgel, L. E.

    1986-01-01

    It has been suggested that the prebiotic synthesis of the precursors of biopolymers could have occurred on a solid surface such as that provided by clay or some other mineral. One such scheme envisages that growing polymers were localized by adsorption to a mineral surface where an activating agent or activated monomers were supplied continuously or cyclically. Here, it is reported that a sequence of reactions in which initially formed oligo(G)s are reactivated by conversion to phosphorimidazolides in the presence of poly(C) and then allowed to ligate is ideal, in that repeated cycles can be carried out on the surface of hydroxylapatite, whereas in the liquid phase the cycle could be achieved only with considerable difficulty.

  10. Parallel machines: Parallel machine languages

    SciTech Connect

    Iannucci, R.A. )

    1990-01-01

    This book presents a framework for understanding the tradeoffs between the conventional view and the dataflow view with the objective of discovering the critical hardware structures which must be present in any scalable, general-purpose parallel computer to effectively tolerate latency and synchronization costs. The author presents an approach to scalable general purpose parallel computation. Linguistic Concerns, Compiling Issues, Intermediate Language Issues, and hardware/technological constraints are presented as a combined approach to architectural Develoement. This book presents the notion of a parallel machine language.

  11. Higher dimensional massive bigravity

    NASA Astrophysics Data System (ADS)

    Do, Tuan Q.

    2016-08-01

    We study higher-dimensional scenarios of massive bigravity, which is a very interesting extension of nonlinear massive gravity since its reference metric is assumed to be fully dynamical. In particular, the Einstein field equations along with the following constraint equations for both physical and reference metrics of a five-dimensional massive bigravity will be addressed. Then, we study some well-known cosmological spacetimes such as the Friedmann-Lemaitre-Robertson-Walker, Bianchi type I, and Schwarzschild-Tangherlini metrics for the five-dimensional massive bigravity. As a result, we find that massive graviton terms will serve as effective cosmological constants in both physical and reference sectors if a special scenario, in which reference metrics are chosen to be proportional to physical ones, is considered for all mentioned metrics. Thanks to the constancy property of massive graviton terms, consistent cosmological solutions will be figured out accordingly.

  12. Efficient Ligation of the Schistosoma Hammerhead Ribozyme †

    PubMed Central

    Canny, Marella D.; Jucker, Fiona M.; Pardi, Arthur

    2011-01-01

    The hammerhead ribozyme from Schistosoma mansoni is the best characterized of the natural hammerhead ribozymes. Biophysical, biochemical, and structural studies have shown that the formation of the loop-loop tertiary interaction between stems I and II alters the global folding, cleavage kinetics, and conformation of the catalytic core of this hammerhead, leading to a ribozyme that is readily cleaved under physiological conditions. This study investigates the ligation kinetics and the internal equilibrium between cleavage and ligation for the Schistosoma hammerhead. Single turnover kinetic studies on a construct where the ribozyme cleaves and ligates substrate(s) in trans showed up to 23% ligation when starting from fully cleaved products. This was achieved by a ~2,000-fold increase in the rate of ligation compared to a minimal hammerhead without the loop-loop tertiary interaction, yielding an internal equilibrium that ranges from 2–3 at physiological Mg2+ ion concentrations (0.1 –1 mM). Thus, the natural Schistosoma hammerhead ribozyme is almost as efficient at ligation as it is at cleavage. The results here are consistent with a model where formation of the loop-loop tertiary interaction leads to a higher population of catalytically active molecules, and where formation of this tertiary interaction has a much larger effect on the ligation than the cleavage activity of the Schistosoma hammerhead ribozyme. PMID:17319693

  13. Selective suppression of interleukin-12 induction after macrophage receptor ligation.

    PubMed

    Sutterwala, F S; Noel, G J; Clynes, R; Mosser, D M

    1997-06-02

    Interleukin (IL)-12 is a monocyte- and macrophage-derived cytokine that plays a crucial role in both the innate and the acquired immune response. In this study, we examined the effects that ligating specific macrophage receptors had on the induction of IL-12 by lipopolysaccharide (LPS). We report that ligation of the macrophage Fcgamma, complement, or scavenger receptors inhibited the induction of IL-12 by LPS. Both mRNA synthesis and protein secretion were diminished to near-undetectable levels following receptor ligation. Suppression was specific to IL-12 since IL-10 and tumor necrosis factor-alpha (TNF-alpha) production were not inhibited by ligating macrophage receptors. The results of several different experimental approaches suggest that IL-12 downregulation was due to extracellular calcium influxes that resulted from receptor ligation. First, preventing extracellular calcium influxes, by performing the assays in EGTA, abrogated FcgammaR-mediated IL-12(p40) mRNA suppression. Second, exposure of macrophages to the calcium ionophores, ionomycin or A23187, mimicked receptor ligation and inhibited IL-12(p40) mRNA induction by LPS. Finally, bone marrow-derived macrophages from FcR gamma chain-deficient mice, which fail to flux calcium after receptor ligation, failed to inhibit IL-12(p40) mRNA induction. These results indicate that the calcium influxes that occur as a result of receptor ligation are responsible for inhibiting the induction of IL-12 by LPS. Hence, the ligation of phagocytic receptors on macrophages can lead to a dramatic decrease in IL-12 induction. This downregulation may be a way of limiting proinflammatory responses of macrophages to extracellular pathogens, or suppressing the development of cell-mediated immunity to intracellular pathogens.

  14. Gang scheduling a parallel machine

    SciTech Connect

    Gorda, B.C.; Brooks, E.D. III.

    1991-03-01

    Program development on parallel machines can be a nightmare of scheduling headaches. We have developed a portable time sharing mechanism to handle the problem of scheduling gangs of processors. User program and their gangs of processors are put to sleep and awakened by the gang scheduler to provide a time sharing environment. Time quantums are adjusted according to priority queues and a system of fair share accounting. The initial platform for this software is the 128 processor BBN TC2000 in use in the Massively Parallel Computing Initiative at the Lawrence Livermore National Laboratory. 2 refs., 1 fig.

  15. Gang scheduling a parallel machine

    SciTech Connect

    Gorda, B.C.; Brooks, E.D. III.

    1991-12-01

    Program development on parallel machines can be a nightmare of scheduling headaches. We have developed a portable time sharing mechanism to handle the problem of scheduling gangs of processes. User programs and their gangs of processes are put to sleep and awakened by the gang scheduler to provide a time sharing environment. Time quantum are adjusted according to priority queues and a system of fair share accounting. The initial platform for this software is the 128 processor BBN TC2000 in use in the Massively Parallel Computing Initiative at the Lawrence Livermore National Laboratory.

  16. Implementation of the T-matrix method on a massively parallel machine: a comparison of hexagonal ice cylinder single-scattering properties using the T-matrix and improved geometric optics methods

    NASA Astrophysics Data System (ADS)

    Havemann, S.; Baran, A. J.; Edwards, J. M.

    2003-09-01

    This paper describes a series of improvements to an already existing T-matrix implementation for calculating the single-scattering properties of non-axisymmetric particles at large size parameters.The improvements aim to use the computational resources in terms of memory and time as efficiently as possible by taking into account various simplifications of the T-matrix formulation.In this paper the randomly oriented hexagonal ice crystal is considered and the orientational averaging is done analytically. Within the framework of the T-matrix approach, symmetries of the hexagonal cylinders allow the formulation of eight independent subproblems, for each of which the memory demand is reduced by a factor of 144 compared to the general formulation.The memory demand strongly increases as a function of size parameter.For non-axisymmetric particles, memory demand rises approximately as the fourth power of the size parameter.In order to have more memory available, the T-matrix code was parallelized and implemented on a multi-processor CRAY T3E supercomputer.Extended precision is software emulated. With these new developments the single-scattering properties of randomly oriented hexagonal columns can be calculated for size parameters up to about 40.This size parameter reaches into the region of Improved Geometric Optics (IGO), which is applicable down to size parameters of around 20. Comparisons of IGO and T-matrix results for the single-scattering albedo and the extinction efficiency show that at a size parameter of around 30, the single-scattering albedos are generally within about 2%. IGO underpredicts the extinction efficiency by 4% at l = 3.775 mm and by 11% at the more absorbing wavelength of l = 4.9 mm.This underprediction may be due to the version may be due to the version of IGO usedsince this does not take into account the inhomogeneity of the refracted wave in an absorbing medium.

  17. Analysis of Genes, Transcripts, and Proteins via DNA Ligation

    NASA Astrophysics Data System (ADS)

    Conze, Tim; Shetye, Alysha; Tanaka, Yuki; Gu, Jijuan; Larsson, Chatarina; Göransson, Jenny; Tavoosidana, Gholamreza; Söderberg, Ola; Nilsson, Mats; Landegren, Ulf

    2009-07-01

    Analytical reactions in which short DNA strands are used in combination with DNA ligases have proven useful for measuring, decoding, and locating most classes of macromolecules. Given the need to accumulate large amounts of precise molecular information from biological systems in research and in diagnostics, ligation reactions will continue to offer valuable strategies for advanced analytical reactions. Here, we provide a basis for further development of methods by reviewing the history of analytical ligation reactions, discussing the properties of ligation reactions that render them suitable for engineering novel assays, describing a wide range of successful ligase-based assays, and briefly considering future directions.

  18. Convergent synthesis of proteins by kinetically controlled ligation

    DOEpatents

    Kent, Stephen; Pentelute, Brad; Bang, Duhee; Johnson, Erik; Durek, Thomas

    2010-03-09

    The present invention concerns methods and compositions for synthesizing a polypeptide using kinetically controlled reactions involving fragments of the polypeptide for a fully convergent process. In more specific embodiments, a ligation involves reacting a first peptide having a protected cysteyl group at its N-terminal and a phenylthioester at its C-terminal with a second peptide having a cysteine residue at its N-termini and a thioester at its C-termini to form a ligation product. Subsequent reactions may involve deprotecting the cysteyl group of the resulting ligation product and/or converting the thioester into a thiophenylester.

  19. Parallel pipelining

    SciTech Connect

    Joseph, D.D.; Bai, R.; Liao, T.Y.; Huang, A.; Hu, H.H.

    1995-09-01

    In this paper the authors introduce the idea of parallel pipelining for water lubricated transportation of oil (or other viscous material). A parallel system can have major advantages over a single pipe with respect to the cost of maintenance and continuous operation of the system, to the pressure gradients required to restart a stopped system and to the reduction and even elimination of the fouling of pipe walls in continuous operation. The authors show that the action of capillarity in small pipes is more favorable for restart than in large pipes. In a parallel pipeline system, they estimate the number of small pipes needed to deliver the same oil flux as in one larger pipe as N = (R/r){sup {alpha}}, where r and R are the radii of the small and large pipes, respectively, and {alpha} = 4 or 19/7 when the lubricating water flow is laminar or turbulent.

  20. Formatting and ligating biopolymers using adjustable nanoconfinement

    NASA Astrophysics Data System (ADS)

    Berard, Daniel J.; Shayegan, Marjan; Michaud, Francois; Henkin, Gil; Scott, Shane; Leslie, Sabrina

    2016-07-01

    Sensitive visualization and conformational control of long, delicate biopolymers present critical challenges to emerging biotechnologies and biophysical studies. Next-generation nanofluidic manipulation platforms strive to maintain the structural integrity of genomic DNA prio