Loo, Lit-Hsin; Laksameethanasan, Danai; Tung, Yi-Ling
2014-03-01
Protein subcellular localization is a major determinant of protein function. However, this important protein feature is often described in terms of discrete and qualitative categories of subcellular compartments, and therefore it has limited applications in quantitative protein function analyses. Here, we present Protein Localization Analysis and Search Tools (PLAST), an automated analysis framework for constructing and comparing quantitative signatures of protein subcellular localization patterns based on microscopy images. PLAST produces human-interpretable protein localization maps that quantitatively describe the similarities in the localization patterns of proteins and major subcellular compartments, without requiring manual assignment or supervised learning of these compartments. Using the budding yeast Saccharomyces cerevisiae as a model system, we show that PLAST is more accurate than existing, qualitative protein localization annotations in identifying known co-localized proteins. Furthermore, we demonstrate that PLAST can reveal protein localization-function relationships that are not obvious from these annotations. First, we identified proteins that have similar localization patterns and participate in closely-related biological processes, but do not necessarily form stable complexes with each other or localize at the same organelles. Second, we found an association between spatial and functional divergences of proteins during evolution. Surprisingly, as proteins with common ancestors evolve, they tend to develop more diverged subcellular localization patterns, but still occupy similar numbers of compartments. This suggests that divergence of protein localization might be more frequently due to the development of more specific localization patterns over ancestral compartments than the occupation of new compartments. PLAST enables systematic and quantitative analyses of protein localization-function relationships, and will be useful to elucidate protein functions and how these functions were acquired in cells from different organisms or species. A public web interface of PLAST is available at http://plast.bii.a-star.edu.sg.
Loo, Lit-Hsin; Laksameethanasan, Danai; Tung, Yi-Ling
2014-01-01
Protein subcellular localization is a major determinant of protein function. However, this important protein feature is often described in terms of discrete and qualitative categories of subcellular compartments, and therefore it has limited applications in quantitative protein function analyses. Here, we present Protein Localization Analysis and Search Tools (PLAST), an automated analysis framework for constructing and comparing quantitative signatures of protein subcellular localization patterns based on microscopy images. PLAST produces human-interpretable protein localization maps that quantitatively describe the similarities in the localization patterns of proteins and major subcellular compartments, without requiring manual assignment or supervised learning of these compartments. Using the budding yeast Saccharomyces cerevisiae as a model system, we show that PLAST is more accurate than existing, qualitative protein localization annotations in identifying known co-localized proteins. Furthermore, we demonstrate that PLAST can reveal protein localization-function relationships that are not obvious from these annotations. First, we identified proteins that have similar localization patterns and participate in closely-related biological processes, but do not necessarily form stable complexes with each other or localize at the same organelles. Second, we found an association between spatial and functional divergences of proteins during evolution. Surprisingly, as proteins with common ancestors evolve, they tend to develop more diverged subcellular localization patterns, but still occupy similar numbers of compartments. This suggests that divergence of protein localization might be more frequently due to the development of more specific localization patterns over ancestral compartments than the occupation of new compartments. PLAST enables systematic and quantitative analyses of protein localization-function relationships, and will be useful to elucidate protein functions and how these functions were acquired in cells from different organisms or species. A public web interface of PLAST is available at http://plast.bii.a-star.edu.sg. PMID:24603469
Priolo, M; Lerone, M; Rosaia, L; Calcagno, E P; Sadeghi, A K; Ghezzi, F; Ravazzolo, R; Silengo, M
2000-10-01
We report a boy with prominent, peculiarly malformed ears, abnormality of the ramus of the mandible and hypotonia. An isolated peculiar bilateral ear deformity named 'question mark ear' has been delineated in plastic reconstruction surgery reviews [Cosman et al., 1970 Plast Reconstr Surg 46:454-457; Cosman (1984) Plast Reconstr Surg 73:572-576; Takato et al. (1989) Ann Plast Surg 22:69-73; Brodovsky (1997) Plast Reconstr Surg 100:1254-1257; Park (1998) Plast Reconstr Surg 101:1620-1623; Al-Quattan (1998) Plast Reconstr Surg 102:439-441] and a similar deformity of the ear and changes in the temporo-mandibular joint and condyle has been described by Jampol et al. [(1998) Am J Med Genet 75:449-452] and by Guion-Almeida et al. [(1999) Am J Med Genet 86:130-133]. The present case may be the third description of this malformation complex with additional clinical features characterized by hypotonia and mild developmental delay, or possibly a new distinct entity.
Ramkumar, S; Narayanan, V; Laing, J H E
2006-01-01
The perceived benefits of bandaging for 10 days following pinnaplasty have been questioned by previous studies. The problems arising from these dressings are many [Powell BWEM. The value of head dressings in the postoperative management of the prominent ear. Br J Plast Surg 1989;42:692-4. Bartley J. How long should ears be bandaged after otoplasty? J Laryngol Otol 1998;112:531-2. Wong MC, Sylaidis P. Head dressings for pinnaplasty: a tradition not supported by evidence. Br J Plast Surg 2001;54:81-2], including their slippage [Powell BWEM. The value of head dressings in the postoperative management of the prominent ear. Br J Plast Surg 1989;42:692-4. Bradbury ET, Hewison J, Timmons MJ. Psychological and social outcome of prominent ear correction in children. Br J Plast Surg 1992;45:97-100. Jeffery SLA. Complications following correction of prominent ears: an audit review of 122 cases. Br J Plast Surg 1999;52:588-90]. Eighty children were recruited into a prospective randomised controlled trial comparing the use of a head bandage for only 24 h with a standard practise of a 10-day head bandage. A preoperative measurement of the lateral ear projection (LEP) was made. The outcome measures recorded during the two planned postoperative visits at 10 days (visit 1) and 2 months (visit 2) were: patient satisfaction score, LEP, complications and any unscheduled hospital visits associated with the surgery. There was no significant difference in LEP and patient satisfaction between the two groups at both the scheduled postoperative visits. Differences between the groups in the number of unscheduled visits (p=0.21) did not reach statistical significance. The findings indicate that it is safe and effective to use head bandage for only 24 h following surgical correction of prominent ears. This study shows no benefit from the application of a formal head bandage for any longer than 1 day.
Freshwater, M Felix
2015-11-01
Laboratory animal research must be designed in a manner that minimizes bias if it is to yield valid and reproducible results. In 2009, a survey that examined 271 animal studies found that 87% did not use randomization and 86% did not use blinding. This has been called "research waste" because it wasted time and resources. This systematic review measured the quantity of research waste in plastic surgery journals in 2014. The PRISMA-P protocol was used. SCOPUS and PubMed searches were done for all animal studies published in 2014 in Aesthetic Plast Surg, Aesthet Surg J, Ann Plast Surg, JPRAS, J Plast Surg Hand Surg and Plast Reconstr Surg. These were supplemented by manual searches of the 2014 issues not indexed. Articles were analyzed for descriptions of randomization, randomization methodology, allocation concealment, and blinding of the primary outcome assessment. Corresponding authors who mentioned randomization without elaborating were emailed for details. 112 of 154 articles met the inclusion criteria. Only 24/112 (21.4%) had blinding of the primary outcome measure, 28/110 (25.5%) of articles that required randomization mentioned it. While 12/28 articles clearly described randomizing the intervention, only 4/28 described the method of randomization, and 2/28 mentioned allocation concealment. Only two authors responded and described the randomization methodology. The quality of plastic surgery laboratory animal research published in 2014 was poor. Use of the National Centre for the Replacement Refinement & Reduction of Animals in Research's "Animal Research: Reporting In Vivo Experiments" (ARRIVE) Guidelines by authors, and enforcement of them by editors and reviewers could improve research quality and reduce waste. Copyright © 2015 British Association of Plastic, Reconstructive and Aesthetic Surgeons. Published by Elsevier Ltd. All rights reserved.
Sternal exploration or closure
... Beauchamp RD, Evers BM, Mattox KL, eds. Sabiston Textbook of Surgery . 20th ed. Philadelphia, PA: Elsevier; 2017:chap 12. Singh K, Anderson E. Harper JG. Overview and management of sternal wound infection. Semin Plast Surg . 2011; ...
Using DDGS in industrial materials
USDA-ARS?s Scientific Manuscript database
Adding biological materials as fillers to plastics can enhance any existing biodegradability or provide biodegradability where none had previously existed. One potential biofiller is DDGS. In fact, several studies have been conducted recently that have investigated the use of DDGS in various plast...
A Farewell to Harms: The Audacity to Design Safer Products
The unprecedented explosion of the manufacture and adoption of synthetic chemicals into commerce after World War II introduced a panoply of products and materials that improved the standard of living for many people and spawned a multitrillion dollar chemical industry. From plast...
USDA-ARS?s Scientific Manuscript database
Reniform nematodes (Rotylenchulus reniformis) from pot culture were attached with in vitro produced Pasteuria spp. spores using a centrifuge attachment technique that resulted in 40-50% of the vermiform nematodes with spores adhering to their cuticles. Attached nematodes were placed into small plast...
Management of Complex Abdominal Wall Defects Associated with Penetrating Abdominal Trauma
2014-05-09
recruitment): a new method of wound closure. Ann Plast Surg 2005;55:660–4. 8 Ramirez OM, Ruas E, Dellon AL. ‘Components separation’ method for closure of...patients with open abdomens closed by either permanent mesh, vicryl mesh or a modification of Ramirez ’ original method of components separation. These
Modeling Finite Deformations in Trigonal Ceramic Crystals with Lattice Defects
2010-02-08
International Journal of Plasticity 26 (2010) 1357–1386 1385Farber, Y.A., Yoon, S.Y., Lagerlof, K.P.D., Heuer, A.H., 1993. Microplasticity during high... microplasticity -induced deformation in uniaxially strained ceramics by 3-D Voronoi polycrystal modeling. Int. J. Plast. 21, 801–834. Zhang, C., Kalia, R.K
Factors affecting the efficacy of a vinegar trap for Drosophila suzukii (Diptera: Drosophilidae)
USDA-ARS?s Scientific Manuscript database
Studies were conducted to develop an optimized, economical trap for monitoring the spotted wing fruit fly, Drosophila suzukii Matsumura. Flies were attracted to dark colors ranging from red to black compared with low attraction to white, yellow, and light blue. Similarly, fly catches in 237 ml plast...
Prefailure Evaluation Techniques for Marine Coatings
1971-01-01
Rozenfeld, et al(269) Naumova et al(261), Morozumi, et al(56), Baczewski , et al (34), Holtzman(48), Fridman(42), and Mori, et al(215) have all studied...resistance of paints by d.c. pulses", Plaste Kaut., 18 (12), 931-5 (1971). Baczewski , J. and Zagrodzhi, S., "Evaluation of the protective properties
Beckett, K S; Gault, D T
2006-01-01
Prominent ear correction is a common operation. Complication as a result of infection has been quoted at between 3% and 5% [Calder JC, Nasaan A. Morbidity of otoplasty: a review of 562 consecutive cases. Br J Plast Surg 1994;47:170-4 and Jeffery SLA. Complications following correction of prominent ears: an audit review of 122 cases. Br J Plast Surg 1999;52:588-90.]. We present two cases referred for ear reconstruction following catastrophic post-operative infection at the time of pinnaplasty, leaving each patient with significant helical rim deformities. Both patients displayed evidence of active post-auricular eczema at the time of their primary surgery. Dermatological research has highlighted the increased colonisation of Staphylococcus aureus in particular within areas of atopic eczema in comparison to normal skin. We advise delaying ear surgery in the presence of a rash in view of the potentially devastating complications that may result. This approach may be extended to all cutaneous surgery where treatment of the rash is advocated prior to embarking on an elective surgical procedure.
Female Rats are Less Susceptible during Puberty to Lethal Effects of Percutaneous Exposure to VX
2015-12-17
Harvell, I. Hussona-Saeed, H.I. Maibach, Changes in transepidermal water loss and cutaneous blood flow during the menstrual cycle, Contact ... Dermatitis 27 (1992) 294–301. [13] A. Leung, S. Balaji, S.G. Keswani, Biology and function of fetal and pediatric skin, Facial Plast Surg. Clin. North Am. 21
Federal Register 2010, 2011, 2012, 2013, 2014
2011-09-29
... food packaging solutions. EuroPlast, Ltd 100 S. Industrial Lane, 9/9/2011 The firm manufactures plastic... Centech Road, Omaha, 9/19/2011 The firm designs and NE 68138. manufactures equipment used to manufacture... manufactures custom Covington, LA 70433. cabinetry and millwork. Oakridge Seafood, LLC 3408 E. Old Spanish 9/12...
Engineered Joint Lubrication for OA Prevention and Treatment
2015-09-01
Williams, C. G., Khan, M., Manson, P. & Elisseeff, J .H. In vivo chondrogenesis of mesenchymal stem cells in photopolymerized hydrogel. Plast...protecting cells from free-radical damage20–22. Coating surfaces with HA may also physically protect the surfaces from cytokines and degrading enzymes...modification provides a biomimetic mechanism to concentrate HA on the surface. Numerous endogenous enzymes and reactive oxygen species can degrade HA
Adaptive multi-time-domain subcycling for crystal plasticity FE modeling of discrete twin evolution
NASA Astrophysics Data System (ADS)
Ghosh, Somnath; Cheng, Jiahao
2018-02-01
Crystal plasticity finite element (CPFE) models that accounts for discrete micro-twin nucleation-propagation have been recently developed for studying complex deformation behavior of hexagonal close-packed (HCP) materials (Cheng and Ghosh in Int J Plast 67:148-170, 2015, J Mech Phys Solids 99:512-538, 2016). A major difficulty with conducting high fidelity, image-based CPFE simulations of polycrystalline microstructures with explicit twin formation is the prohibitively high demands on computing time. High strain localization within fast propagating twin bands requires very fine simulation time steps and leads to enormous computational cost. To mitigate this shortcoming and improve the simulation efficiency, this paper proposes a multi-time-domain subcycling algorithm. It is based on adaptive partitioning of the evolving computational domain into twinned and untwinned domains. Based on the local deformation-rate, the algorithm accelerates simulations by adopting different time steps for each sub-domain. The sub-domains are coupled back after coarse time increments using a predictor-corrector algorithm at the interface. The subcycling-augmented CPFEM is validated with a comprehensive set of numerical tests. Significant speed-up is observed with this novel algorithm without any loss of accuracy that is advantageous for predicting twinning in polycrystalline microstructures.
High-Energy Trauma and Damage Control in the Lower Limb
2010-01-01
Reconstruction : From Microsurgery Reconstruction to Transplantation; Guest Editors, Chih-Hung Lin, M.D., and Fu-Chan Wei, M.D. Semin Plast Surg...continue intraoperatively.12–14 The goal is to achieve hemostasis, restore normal physiology, and potentially complete a vascular reconstruction upon...injuries and the need for vascular reconstruction at the time of admission is crucial to the success of grafting and maximizes the chances of limb
Etiology and Treatment of Congenital Festoons.
Asaadi, Mokhtar
2018-04-18
Festoons and malar bags present a particular challenge to the plastic surgeon and commonly persist after the traditional lower blepharoplasty. They are more common than we think and a trained eye will be able to recognize them. Lower blepharoplasty in these patients requires addressing the lid-cheek junction and midcheek using additional techniques such as orbicularis retaining ligament (ORL) and zygomaticocutaneous ligament (ZCL) release, midface lift, microsuction, or even direct excision (Kpodzo e al. in Aesthet Surg J 34(2):235-248, 2014; Goldberg et al. in Plast Reconstr Surg 115(5):1395-1402, 2005; Mendelson et al. in Plast Reconstr Surg 110(3):885-896, 2002). The goal in these patients is to restore a smooth contour from the lower eyelid to the cheek. The review of literature shows the need for more than one surgery for treatment of the festoons (Furnas in Plast Reconstr Surg 61(4):540-546, 1978). One of the reasons WHY these cases are so challenging is that the festoons tend to persist even after surgical treatment. As Furnas said, "Malar mounds have acquired some notoriety for their persistence in the face of surgical efforts to remove them" (Furnas in Clin Plast Surg 20(2):367-385, 1993). This could be due to different etiology between acquired and congenital festoons. There are currently no cases of congenital festoons described in the literature. In the last 10 years, we have treated a total of 59 patients with festoons or malar mounds. We used the terminology of festoon for acquired cases and malar mound for congenital ones (Kpodzo et al. 2014). We were successful with treating 56 patients who developed acquired festoons later on in life; however, three cases required an additional treatment to improve residual puffiness that they had after the first operation. From the above findings, we hypothesized that there should be something common in patients with congenital festoons or malar mounds which are different from acquired festoons. All of these three patients had one thing in common, and that was a history of puffiness of the prezygomatic space since childhood. Each of these patients expressed that these conditions have been present since a young age but became worse with aging over time. To date, there are no descriptions of the cause or treatment for congenital festoons. Here, we present the first case series of three patients with congenital festoons. We discuss the possible etiology of congenital festoons, the physical exam, and the surgical approaches. We performed a retrospective review of 59 patients who had surgical correction of festoons in the past 10 years, three of which were presented since childhood. In this paper, we will discuss the pathophysiology and the surgical treatments for congenital festoons. Only patients with festoons present since birth were included. The first two cases were treated with a subciliary blepharoplasty with release of the orbicularis retaining and zygomaticocutaneous ligaments and midface lift with canthopexy and orbicularis muscle suspension. The third case had a subciliary lower blepharoplasty approach, skin, and muscle flap and direct excision of the fat through the orbicularis from the subcutaneous space. In addition, each patient required further treatments to address supra-orbicularis fat by various methods. All patients with acquired festoons had successful results with one operation by subciliary skin muscle flap, release of the ORL and ZCL, midface lift, and muscle suspension. All three patients with congenital festoons had residual puffiness that required surgical and non-surgical treatments. There were no complications. Our first case required three surgical treatments for complete correction. The second and third cases required Kybella injections after their initial surgical treatments. The specimen of the first patient, Fig. 10, who had direct excision, showed localized fat collection immediately under the skin and above the orbicularis oculi muscle. Correction of congenital festoons or malar mounds requires a combination of subciliary lower blepharoplasty with skin muscle flap, midface lift, and orbicularis muscle suspension, as well as addressing the supra-orbicularis fat via direct excision, off-label Kybella injection or liposuction. This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266 .
2012-04-01
M. Winey and Y. M. Gupta, J. Appl. Phys. 107, 103505 (2010). 13R. Becker, Int. J. Plast. 20, 1983 (2004). 14B. Olinger, B. Roof, and H. H. Cady ...f011g, f021g (010), f011g, 021ð Þ 063512-8 J. D. Clayton and R. Becker J. Appl. Phys. 111, 063512 (2012) 18H. H. Cady , J. Chem. Eng. Data 17, 369
Gao, Jianwei; Qiu, Xiangyu; Li, Xinying; Fan, Hang; Zhang, Fang; Lv, Tangfeng; Song, Yong
2018-04-06
Exosomes are membrane-bound, virus-sized vesicles present in circulating blood. Tumor cells are avid producers of exosomes, which are thought to mimic molecular features of parent tumor cells. T-cell immunoglobulin- and mucin-domain-containing molecule 3 (Tim-3) is a the next-generation immune checkpoint that can be activated by its ligand Galectin-9 to negatively regulate the anti-tumor immune response. However, the characteristics of plasma exosomal Tim-3 and Galectin-9 (Exo-T/G) in cancer remained unknown. This study was conducted to investigate the expression patterns and clinical value of plasma exosomal total protein (Exo-pro) and Exo-T/G in non-small cell lung cancer (NSCLC). Plasma was collected from 103 NSCLC patients including 60 early stages and 43 advanced stages disease samples as well as 56 healthy subjects. Exosomes were isolated from plasma by commercial exosome precipitation solution and identified by western blotting of CD63 and transmission electron microscopy. Exo-pro concentration was measured by the BCA assay. Enzyme-linked immunosorbent assay was used to quantify Exo-T/G. Additionally, 34 NSCLC samples were applied to directly detect plasma TIM-3 (Plas-T) and Galectin-9 (Plas-G). Our results showed that Exo-pro, Exo-T, and Exo-G were significantly increased in NSCLC plasma compared to that in the healthy samples. High levels of Exo-T and Exo-G were all positively correlated with several malignant parameters, including larger tumor size, advanced stages, and more distant metastasis. High levels of Exo-pro and Exo-T were also correlated with more lymph node metastasis. Additionally, plasma from lung squamous cell carcinoma showed higher Exo-T and Exo-G compared with that from lung adenocarcinoma. ALK-positive patients showed to have decreased Exo-T and Exo-G levels. Pearson's correlation analysis revealed a significant correlation between Exo-pro and Exo-T/G, Exo-T and Exo-G, Exo-T and Plas-T, Exo-G and Plas-G, and Plas-T and Plas-G. Together, our data revealed that Exo-pro, especially Exo-T and Exo-G could be potential biomarkers for NSCLC. Further studies focusing on pure tumor-derived exosomes isolated from plasma were needed. Copyright © 2018 Elsevier Inc. All rights reserved.
1999-11-03
increases in the posterior and anterior implants of 0.42 mm/year and 0,14 mm/year, respectively. Korn •je and Baumrind also found no evidence to...0.23 mm/year, respectively. Also reporting asymmetric growth of the maxilla, Bjork and Skieller,34 and Korn and Baumrind observed posterior midpalatal...Skieller V. Growth in width of the maxilla studied by the implant method. Scand J Plast Reconstr Surg 1974;8:26-33. 35. Korn EL, Baumrind S
Perforator Peroneal Artery Flap for Tongue Reconstruction.
Chauhan, Shubhra; Chavre, Sachin; Chandrashekar, Naveen Hedne; B S, Naveen
2017-03-01
Reconstruction has evolved long way from primary closure to flaps. As time evolved, better understanding of vascularity of flap has led to the development of innovative reconstructive techniques. These flaps can be raised from various parts of the body for reconstruction and have shown least donor site morbidity. We use one such peroneal artery perforator flap for tongue reconstruction with advantage of thin pliable flap, minimal donor site morbidity and hidden scar. Our patient 57yrs old lady underwent wide local excision with selective neck dissection. Perforators are marked about 10 and 15 cm inferiorly from the fibular head using hand held Doppler. Leg is positioned in such a way to give better exposure during dissection of the flap and flap is harvested under a tourniquet with pressure kept 350 mm Hg. The perforator is kept at the eccentric location, so as to gain length of the pedicle. Skin incison is placed over the peroneal muscle and deepened unto the deep facia, then the dissection is continued over the muscle and the perforator arising from the lateral septum. The proximal perforator about 10 cm from the fibular head is a constant perforator and bigger one, which is traced up to the peroneal vessel. We could get a 6 cm of pedicle length. Finally the flap is islanded on this perforator and the pedicle is ligated and flap harvested. Anastamosis was done to the ipsilateral side to facial vessels. The donor site is closed primarily and in the upper half one can harvest 5 cm width flap without requiring a skin graft along with a length of 8 to 12 cm. Various local and free flap has been used for reconstruction of partial tongue defects with its obvious donor site problems, like less pliable skin and not so adequate tissue from local flaps and sacrificing a important artery as in radial forearm flap serves as the work horse in reconstruction of partial tongue defects, Concept of super microsurgery was popularized by Japanese in 1980s and the concept of angiosome proposed by Taylor paved the way for development of new flaps. True perforator flaps are those where the source vessel is left undisturbed and overlying skin flap is raised. Yoshimura proposed cutaneous flap could be raised from peroneal artery (Br J Plast Surg 42:715-718, 1989). Wolff et al. (Plast Reconstr Surg 113:107-113, 2004) first used perforator based peroneal artery flap for oral reconstruction. Location of perforators vary, hence pre operative localisation can be done by ultrasound doppler, CT angio or MR angiography. Disadvantages over radial flap include varying anatomic location of perforators, need for imaging and difficult dissection of delicate vessels through muscles and hence a learning curve. Our patient had an arterial thrombus within few hours post-operatively which was successfully salvaged with immediate re-exploration and re-anastomosis of artery. Post-operative healing was uneventful and donor site was closed primarily without the need for graft. Perforator peroneal flap serves as a useful armamentarium for reconstruction of moderate size defects of tongue, buccal mucosa and floor of mouth with advantages of thin pliable flap, minimal donor site morbidity and hidden scar.
Two-stage hypospadias repair: audit in a district general hospital.
Price, R D; Lambe, G F; Jones, R P
2003-12-01
The number of techniques for hypospadias repair is testament to the challenges associated with this condition. In 1994, the senior author undertook an audit of his repairs using the van der Meulen [Plast. Reconstr. Surg. 59 (1977) 20615] technique and determined that the revision rate of 11% was unsatisfactory and the cosmetic result sub-optimal. He, therefore, retrained and began in 1995, using the two-stage technique popularised by Bracka [Br. J. Plast. Surg. 48 (1995) 345]. We undertook an audit of all corrections performed in the period from September 1995 to March 2002. The computer database in the main theatre suite was used to identify all patients on whom such a repair had been undertaken and those notes retrieved. Data was collected on a number of variables including age at operations, complications such as urinary tract infection and fistulae, and total number of corrective operations. One hundred and nineteen patients were identified, of which seven had no records available. Of the remaining 112, 81 were primary repairs, in whom the complication rate was 2.5% for stage I (graft loss) and 9.8% for stage II (fistula rate 7.4%, stenosis 1.2%, baggy urethra requiring reconstruction 1.2%). The remaining 31 patients were those with unsatisfactory single-stage repairs and in this group, graft loss was seen in three cases (10%). The fistula rate was 4/31 (12.9%) and the stenosis rate 2/31 (6.5%). These results compare favourably with a number of published series from surgeons who have super-specialised in this field. We conclude that the two-stage repair is a useful and reliable technique in the hands of a Plastic Surgeon who has a broader interest.
Aminoacyl-tRNA synthetases database Y2K.
Szymanski, M; Barciszewski, J
2000-01-01
The aminoacyl-tRNA synthetases (AARS) are a diverse group of enzymes that ensure the fidelity of transfer of genetic information from DNA into protein. They catalyse the attachment of amino acids to transfer RNAs and thereby establish the rules of the genetic code by virtue of matching the nucleotide triplet of the anticodon with its cognate amino acid. Currently, 818 AARS primary structures have been reported from archaebacteria, eubacteria, mitochondria, chloro-plasts and eukaryotic cells. The database is a compilation of the amino acid sequences of all AARSs, known to date, which are available as separate entries or alignments of related proteins via the WWW at http://rose.man.poznan.pl/aars/index.html
Hwee, Yin Kan; Park, Daniel; Vinas, Marisa; Litts, Christopher; Friedman, David
2017-08-01
Collagenase clostridium histolyticum (CCH) injection is an alternative to surgery for patients with Dupuytren disease (DD) of the metacarpophalangeal (MCP) and proximal interphalangeal (PIP) joints. The success of surgical and nonsurgical treatment modalities for DD is reported to vary widely between 25% and 80% (J Bone Joint Surg Am. 1985;67:1439-1443; Plast Reconstr Surg. 2007;120:44e-54e; J Bone Joint Surg Am. 2007;89:189-198; J Hand Surg Am. 2011:36:936-942; J Hand Surg Am. 1990;15:755-761; J Hand Surg Br. 1996;21:797-800; J Bone Joint Surg Br. 2000;82:90-94; Plast Reconstr Surg. 2005;115:802-810; Ann Plast Surg. 2006;57:13-17). This study presents the outcomes of patients with DD contractures treated with CCH injections at a single institution. An institutional review board-approved retrospective study was conducted of patients with DD of the hand treated with CCH injections in a single institution from February 2010 to April 2015. All patients received the recommended dose of 0.58 mg of CCH and returned for joint manipulation the following day. Data for follow-up at 7 and 30 days postoperatively and up to 5 years for patients who returned seeking further therapy for recurrent symptoms were reviewed. One hundred thirteen patients with a total of 146 affected joints (72 MCP; 74 PIP) were treated with CCH injections (95 males; 18 females; age, 40-92 y). Successful CCH therapy occurred in 75% of injected joints (109/146 joints; 59 MCP; 50 PIP), as defined by less than 5 degrees of contracture after treatment. Twenty-three percent of treated joints had partial correction (34/146 joints; 13 MCP; 21 PIP), as defined by between 5 and 30 degrees of residual contracture after treatment. Three patients (2%) had a failure of treatment, as defined by unchanged or worsened contracture from pretreatment baseline measurements. Fifteen patients (13%) returned to the clinic seeking additional therapy for recurrent joint contracture symptoms in 17 joints over a span of 1.5 months to 4 years after initial successful or partially successful treatment (17/143, 12%; 5 MCP; 12 PIP). Recurrence was defined as patients who sought treatment for a return of symptoms or greater than 20 degrees contracture in the setting of a palpable cord after initial full or partial contracture correction. Our 5-year outcome of CCH injections for DD contractures revealed full correction in 75% and partial correction in 23% of treated joints, with failure of treatment seen in only 2% of patients. Thirteen percent of the patients returned for additional treatment because of symptoms resulting from contracture recurrence in 12% of initially corrected or partially corrected joints. These positive outcomes are comparable with current surgical treatment modalities (J Hand Surg Am. 1990;15:755-761; J Bone Joint Surg Am. 1962;44B:602-613; J Clin Epidemiol. 2000;53:291-296). The use of CCH injections is an important nonsurgical treatment alternative for DD contractures of the MCP and PIP joints.
Heading in the right direction? An innovative approach toward proper patient head positioning
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grush, William H.; Steffen, Gary A
2002-12-31
An in-house-manufactured modification of the standard A-F foam rubber head-neck supports (aka. Timo Supports) was designed to eliminate clinical setup problems with head immobilization and instability during treatment, thus providing for a more comfortable head rest for the patient. The custom design of this head holder seeks to eliminate superior-to-inferior shift, and minimize the lateral right-to-left rotational movement of the head when coupled with an AquaPlast casting system. By focusing attention to the seating of the occipital portion of the head and contour of the patient's neck, the aforementioned problems of movement were addressed, while adhering to the interests ofmore » patient comfort in this modified head support system.« less
SGR9, a RING type E3 ligase, modulates amyloplast dynamics important for gravity sensing.
NASA Astrophysics Data System (ADS)
Morita, Miyo T.; Nakamura, Moritaka; Tasaka, Masao
Gravitropism is triggered when the directional change of gravity is sensed in the specific cells, called statocytes. In higher plants, statocytes contain sinking heavier amyloplasts which are particular plastids accumulating starch granules. The displacement of amyloplasts within the statocytes is thought to be the initial event of gravity perception. We have demonstrated that endodermal cells are most likely to be the statocytes in Arabidop-sis shoots. Live cell imaging of the endodermal cell of stem has shown that most amyloplasts are sediment to the direction of gravity but they are not static. Several amyloplasts move dynamically in an actin filament (F-actin) dependent manner. In the presence of actin poly-merization inhibitor, all amyloplasts become static and sediment to the direction of gravity. In addition, stems treated with the inhibitor can exhibit gravitropism. These results suggest that F-actin-dependent dynamic movement of amyloplasts is not essential for gravity sensing. sgr (shoot gravitropism) 9 mutant exhibits greatly reduced shoot gravitropism. In endodermal cells of sgr9, dynamic amyloplast movement was predominantly observed and amyloplasts did not sediment to the direction of gravity. Interestingly, inhibition of actin polymerization re-stored both gravitropism and amyloplast sedimentation in sgr9. The SGR9 encodes a novel RING finger protein, which is localized to amyloplasts in endodermal cells. SGR9 showed ubiq-uitin E3 ligase activity in vitro. Together with live cell imaging of amyloplasts and F-actin, our data suggest that SGR9 modulate interaction between amyloplasts and F-actin on amylo-plasts. SGR9 positively act on amyloplasts sedimentation, probably by releasing amyloplasts from F-actin. SGR9 that is localized to amyloplast, possibly degrades unknown substrates by its E3 ligase activity, and this might promote release of amyloplasts from F-actin.
The In Vivo Pericapsular Tissue Response to Modern Polyurethane Breast Implants.
Frame, James; Kamel, Dia; Olivan, Marcelo; Cintra, Henrique
2015-10-01
Polyurethane breast implants were first introduced by Ashley (Plast Reconstr Surg 45:421-424, 1970), with the intention of trying to reduce the high incidence of capsular contracture associated with smooth shelled, high gel bleed, silicone breast implants. The sterilization of the polyurethane foam in the early days was questionable. More recently, ethylene oxide (ETO)-sterilized polyurethane has been used in the manufacturing process and this has been shown to reduce the incidence of biofilm. The improved method of attachment of polyurethane onto the underlying high cohesive gel, barrier shell layered, silicone breast implants also encourages bio-integration. Polyurethane covered, cohesive gel, silicone implants have also been shown to reduce the incidence of other problems commonly associated with smooth or textured silicone implants, especially with reference to displacement, capsular contracture, seroma, reoperation, biofilm and implant rupture. Since the introduction of the conical polyurethane implant (Silimed, Brazil) into the United Kingdom in 2009 (Eurosurgical, UK), we have had the opportunity to review histology taken from the capsules of polyurethane implants in three women ranging from a few months to over 3 years after implantation. All implants had been inserted into virgin subfascial, extra-pectoral planes. The results add to the important previously described histological findings of Bassetto et al. (Aesthet Plast Surg 34:481-485, 2010). Five distinct layers are identified and reasons for the development of each layer are discussed. Breast capsule around polyurethane implants, in situ for fifteen and 20 years, has recently been obtained and analysed in Brazil, and the histology has been incorporated into this study. After 20 years, the polyurethane is almost undetectable and capsular contracture may appear. These findings contribute to our understanding of polyurethane implant safety, and give reasoning for a significant reduction in clinical capsular contracture rate, up to 10 years after implantation, compared to contemporary silicone implants. A more permanent matrix equivalent to polyurethane may be the solution for reducing long-term capsular contracture. This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266 .
Crashworthiness simulations with DYNA3D
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schauer, D.A.; Hoover, C.G.; Kay, G.J.
1996-04-01
Current progress in parallel algorithm research and applications in vehicle crash simulation is described for the explicit, finite element algorithms in DYNA3D. Problem partitioning methods and parallel algorithms for contact at material interfaces are the two challenging algorithm research problems that are addressed. Two prototype parallel contact algorithms have been developed for treating the cases of local and arbitrary contact. Demonstration problems for local contact are crashworthiness simulations with 222 locally defined contact surfaces and a vehicle/barrier collision modeled with arbitrary contact. A simulation of crash tests conducted for a vehicle impacting a U-channel small sign post embedded in soilmore » has been run on both the serial and parallel versions of DYNA3D. A significant reduction in computational time has been observed when running these problems on the parallel version. However, to achieve maximum efficiency, complex problems must be appropriately partitioned, especially when contact dominates the computation.« less
A Metascalable Computing Framework for Large Spatiotemporal-Scale Atomistic Simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nomura, K; Seymour, R; Wang, W
2009-02-17
A metascalable (or 'design once, scale on new architectures') parallel computing framework has been developed for large spatiotemporal-scale atomistic simulations of materials based on spatiotemporal data locality principles, which is expected to scale on emerging multipetaflops architectures. The framework consists of: (1) an embedded divide-and-conquer (EDC) algorithmic framework based on spatial locality to design linear-scaling algorithms for high complexity problems; (2) a space-time-ensemble parallel (STEP) approach based on temporal locality to predict long-time dynamics, while introducing multiple parallelization axes; and (3) a tunable hierarchical cellular decomposition (HCD) parallelization framework to map these O(N) algorithms onto a multicore cluster based onmore » hybrid implementation combining message passing and critical section-free multithreading. The EDC-STEP-HCD framework exposes maximal concurrency and data locality, thereby achieving: (1) inter-node parallel efficiency well over 0.95 for 218 billion-atom molecular-dynamics and 1.68 trillion electronic-degrees-of-freedom quantum-mechanical simulations on 212,992 IBM BlueGene/L processors (superscalability); (2) high intra-node, multithreading parallel efficiency (nanoscalability); and (3) nearly perfect time/ensemble parallel efficiency (eon-scalability). The spatiotemporal scale covered by MD simulation on a sustained petaflops computer per day (i.e. petaflops {center_dot} day of computing) is estimated as NT = 2.14 (e.g. N = 2.14 million atoms for T = 1 microseconds).« less
Local rollback for fault-tolerance in parallel computing systems
Blumrich, Matthias A [Yorktown Heights, NY; Chen, Dong [Yorktown Heights, NY; Gara, Alan [Yorktown Heights, NY; Giampapa, Mark E [Yorktown Heights, NY; Heidelberger, Philip [Yorktown Heights, NY; Ohmacht, Martin [Yorktown Heights, NY; Steinmacher-Burow, Burkhard [Boeblingen, DE; Sugavanam, Krishnan [Yorktown Heights, NY
2012-01-24
A control logic device performs a local rollback in a parallel super computing system. The super computing system includes at least one cache memory device. The control logic device determines a local rollback interval. The control logic device runs at least one instruction in the local rollback interval. The control logic device evaluates whether an unrecoverable condition occurs while running the at least one instruction during the local rollback interval. The control logic device checks whether an error occurs during the local rollback. The control logic device restarts the local rollback interval if the error occurs and the unrecoverable condition does not occur during the local rollback interval.
Local and nonlocal parallel heat transport in general magnetic fields
DOE Office of Scientific and Technical Information (OSTI.GOV)
Del-Castillo-Negrete, Diego B; Chacon, Luis
2011-01-01
A novel approach for the study of parallel transport in magnetized plasmas is presented. The method avoids numerical pollution issues of grid-based formulations and applies to integrable and chaotic magnetic fields with local or nonlocal parallel closures. In weakly chaotic fields, the method gives the fractal structure of the devil's staircase radial temperature profile. In fully chaotic fields, the temperature exhibits self-similar spatiotemporal evolution with a stretched-exponential scaling function for local closures and an algebraically decaying one for nonlocal closures. It is shown that, for both closures, the effective radial heat transport is incompatible with the quasilinear diffusion model.
A Green's function method for local and non-local parallel transport in general magnetic fields
NASA Astrophysics Data System (ADS)
Del-Castillo-Negrete, Diego; Chacón, Luis
2009-11-01
The study of transport in magnetized plasmas is a problem of fundamental interest in controlled fusion and astrophysics research. Three issues make this problem particularly challenging: (i) The extreme anisotropy between the parallel (i.e., along the magnetic field), χ, and the perpendicular, χ, conductivities (χ/χ may exceed 10^10 in fusion plasmas); (ii) Magnetic field lines chaos which in general complicates (and may preclude) the construction of magnetic field line coordinates; and (iii) Nonlocal parallel transport in the limit of small collisionality. Motivated by these issues, we present a Lagrangian Green's function method to solve the local and non-local parallel transport equation applicable to integrable and chaotic magnetic fields. The numerical implementation employs a volume-preserving field-line integrator [Finn and Chac'on, Phys. Plasmas, 12 (2005)] for an accurate representation of the magnetic field lines regardless of the level of stochasticity. The general formalism and its algorithmic properties are discussed along with illustrative analytical and numerical examples. Problems of particular interest include: the departures from the Rochester--Rosenbluth diffusive scaling in the weak magnetic chaos regime, the interplay between non-locality and chaos, and the robustness of transport barriers in reverse shear configurations.
Ji, Eun-Kyu; Lee, Sang-Heon
2016-11-01
[Purpose] The purpose of this study was to investigate the effects of virtual reality training combined with modified constraint-induced movement therapy on upper extremity motor function recovery in acute stage stroke patients. [Subjects and Methods] Four acute stage stroke patients participated in the study. A multiple baseline single subject experimental design was utilized. Modified constraint-induced movement therapy was used according to the EXplaining PLastICITy after stroke protocol during baseline sessions. Virtual reality training with modified constraint-induced movement therapy was applied during treatment sessions. The Manual Function Test and the Box and Block Test were used to measure upper extremity function before every session. [Results] The subjects' upper extremity function improved during the intervention period. [Conclusion] Virtual reality training combined with modified constraint-induced movement is effective for upper extremity function recovery in acute stroke patients.
Characterizing and Mitigating Work Time Inflation in Task Parallel Programs
Olivier, Stephen L.; de Supinski, Bronis R.; Schulz, Martin; ...
2013-01-01
Task parallelism raises the level of abstraction in shared memory parallel programming to simplify the development of complex applications. However, task parallel applications can exhibit poor performance due to thread idleness, scheduling overheads, and work time inflation – additional time spent by threads in a multithreaded computation beyond the time required to perform the same work in a sequential computation. We identify the contributions of each factor to lost efficiency in various task parallel OpenMP applications and diagnose the causes of work time inflation in those applications. Increased data access latency can cause significant work time inflation in NUMA systems.more » Our locality framework for task parallel OpenMP programs mitigates this cause of work time inflation. Our extensions to the Qthreads library demonstrate that locality-aware scheduling can improve performance up to 3X compared to the Intel OpenMP task scheduler.« less
A Parallel Saturation Algorithm on Shared Memory Architectures
NASA Technical Reports Server (NTRS)
Ezekiel, Jonathan; Siminiceanu
2007-01-01
Symbolic state-space generators are notoriously hard to parallelize. However, the Saturation algorithm implemented in the SMART verification tool differs from other sequential symbolic state-space generators in that it exploits the locality of ring events in asynchronous system models. This paper explores whether event locality can be utilized to efficiently parallelize Saturation on shared-memory architectures. Conceptually, we propose to parallelize the ring of events within a decision diagram node, which is technically realized via a thread pool. We discuss the challenges involved in our parallel design and conduct experimental studies on its prototypical implementation. On a dual-processor dual core PC, our studies show speed-ups for several example models, e.g., of up to 50% for a Kanban model, when compared to running our algorithm only on a single core.
Parallel heat transport in integrable and chaotic magnetic fields
DOE Office of Scientific and Technical Information (OSTI.GOV)
Del-Castillo-Negrete, Diego B; Chacon, Luis
2012-01-01
The study of transport in magnetized plasmas is a problem of fundamental interest in controlled fusion, space plasmas, and astrophysics research. Three issues make this problem particularly chal- lenging: (i) The extreme anisotropy between the parallel (i.e., along the magnetic field), , and the perpendicular, , conductivities ( / may exceed 1010 in fusion plasmas); (ii) Magnetic field lines chaos which in general complicates (and may preclude) the construction of magnetic field line coordinates; and (iii) Nonlocal parallel transport in the limit of small collisionality. Motivated by these issues, we present a Lagrangian Green s function method to solve themore » local and non-local parallel transport equation applicable to integrable and chaotic magnetic fields in arbitrary geom- etry. The method avoids by construction the numerical pollution issues of grid-based algorithms. The potential of the approach is demonstrated with nontrivial applications to integrable (magnetic island chain), weakly chaotic (devil s staircase), and fully chaotic magnetic field configurations. For the latter, numerical solutions of the parallel heat transport equation show that the effective radial transport, with local and non-local closures, is non-diffusive, thus casting doubts on the appropriateness of the applicability of quasilinear diffusion descriptions. General conditions for the existence of non-diffusive, multivalued flux-gradient relations in the temperature evolution are derived.« less
Large-scale trench-normal mantle flow beneath central South America
NASA Astrophysics Data System (ADS)
Reiss, M. C.; Rümpker, G.; Wölbern, I.
2018-01-01
We investigate the anisotropic properties of the fore-arc region of the central Andean margin between 17-25°S by analyzing shear-wave splitting from teleseismic and local earthquakes from the Nazca slab. With partly over ten years of recording time, the data set is uniquely suited to address the long-standing debate about the mantle flow field at the South American margin and in particular whether the flow field beneath the slab is parallel or perpendicular to the trench. Our measurements suggest two anisotropic layers located within the crust and mantle beneath the stations, respectively. The teleseismic measurements show a moderate change of fast polarizations from North to South along the trench ranging from parallel to subparallel to the absolute plate motion and, are oriented mostly perpendicular to the trench. Shear-wave splitting measurements from local earthquakes show fast polarizations roughly aligned trench-parallel but exhibit short-scale variations which are indicative of a relatively shallow origin. Comparisons between fast polarization directions from local earthquakes and the strike of the local fault systems yield a good agreement. To infer the parameters of the lower anisotropic layer we employ an inversion of the teleseismic waveforms based on two-layer models, where the anisotropy of the upper (crustal) layer is constrained by the results from the local splitting. The waveform inversion yields a mantle layer that is best characterized by a fast axis parallel to the absolute plate motion which is more-or-less perpendicular to the trench. This orientation is likely caused by a combination of the fossil crystallographic preferred orientation of olivine within the slab and entrained mantle flow beneath the slab. The anisotropy within the crust of the overriding continental plate is explained by the shape-preferred orientation of micro-cracks in relation to local fault zones which are oriented parallel to the overall strike of the Andean range. Our results do not provide any evidence for a significant contribution of trench-parallel mantle flow beneath the subducting slab.
Development of parallel algorithms for electrical power management in space applications
NASA Technical Reports Server (NTRS)
Berry, Frederick C.
1989-01-01
The application of parallel techniques for electrical power system analysis is discussed. The Newton-Raphson method of load flow analysis was used along with the decomposition-coordination technique to perform load flow analysis. The decomposition-coordination technique enables tasks to be performed in parallel by partitioning the electrical power system into independent local problems. Each independent local problem represents a portion of the total electrical power system on which a loan flow analysis can be performed. The load flow analysis is performed on these partitioned elements by using the Newton-Raphson load flow method. These independent local problems will produce results for voltage and power which can then be passed to the coordinator portion of the solution procedure. The coordinator problem uses the results of the local problems to determine if any correction is needed on the local problems. The coordinator problem is also solved by an iterative method much like the local problem. The iterative method for the coordination problem will also be the Newton-Raphson method. Therefore, each iteration at the coordination level will result in new values for the local problems. The local problems will have to be solved again along with the coordinator problem until some convergence conditions are met.
Influence of equilibrium shear flow in the parallel magnetic direction on edge localized mode crash
DOE Office of Scientific and Technical Information (OSTI.GOV)
Luo, Y.; Xiong, Y. Y.; Chen, S. Y., E-mail: sychen531@163.com
2016-04-15
The influence of the parallel shear flow on the evolution of peeling-ballooning (P-B) modes is studied with the BOUT++ four-field code in this paper. The parallel shear flow has different effects in linear simulation and nonlinear simulation. In the linear simulations, the growth rate of edge localized mode (ELM) can be increased by Kelvin-Helmholtz term, which can be caused by the parallel shear flow. In the nonlinear simulations, the results accord with the linear simulations in the linear phase. However, the ELM size is reduced by the parallel shear flow in the beginning of the turbulence phase, which is recognizedmore » as the P-B filaments' structure. Then during the turbulence phase, the ELM size is decreased by the shear flow.« less
Improved treatment of exact exchange in Quantum ESPRESSO
Barnes, Taylor A.; Kurth, Thorsten; Carrier, Pierre; ...
2017-01-18
Here, we present an algorithm and implementation for the parallel computation of exact exchange in Quantum ESPRESSO (QE) that exhibits greatly improved strong scaling. QE is an open-source software package for electronic structure calculations using plane wave density functional theory, and supports the use of local, semi-local, and hybrid DFT functionals. Wider application of hybrid functionals is desirable for the improved simulation of electronic band energy alignments and thermodynamic properties, but the computational complexity of evaluating the exact exchange potential limits the practical application of hybrid functionals to large systems and requires efficient implementations. We demonstrate that existing implementations ofmore » hybrid DFT that utilize a single data structure for both the local and exact exchange regions of the code are significantly limited in the degree of parallelization achievable. We present a band-pair parallelization approach, in which the calculation of exact exchange is parallelized and evaluated independently from the parallelization of the remainder of the calculation, with the wavefunction data being efficiently transformed on-the-fly into a form that is optimal for each part of the calculation. For a 64 water molecule supercell, our new algorithm reduces the overall time to solution by nearly an order of magnitude.« less
A METHOD FOR IN-SITU CHARACTERIZATION OF RF HEATING IN PARALLEL TRANSMIT MRI
Alon, Leeor; Deniz, Cem Murat; Brown, Ryan; Sodickson, Daniel K.; Zhu, Yudong
2012-01-01
In ultra high field magnetic resonance imaging, parallel radio-frequency (RF) transmission presents both opportunities and challenges for specific absorption rate (SAR) management. On one hand, parallel transmission provides flexibility in tailoring electric fields in the body while facilitating magnetization profile control. On the other hand, it increases the complexity of energy deposition as well as possibly exacerbating local SAR by improper design or delivery of RF pulses. This study shows that the information needed to characterize RF heating in parallel transmission is contained within a local power correlation matrix. Building upon a calibration scheme involving a finite number of magnetic resonance thermometry measurements, the present work establishes a way of estimating the local power correlation matrix. Determination of this matrix allows prediction of temperature change for an arbitrary parallel transmit RF pulse. In the case of a three transmit coil MR experiment in a phantom, determination and validation of the power correlation matrix was conducted in less than 200 minutes with induced temperature changes of <4 degrees C. Further optimization and adaptation are possible, and simulations evaluating potential feasibility for in vivo use are presented. The method allows general characteristics indicative of RF coil/pulse safety determined in situ. PMID:22714806
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barnes, Taylor A.; Kurth, Thorsten; Carrier, Pierre
Here, we present an algorithm and implementation for the parallel computation of exact exchange in Quantum ESPRESSO (QE) that exhibits greatly improved strong scaling. QE is an open-source software package for electronic structure calculations using plane wave density functional theory, and supports the use of local, semi-local, and hybrid DFT functionals. Wider application of hybrid functionals is desirable for the improved simulation of electronic band energy alignments and thermodynamic properties, but the computational complexity of evaluating the exact exchange potential limits the practical application of hybrid functionals to large systems and requires efficient implementations. We demonstrate that existing implementations ofmore » hybrid DFT that utilize a single data structure for both the local and exact exchange regions of the code are significantly limited in the degree of parallelization achievable. We present a band-pair parallelization approach, in which the calculation of exact exchange is parallelized and evaluated independently from the parallelization of the remainder of the calculation, with the wavefunction data being efficiently transformed on-the-fly into a form that is optimal for each part of the calculation. For a 64 water molecule supercell, our new algorithm reduces the overall time to solution by nearly an order of magnitude.« less
Parallel architectures for iterative methods on adaptive, block structured grids
NASA Technical Reports Server (NTRS)
Gannon, D.; Vanrosendale, J.
1983-01-01
A parallel computer architecture well suited to the solution of partial differential equations in complicated geometries is proposed. Algorithms for partial differential equations contain a great deal of parallelism. But this parallelism can be difficult to exploit, particularly on complex problems. One approach to extraction of this parallelism is the use of special purpose architectures tuned to a given problem class. The architecture proposed here is tuned to boundary value problems on complex domains. An adaptive elliptic algorithm which maps effectively onto the proposed architecture is considered in detail. Two levels of parallelism are exploited by the proposed architecture. First, by making use of the freedom one has in grid generation, one can construct grids which are locally regular, permitting a one to one mapping of grids to systolic style processor arrays, at least over small regions. All local parallelism can be extracted by this approach. Second, though there may be a regular global structure to the grids constructed, there will be parallelism at this level. One approach to finding and exploiting this parallelism is to use an architecture having a number of processor clusters connected by a switching network. The use of such a network creates a highly flexible architecture which automatically configures to the problem being solved.
NASA Astrophysics Data System (ADS)
Baregheh, Mandana; Mezentsev, Vladimir; Schmitz, Holger
2011-06-01
We describe a parallel multi-threaded approach for high performance modelling of wide class of phenomena in ultrafast nonlinear optics. Specific implementation has been performed using the highly parallel capabilities of a programmable graphics processor.
Idle waves in high-performance computing
NASA Astrophysics Data System (ADS)
Markidis, Stefano; Vencels, Juris; Peng, Ivy Bo; Akhmetova, Dana; Laure, Erwin; Henri, Pierre
2015-01-01
The vast majority of parallel scientific applications distributes computation among processes that are in a busy state when computing and in an idle state when waiting for information from other processes. We identify the propagation of idle waves through processes in scientific applications with a local information exchange between the two processes. Idle waves are nondispersive and have a phase velocity inversely proportional to the average busy time. The physical mechanism enabling the propagation of idle waves is the local synchronization between two processes due to remote data dependency. This study provides a description of the large number of processes in parallel scientific applications as a continuous medium. This work also is a step towards an understanding of how localized idle periods can affect remote processes, leading to the degradation of global performance in parallel scientific applications.
NASA Astrophysics Data System (ADS)
Wang, Xin; Tu, Chuanyi; Marsch, Eckart; He, Jiansen; Wang, Linghua
2016-01-01
Turbulence in the solar wind was recently reported to be anisotropic, with the average power spectral index close to -2 when sampling parallel to the local mean magnetic field B0 and close to -5/3 when sampling perpendicular to the local B0. This result was widely considered to be observational evidence for the critical balance theory (CBT), which is derived by making the assumption that the turbulence strength is close to one. However, this basic assumption has not yet been checked carefully with observational data. Here we present for the first time the scale-dependent magnetic-field fluctuation amplitude, which is normalized by the local B0 and evaluated for both parallel and perpendicular sampling directions, using two 30-day intervals of Ulysses data. From our results, the turbulence strength is evaluated as much less than one at small scales in the parallel direction. An even stricter criterion is imposed when selecting the wavelet coefficients for a given sampling direction, so that the time stationarity of the local B0 is better ensured during the local sampling interval. The spectral index for the parallel direction is then found to be -1.75, whereas the spectral index in the perpendicular direction remains close to -1.65. These two new results, namely that the value of the turbulence strength is much less than one in the parallel direction and that the angle dependence of the spectral index is weak, cannot be explained by existing turbulence theories, like CBT, and thus will require new theoretical considerations and promote further observations of solar-wind turbulence.
Parallelization of sequential Gaussian, indicator and direct simulation algorithms
NASA Astrophysics Data System (ADS)
Nunes, Ruben; Almeida, José A.
2010-08-01
Improving the performance and robustness of algorithms on new high-performance parallel computing architectures is a key issue in efficiently performing 2D and 3D studies with large amount of data. In geostatistics, sequential simulation algorithms are good candidates for parallelization. When compared with other computational applications in geosciences (such as fluid flow simulators), sequential simulation software is not extremely computationally intensive, but parallelization can make it more efficient and creates alternatives for its integration in inverse modelling approaches. This paper describes the implementation and benchmarking of a parallel version of the three classic sequential simulation algorithms: direct sequential simulation (DSS), sequential indicator simulation (SIS) and sequential Gaussian simulation (SGS). For this purpose, the source used was GSLIB, but the entire code was extensively modified to take into account the parallelization approach and was also rewritten in the C programming language. The paper also explains in detail the parallelization strategy and the main modifications. Regarding the integration of secondary information, the DSS algorithm is able to perform simple kriging with local means, kriging with an external drift and collocated cokriging with both local and global correlations. SIS includes a local correction of probabilities. Finally, a brief comparison is presented of simulation results using one, two and four processors. All performance tests were carried out on 2D soil data samples. The source code is completely open source and easy to read. It should be noted that the code is only fully compatible with Microsoft Visual C and should be adapted for other systems/compilers.
A novel parallel architecture for local histogram equalization
NASA Astrophysics Data System (ADS)
Ohannessian, Mesrob I.; Choueiter, Ghinwa F.; Diab, Hassan
2005-07-01
Local histogram equalization is an image enhancement algorithm that has found wide application in the pre-processing stage of areas such as computer vision, pattern recognition and medical imaging. The computationally intensive nature of the procedure, however, is a main limitation when real time interactive applications are in question. This work explores the possibility of performing parallel local histogram equalization, using an array of special purpose elementary processors, through an HDL implementation that targets FPGA or ASIC platforms. A novel parallelization scheme is presented and the corresponding architecture is derived. The algorithm is reduced to pixel-level operations. Processing elements are assigned image blocks, to maintain a reasonable performance-cost ratio. To further simplify both processor and memory organizations, a bit-serial access scheme is used. A brief performance assessment is provided to illustrate and quantify the merit of the approach.
Radiative instabilities in sheared magnetic field
NASA Technical Reports Server (NTRS)
Drake, J. F.; Sparks, L.; Van Hoven, G.
1988-01-01
The structure and growth rate of the radiative instability in a sheared magnetic field B have been calculated analytically using the Braginskii fluid equations. In a shear layer, temperature and density perturbations are linked by the propagation of sound waves parallel to the local magnetic field. As a consequence, density clumping or condensation plays an important role in driving the instability. Parallel thermal conduction localizes the mode to a narrow layer where K(parallel) is small and stabilizes short wavelengths k larger-than(c) where k(c) depends on the local radiation and conduction rates. Thermal coupling to ions also limits the width of the unstable spectrum. It is shown that a broad spectrum of modes is typically unstable in tokamak edge plasmas and it is argued that this instability is sufficiently robust to drive the large-amplitude density fluctuations often measured there.
Versatility of Capsular Flaps in the Salvage of Exposed Breast Implants
Tenna, Stefania; Cagli, Barbara; Pallara, Tiziano; Campa, Stefano; Persichetti, Paolo
2015-01-01
Summary: Breast implant exposure due to poor tissue coverage or previous irradiation represents a surgical challenge both in the reconstructive and aesthetic plastic surgery practice. In case of implant extrusion or incipient exposure, the commonly suggested strategies, such as targeted antibiotic therapy, drainage and lavage of the cavity, fistulectomy, and primary closure, may be ineffective leading the surgeon to an unwanted implant removal or to adopt more invasive flap coverage procedures. Breast implant capsule, in its physiological clinical behavior, can be considered as a new reliable source of tissue, which can be used in a wide range of clinical situations. In our hands, capsular flaps proved to be a versatile solution not only to treat breast contour deformities or inframammary fold malpositions but also to salvage exposed breast implants. In this scenario, the use of more invasive surgical techniques can be avoided or simply saved and delayed for future recurrences.(Plast Reconstr Surg Glob Open 2015;3:e340; doi:10.1097/GOX.0000000000000307; Published online 30 March 2015.) PMID:26034647
Tzou, Chieh-Han John; Pona, Igor; Placheta, Eva; Hold, Alina; Michaelidou, Maria; Artner, Nicole; Kropatsch, Walter; Gerber, Hans; Frey, Manfred
2012-08-01
Since the implementation of the computer-aided system for assessing facial palsy in 1999 by Frey et al (Plast Reconstr Surg. 1999;104:2032-2039), no similar system that can make an objective, three-dimensional, quantitative analysis of facial movements has been marketed. This system has been in routine use since its launch, and it has proven to be reliable, clinically applicable, and therapeutically accurate. With the cooperation of international partners, more than 200 patients were analyzed. Recent developments in computer vision--mostly in the area of generative face models, applying active--appearance models (and extensions), optical flow, and video-tracking-have been successfully incorporated to automate the prototype system. Further market-ready development and a business partner will be needed to enable the production of this system to enhance clinical methodology in diagnostic and prognostic accuracy as a personalized therapy concept, leading to better results and higher quality of life for patients with impaired facial function.
Relation of Parallel Discrete Event Simulation algorithms with physical models
NASA Astrophysics Data System (ADS)
Shchur, L. N.; Shchur, L. V.
2015-09-01
We extend concept of local simulation times in parallel discrete event simulation (PDES) in order to take into account architecture of the current hardware and software in high-performance computing. We shortly review previous research on the mapping of PDES on physical problems, and emphasise how physical results may help to predict parallel algorithms behaviour.
Performing a global barrier operation in a parallel computer
Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E
2014-12-09
Executing computing tasks on a parallel computer that includes compute nodes coupled for data communications, where each compute node executes tasks, with one task on each compute node designated as a master task, including: for each task on each compute node until all master tasks have joined a global barrier: determining whether the task is a master task; if the task is not a master task, joining a single local barrier; if the task is a master task, joining the global barrier and the single local barrier only after all other tasks on the compute node have joined the single local barrier.
Parallel algorithms for boundary value problems
NASA Technical Reports Server (NTRS)
Lin, Avi
1990-01-01
A general approach to solve boundary value problems numerically in a parallel environment is discussed. The basic algorithm consists of two steps: the local step where all the P available processors work in parallel, and the global step where one processor solves a tridiagonal linear system of the order P. The main advantages of this approach are two fold. First, this suggested approach is very flexible, especially in the local step and thus the algorithm can be used with any number of processors and with any of the SIMD or MIMD machines. Secondly, the communication complexity is very small and thus can be used as easily with shared memory machines. Several examples for using this strategy are discussed.
A template-based approach for parallel hexahedral two-refinement
Owen, Steven J.; Shih, Ryan M.; Ernst, Corey D.
2016-10-17
Here, we provide a template-based approach for generating locally refined all-hex meshes. We focus specifically on refinement of initially structured grids utilizing a 2-refinement approach where uniformly refined hexes are subdivided into eight child elements. The refinement algorithm consists of identifying marked nodes that are used as the basis for a set of four simple refinement templates. The target application for 2-refinement is a parallel grid-based all-hex meshing tool for high performance computing in a distributed environment. The result is a parallel consistent locally refined mesh requiring minimal communication and where minimum mesh quality is greater than scaled Jacobian 0.3more » prior to smoothing.« less
A template-based approach for parallel hexahedral two-refinement
DOE Office of Scientific and Technical Information (OSTI.GOV)
Owen, Steven J.; Shih, Ryan M.; Ernst, Corey D.
Here, we provide a template-based approach for generating locally refined all-hex meshes. We focus specifically on refinement of initially structured grids utilizing a 2-refinement approach where uniformly refined hexes are subdivided into eight child elements. The refinement algorithm consists of identifying marked nodes that are used as the basis for a set of four simple refinement templates. The target application for 2-refinement is a parallel grid-based all-hex meshing tool for high performance computing in a distributed environment. The result is a parallel consistent locally refined mesh requiring minimal communication and where minimum mesh quality is greater than scaled Jacobian 0.3more » prior to smoothing.« less
Parallel volume ray-casting for unstructured-grid data on distributed-memory architectures
NASA Technical Reports Server (NTRS)
Ma, Kwan-Liu
1995-01-01
As computing technology continues to advance, computational modeling of scientific and engineering problems produces data of increasing complexity: large in size and unstructured in shape. Volume visualization of such data is a challenging problem. This paper proposes a distributed parallel solution that makes ray-casting volume rendering of unstructured-grid data practical. Both the data and the rendering process are distributed among processors. At each processor, ray-casting of local data is performed independent of the other processors. The global image composing processes, which require inter-processor communication, are overlapped with the local ray-casting processes to achieve maximum parallel efficiency. This algorithm differs from previous ones in four ways: it is completely distributed, less view-dependent, reasonably scalable, and flexible. Without using dynamic load balancing, test results on the Intel Paragon using from two to 128 processors show, on average, about 60% parallel efficiency.
Sublattice parallel replica dynamics.
Martínez, Enrique; Uberuaga, Blas P; Voter, Arthur F
2014-06-01
Exascale computing presents a challenge for the scientific community as new algorithms must be developed to take full advantage of the new computing paradigm. Atomistic simulation methods that offer full fidelity to the underlying potential, i.e., molecular dynamics (MD) and parallel replica dynamics, fail to use the whole machine speedup, leaving a region in time and sample size space that is unattainable with current algorithms. In this paper, we present an extension of the parallel replica dynamics algorithm [A. F. Voter, Phys. Rev. B 57, R13985 (1998)] by combining it with the synchronous sublattice approach of Shim and Amar [ and , Phys. Rev. B 71, 125432 (2005)], thereby exploiting event locality to improve the algorithm scalability. This algorithm is based on a domain decomposition in which events happen independently in different regions in the sample. We develop an analytical expression for the speedup given by this sublattice parallel replica dynamics algorithm and compare it with parallel MD and traditional parallel replica dynamics. We demonstrate how this algorithm, which introduces a slight additional approximation of event locality, enables the study of physical systems unreachable with traditional methodologies and promises to better utilize the resources of current high performance and future exascale computers.
Costa-Font, Joan; Kanavos, Panos
2007-01-01
To examine the effects of parallel simvastatin importation on drug price in three of the main parallel importing countries in the European Union, namely the United Kingdom, Germany, and the Netherlands. To estimate the market share of parallel imported simvastatin and the unit price -both locally produced and parallel imported- adjusted by defined daily dose in the importing country and in the exporting country (Spain). Ordinary least squares regression was used to examine the potential price competition resulting from parallel drug trade between 1997 and 2002. The market share of parallel imported simvastatin progressively expanded (especially in the United Kingdom and Germany) in the period examined, although the price difference between parallel imported and locally sourced simvastatin was not significant. Prices tended to rise in the United Kingdom and Germany and declined in the Netherlands. We found no evidence of pro-competitive effects resulting from the expansion of parallel trade. The development of parallel drug importation in the European Union produced unexpected effects (limited competition) on prices that differ from those expected by the introduction of a new competitor. This is partially the result of drug price regulation scant incentives to competition and of the lack of transparency in the drug reimbursement system, especially due to the effect of informal discounts (not observable to researchers). The case of simvastatin reveals that savings to the health system from parallel trade are trivial. Finally, of the three countries examined, the only country that shows a moderate downward pattern in simvastatin prices is the Netherlands. This effect can be attributed to the existence of a system that claws back informal discounts.
[CMACPAR an modified parallel neuro-controller for control processes].
Ramos, E; Surós, R
1999-01-01
CMACPAR is a Parallel Neurocontroller oriented to real time systems as for example Control Processes. Its characteristics are mainly a fast learning algorithm, a reduced number of calculations, great generalization capacity, local learning and intrinsic parallelism. This type of neurocontroller is used in real time applications required by refineries, hydroelectric centers, factories, etc. In this work we present the analysis and the parallel implementation of a modified scheme of the Cerebellar Model CMAC for the n-dimensional space projection using a mean granularity parallel neurocontroller. The proposed memory management allows for a significant memory reduction in training time and required memory size.
Parallel Directionally Split Solver Based on Reformulation of Pipelined Thomas Algorithm
NASA Technical Reports Server (NTRS)
Povitsky, A.
1998-01-01
In this research an efficient parallel algorithm for 3-D directionally split problems is developed. The proposed algorithm is based on a reformulated version of the pipelined Thomas algorithm that starts the backward step computations immediately after the completion of the forward step computations for the first portion of lines This algorithm has data available for other computational tasks while processors are idle from the Thomas algorithm. The proposed 3-D directionally split solver is based on the static scheduling of processors where local and non-local, data-dependent and data-independent computations are scheduled while processors are idle. A theoretical model of parallelization efficiency is used to define optimal parameters of the algorithm, to show an asymptotic parallelization penalty and to obtain an optimal cover of a global domain with subdomains. It is shown by computational experiments and by the theoretical model that the proposed algorithm reduces the parallelization penalty about two times over the basic algorithm for the range of the number of processors (subdomains) considered and the number of grid nodes per subdomain.
A Robust and Scalable Software Library for Parallel Adaptive Refinement on Unstructured Meshes
NASA Technical Reports Server (NTRS)
Lou, John Z.; Norton, Charles D.; Cwik, Thomas A.
1999-01-01
The design and implementation of Pyramid, a software library for performing parallel adaptive mesh refinement (PAMR) on unstructured meshes, is described. This software library can be easily used in a variety of unstructured parallel computational applications, including parallel finite element, parallel finite volume, and parallel visualization applications using triangular or tetrahedral meshes. The library contains a suite of well-designed and efficiently implemented modules that perform operations in a typical PAMR process. Among these are mesh quality control during successive parallel adaptive refinement (typically guided by a local-error estimator), parallel load-balancing, and parallel mesh partitioning using the ParMeTiS partitioner. The Pyramid library is implemented in Fortran 90 with an interface to the Message-Passing Interface (MPI) library, supporting code efficiency, modularity, and portability. An EM waveguide filter application, adaptively refined using the Pyramid library, is illustrated.
Qiu, Mingfeng; Bailey, Brian N.; Stoll, Rob
2014-01-01
The validity of the compressible Reynolds equation to predict the local pressure in a gas-lubricated, textured parallel slider bearing is investigated. The local bearing pressure is numerically simulated using the Reynolds equation and the Navier-Stokes equations for different texture geometries and operating conditions. The respective results are compared and the simplifying assumptions inherent in the application of the Reynolds equation are quantitatively evaluated. The deviation between the local bearing pressure obtained with the Reynolds equation and the Navier-Stokes equations increases with increasing texture aspect ratio, because a significant cross-film pressure gradient and a large velocity gradient in the sliding direction develop in the lubricant film. Inertia is found to be negligible throughout this study. PMID:25049440
Optimisation of a parallel ocean general circulation model
NASA Astrophysics Data System (ADS)
Beare, M. I.; Stevens, D. P.
1997-10-01
This paper presents the development of a general-purpose parallel ocean circulation model, for use on a wide range of computer platforms, from traditional scalar machines to workstation clusters and massively parallel processors. Parallelism is provided, as a modular option, via high-level message-passing routines, thus hiding the technical intricacies from the user. An initial implementation highlights that the parallel efficiency of the model is adversely affected by a number of factors, for which optimisations are discussed and implemented. The resulting ocean code is portable and, in particular, allows science to be achieved on local workstations that could otherwise only be undertaken on state-of-the-art supercomputers.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hau, L.-N.; Department of Physics, National Central University, Jhongli, Taiwan; Lai, Y.-T.
Harris-type current sheets with the magnetic field model of B-vector=B{sub x}(z)x-caret+B{sub y}(z)y-caret have many important applications to space, astrophysical, and laboratory plasmas for which the temperature or pressure usually exhibits the gyrotropic form of p{r_reversible}=p{sub Parallel-To }b-caretb-caret+p{sub Up-Tack }(I{r_reversible}-b-caretb-caret). Here, p{sub Parallel-To} and p{sub Up-Tack} are, respectively, to be the pressure component along and perpendicular to the local magnetic field, b-caret=B-vector/B. This study presents the general formulation for magnetohydrodynamic (MHD) wave propagation, fire-hose, and mirror instabilities in general Harris-type current sheets. The wave equations are expressed in terms of the four MHD characteristic speeds of fast, intermediate, slow, and cuspmore » waves, and in the local (k{sub Parallel-To },k{sub Up-Tack },z) coordinates. Here, k{sub Parallel-To} and k{sub Up-Tack} are, respectively, to be the wave vector along and perpendicular to the local magnetic field. The parameter regimes for the existence of discrete and resonant modes are identified, which may become unstable at the local fire-hose and mirror instability thresholds. Numerical solutions for discrete eigenmodes are shown for stable and unstable cases. The results have important implications for the anomalous heating and stability of thin current sheets.« less
Metascalable molecular dynamics simulation of nano-mechano-chemistry
NASA Astrophysics Data System (ADS)
Shimojo, F.; Kalia, R. K.; Nakano, A.; Nomura, K.; Vashishta, P.
2008-07-01
We have developed a metascalable (or 'design once, scale on new architectures') parallel application-development framework for first-principles based simulations of nano-mechano-chemical processes on emerging petaflops architectures based on spatiotemporal data locality principles. The framework consists of (1) an embedded divide-and-conquer (EDC) algorithmic framework based on spatial locality to design linear-scaling algorithms, (2) a space-time-ensemble parallel (STEP) approach based on temporal locality to predict long-time dynamics, and (3) a tunable hierarchical cellular decomposition (HCD) parallelization framework to map these scalable algorithms onto hardware. The EDC-STEP-HCD framework exposes and expresses maximal concurrency and data locality, thereby achieving parallel efficiency as high as 0.99 for 1.59-billion-atom reactive force field molecular dynamics (MD) and 17.7-million-atom (1.56 trillion electronic degrees of freedom) quantum mechanical (QM) MD in the framework of the density functional theory (DFT) on adaptive multigrids, in addition to 201-billion-atom nonreactive MD, on 196 608 IBM BlueGene/L processors. We have also used the framework for automated execution of adaptive hybrid DFT/MD simulation on a grid of six supercomputers in the US and Japan, in which the number of processors changed dynamically on demand and tasks were migrated according to unexpected faults. The paper presents the application of the framework to the study of nanoenergetic materials: (1) combustion of an Al/Fe2O3 thermite and (2) shock initiation and reactive nanojets at a void in an energetic crystal.
pcircle - A Suite of Scalable Parallel File System Tools
DOE Office of Scientific and Technical Information (OSTI.GOV)
WANG, FEIYI
2015-10-01
Most of the software related to file system are written for conventional local file system, they are serialized and can't take advantage of the benefit of a large scale parallel file system. "pcircle" software builds on top of ubiquitous MPI in cluster computing environment and "work-stealing" pattern to provide a scalable, high-performance suite of file system tools. In particular - it implemented parallel data copy and parallel data checksumming, with advanced features such as async progress report, checkpoint and restart, as well as integrity checking.
Han, S; Humphreys, G W; Chen, L
1999-10-01
The role of perceptual grouping and the encoding of closure of local elements in the processing of hierarchical patterns was studied. Experiments 1 and 2 showed a global advantage over the local level for 2 tasks involving the discrimination of orientation and closure, but there was a local advantage for the closure discrimination task relative to the orientation discrimination task. Experiment 3 showed a local precedence effect for the closure discrimination task when local element grouping was weakened by embedding the stimuli from Experiment 1 in a background made up of cross patterns. Experiments 4A and 4B found that dissimilarity of closure between the local elements of hierarchical stimuli and the background figures could facilitate the grouping of closed local elements and enhanced the perception of global structure. Experiment 5 showed that the advantage for detecting the closure of local elements in hierarchical analysis also held under divided- and selective-attention conditions. Results are consistent with the idea that grouping between local elements takes place in parallel and competes with the computation of closure of local elements in determining the selection between global and local levels of hierarchical patterns for response.
Device-independent parallel self-testing of two singlets
NASA Astrophysics Data System (ADS)
Wu, Xingyao; Bancal, Jean-Daniel; McKague, Matthew; Scarani, Valerio
2016-06-01
Device-independent self-testing offers the possibility of certifying the quantum state and measurements, up to local isometries, using only the statistics observed by querying uncharacterized local devices. In this paper we study parallel self-testing of two maximally entangled pairs of qubits; in particular, the local tensor product structure is not assumed but derived. We prove two criteria that achieve the desired result: a double use of the Clauser-Horne-Shimony-Holt inequality and the 3 ×3 magic square game. This demonstrate that the magic square game can only be perfectly won by measuring a two-singlet state. The tolerance to noise is well within reach of state-of-the-art experiments.
Accessing and Visualizing scientific spatiotemporal data
NASA Technical Reports Server (NTRS)
Katz, Daniel S.; Bergou, Attila; Berriman, Bruce G.; Block, Gary L.; Collier, Jim; Curkendall, David W.; Good, John; Husman, Laura; Jacob, Joseph C.; Laity, Anastasia;
2004-01-01
This paper discusses work done by JPL 's Parallel Applications Technologies Group in helping scientists access and visualize very large data sets through the use of multiple computing resources, such as parallel supercomputers, clusters, and grids These tools do one or more of the following tasks visualize local data sets for local users, visualize local data sets for remote users, and access and visualize remote data sets The tools are used for various types of data, including remotely sensed image data, digital elevation models, astronomical surveys, etc The paper attempts to pull some common elements out of these tools that may be useful for others who have to work with similarly large data sets.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tsugane, Keisuke; Boku, Taisuke; Murai, Hitoshi
Recently, the Partitioned Global Address Space (PGAS) parallel programming model has emerged as a usable distributed memory programming model. XcalableMP (XMP) is a PGAS parallel programming language that extends base languages such as C and Fortran with directives in OpenMP-like style. XMP supports a global-view model that allows programmers to define global data and to map them to a set of processors, which execute the distributed global data as a single thread. In XMP, the concept of a coarray is also employed for local-view programming. In this study, we port Gyrokinetic Toroidal Code - Princeton (GTC-P), which is a three-dimensionalmore » gyrokinetic PIC code developed at Princeton University to study the microturbulence phenomenon in magnetically confined fusion plasmas, to XMP as an example of hybrid memory model coding with the global-view and local-view programming models. In local-view programming, the coarray notation is simple and intuitive compared with Message Passing Interface (MPI) programming while the performance is comparable to that of the MPI version. Thus, because the global-view programming model is suitable for expressing the data parallelism for a field of grid space data, we implement a hybrid-view version using a global-view programming model to compute the field and a local-view programming model to compute the movement of particles. Finally, the performance is degraded by 20% compared with the original MPI version, but the hybrid-view version facilitates more natural data expression for static grid space data (in the global-view model) and dynamic particle data (in the local-view model), and it also increases the readability of the code for higher productivity.« less
Tsugane, Keisuke; Boku, Taisuke; Murai, Hitoshi; ...
2016-06-01
Recently, the Partitioned Global Address Space (PGAS) parallel programming model has emerged as a usable distributed memory programming model. XcalableMP (XMP) is a PGAS parallel programming language that extends base languages such as C and Fortran with directives in OpenMP-like style. XMP supports a global-view model that allows programmers to define global data and to map them to a set of processors, which execute the distributed global data as a single thread. In XMP, the concept of a coarray is also employed for local-view programming. In this study, we port Gyrokinetic Toroidal Code - Princeton (GTC-P), which is a three-dimensionalmore » gyrokinetic PIC code developed at Princeton University to study the microturbulence phenomenon in magnetically confined fusion plasmas, to XMP as an example of hybrid memory model coding with the global-view and local-view programming models. In local-view programming, the coarray notation is simple and intuitive compared with Message Passing Interface (MPI) programming while the performance is comparable to that of the MPI version. Thus, because the global-view programming model is suitable for expressing the data parallelism for a field of grid space data, we implement a hybrid-view version using a global-view programming model to compute the field and a local-view programming model to compute the movement of particles. Finally, the performance is degraded by 20% compared with the original MPI version, but the hybrid-view version facilitates more natural data expression for static grid space data (in the global-view model) and dynamic particle data (in the local-view model), and it also increases the readability of the code for higher productivity.« less
Waterfalls drive parallel evolution in a freshwater goby
Kano, Yuichi; Nishida, Shin; Nakajima, Jun
2012-01-01
Waterfalls may affect fish distribution and genetic structure within drainage networks even to the extent of leading evolutionary events. Here, parallel evolution was studied by focusing on waterfall and the landlocked freshwater goby Rhinogobius sp. YB (YB), which evolved from amphidromous R. brunneus (BR). The fish fauna was surveyed at 30 sites in 11 rivers on Iriomote Island, Japan, the geography of which was characterized by terraces/tablelands with many waterfalls. We found that all YB individuals were distributed only above waterfalls (height 6.8–58.7 m), whereas BR, and other fishes, were mostly distributed below waterfalls. Mitochondrial DNA analysis showed that every YB local population above the waterfall was independently evolved from BR. In contrast, cluster analysis of nine morphological characters, such as fin color and body pattern, showed that the morphology of YB individuals held a similarity beyond the genetic divergence, suggesting parallel evolution has occurred relating to their morphology. Genetic distance between each YB local population and BR was significantly correlated with waterfall height (r2 = 0.94), suggesting that the waterfalls have been heightened due to the constant geological erosion and that their height represents the isolation period of YB local populations from BR (ca. 11,000–88,000 years). Each local population of BR was once landlocked in upstream by waterfall formation, consequently evolving to YB in each site. Although the morphology of YB had a high degree of similarity among local populations, finer scale analysis showed that the morphology of YB was significantly correlated with the genetic distance from BR. Consequently, there could be simultaneous multiple phases of allopatric/parallel evolution of the goby due to variations in waterfall height on this small island. PMID:22957183
ERIC Educational Resources Information Center
Wise, Dena; Sneed, Christopher; Velandia, Margarita; Berry, Ann; Rhea, Alice; Fairhurst, Ann
2013-01-01
The Local Table project compared results from parallel surveys of consumers and restaurateurs regarding local food purchasing and use. Results were also compared with producers' perception of, capacity for and participation in direct marketing through local venues, on-farm outlets, and restaurants. The surveys found consumers' and restaurateurs'…
NASA Astrophysics Data System (ADS)
Ying, Jia-ju; Chen, Yu-dan; Liu, Jie; Wu, Dong-sheng; Lu, Jun
2016-10-01
The maladjustment of photoelectric instrument binocular optical axis parallelism will affect the observe effect directly. A binocular optical axis parallelism digital calibration system is designed. On the basis of the principle of optical axis binocular photoelectric instrument calibration, the scheme of system is designed, and the binocular optical axis parallelism digital calibration system is realized, which include four modules: multiband parallel light tube, optical axis translation, image acquisition system and software system. According to the different characteristics of thermal infrared imager and low-light-level night viewer, different algorithms is used to localize the center of the cross reticle. And the binocular optical axis parallelism calibration is realized for calibrating low-light-level night viewer and thermal infrared imager.
Parallel and serial grouping of image elements in visual perception.
Houtkamp, Roos; Roelfsema, Pieter R
2010-12-01
The visual system groups image elements that belong to an object and segregates them from other objects and the background. Important cues for this grouping process are the Gestalt criteria, and most theories propose that these are applied in parallel across the visual scene. Here, we find that Gestalt grouping can indeed occur in parallel in some situations, but we demonstrate that there are also situations where Gestalt grouping becomes serial. We observe substantial time delays when image elements have to be grouped indirectly through a chain of local groupings. We call this chaining process incremental grouping and demonstrate that it can occur for only a single object at a time. We suggest that incremental grouping requires the gradual spread of object-based attention so that eventually all the object's parts become grouped explicitly by an attentional labeling process. Our findings inspire a new incremental grouping theory that relates the parallel, local grouping process to feedforward processing and the serial, incremental grouping process to recurrent processing in the visual cortex.
Developing Local Lifelong Guidance Strategies.
ERIC Educational Resources Information Center
Watts, A. G.; Hawthorn, Ruth; Hoffbrand, Jill; Jackson, Heather; Spurling, Andrea
1997-01-01
Outlines the background, rationale, methodology, and outcomes of developing local lifelong guidance strategies in four geographic areas. Analyzes the main components of the strategies developed and addresses a number of issues relating to the process of strategy development. Explores implications for parallel work in other localities. (RJM)
NASA Astrophysics Data System (ADS)
Song, Y.; Lysak, R. L.
2015-12-01
Parallel E-fields play a crucial role for the acceleration of charged particles, creating discrete aurorae. However, once the parallel electric fields are produced, they will disappear right away, unless the electric fields can be continuously generated and sustained for a fairly long time. Thus, the crucial question in auroral physics is how to generate such a powerful and self-sustained parallel electric fields which can effectively accelerate charge particles to high energy during a fairly long time. We propose that nonlinear interaction of incident and reflected Alfven wave packets in inhomogeneous auroral acceleration region can produce quasi-stationary non-propagating electromagnetic plasma structures, such as Alfvenic double layers (DLs) and Charge Holes. Such Alfvenic quasi-static structures often constitute powerful high energy particle accelerators. The Alfvenic DL consists of localized self-sustained powerful electrostatic electric fields nested in a low density cavity and surrounded by enhanced magnetic and mechanical stresses. The enhanced magnetic and velocity fields carrying the free energy serve as a local dynamo, which continuously create the electrostatic parallel electric field for a fairly long time. The generated parallel electric fields will deepen the seed low density cavity, which then further quickly boosts the stronger parallel electric fields creating both Alfvenic and quasi-static discrete aurorae. The parallel electrostatic electric field can also cause ion outflow, perpendicular ion acceleration and heating, and may excite Auroral Kilometric Radiation.
Large-scale trench-perpendicular mantle flow beneath northern Chile
NASA Astrophysics Data System (ADS)
Reiss, M. C.; Rumpker, G.; Woelbern, I.
2017-12-01
We investigate the anisotropic properties of the forearc region of the central Andean margin by analyzing shear-wave splitting from teleseismic and local earthquakes from the Nazca slab. The data stems from the Integrated Plate boundary Observatory Chile (IPOC) located in northern Chile, covering an approximately 120 km wide coastal strip between 17°-25° S with an average station spacing of 60 km. With partly over ten years of data, this data set is uniquely suited to address the long-standing debate about the mantle flow field at the South American margin and in particular whether the flow field beneath the slab is parallel or perpendicular to the trench. Our measurements yield two distinct anisotropic layers. The teleseismic measurements show a change of fast polarizations directions from North to South along the trench ranging from parallel to subparallel to the absolute plate motion and, given the geometry of absolute plate motion and strike of the trench, mostly perpendicular to the trench. Shear-wave splitting from local earthquakes shows fast polarizations roughly aligned trench-parallel but exhibit short-scale variations which are indicative of a relatively shallow source. Comparisons between fast polarization directions and the strike of the local fault systems yield a good agreement. We use forward modelling to test the influence of the upper layer on the teleseismic measurements. We show that the observed variations of teleseismic measurements along the trench are caused by the anisotropy in the upper layer. Accordingly, the mantle layer is best characterized by an anisotropic fast axes parallel to the absolute plate motion which is roughly trench-perpendicular. This anisotropy is likely caused by a combination of crystallographic preferred orientation of the mantle mineral olivine as fossilized anisotropy in the slab and entrained flow beneath the slab. We interpret the upper anisotropic layer to be confined to the crust of the overriding continental plate. This is explained by the shape-preferred orientation of micro-cracks in relation to local fault zones which are oriented parallel the overall strike of the Andean range. Our results do not provide any evidence for a significant contribution of trench-parallel mantle flow beneath the subducting slab to the measurements.
Fully Parallel MHD Stability Analysis Tool
NASA Astrophysics Data System (ADS)
Svidzinski, Vladimir; Galkin, Sergei; Kim, Jin-Soo; Liu, Yueqiang
2014-10-01
Progress on full parallelization of the plasma stability code MARS will be reported. MARS calculates eigenmodes in 2D axisymmetric toroidal equilibria in MHD-kinetic plasma models. It is a powerful tool for studying MHD and MHD-kinetic instabilities and it is widely used by fusion community. Parallel version of MARS is intended for simulations on local parallel clusters. It will be an efficient tool for simulation of MHD instabilities with low, intermediate and high toroidal mode numbers within both fluid and kinetic plasma models, already implemented in MARS. Parallelization of the code includes parallelization of the construction of the matrix for the eigenvalue problem and parallelization of the inverse iterations algorithm, implemented in MARS for the solution of the formulated eigenvalue problem. Construction of the matrix is parallelized by distributing the load among processors assigned to different magnetic surfaces. Parallelization of the solution of the eigenvalue problem is made by repeating steps of the present MARS algorithm using parallel libraries and procedures. Initial results of the code parallelization will be reported. Work is supported by the U.S. DOE SBIR program.
Measures of three-dimensional anisotropy and intermittency in strong Alfvénic turbulence
NASA Astrophysics Data System (ADS)
Mallet, A.; Schekochihin, A. A.; Chandran, B. D. G.; Chen, C. H. K.; Horbury, T. S.; Wicks, R. T.; Greenan, C. C.
2016-06-01
We measure the local anisotropy of numerically simulated strong Alfvénic turbulence with respect to two local, physically relevant directions: along the local mean magnetic field and along the local direction of one of the fluctuating Elsasser fields. We find significant scaling anisotropy with respect to both these directions: the fluctuations are `ribbon-like' - statistically, they are elongated along both the mean magnetic field and the fluctuating field. The latter form of anisotropy is due to scale-dependent alignment of the fluctuating fields. The intermittent scalings of the nth-order conditional structure functions in the direction perpendicular to both the local mean field and the fluctuations agree well with the theory of Chandran, Schekochihin & Mallet, while the parallel scalings are consistent with those implied by the critical-balance conjecture. We quantify the relationship between the perpendicular scalings and those in the fluctuation and parallel directions, and find that the scaling exponent of the perpendicular anisotropy (I.e. of the aspect ratio of the Alfvénic structures in the plane perpendicular to the mean magnetic field) depends on the amplitude of the fluctuations. This is shown to be equivalent to the anticorrelation of fluctuation amplitude and alignment at each scale. The dependence of the anisotropy on amplitude is shown to be more significant for the anisotropy between the perpendicular and fluctuation-direction scales than it is between the perpendicular and parallel scales.
Massively parallel processor networks with optical express channels
Deri, R.J.; Brooks, E.D. III; Haigh, R.E.; DeGroot, A.J.
1999-08-24
An optical method for separating and routing local and express channel data comprises interconnecting the nodes in a network with fiber optic cables. A single fiber optic cable carries both express channel traffic and local channel traffic, e.g., in a massively parallel processor (MPP) network. Express channel traffic is placed on, or filtered from, the fiber optic cable at a light frequency or a color different from that of the local channel traffic. The express channel traffic is thus placed on a light carrier that skips over the local intermediate nodes one-by-one by reflecting off of selective mirrors placed at each local node. The local-channel-traffic light carriers pass through the selective mirrors and are not reflected. A single fiber optic cable can thus be threaded throughout a three-dimensional matrix of nodes with the x,y,z directions of propagation encoded by the color of the respective light carriers for both local and express channel traffic. Thus frequency division multiple access is used to hierarchically separate the local and express channels to eliminate the bucket brigade latencies that would otherwise result if the express traffic had to hop between every local node to reach its ultimate destination. 3 figs.
Massively parallel processor networks with optical express channels
Deri, Robert J.; Brooks, III, Eugene D.; Haigh, Ronald E.; DeGroot, Anthony J.
1999-01-01
An optical method for separating and routing local and express channel data comprises interconnecting the nodes in a network with fiber optic cables. A single fiber optic cable carries both express channel traffic and local channel traffic, e.g., in a massively parallel processor (MPP) network. Express channel traffic is placed on, or filtered from, the fiber optic cable at a light frequency or a color different from that of the local channel traffic. The express channel traffic is thus placed on a light carrier that skips over the local intermediate nodes one-by-one by reflecting off of selective mirrors placed at each local node. The local-channel-traffic light carriers pass through the selective mirrors and are not reflected. A single fiber optic cable can thus be threaded throughout a three-dimensional matrix of nodes with the x,y,z directions of propagation encoded by the color of the respective light carriers for both local and express channel traffic. Thus frequency division multiple access is used to hierarchically separate the local and express channels to eliminate the bucket brigade latencies that would otherwise result if the express traffic had to hop between every local node to reach its ultimate destination.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kargupta, H.; Stafford, B.; Hamzaoglu, I.
This paper describes an experimental parallel/distributed data mining system PADMA (PArallel Data Mining Agents) that uses software agents for local data accessing and analysis and a web based interface for interactive data visualization. It also presents the results of applying PADMA for detecting patterns in unstructured texts of postmortem reports and laboratory test data for Hepatitis C patients.
Quantum communication beyond the localization length in disordered spin chains.
Allcock, Jonathan; Linden, Noah
2009-03-20
We study the effects of localization on quantum state transfer in spin chains. We show how to use quantum error correction and multiple parallel spin chains to send a qubit with high fidelity over arbitrary distances, in particular, distances much greater than the localization length of the chain.
Zhu, Xiang; Zhang, Dianwen
2013-01-01
We present a fast, accurate and robust parallel Levenberg-Marquardt minimization optimizer, GPU-LMFit, which is implemented on graphics processing unit for high performance scalable parallel model fitting processing. GPU-LMFit can provide a dramatic speed-up in massive model fitting analyses to enable real-time automated pixel-wise parametric imaging microscopy. We demonstrate the performance of GPU-LMFit for the applications in superresolution localization microscopy and fluorescence lifetime imaging microscopy. PMID:24130785
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chacon, Luis; del-Castillo-Negrete, Diego; Hauck, Cory D.
2014-09-01
We propose a Lagrangian numerical algorithm for a time-dependent, anisotropic temperature transport equation in magnetized plasmas in the large guide field regime. The approach is based on an analytical integral formal solution of the parallel (i.e., along the magnetic field) transport equation with sources, and it is able to accommodate both local and non-local parallel heat flux closures. The numerical implementation is based on an operator-split formulation, with two straightforward steps: a perpendicular transport step (including sources), and a Lagrangian (field-line integral) parallel transport step. Algorithmically, the first step is amenable to the use of modern iterative methods, while themore » second step has a fixed cost per degree of freedom (and is therefore scalable). Accuracy-wise, the approach is free from the numerical pollution introduced by the discrete parallel transport term when the perpendicular to parallel transport coefficient ratio X ⊥ /X ∥ becomes arbitrarily small, and is shown to capture the correct limiting solution when ε = X⊥L 2 ∥/X1L 2 ⊥ → 0 (with L∥∙ L⊥ , the parallel and perpendicular diffusion length scales, respectively). Therefore, the approach is asymptotic-preserving. We demonstrate the capabilities of the scheme with several numerical experiments with varying magnetic field complexity in two dimensions, including the case of transport across a magnetic island.« less
Crustal origin of trench-parallel shear-wave fast polarizations in the Central Andes
NASA Astrophysics Data System (ADS)
Wölbern, I.; Löbl, U.; Rümpker, G.
2014-04-01
In this study, SKS and local S phases are analyzed to investigate variations of shear-wave splitting parameters along two dense seismic profiles across the central Andean Altiplano and Puna plateaus. In contrast to previous observations, the vast majority of the measurements reveal fast polarizations sub-parallel to the subduction direction of the Nazca plate with delay times between 0.3 and 1.2 s. Local phases show larger variations of fast polarizations and exhibit delay times ranging between 0.1 and 1.1 s. Two 70 km and 100 km wide sections along the Altiplano profile exhibit larger delay times and are characterized by fast polarizations oriented sub-parallel to major fault zones. Based on finite-difference wavefield calculations for anisotropic subduction zone models we demonstrate that the observations are best explained by fossil slab anisotropy with fast symmetry axes oriented sub-parallel to the slab movement in combination with a significant component of crustal anisotropy of nearly trench-parallel fast-axis orientation. From the modeling we exclude a sub-lithospheric origin of the observed strong anomalies due to the short-scale variations of the fast polarizations. Instead, our results indicate that anisotropy in the Central Andes generally reflects the direction of plate motion while the observed trench-parallel fast polarizations likely originate in the continental crust above the subducting slab.
Performing an allreduce operation on a plurality of compute nodes of a parallel computer
Faraj, Ahmad
2013-02-12
Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer, each node including at least two processing cores, that include: performing, for each node, a local reduction operation using allreduce contribution data for the cores of that node, yielding, for each node, a local reduction result for one or more representative cores for that node; establishing one or more logical rings among the nodes, each logical ring including only one of the representative cores from each node; performing, for each logical ring, a global allreduce operation using the local reduction result for the representative cores included in that logical ring, yielding a global allreduce result for each representative core included in that logical ring; and performing, for each node, a local broadcast operation using the global allreduce results for each representative core on that node.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brächer, T.; Graduate School Materials Science in Mainz, Gottlieb-Daimler-Strasse 47, D-67663 Kaiserslautern; Pirro, P.
2014-03-03
We present the experimental observation of localized parallel parametric generation of spin waves in a transversally in-plane magnetized Ni{sub 81}Fe{sub 19} magnonic waveguide. The localization is realized by combining the threshold character of parametric generation with a spatially confined enhancement of the amplifying microwave field. The latter is achieved by modulating the width of the microstrip transmission line which is used to provide the pumping field. By employing microfocussed Brillouin light scattering spectroscopy, we analyze the spatial distribution of the generated spin waves and compare it with numerical calculations of the field distribution along the Ni{sub 81}Fe{sub 19} waveguide. Thismore » provides a local spin-wave excitation in transversally in-plane magnetized waveguides for a wide wave-vector range which is not restricted by the size of the generation area.« less
Ongoing data reduction, theoretical studies and supporting research in magnetospheric physics
NASA Technical Reports Server (NTRS)
Scarf, F. L.; Greenstadt, E. W.
1984-01-01
Data from ISEE-3, Pioneer Venus Orbiter, and Voyager 1 and 2 were analyzed. The predictability of local shock macrostructure at ISEE-1, at the Earth's bow shock, from solar wind measurements made up-stream by ISEE-3, was conducted using computer graphic format. Morphology of quasi-parallel shock was reviewed. The review attempted to interrelate various measurements and computations involving the q-parallel structure and foreshock elements connected to it. A new classification for q-parallel morphology was suggested.
Fast, Massively Parallel Data Processors
NASA Technical Reports Server (NTRS)
Heaton, Robert A.; Blevins, Donald W.; Davis, ED
1994-01-01
Proposed fast, massively parallel data processor contains 8x16 array of processing elements with efficient interconnection scheme and options for flexible local control. Processing elements communicate with each other on "X" interconnection grid with external memory via high-capacity input/output bus. This approach to conditional operation nearly doubles speed of various arithmetic operations.
Limpanuparb, Taweetham; Milthorpe, Josh; Rendell, Alistair P
2014-10-30
Use of the modern parallel programming language X10 for computing long-range Coulomb and exchange interactions is presented. By using X10, a partitioned global address space language with support for task parallelism and the explicit representation of data locality, the resolution of the Ewald operator can be parallelized in a straightforward manner including use of both intranode and internode parallelism. We evaluate four different schemes for dynamic load balancing of integral calculation using X10's work stealing runtime, and report performance results for long-range HF energy calculation of large molecule/high quality basis running on up to 1024 cores of a high performance cluster machine. Copyright © 2014 Wiley Periodicals, Inc.
A New Parallel Boundary Condition for Turbulence Simulations in Stellarators
NASA Astrophysics Data System (ADS)
Martin, Mike F.; Landreman, Matt; Dorland, William; Xanthopoulos, Pavlos
2017-10-01
For gyrokinetic simulations of core turbulence, the ``twist-and-shift'' parallel boundary condition (Beer et al., PoP, 1995), which involves a shift in radial wavenumber proportional to the global shear and a quantization of the simulation domain's aspect ratio, is the standard choice. But as this condition was derived under the assumption of axisymmetry, ``twist-and-shift'' as it stands is formally incorrect for turbulence simulations in stellarators. Moreover, for low-shear stellarators like W7X and HSX, the use of a global shear in the traditional boundary condition places an inflexible constraint on the aspect ratio of the domain, requiring more grid points to fully resolve its extent. Here, we present a parallel boundary condition for ``stellarator-symmetric'' simulations that relies on the local shear along a field line. This boundary condition is similar to ``twist-and-shift'', but has an added flexibility in choosing the parallel length of the domain based on local shear consideration in order to optimize certain parameters such as the aspect ratio of the simulation domain.
Parallel heat transport in reversed shear magnetic field configurations
NASA Astrophysics Data System (ADS)
Blazevski, D.; Del-Castillo-Negrete, D.
2012-03-01
Transport in magnetized plasmas is a key problem in controlled fusion, space plasmas, and astrophysics. Three issues make this problem particularly challenging: (i) The extreme anisotropy between the parallel (i.e., along the magnetic field), χ, and the perpendicular, χ, conductivities (χ/χ may exceed 10^10 in fusion plasmas); (ii) Magnetic field lines chaos; and (iii) Nonlocal parallel transport. We have recently developed a Lagrangian Green's function (LG) method to solve the local and non-local parallel (χ/χ->∞) transport equation applicable to integrable and chaotic magnetic fields. footnotetext D. del-Castillo-Negrete, L. Chac'on, PRL, 106, 195004 (2011); D. del-Castillo-Negrete, L. Chac'on, Phys. Plasmas, APS Invited paper, submitted (2011). The proposed method overcomes many of the difficulties faced by standard finite different methods related to the three issues mentioned above. Here we apply the LG method to study transport in reversed shear configurations. We focus on the following problems: (i) separatrix reconnection of magnetic islands and transport; (ii) robustness of shearless, q'=0, transport barriers; (iii) leaky barriers and shearless Cantori.
Magnetic spectral signatures in the Earth's magnetosheath and plasma depletion layer
NASA Technical Reports Server (NTRS)
Anderson, Brian J.; Fuselier, Stephen A.; Gary, S. Peter; Denton, Richard E.
1994-01-01
Correlations between plasma properties and magnetic fluctuations in the sub-solar magnetosheath downstream of a quasi-perpendicular shock have been found and indicate that mirror and ion cyclotronlike fluctuations correlate with the magnetosheath proper and plasma depletion layer, respectively (Anderson and Fueselier, 1993). We explore the entire range of magnetic spectral signatures observed from the Active Magnetospheric Particle Tracer Explorers/Charge Composition Explorer (AMPTE/CCE)spacecraft in the magnetosheath downstream of a quasi-perpendicular shock. The magnetic spectral signatures typically progress from predominantly compressional fluctuations,delta B(sub parallel)/delta B perpendicular to approximately 3, with F/F (sub p) less than 0.2 (F and F (sub p) are the wave frequency and proton gyrofrequency, respectively) to predominantly transverse fluctuations, delta B(sub parallel)/delta B perpendicular to approximately 0.3, extending up to F(sub p). The compressional fluctuations are characterized by anticorrelation between the field magnitude and electron density, n(sub e), and by a small compressibility, C(sub e) identically equal to (delta n(sub e)/n(sub e)) (exp 2) (B/delta B(sub parallel)) (exp 2) approximately 0.13, indicative of mirror waves. The spectral characteristics of the transverse fluctuations are in agreement with predictions of linear Vlasov theory for the H(+) and He(2+) cyclotron modes. The power spectra and local plasma parameters are found to vary in concert: mirror waves occur for beta(s ub parallel p) (beta (sub parallel p) identically = 2 mu(sub zero) n(sub p) kT (sub parallel p) / B(exp 2) approximately = 2, A(sub p) indentically = T(sub perpendicular to p)/T(sub parallel p) - 1 approximately = 0.4, whereas cyclotron waves occur for beta (sub parallel p) approximately = 0.2 and A(sub p) approximately = 2. The transition from mirror to cyclotron modes is predicted by linear theory. The spectral characteristics overlap for intermediate plasma parameters. The plasma observations are described by A(sub p) = 0.85 beta(sub parallel P) (exp - 0.48) with a log regression coefficient of -0.74. This inverse A(sub p) - beta(sub parallel p) correlation corresponds closely to the isocontours of maximum ion anisotropy instability growth, gamma (sub m)/omega(sub p) = 0.01, for the mirror and cyclotron modes. The agreement of observed properties and predictions of local theory suggests that the spectral signatures reflect the local plasma environment and that the anisotropy instabilities regulate A(sub p). We suggest that the spectral characteristics may provide a useful basis for ordering observations in the magnetosheath and that the A(sub p) - beta(sub parallel p) inverse correlation may be used as a beta-dependent upper limit on the proton anisotropy to represent kinetic effects.
Numerical characteristics of quantum computer simulation
NASA Astrophysics Data System (ADS)
Chernyavskiy, A.; Khamitov, K.; Teplov, A.; Voevodin, V.; Voevodin, Vl.
2016-12-01
The simulation of quantum circuits is significantly important for the implementation of quantum information technologies. The main difficulty of such modeling is the exponential growth of dimensionality, thus the usage of modern high-performance parallel computations is relevant. As it is well known, arbitrary quantum computation in circuit model can be done by only single- and two-qubit gates, and we analyze the computational structure and properties of the simulation of such gates. We investigate the fact that the unique properties of quantum nature lead to the computational properties of the considered algorithms: the quantum parallelism make the simulation of quantum gates highly parallel, and on the other hand, quantum entanglement leads to the problem of computational locality during simulation. We use the methodology of the AlgoWiki project (algowiki-project.org) to analyze the algorithm. This methodology consists of theoretical (sequential and parallel complexity, macro structure, and visual informational graph) and experimental (locality and memory access, scalability and more specific dynamic characteristics) parts. Experimental part was made by using the petascale Lomonosov supercomputer (Moscow State University, Russia). We show that the simulation of quantum gates is a good base for the research and testing of the development methods for data intense parallel software, and considered methodology of the analysis can be successfully used for the improvement of the algorithms in quantum information science.
Fully Parallel MHD Stability Analysis Tool
NASA Astrophysics Data System (ADS)
Svidzinski, Vladimir; Galkin, Sergei; Kim, Jin-Soo; Liu, Yueqiang
2015-11-01
Progress on full parallelization of the plasma stability code MARS will be reported. MARS calculates eigenmodes in 2D axisymmetric toroidal equilibria in MHD-kinetic plasma models. It is a powerful tool for studying MHD and MHD-kinetic instabilities and it is widely used by fusion community. Parallel version of MARS is intended for simulations on local parallel clusters. It will be an efficient tool for simulation of MHD instabilities with low, intermediate and high toroidal mode numbers within both fluid and kinetic plasma models, already implemented in MARS. Parallelization of the code includes parallelization of the construction of the matrix for the eigenvalue problem and parallelization of the inverse iterations algorithm, implemented in MARS for the solution of the formulated eigenvalue problem. Construction of the matrix is parallelized by distributing the load among processors assigned to different magnetic surfaces. Parallelization of the solution of the eigenvalue problem is made by repeating steps of the present MARS algorithm using parallel libraries and procedures. Results of MARS parallelization and of the development of a new fix boundary equilibrium code adapted for MARS input will be reported. Work is supported by the U.S. DOE SBIR program.
A mirror for lab-based quasi-monochromatic parallel x-rays
NASA Astrophysics Data System (ADS)
Nguyen, Thanhhai; Lu, Xun; Lee, Chang Jun; Jung, Jin-Ho; Jin, Gye-Hwan; Kim, Sung Youb; Jeon, Insu
2014-09-01
A multilayered parabolic mirror with six W/Al bilayers was designed and fabricated to generate monochromatic parallel x-rays using a lab-based x-ray source. Using this mirror, curved bright bands were obtained in x-ray images as reflected x-rays. The parallelism of the reflected x-rays was investigated using the shape of the bands. The intensity and monochromatic characteristics of the reflected x-rays were evaluated through measurements of the x-ray spectra in the band. High intensity, nearly monochromatic, and parallel x-rays, which can be used for high resolution x-ray microscopes and local radiation therapy systems, were obtained.
NASA Technical Reports Server (NTRS)
Waheed, Abdul; Yan, Jerry
1998-01-01
This paper presents a model to evaluate the performance and overhead of parallelizing sequential code using compiler directives for multiprocessing on distributed shared memory (DSM) systems. With increasing popularity of shared address space architectures, it is essential to understand their performance impact on programs that benefit from shared memory multiprocessing. We present a simple model to characterize the performance of programs that are parallelized using compiler directives for shared memory multiprocessing. We parallelized the sequential implementation of NAS benchmarks using native Fortran77 compiler directives for an Origin2000, which is a DSM system based on a cache-coherent Non Uniform Memory Access (ccNUMA) architecture. We report measurement based performance of these parallelized benchmarks from four perspectives: efficacy of parallelization process; scalability; parallelization overhead; and comparison with hand-parallelized and -optimized version of the same benchmarks. Our results indicate that sequential programs can conveniently be parallelized for DSM systems using compiler directives but realizing performance gains as predicted by the performance model depends primarily on minimizing architecture-specific data locality overhead.
Performing a local reduction operation on a parallel computer
Blocksome, Michael A; Faraj, Daniel A
2013-06-04
A parallel computer including compute nodes, each including two reduction processing cores, a network write processing core, and a network read processing core, each processing core assigned an input buffer. Copying, in interleaved chunks by the reduction processing cores, contents of the reduction processing cores' input buffers to an interleaved buffer in shared memory; copying, by one of the reduction processing cores, contents of the network write processing core's input buffer to shared memory; copying, by another of the reduction processing cores, contents of the network read processing core's input buffer to shared memory; and locally reducing in parallel by the reduction processing cores: the contents of the reduction processing core's input buffer; every other interleaved chunk of the interleaved buffer; the copied contents of the network write processing core's input buffer; and the copied contents of the network read processing core's input buffer.
Performing a local reduction operation on a parallel computer
Blocksome, Michael A.; Faraj, Daniel A.
2012-12-11
A parallel computer including compute nodes, each including two reduction processing cores, a network write processing core, and a network read processing core, each processing core assigned an input buffer. Copying, in interleaved chunks by the reduction processing cores, contents of the reduction processing cores' input buffers to an interleaved buffer in shared memory; copying, by one of the reduction processing cores, contents of the network write processing core's input buffer to shared memory; copying, by another of the reduction processing cores, contents of the network read processing core's input buffer to shared memory; and locally reducing in parallel by the reduction processing cores: the contents of the reduction processing core's input buffer; every other interleaved chunk of the interleaved buffer; the copied contents of the network write processing core's input buffer; and the copied contents of the network read processing core's input buffer.
Automatic mesh refinement and parallel load balancing for Fokker-Planck-DSMC algorithm
NASA Astrophysics Data System (ADS)
Küchlin, Stephan; Jenny, Patrick
2018-06-01
Recently, a parallel Fokker-Planck-DSMC algorithm for rarefied gas flow simulation in complex domains at all Knudsen numbers was developed by the authors. Fokker-Planck-DSMC (FP-DSMC) is an augmentation of the classical DSMC algorithm, which mitigates the near-continuum deficiencies in terms of computational cost of pure DSMC. At each time step, based on a local Knudsen number criterion, the discrete DSMC collision operator is dynamically switched to the Fokker-Planck operator, which is based on the integration of continuous stochastic processes in time, and has fixed computational cost per particle, rather than per collision. In this contribution, we present an extension of the previous implementation with automatic local mesh refinement and parallel load-balancing. In particular, we show how the properties of discrete approximations to space-filling curves enable an efficient implementation. Exemplary numerical studies highlight the capabilities of the new code.
Yu, Dongjun; Wu, Xiaowei; Shen, Hongbin; Yang, Jian; Tang, Zhenmin; Qi, Yong; Yang, Jingyu
2012-12-01
Membrane proteins are encoded by ~ 30% in the genome and function importantly in the living organisms. Previous studies have revealed that membrane proteins' structures and functions show obvious cell organelle-specific properties. Hence, it is highly desired to predict membrane protein's subcellular location from the primary sequence considering the extreme difficulties of membrane protein wet-lab studies. Although many models have been developed for predicting protein subcellular locations, only a few are specific to membrane proteins. Existing prediction approaches were constructed based on statistical machine learning algorithms with serial combination of multi-view features, i.e., different feature vectors are simply serially combined to form a super feature vector. However, such simple combination of features will simultaneously increase the information redundancy that could, in turn, deteriorate the final prediction accuracy. That's why it was often found that prediction success rates in the serial super space were even lower than those in a single-view space. The purpose of this paper is investigation of a proper method for fusing multiple multi-view protein sequential features for subcellular location predictions. Instead of serial strategy, we propose a novel parallel framework for fusing multiple membrane protein multi-view attributes that will represent protein samples in complex spaces. We also proposed generalized principle component analysis (GPCA) for feature reduction purpose in the complex geometry. All the experimental results through different machine learning algorithms on benchmark membrane protein subcellular localization datasets demonstrate that the newly proposed parallel strategy outperforms the traditional serial approach. We also demonstrate the efficacy of the parallel strategy on a soluble protein subcellular localization dataset indicating the parallel technique is flexible to suite for other computational biology problems. The software and datasets are available at: http://www.csbio.sjtu.edu.cn/bioinf/mpsp.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vydyanathan, Naga; Krishnamoorthy, Sriram; Sabin, Gerald M.
2009-08-01
Complex parallel applications can often be modeled as directed acyclic graphs of coarse-grained application-tasks with dependences. These applications exhibit both task- and data-parallelism, and combining these two (also called mixedparallelism), has been shown to be an effective model for their execution. In this paper, we present an algorithm to compute the appropriate mix of task- and data-parallelism required to minimize the parallel completion time (makespan) of these applications. In other words, our algorithm determines the set of tasks that should be run concurrently and the number of processors to be allocated to each task. The processor allocation and scheduling decisionsmore » are made in an integrated manner and are based on several factors such as the structure of the taskgraph, the runtime estimates and scalability characteristics of the tasks and the inter-task data communication volumes. A locality conscious scheduling strategy is used to improve inter-task data reuse. Evaluation through simulations and actual executions of task graphs derived from real applications as well as synthetic graphs shows that our algorithm consistently generates schedules with lower makespan as compared to CPR and CPA, two previously proposed scheduling algorithms. Our algorithm also produces schedules that have lower makespan than pure taskand data-parallel schedules. For task graphs with known optimal schedules or lower bounds on the makespan, our algorithm generates schedules that are closer to the optima than other scheduling approaches.« less
Checkpoint-Restart in User Space
DOE Office of Scientific and Technical Information (OSTI.GOV)
CRUISE implements a user-space file system that stores data in main memory and transparently spills over to other storage, like local flash memory or the parallel file system, as needed. CRUISE also exposes file contents fo remote direct memory access, allowing external tools to copy files to the parallel file system in the background with reduced CPU interruption.
An annealed chaotic maximum neural network for bipartite subgraph problem.
Wang, Jiahai; Tang, Zheng; Wang, Ronglong
2004-04-01
In this paper, based on maximum neural network, we propose a new parallel algorithm that can help the maximum neural network escape from local minima by including a transient chaotic neurodynamics for bipartite subgraph problem. The goal of the bipartite subgraph problem, which is an NP- complete problem, is to remove the minimum number of edges in a given graph such that the remaining graph is a bipartite graph. Lee et al. presented a parallel algorithm using the maximum neural model (winner-take-all neuron model) for this NP- complete problem. The maximum neural model always guarantees a valid solution and greatly reduces the search space without a burden on the parameter-tuning. However, the model has a tendency to converge to a local minimum easily because it is based on the steepest descent method. By adding a negative self-feedback to the maximum neural network, we proposed a new parallel algorithm that introduces richer and more flexible chaotic dynamics and can prevent the network from getting stuck at local minima. After the chaotic dynamics vanishes, the proposed algorithm is then fundamentally reined by the gradient descent dynamics and usually converges to a stable equilibrium point. The proposed algorithm has the advantages of both the maximum neural network and the chaotic neurodynamics. A large number of instances have been simulated to verify the proposed algorithm. The simulation results show that our algorithm finds the optimum or near-optimum solution for the bipartite subgraph problem superior to that of the best existing parallel algorithms.
Performance of GeantV EM Physics Models
NASA Astrophysics Data System (ADS)
Amadio, G.; Ananya, A.; Apostolakis, J.; Aurora, A.; Bandieramonte, M.; Bhattacharyya, A.; Bianchini, C.; Brun, R.; Canal, P.; Carminati, F.; Cosmo, G.; Duhem, L.; Elvira, D.; Folger, G.; Gheata, A.; Gheata, M.; Goulas, I.; Iope, R.; Jun, S. Y.; Lima, G.; Mohanty, A.; Nikitina, T.; Novak, M.; Pokorski, W.; Ribon, A.; Seghal, R.; Shadura, O.; Vallecorsa, S.; Wenzel, S.; Zhang, Y.
2017-10-01
The recent progress in parallel hardware architectures with deeper vector pipelines or many-cores technologies brings opportunities for HEP experiments to take advantage of SIMD and SIMT computing models. Launched in 2013, the GeantV project studies performance gains in propagating multiple particles in parallel, improving instruction throughput and data locality in HEP event simulation on modern parallel hardware architecture. Due to the complexity of geometry description and physics algorithms of a typical HEP application, performance analysis is indispensable in identifying factors limiting parallel execution. In this report, we will present design considerations and preliminary computing performance of GeantV physics models on coprocessors (Intel Xeon Phi and NVidia GPUs) as well as on mainstream CPUs.
Creating a Parallel Version of VisIt for Microsoft Windows
DOE Office of Scientific and Technical Information (OSTI.GOV)
Whitlock, B J; Biagas, K S; Rawson, P L
2011-12-07
VisIt is a popular, free interactive parallel visualization and analysis tool for scientific data. Users can quickly generate visualizations from their data, animate them through time, manipulate them, and save the resulting images or movies for presentations. VisIt was designed from the ground up to work on many scales of computers from modest desktops up to massively parallel clusters. VisIt is comprised of a set of cooperating programs. All programs can be run locally or in client/server mode in which some run locally and some run remotely on compute clusters. The VisIt program most able to harness today's computing powermore » is the VisIt compute engine. The compute engine is responsible for reading simulation data from disk, processing it, and sending results or images back to the VisIt viewer program. In a parallel environment, the compute engine runs several processes, coordinating using the Message Passing Interface (MPI) library. Each MPI process reads some subset of the scientific data and filters the data in various ways to create useful visualizations. By using MPI, VisIt has been able to scale well into the thousands of processors on large computers such as dawn and graph at LLNL. The advent of multicore CPU's has made parallelism the 'new' way to achieve increasing performance. With today's computers having at least 2 cores and in many cases up to 8 and beyond, it is more important than ever to deploy parallel software that can use that computing power not only on clusters but also on the desktop. We have created a parallel version of VisIt for Windows that uses Microsoft's MPI implementation (MSMPI) to process data in parallel on the Windows desktop as well as on a Windows HPC cluster running Microsoft Windows Server 2008. Initial desktop parallel support for Windows was deployed in VisIt 2.4.0. Windows HPC cluster support has been completed and will appear in the VisIt 2.5.0 release. We plan to continue supporting parallel VisIt on Windows so our users will be able to take full advantage of their multicore resources.« less
Evaluating local indirect addressing in SIMD proc essors
NASA Technical Reports Server (NTRS)
Middleton, David; Tomboulian, Sherryl
1989-01-01
In the design of parallel computers, there exists a tradeoff between the number and power of individual processors. The single instruction stream, multiple data stream (SIMD) model of parallel computers lies at one extreme of the resulting spectrum. The available hardware resources are devoted to creating the largest possible number of processors, and consequently each individual processor must use the fewest possible resources. Disagreement exists as to whether SIMD processors should be able to generate addresses individually into their local data memory, or all processors should access the same address. The tradeoff is examined between the increased capability and the reduced number of processors that occurs in this single instruction stream, multiple, locally addressed, data (SIMLAD) model. Factors are assembled that affect this design choice, and the SIMLAD model is compared with the bare SIMD and the MIMD models.
Magnetosheath Filamentary Structures Formed by Ion Acceleration at the Quasi-Parallel Bow Shock
NASA Technical Reports Server (NTRS)
Omidi, N.; Sibeck, D.; Gutynska, O.; Trattner, K. J.
2014-01-01
Results from 2.5-D electromagnetic hybrid simulations show the formation of field-aligned, filamentary plasma structures in the magnetosheath. They begin at the quasi-parallel bow shock and extend far into the magnetosheath. These structures exhibit anticorrelated, spatial oscillations in plasma density and ion temperature. Closer to the bow shock, magnetic field variations associated with density and temperature oscillations may also be present. Magnetosheath filamentary structures (MFS) form primarily in the quasi-parallel sheath; however, they may extend to the quasi-perpendicular magnetosheath. They occur over a wide range of solar wind Alfvénic Mach numbers and interplanetary magnetic field directions. At lower Mach numbers with lower levels of magnetosheath turbulence, MFS remain highly coherent over large distances. At higher Mach numbers, magnetosheath turbulence decreases the level of coherence. Magnetosheath filamentary structures result from localized ion acceleration at the quasi-parallel bow shock and the injection of energetic ions into the magnetosheath. The localized nature of ion acceleration is tied to the generation of fast magnetosonic waves at and upstream of the quasi-parallel shock. The increased pressure in flux tubes containing the shock accelerated ions results in the depletion of the thermal plasma in these flux tubes and the enhancement of density in flux tubes void of energetic ions. This results in the observed anticorrelation between ion temperature and plasma density.
Dynamic load balancing of applications
Wheat, Stephen R.
1997-01-01
An application-level method for dynamically maintaining global load balance on a parallel computer, particularly on massively parallel MIMD computers. Global load balancing is achieved by overlapping neighborhoods of processors, where each neighborhood performs local load balancing. The method supports a large class of finite element and finite difference based applications and provides an automatic element management system to which applications are easily integrated.
SCaLeM: A Framework for Characterizing and Analyzing Execution Models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chavarría-Miranda, Daniel; Manzano Franco, Joseph B.; Krishnamoorthy, Sriram
2014-10-13
As scalable parallel systems evolve towards more complex nodes with many-core architectures and larger trans-petascale & upcoming exascale deployments, there is a need to understand, characterize and quantify the underlying execution models being used on such systems. Execution models are a conceptual layer between applications & algorithms and the underlying parallel hardware and systems software on which those applications run. This paper presents the SCaLeM (Synchronization, Concurrency, Locality, Memory) framework for characterizing and execution models. SCaLeM consists of three basic elements: attributes, compositions and mapping of these compositions to abstract parallel systems. The fundamental Synchronization, Concurrency, Locality and Memory attributesmore » are used to characterize each execution model, while the combinations of those attributes in the form of compositions are used to describe the primitive operations of the execution model. The mapping of the execution model’s primitive operations described by compositions, to an underlying abstract parallel system can be evaluated quantitatively to determine its effectiveness. Finally, SCaLeM also enables the representation and analysis of applications in terms of execution models, for the purpose of evaluating the effectiveness of such mapping.« less
Microfluidic local perfusion chambers for the visualization and manipulation of synapses
Taylor, Anne M.; Dieterich, Daniela C.; Ito, Hiroshi T.; Kim, Sally A.; Schuman, Erin M.
2010-01-01
Summary The polarized nature of neurons as well as the size and density of synapses complicates the manipulation and visualization of cell biological processes that control synaptic function. Here we developed a microfluidic local perfusion (μLP) chamber to access and manipulate synaptic regions and pre- and post-synaptic compartments in vitro. This chamber directs the formation of synapses in >100 parallel rows connecting separate neuron populations. A perfusion channel transects the parallel rows allowing access to synaptic regions with high spatial and temporal resolution. We used this chamber to investigate synapse-to-nucleus signaling. Using the calcium indicator dye, Fluo-4, we measured changes in calcium at dendrites and somata, following local perfusion of glutamate. Exploiting the high temporal resolution of the chamber, we exposed synapses to “spaced” or “massed” application of glutamate and then examined levels of pCREB in somata. Lastly, we applied the metabotropic receptor agonist, DHPG, to dendrites and observed increases in Arc transcription and Arc transcript localization. PMID:20399729
Method and system for optical figuring by imagewise heating of a solvent
Rushford, Michael C.
2005-08-30
A method and system of imagewise etching the surface of a substrate, such as thin glass, in a parallel process. The substrate surface is placed in contact with an etchant solution which increases in etch rate with temperature. A local thermal gradient is then generated in each of a plurality of selected local regions of a boundary layer of the etchant solution to imagewise etch the substrate surface in a parallel process. In one embodiment, the local thermal gradient is a local heating gradient produced at selected addresses chosen from an indexed array of addresses. The activation of each of the selected addresses is independently controlled by a computer processor so as to imagewise etch the substrate surface at region-specific etch rates. Moreover, etching progress is preferably concurrently monitored in real time over the entire surface area by an interferometer so as to deterministically control the computer processor to image-wise figure the substrate surface where needed.
Spatio-temporal dynamics of processing non-symbolic number: An ERP source localization study
Hyde, Daniel C.; Spelke, Elizabeth S.
2013-01-01
Coordinated studies with adults, infants, and nonhuman animals provide evidence for two distinct systems of non-verbal number representation. The ‘parallel individuation’ system selects and retains information about 1–3 individual entities and the ‘numerical magnitude’ system establishes representations of the approximate cardinal value of a group. Recent ERP work has demonstrated that these systems reliably evoke functionally and temporally distinct patterns of brain response that correspond to established behavioral signatures. However, relatively little is known about the neural generators of these ERP signatures. To address this question, we targeted known ERP signatures of these systems, by contrasting processing of small versus large non-symbolic numbers, and used a source localization algorithm (LORETA) to identify their cortical origins. Early processing of small numbers, showing the signature effects of parallel individuation on the N1 (∼150 ms), was localized primarily to extrastriate visual regions. In contrast, qualitatively and temporally distinct processing of large numbers, showing the signatures of approximate number representation on the mid-latency P2p (∼200–250 ms), was localized primarily to right intraparietal regions. In comparison, mid-latency small number processing was localized to the right temporal-parietal junction and left-lateralized intraparietal regions. These results add spatial information to the emerging ERP literature documenting the process by which we represent number. Furthermore, these results substantiate recent claims that early attentional processes determine whether a collection of objects will be represented through parallel individuation or as an approximate numerical magnitude by providing evidence that downstream processing diverges to distinct cortical regions. PMID:21830257
schwimmbad: A uniform interface to parallel processing pools in Python
NASA Astrophysics Data System (ADS)
Price-Whelan, Adrian M.; Foreman-Mackey, Daniel
2017-09-01
Many scientific and computing problems require doing some calculation on all elements of some data set. If the calculations can be executed in parallel (i.e. without any communication between calculations), these problems are said to be perfectly parallel. On computers with multiple processing cores, these tasks can be distributed and executed in parallel to greatly improve performance. A common paradigm for handling these distributed computing problems is to use a processing "pool": the "tasks" (the data) are passed in bulk to the pool, and the pool handles distributing the tasks to a number of worker processes when available. schwimmbad provides a uniform interface to parallel processing pools and enables switching easily between local development (e.g., serial processing or with multiprocessing) and deployment on a cluster or supercomputer (via, e.g., MPI or JobLib).
Cost-effective GPU-grid for genome-wide epistasis calculations.
Pütz, B; Kam-Thong, T; Karbalai, N; Altmann, A; Müller-Myhsok, B
2013-01-01
Until recently, genotype studies were limited to the investigation of single SNP effects due to the computational burden incurred when studying pairwise interactions of SNPs. However, some genetic effects as simple as coloring (in plants and animals) cannot be ascribed to a single locus but only understood when epistasis is taken into account [1]. It is expected that such effects are also found in complex diseases where many genes contribute to the clinical outcome of affected individuals. Only recently have such problems become feasible computationally. The inherently parallel structure of the problem makes it a perfect candidate for massive parallelization on either grid or cloud architectures. Since we are also dealing with confidential patient data, we were not able to consider a cloud-based solution but had to find a way to process the data in-house and aimed to build a local GPU-based grid structure. Sequential epistatsis calculations were ported to GPU using CUDA at various levels. Parallelization on the CPU was compared to corresponding GPU counterparts with regards to performance and cost. A cost-effective solution was created by combining custom-built nodes equipped with relatively inexpensive consumer-level graphics cards with highly parallel GPUs in a local grid. The GPU method outperforms current cluster-based systems on a price/performance criterion, as a single GPU shows speed performance comparable up to 200 CPU cores. The outlined approach will work for problems that easily lend themselves to massive parallelization. Code for various tasks has been made available and ongoing development of tools will further ease the transition from sequential to parallel algorithms.
The SPoRT concept of bracing for idiopathic scoliosis.
Zaina, Fabio; Fusco, Claudia; Atanasio, Salvatore; Negrini, Stefano
2011-01-01
The SPoRT (acronym: Symmetrical, Patient-oriented, Rigid, Three-dimensional, active) concept of bracing is a new way to build braces based on our 20 years of experience and the biomechanical principles of scoliosis correction, inclusive of the Sibilla and Sforzesco braces. The concept always requires a custom brace, which is made according to the patient's individual requirements. New technologies such as CAD-CAM can be applied, and often for better results, without the customary use of prebuilt forms whose measurements are stored in databases. Once the initial draft brace is completed, a final test must be made on the patient to modify and adapt it, depending on his or her real interaction between the body and the brace. The results that are today available on the SPoRT concept relate to the Sforzesco brace and are necessarily short-term, because the first treated patients are now reaching the fourth-year follow-up examination and haven't yet completed their treatments. On the basis of the initial evaluations, we can state that the Sforzesco brace is more effective than the Lyon brace after 6 months of treatment and that the Sforzesco brace is equally effective as the Risser Plast brace.
Synchronous parallel system for emulation and discrete event simulation
NASA Technical Reports Server (NTRS)
Steinman, Jeffrey S. (Inventor)
1992-01-01
A synchronous parallel system for emulation and discrete event simulation having parallel nodes responds to received messages at each node by generating event objects having individual time stamps, stores only the changes to state variables of the simulation object attributable to the event object, and produces corresponding messages. The system refrains from transmitting the messages and changing the state variables while it determines whether the changes are superseded, and then stores the unchanged state variables in the event object for later restoral to the simulation object if called for. This determination preferably includes sensing the time stamp of each new event object and determining which new event object has the earliest time stamp as the local event horizon, determining the earliest local event horizon of the nodes as the global event horizon, and ignoring the events whose time stamps are less than the global event horizon. Host processing between the system and external terminals enables such a terminal to query, monitor, command or participate with a simulation object during the simulation process.
Synchronous Parallel System for Emulation and Discrete Event Simulation
NASA Technical Reports Server (NTRS)
Steinman, Jeffrey S. (Inventor)
2001-01-01
A synchronous parallel system for emulation and discrete event simulation having parallel nodes responds to received messages at each node by generating event objects having individual time stamps, stores only the changes to the state variables of the simulation object attributable to the event object and produces corresponding messages. The system refrains from transmitting the messages and changing the state variables while it determines whether the changes are superseded, and then stores the unchanged state variables in the event object for later restoral to the simulation object if called for. This determination preferably includes sensing the time stamp of each new event object and determining which new event object has the earliest time stamp as the local event horizon, determining the earliest local event horizon of the nodes as the global event horizon, and ignoring events whose time stamps are less than the global event horizon. Host processing between the system and external terminals enables such a terminal to query, monitor, command or participate with a simulation object during the simulation process.
Design of k-Space Channel Combination Kernels and Integration with Parallel Imaging
Beatty, Philip J.; Chang, Shaorong; Holmes, James H.; Wang, Kang; Brau, Anja C. S.; Reeder, Scott B.; Brittain, Jean H.
2014-01-01
Purpose In this work, a new method is described for producing local k-space channel combination kernels using a small amount of low-resolution multichannel calibration data. Additionally, this work describes how these channel combination kernels can be combined with local k-space unaliasing kernels produced by the calibration phase of parallel imaging methods such as GRAPPA, PARS and ARC. Methods Experiments were conducted to evaluate both the image quality and computational efficiency of the proposed method compared to a channel-by-channel parallel imaging approach with image-space sum-of-squares channel combination. Results Results indicate comparable image quality overall, with some very minor differences seen in reduced field-of-view imaging. It was demonstrated that this method enables a speed up in computation time on the order of 3–16X for 32-channel data sets. Conclusion The proposed method enables high quality channel combination to occur earlier in the reconstruction pipeline, reducing computational and memory requirements for image reconstruction. PMID:23943602
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Chao; Pouransari, Hadi; Rajamanickam, Sivasankaran
We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by everymore » processor. We also provide various numerical results to demonstrate the versatility and scalability of the parallel algorithm.« less
Dynamic load balancing of applications
Wheat, S.R.
1997-05-13
An application-level method for dynamically maintaining global load balance on a parallel computer, particularly on massively parallel MIMD computers is disclosed. Global load balancing is achieved by overlapping neighborhoods of processors, where each neighborhood performs local load balancing. The method supports a large class of finite element and finite difference based applications and provides an automatic element management system to which applications are easily integrated. 13 figs.
Controls on Early-Rift Geometry: New Perspectives From the Bilila-Mtakataka Fault, Malawi
NASA Astrophysics Data System (ADS)
Hodge, M.; Fagereng, Å.; Biggs, J.; Mdala, H.
2018-05-01
We use the ˜110-km long Bilila-Mtakataka fault in the amagmatic southern East African Rift, Malawi, to investigate the controls on early-rift geometry at the scale of a major border fault. Morphological variations along the 14 ± 8-m high scarp define six 10- to 40-km long segments, which are either foliation parallel or oblique to both foliation and the current regional extension direction. As the scarp is neither consistently parallel to foliation nor well oriented for the current regional extension direction, we suggest that the segmented surface expression is related to the local reactivation of well-oriented weak shallow fabrics above a broadly continuous structure at depth. Using a geometrical model, the geometry of the best fitting subsurface structure is consistent with the local strain field from recent seismicity. In conclusion, within this early-rift, preexisting weaknesses only locally control border fault geometry at subsurface.
Kinematic Analysis and Performance Evaluation of Novel PRS Parallel Mechanism
NASA Astrophysics Data System (ADS)
Balaji, K.; Khan, B. Shahul Hamid
2018-02-01
In this paper, a 3 DoF (Degree of Freedom) novel PRS (Prismatic-Revolute- Spherical) type parallel mechanisms has been designed and presented. The combination of striaght and arc type linkages for 3 DOF parallel mechanism is introduced for the first time. The performances of the mechanisms are evaluated based on the indices such as Minimum Singular Value (MSV), Condition Number (CN), Local Conditioning Index (LCI), Kinematic Configuration Index (KCI) and Global Conditioning Index (GCI). The overall reachable workspace of all mechanisms are presented. The kinematic measure, dexterity measure and workspace analysis for all the mechanism have been evaluated and compared.
Harvey-Girard, Erik; Lewis, John; Maler, Leonard
2010-04-28
Weakly electric fish can enhance the detection and localization of important signals such as those of prey in part by cancellation of redundant spatially diffuse electric signals due to, e.g., their tail bending. The cancellation mechanism is based on descending input, conveyed by parallel fibers emanating from cerebellar granule cells, that produces a negative image of the global low-frequency signals in pyramidal cells within the first-order electrosensory region, the electrosensory lateral line lobe (ELL). Here we demonstrate that the parallel fiber synaptic input to ELL pyramidal cell undergoes long-term depression (LTD) whenever both parallel fiber afferents and their target cells are stimulated to produce paired burst discharges. Paired large bursts (4-4) induce robust LTD over pre-post delays of up to +/-50 ms, whereas smaller bursts (2-2) induce weaker LTD. Single spikes (either presynaptic or postsynaptic) paired with bursts did not induce LTD. Tetanic presynaptic stimulation was also ineffective in inducing LTD. Thus, we have demonstrated a form of anti-Hebbian LTD that depends on the temporal correlation of burst discharge. We then demonstrated that the burst-induced LTD is postsynaptic and requires the NR2B subunit of the NMDA receptor, elevation of postsynaptic Ca(2+), and activation of CaMKIIbeta. A model incorporating local inhibitory circuitry and previously identified short-term presynaptic potentiation of the parallel fiber synapses further suggests that the combination of burst-induced LTD, presynaptic potentiation, and local inhibition may be sufficient to explain the generation of the negative image and cancellation of redundant sensory input by ELL pyramidal cells.
Alignments of the galaxies in and around the Virgo cluster with the local velocity shear
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Jounghun; Rey, Soo Chang; Kim, Suk, E-mail: jounghun@astro.snu.ac.kr
2014-08-10
Observational evidence is presented for the alignment between the cosmic sheet and the principal axis of the velocity shear field at the position of the Virgo cluster. The galaxies in and around the Virgo cluster from the Extended Virgo Cluster Catalog that was recently constructed by Kim et al. are used to determine the direction of the local sheet. The peculiar velocity field reconstructed from the Sloan Digital Sky Survey Data Release 7 is analyzed to estimate the local velocity shear tensor at the Virgo center. Showing first that the minor principal axis of the local velocity shear tensor ismore » almost parallel to the direction of the line of sight, we detect a clear signal of alignment between the positions of the Virgo satellites and the intermediate principal axis of the local velocity shear projected onto the plane of the sky. Furthermore, the dwarf satellites are found to appear more strongly aligned than their normal counterparts, which is interpreted as an indication of the following. (1) The normal satellites and the dwarf satellites fall in the Virgo cluster preferentially along the local filament and the local sheet, respectively. (2) The local filament is aligned with the minor principal axis of the local velocity shear while the local sheet is parallel to the plane spanned by the minor and intermediate principal axes. Our result is consistent with the recent numerical claim that the velocity shear is a good tracer of the cosmic web.« less
Local SAR in Parallel Transmission Pulse Design
Lee, Joonsung; Gebhardt, Matthias; Wald, Lawrence L.; Adalsteinsson, Elfar
2011-01-01
The management of local and global power deposition in human subjects (Specific Absorption Rate, SAR) is a fundamental constraint to the application of parallel transmission (pTx) systems. Even though the pTx and single channel have to meet the same SAR requirements, the complex behavior of the spatial distribution of local SAR for transmission arrays poses problems that are not encountered in conventional single-channel systems and places additional requirements on pTx RF pulse design. We propose a pTx pulse design method which builds on recent work to capture the spatial distribution of local SAR in numerical tissue models in a compressed parameterization in order to incorporate local SAR constraints within computation times that accommodate pTx pulse design during an in vivo MRI scan. Additionally, the algorithm yields a Protocol-specific Ultimate Peak in Local SAR (PUPiL SAR), which is shown to bound the achievable peak local SAR for a given excitation profile fidelity. The performance of the approach was demonstrated using a numerical human head model and a 7T eight-channel transmit array. The method reduced peak local 10g SAR by 14–66% for slice-selective pTx excitations and 2D selective pTx excitations compared to a pTx pulse design constrained only by global SAR. The primary tradeoff incurred for reducing peak local SAR was an increase in global SAR, up to 34% for the evaluated examples, which is favorable in cases where local SAR constraints dominate the pulse applications. PMID:22083594
NASA Astrophysics Data System (ADS)
Lian, Yanping; Lin, Stephen; Yan, Wentao; Liu, Wing Kam; Wagner, Gregory J.
2018-05-01
In this paper, a parallelized 3D cellular automaton computational model is developed to predict grain morphology for solidification of metal during the additive manufacturing process. Solidification phenomena are characterized by highly localized events, such as the nucleation and growth of multiple grains. As a result, parallelization requires careful treatment of load balancing between processors as well as interprocess communication in order to maintain a high parallel efficiency. We give a detailed summary of the formulation of the model, as well as a description of the communication strategies implemented to ensure parallel efficiency. Scaling tests on a representative problem with about half a billion cells demonstrate parallel efficiency of more than 80% on 8 processors and around 50% on 64; loss of efficiency is attributable to load imbalance due to near-surface grain nucleation in this test problem. The model is further demonstrated through an additive manufacturing simulation with resulting grain structures showing reasonable agreement with those observed in experiments.
NASA Astrophysics Data System (ADS)
Lian, Yanping; Lin, Stephen; Yan, Wentao; Liu, Wing Kam; Wagner, Gregory J.
2018-01-01
In this paper, a parallelized 3D cellular automaton computational model is developed to predict grain morphology for solidification of metal during the additive manufacturing process. Solidification phenomena are characterized by highly localized events, such as the nucleation and growth of multiple grains. As a result, parallelization requires careful treatment of load balancing between processors as well as interprocess communication in order to maintain a high parallel efficiency. We give a detailed summary of the formulation of the model, as well as a description of the communication strategies implemented to ensure parallel efficiency. Scaling tests on a representative problem with about half a billion cells demonstrate parallel efficiency of more than 80% on 8 processors and around 50% on 64; loss of efficiency is attributable to load imbalance due to near-surface grain nucleation in this test problem. The model is further demonstrated through an additive manufacturing simulation with resulting grain structures showing reasonable agreement with those observed in experiments.
Efficient parallel resolution of the simplified transport equations in mixed-dual formulation
NASA Astrophysics Data System (ADS)
Barrault, M.; Lathuilière, B.; Ramet, P.; Roman, J.
2011-03-01
A reactivity computation consists of computing the highest eigenvalue of a generalized eigenvalue problem, for which an inverse power algorithm is commonly used. Very fine modelizations are difficult to treat for our sequential solver, based on the simplified transport equations, in terms of memory consumption and computational time. A first implementation of a Lagrangian based domain decomposition method brings to a poor parallel efficiency because of an increase in the power iterations [1]. In order to obtain a high parallel efficiency, we improve the parallelization scheme by changing the location of the loop over the subdomains in the overall algorithm and by benefiting from the characteristics of the Raviart-Thomas finite element. The new parallel algorithm still allows us to locally adapt the numerical scheme (mesh, finite element order). However, it can be significantly optimized for the matching grid case. The good behavior of the new parallelization scheme is demonstrated for the matching grid case on several hundreds of nodes for computations based on a pin-by-pin discretization.
Speculation and replication in temperature accelerated dynamics
Zamora, Richard J.; Perez, Danny; Voter, Arthur F.
2018-02-12
Accelerated Molecular Dynamics (AMD) is a class of MD-based algorithms for the long-time scale simulation of atomistic systems that are characterized by rare-event transitions. Temperature-Accelerated Dynamics (TAD), a traditional AMD approach, hastens state-to-state transitions by performing MD at an elevated temperature. Recently, Speculatively-Parallel TAD (SpecTAD) was introduced, allowing the TAD procedure to exploit parallel computing systems by concurrently executing in a dynamically generated list of speculative future states. Although speculation can be very powerful, it is not always the most efficient use of parallel resources. In this paper, we compare the performance of speculative parallelism with a replica-based technique, similarmore » to the Parallel Replica Dynamics method. A hybrid SpecTAD approach is also presented, in which each speculation process is further accelerated by a local set of replicas. Finally and overall, this work motivates the use of hybrid parallelism whenever possible, as some combination of speculation and replication is typically most efficient.« less
Preconditioned implicit solvers for the Navier-Stokes equations on distributed-memory machines
NASA Technical Reports Server (NTRS)
Ajmani, Kumud; Liou, Meng-Sing; Dyson, Rodger W.
1994-01-01
The GMRES method is parallelized, and combined with local preconditioning to construct an implicit parallel solver to obtain steady-state solutions for the Navier-Stokes equations of fluid flow on distributed-memory machines. The new implicit parallel solver is designed to preserve the convergence rate of the equivalent 'serial' solver. A static domain-decomposition is used to partition the computational domain amongst the available processing nodes of the parallel machine. The SPMD (Single-Program Multiple-Data) programming model is combined with message-passing tools to develop the parallel code on a 32-node Intel Hypercube and a 512-node Intel Delta machine. The implicit parallel solver is validated for internal and external flow problems, and is found to compare identically with flow solutions obtained on a Cray Y-MP/8. A peak computational speed of 2300 MFlops/sec has been achieved on 512 nodes of the Intel Delta machine,k for a problem size of 1024 K equations (256 K grid points).
Speculation and replication in temperature accelerated dynamics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zamora, Richard J.; Perez, Danny; Voter, Arthur F.
Accelerated Molecular Dynamics (AMD) is a class of MD-based algorithms for the long-time scale simulation of atomistic systems that are characterized by rare-event transitions. Temperature-Accelerated Dynamics (TAD), a traditional AMD approach, hastens state-to-state transitions by performing MD at an elevated temperature. Recently, Speculatively-Parallel TAD (SpecTAD) was introduced, allowing the TAD procedure to exploit parallel computing systems by concurrently executing in a dynamically generated list of speculative future states. Although speculation can be very powerful, it is not always the most efficient use of parallel resources. In this paper, we compare the performance of speculative parallelism with a replica-based technique, similarmore » to the Parallel Replica Dynamics method. A hybrid SpecTAD approach is also presented, in which each speculation process is further accelerated by a local set of replicas. Finally and overall, this work motivates the use of hybrid parallelism whenever possible, as some combination of speculation and replication is typically most efficient.« less
NASA Astrophysics Data System (ADS)
Timchenko, Leonid; Yarovyi, Andrii; Kokriatskaya, Nataliya; Nakonechna, Svitlana; Abramenko, Ludmila; Ławicki, Tomasz; Popiel, Piotr; Yesmakhanova, Laura
2016-09-01
The paper presents a method of parallel-hierarchical transformations for rapid recognition of dynamic images using GPU technology. Direct parallel-hierarchical transformations based on cluster CPU-and GPU-oriented hardware platform. Mathematic models of training of the parallel hierarchical (PH) network for the transformation are developed, as well as a training method of the PH network for recognition of dynamic images. This research is most topical for problems on organizing high-performance computations of super large arrays of information designed to implement multi-stage sensing and processing as well as compaction and recognition of data in the informational structures and computer devices. This method has such advantages as high performance through the use of recent advances in parallelization, possibility to work with images of ultra dimension, ease of scaling in case of changing the number of nodes in the cluster, auto scan of local network to detect compute nodes.
High-Resolution Study of the First Stretching Overtones of H3Si79Br.
Ceausu; Graner; Bürger; Mkadmi; Pracna; Lafferty
1998-11-01
The Fourier transform infrared spectrum of monoisotopic H3Si79Br (resolution 7.7 x 10(-3) cm-1) was studied from 4200 to 4520 cm-1, in the region of the first overtones of the Si-H stretching vibration. The investigation of the spectrum revealed the presence of two band systems, the first consisting of one parallel (nu0 = 4340.2002 cm-1) and one perpendicular (nu0 = 4342.1432 cm-1) strong component, and the second of one parallel (nu0 = 4405.789 cm-1) and one perpendicular (nu0 = 4416.233 cm-1) weak component. The rovibrational analysis shows strong local perturbations for both strong and weak systems. Seven hundred eighty-one nonzero-weighted transitions belonging to the strong system [the (200) manifold in the local mode picture] were fitted to a simple model involving a perpendicular component interacting by a weak Coriolis resonance with a parallel component. The most severely perturbed transitions (whose ||obs-calc || values exceeded 3 x 10(-3) cm-1) were given zero weights. The standard deviations of the fit were 1.0 x 10(-3) and 0.69 x 10(-3) cm-1 for the parallel and the perpendicular components, respectively. The weak band system, severely perturbed by many "dark" perturbers, was fitted to a model involving one parallel and one perpendicular band, connected by a Coriolis-type resonance. The K" . DeltaK = +10 to +18 subbands of the perpendicular component, which showed very high observed - calculated values ( approximately 0.5 cm-1), were excluded from this calculation. The standard deviations of the fit were 11 x 10(-3) and 13 x 10(-3) cm-1 for the parallel and the perpendicular components, respectively. Copyright 1998 Academic Press.
Scheduling for Locality in Shared-Memory Multiprocessors
1993-05-01
Submitted in Partial Fulfillment of the Requirements for the Degree ’)iIC Q(JALfryT INSPECTED 5 DOCTOR OF PHILOSOPHY I Accesion For Supervised by NTIS CRAM... architecture on parallel program performance, explain the implications of this trend on popular parallel programming models, and propose system software to 0...decomoosition and scheduling algorithms. I. SUIUECT TERMS IS. NUMBER OF PAGES shared-memory multiprocessors; architecture trends; loop 110 scheduling
1992-12-01
Dynamics and Free Energy Perturbation Methods." Reviews in Computational Chem- istry edited by Kenny B. Lipkowitz and Donald B. Boyd, chapter 8, 295-320...atomic motions during annealing, allows the search to probabilistically move in a locally non-optimal direction. The probability of doing so is...Network processors communicate via communication links. This type of communication is generally very slow relative to other processor activities
NASA Astrophysics Data System (ADS)
Shen, Yanfeng; Cesnik, Carlos E. S.
2016-04-01
This paper presents a parallelized modeling technique for the efficient simulation of nonlinear ultrasonics introduced by the wave interaction with fatigue cracks. The elastodynamic wave equations with contact effects are formulated using an explicit Local Interaction Simulation Approach (LISA). The LISA formulation is extended to capture the contact-impact phenomena during the wave damage interaction based on the penalty method. A Coulomb friction model is integrated into the computation procedure to capture the stick-slip contact shear motion. The LISA procedure is coded using the Compute Unified Device Architecture (CUDA), which enables the highly parallelized supercomputing on powerful graphic cards. Both the explicit contact formulation and the parallel feature facilitates LISA's superb computational efficiency over the conventional finite element method (FEM). The theoretical formulations based on the penalty method is introduced and a guideline for the proper choice of the contact stiffness is given. The convergence behavior of the solution under various contact stiffness values is examined. A numerical benchmark problem is used to investigate the new LISA formulation and results are compared with a conventional contact finite element solution. Various nonlinear ultrasonic phenomena are successfully captured using this contact LISA formulation, including the generation of nonlinear higher harmonic responses. Nonlinear mode conversion of guided waves at fatigue cracks is also studied.
Robust Parallel Motion Estimation and Mapping with Stereo Cameras in Underground Infrastructure
NASA Astrophysics Data System (ADS)
Liu, Chun; Li, Zhengning; Zhou, Yuan
2016-06-01
Presently, we developed a novel robust motion estimation method for localization and mapping in underground infrastructure using a pre-calibrated rigid stereo camera rig. Localization and mapping in underground infrastructure is important to safety. Yet it's also nontrivial since most underground infrastructures have poor lighting condition and featureless structure. Overcoming these difficulties, we discovered that parallel system is more efficient than the EKF-based SLAM approach since parallel system divides motion estimation and 3D mapping tasks into separate threads, eliminating data-association problem which is quite an issue in SLAM. Moreover, the motion estimation thread takes the advantage of state-of-art robust visual odometry algorithm which is highly functional under low illumination and provides accurate pose information. We designed and built an unmanned vehicle and used the vehicle to collect a dataset in an underground garage. The parallel system was evaluated by the actual dataset. Motion estimation results indicated a relative position error of 0.3%, and 3D mapping results showed a mean position error of 13cm. Off-line process reduced position error to 2cm. Performance evaluation by actual dataset showed that our system is capable of robust motion estimation and accurate 3D mapping in poor illumination and featureless underground environment.
Parallel computation of GA search for the artery shape determinants with CFD
NASA Astrophysics Data System (ADS)
Himeno, M.; Noda, S.; Fukasaku, K.; Himeno, R.
2010-06-01
We studied which factors play important role to determine the shape of arteries at the carotid artery bifurcation by performing multi-objective optimization with computation fluid dynamics (CFD) and the genetic algorithm (GA). To perform it, the most difficult problem is how to reduce turn-around time of the GA optimization with 3D unsteady computation of blood flow. We devised two levels of parallel computation method with the following features: level 1: parallel CFD computation with appropriate number of cores; level 2: parallel jobs generated by "master", which finds quickly available job cue and dispatches jobs, to reduce turn-around time. As a result, the turn-around time of one GA trial, which would have taken 462 days with one core, was reduced to less than two days on RIKEN supercomputer system, RICC, with 8192 cores. We performed a multi-objective optimization to minimize the maximum mean WSS and to minimize the sum of circumference for four different shapes and obtained a set of trade-off solutions for each shape. In addition, we found that the carotid bulb has the feature of the minimum local mean WSS and minimum local radius. We confirmed that our method is effective for examining determinants of artery shapes.
On the nature of the NAA diffusion attenuated MR signal in the central nervous system.
Kroenke, Christopher D; Ackerman, Joseph J H; Yablonskiy, Dmitriy A
2004-11-01
In the brain, on a macroscopic scale, diffusion of the intraneuronal constituent N-acetyl-L-aspartate (NAA) appears to be isotropic. In contrast, on a microscopic scale, NAA diffusion is likely highly anisotropic, with displacements perpendicular to neuronal fibers being markedly hindered, and parallel displacements less so. In this report we first substantiate that local anisotropy influences NAA diffusion in vivo by observing differing diffusivities parallel and perpendicular to human corpus callosum axonal fibers. We then extend our measurements to large voxels within rat brains. As expected, the macroscopic apparent diffusion coefficient (ADC) of NAA is practically isotropic due to averaging of the numerous and diverse fiber orientations. We demonstrate that the substantially non-monoexponential diffusion-mediated MR signal decay vs. b value can be quantitatively explained by a theoretical model of NAA confined to an ensemble of differently oriented neuronal fibers. On the microscopic scale, NAA diffusion is found to be strongly anisotropic, with displacements occurring almost exclusively parallel to the local fiber axis. This parallel diffusivity, ADCparallel, is 0.36 +/- 0.01 microm2/ms, and ADCperpendicular is essentially zero. From ADCparallel the apparent viscosity of the neuron cytoplasm is estimated to be twice as large as that of a temperature-matched dilute aqueous solution. (c) 2004 Wiley-Liss, Inc.
A Parallel Compact Multi-Dimensional Numerical Algorithm with Aeroacoustics Applications
NASA Technical Reports Server (NTRS)
Povitsky, Alex; Morris, Philip J.
1999-01-01
In this study we propose a novel method to parallelize high-order compact numerical algorithms for the solution of three-dimensional PDEs (Partial Differential Equations) in a space-time domain. For this numerical integration most of the computer time is spent in computation of spatial derivatives at each stage of the Runge-Kutta temporal update. The most efficient direct method to compute spatial derivatives on a serial computer is a version of Gaussian elimination for narrow linear banded systems known as the Thomas algorithm. In a straightforward pipelined implementation of the Thomas algorithm processors are idle due to the forward and backward recurrences of the Thomas algorithm. To utilize processors during this time, we propose to use them for either non-local data independent computations, solving lines in the next spatial direction, or local data-dependent computations by the Runge-Kutta method. To achieve this goal, control of processor communication and computations by a static schedule is adopted. Thus, our parallel code is driven by a communication and computation schedule instead of the usual "creative, programming" approach. The obtained parallelization speed-up of the novel algorithm is about twice as much as that for the standard pipelined algorithm and close to that for the explicit DRP algorithm.
Variable Anisotropic Brain Electrical Conductivities in Epileptogenic Foci
Mandelkern, M.; Bui, D.; Salamon, N.; Vinters, H. V.; Mathern, G. W.
2010-01-01
Source localization models assume brain electrical conductivities are isotropic at about 0.33 S/m. These assumptions have not been confirmed ex vivo in humans. This study determined bidirectional electrical conductivities from pediatric epilepsy surgery patients. Electrical conductivities perpendicular and parallel to the pial surface of neocortex and subcortical white matter (n = 15) were measured using the 4-electrode technique and compared with clinical variables. Mean (±SD) electrical conductivities were 0.10 ± 0.01 S/m, and varied by 243% from patient to patient. Perpendicular and parallel conductivities differed by 45%, and the larger values were perpendicular to the pial surface in 47% and parallel in 40% of patients. A perpendicular principal axis was associated with normal, while isotropy and parallel principal axes were linked with epileptogenic lesions by MRI. Electrical conductivities were decreased in patients with cortical dysplasia compared with non-dysplasia etiologies. The electrical conductivity values of freshly excised human brain tissues were approximately 30% of assumed values, varied by over 200% from patient to patient, and had erratic anisotropic and isotropic shapes if the MRI showed a lesion. Understanding brain electrical conductivity and ways to non-invasively measure them are probably necessary to enhance the ability to localize EEG sources from epilepsy surgery patients. PMID:20440549
Parallel magnetic resonance imaging using coils with localized sensitivities.
Goldfarb, James W; Holland, Agnes E
2004-09-01
The purpose of this study was to present clinical examples and illustrate the inefficiencies of a conventional reconstruction using a commercially available phased array coil with localized sensitivities. Five patients were imaged at 1.5 T using a cardiac-synchronized gadolinium-enhanced acquisition and a commercially available four-element phased array coil. Four unique sets of images were reconstructed from the acquired k-space data: (a) sum-of-squares image using four elements of the coil; localized sum-of-squares images from the (b) anterior coils and (c) posterior coils and a (c) local reconstruction. Images were analyzed for artifacts and usable field-of-view. Conventional image reconstruction produced images with fold-over artifacts in all cases spanning a portion of the image (mean 90 mm; range 36-126 mm). The local reconstruction removed fold-over artifacts and resulted in an effective increase in the field-of-view (mean 50%; range 20-70%). Commercially available phased array coils do not always have overlapping sensitivities. Fold-over artifacts can be removed using an alternate reconstruction method. When assessing the advantages of parallel imaging techniques, gains achieved using techniques such as SENSE and SMASH should be gauged against the acquisition time of the localized method rather than the conventional sum-of-squares method.
Guérin, Bastien; Setsompop, Kawin; Ye, Huihui; Poser, Benedikt A; Stenger, Andrew V; Wald, Lawrence L
2015-05-01
To design parallel transmit (pTx) simultaneous multislice (SMS) spokes pulses with explicit control for peak power and local and global specific absorption rate (SAR). We design SMS pTx least-squares and magnitude least squares spokes pulses while constraining local SAR using the virtual observation points (VOPs) compression of SAR matrices. We evaluate our approach in simulations of a head (7T) and a body (3T) coil with eight channels arranged in two z-rows. For many of our simulations, control of average power by Tikhonov regularization of the SMS pTx spokes pulse design yielded pulses that violated hardware and SAR safety limits. On the other hand, control of peak power alone yielded pulses that violated local SAR limits. Pulses optimized with control of both local SAR and peak power satisfied all constraints and therefore had the best excitation performance under limited power and SAR constraints. These results extend our previous results for single slice pTx excitations but are more pronounced because of the large power demands and SAR of SMS pulses. Explicit control of local SAR and peak power is required to generate optimal SMS pTx excitations satisfying both the system's hardware limits and regulatory safety limits. © 2014 Wiley Periodicals, Inc.
Parallel Monotonic Basin Hopping for Low Thrust Trajectory Optimization
NASA Technical Reports Server (NTRS)
McCarty, Steven L.; McGuire, Melissa L.
2018-01-01
Monotonic Basin Hopping has been shown to be an effective method of solving low thrust trajectory optimization problems. This paper outlines an extension to the common serial implementation by parallelizing it over any number of available compute cores. The Parallel Monotonic Basin Hopping algorithm described herein is shown to be an effective way to more quickly locate feasible solutions, and improve locally optimal solutions in an automated way without requiring a feasible initial guess. The increased speed achieved through parallelization enables the algorithm to be applied to more complex problems that would otherwise be impractical for a serial implementation. Low thrust cislunar transfers and a hybrid Mars example case demonstrate the effectiveness of the algorithm. Finally, a preliminary scaling study quantifies the expected decrease in solve time compared to a serial implementation.,
Clustering and flow around a sphere moving into a grain cloud.
Seguin, A; Lefebvre-Lepot, A; Faure, S; Gondret, P
2016-06-01
A bidimensional simulation of a sphere moving at constant velocity into a cloud of smaller spherical grains far from any boundaries and without gravity is presented with a non-smooth contact dynamics method. A dense granular "cluster" zone builds progressively around the moving sphere until a stationary regime appears with a constant upstream cluster size. The key point is that the upstream cluster size increases with the initial solid fraction [Formula: see text] but the cluster packing fraction takes an about constant value independent of [Formula: see text]. Although the upstream cluster size around the moving sphere diverges when [Formula: see text] approaches a critical value, the drag force exerted by the grains on the sphere does not. The detailed analysis of the local strain rate and local stress fields made in the non-parallel granular flow inside the cluster allows us to extract the local invariants of the two tensors: dilation rate, shear rate, pressure and shear stress. Despite different spatial variations of these invariants, the local friction coefficient μ appears to depend only on the local inertial number I as well as the local solid fraction, which means that a local rheology does exist in the present non-parallel flow. The key point is that the spatial variations of I inside the cluster do not depend on the sphere velocity and explore only a small range around the value one.
Settgast, Randolph R.; Fu, Pengcheng; Walsh, Stuart D. C.; ...
2016-09-18
This study describes a fully coupled finite element/finite volume approach for simulating field-scale hydraulically driven fractures in three dimensions, using massively parallel computing platforms. The proposed method is capable of capturing realistic representations of local heterogeneities, layering and natural fracture networks in a reservoir. A detailed description of the numerical implementation is provided, along with numerical studies comparing the model with both analytical solutions and experimental results. The results demonstrate the effectiveness of the proposed method for modeling large-scale problems involving hydraulically driven fractures in three dimensions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Settgast, Randolph R.; Fu, Pengcheng; Walsh, Stuart D. C.
This study describes a fully coupled finite element/finite volume approach for simulating field-scale hydraulically driven fractures in three dimensions, using massively parallel computing platforms. The proposed method is capable of capturing realistic representations of local heterogeneities, layering and natural fracture networks in a reservoir. A detailed description of the numerical implementation is provided, along with numerical studies comparing the model with both analytical solutions and experimental results. The results demonstrate the effectiveness of the proposed method for modeling large-scale problems involving hydraulically driven fractures in three dimensions.
Re-forming supercritical quasi-parallel shocks. I - One- and two-dimensional simulations
NASA Technical Reports Server (NTRS)
Thomas, V. A.; Winske, D.; Omidi, N.
1990-01-01
The process of reforming supercritical quasi-parallel shocks is investigated using one-dimensional and two-dimensional hybrid (particle ion, massless fluid electron) simulations both of shocks and of simpler two-stream interactions. It is found that the supercritical quasi-parallel shock is not steady. Instread of a well-defined shock ramp between upstream and downstream states that remains at a fixed position in the flow, the ramp periodically steepens, broadens, and then reforms upstream of its former position. It is concluded that the wave generation process is localized at the shock ramp and that the reformation process proceeds in the absence of upstream perturbations intersecting the shock.
Local SAR in parallel transmission pulse design.
Lee, Joonsung; Gebhardt, Matthias; Wald, Lawrence L; Adalsteinsson, Elfar
2012-06-01
The management of local and global power deposition in human subjects (specific absorption rate, SAR) is a fundamental constraint to the application of parallel transmission (pTx) systems. Even though the pTx and single channel have to meet the same SAR requirements, the complex behavior of the spatial distribution of local SAR for transmission arrays poses problems that are not encountered in conventional single-channel systems and places additional requirements on pTx radio frequency pulse design. We propose a pTx pulse design method which builds on recent work to capture the spatial distribution of local SAR in numerical tissue models in a compressed parameterization in order to incorporate local SAR constraints within computation times that accommodate pTx pulse design during an in vivo magnetic resonance imaging scan. Additionally, the algorithm yields a protocol-specific ultimate peak in local SAR, which is shown to bound the achievable peak local SAR for a given excitation profile fidelity. The performance of the approach was demonstrated using a numerical human head model and a 7 Tesla eight-channel transmit array. The method reduced peak local 10 g SAR by 14-66% for slice-selective pTx excitations and 2D selective pTx excitations compared to a pTx pulse design constrained only by global SAR. The primary tradeoff incurred for reducing peak local SAR was an increase in global SAR, up to 34% for the evaluated examples, which is favorable in cases where local SAR constraints dominate the pulse applications. Copyright © 2011 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Du, Zhidong; Chen, Chen; Pan, Liang
2017-04-01
Maskless lithography using parallel electron beamlets is a promising solution for next generation scalable maskless nanolithography. Researchers have focused on this goal but have been unable to find a robust technology to generate and control high-quality electron beamlets with satisfactory brightness and uniformity. In this work, we will aim to address this challenge by developing a revolutionary surface-plasmon-enhanced-photoemission (SPEP) technology to generate massively-parallel electron beamlets for maskless nanolithography. The new technology is built upon our recent breakthroughs in plasmonic lenses, which will be used to excite and focus surface plasmons to generate massively-parallel electron beamlets through photoemission. Specifically, the proposed SPEP device consists of an array of plasmonic lens and electrostatic micro-lens pairs, each pair independently producing an electron beamlet. During lithography, a spatial optical modulator will dynamically project light onto individual plasmonic lenses to control the switching and brightness of electron beamlets. The photons incident onto each plasmonic lens are concentrated into a diffraction-unlimited spot as localized surface plasmons to excite the local electrons to near their vacuum levels. Meanwhile, the electrostatic micro-lens extracts the excited electrons to form a focused beamlet, which can be rastered across a wafer to perform lithography. Studies showed that surface plasmons can enhance the photoemission by orders of magnitudes. This SPEP technology can scale up the maskless lithography process to write at wafers per hour. In this talk, we will report the mechanism of the strong electron-photon couplings and the locally enhanced photoexcitation, design of a SPEP device, overview of our proof-of-concept study, and demonstrated parallel lithography of 20-50 nm features.
An efficient implementation of a high-order filter for a cubed-sphere spectral element model
NASA Astrophysics Data System (ADS)
Kang, Hyun-Gyu; Cheong, Hyeong-Bin
2017-03-01
A parallel-scalable, isotropic, scale-selective spatial filter was developed for the cubed-sphere spectral element model on the sphere. The filter equation is a high-order elliptic (Helmholtz) equation based on the spherical Laplacian operator, which is transformed into cubed-sphere local coordinates. The Laplacian operator is discretized on the computational domain, i.e., on each cell, by the spectral element method with Gauss-Lobatto Lagrange interpolating polynomials (GLLIPs) as the orthogonal basis functions. On the global domain, the discrete filter equation yielded a linear system represented by a highly sparse matrix. The density of this matrix increases quadratically (linearly) with the order of GLLIP (order of the filter), and the linear system is solved in only O (Ng) operations, where Ng is the total number of grid points. The solution, obtained by a row reduction method, demonstrated the typical accuracy and convergence rate of the cubed-sphere spectral element method. To achieve computational efficiency on parallel computers, the linear system was treated by an inverse matrix method (a sparse matrix-vector multiplication). The density of the inverse matrix was lowered to only a few times of the original sparse matrix without degrading the accuracy of the solution. For better computational efficiency, a local-domain high-order filter was introduced: The filter equation is applied to multiple cells, and then the central cell was only used to reconstruct the filtered field. The parallel efficiency of applying the inverse matrix method to the global- and local-domain filter was evaluated by the scalability on a distributed-memory parallel computer. The scale-selective performance of the filter was demonstrated on Earth topography. The usefulness of the filter as a hyper-viscosity for the vorticity equation was also demonstrated.
NASA Astrophysics Data System (ADS)
Byun, Hye Suk; El-Naggar, Mohamed Y.; Kalia, Rajiv K.; Nakano, Aiichiro; Vashishta, Priya
2017-10-01
Kinetic Monte Carlo (KMC) simulations are used to study long-time dynamics of a wide variety of systems. Unfortunately, the conventional KMC algorithm is not scalable to larger systems, since its time scale is inversely proportional to the simulated system size. A promising approach to resolving this issue is the synchronous parallel KMC (SPKMC) algorithm, which makes the time scale size-independent. This paper introduces a formal derivation of the SPKMC algorithm based on local transition-state and time-dependent Hartree approximations, as well as its scalable parallel implementation based on a dual linked-list cell method. The resulting algorithm has achieved a weak-scaling parallel efficiency of 0.935 on 1024 Intel Xeon processors for simulating biological electron transfer dynamics in a 4.2 billion-heme system, as well as decent strong-scaling parallel efficiency. The parallel code has been used to simulate a lattice of cytochrome complexes on a bacterial-membrane nanowire, and it is broadly applicable to other problems such as computational synthesis of new materials.
Transmission Index Research of Parallel Manipulators Based on Matrix Orthogonal Degree
NASA Astrophysics Data System (ADS)
Shao, Zhu-Feng; Mo, Jiao; Tang, Xiao-Qiang; Wang, Li-Ping
2017-11-01
Performance index is the standard of performance evaluation, and is the foundation of both performance analysis and optimal design for the parallel manipulator. Seeking the suitable kinematic indices is always an important and challenging issue for the parallel manipulator. So far, there are extensive studies in this field, but few existing indices can meet all the requirements, such as simple, intuitive, and universal. To solve this problem, the matrix orthogonal degree is adopted, and generalized transmission indices that can evaluate motion/force transmissibility of fully parallel manipulators are proposed. Transmission performance analysis of typical branches, end effectors, and parallel manipulators is given to illustrate proposed indices and analysis methodology. Simulation and analysis results reveal that proposed transmission indices possess significant advantages, such as normalized finite (ranging from 0 to 1), dimensionally homogeneous, frame-free, intuitive and easy to calculate. Besides, proposed indices well indicate the good transmission region and relativity to the singularity with better resolution than the traditional local conditioning index, and provide a novel tool for kinematic analysis and optimal design of fully parallel manipulators.
Study of Solid State Drives performance in PROOF distributed analysis system
NASA Astrophysics Data System (ADS)
Panitkin, S. Y.; Ernst, M.; Petkus, R.; Rind, O.; Wenaus, T.
2010-04-01
Solid State Drives (SSD) is a promising storage technology for High Energy Physics parallel analysis farms. Its combination of low random access time and relatively high read speed is very well suited for situations where multiple jobs concurrently access data located on the same drive. It also has lower energy consumption and higher vibration tolerance than Hard Disk Drive (HDD) which makes it an attractive choice in many applications raging from personal laptops to large analysis farms. The Parallel ROOT Facility - PROOF is a distributed analysis system which allows to exploit inherent event level parallelism of high energy physics data. PROOF is especially efficient together with distributed local storage systems like Xrootd, when data are distributed over computing nodes. In such an architecture the local disk subsystem I/O performance becomes a critical factor, especially when computing nodes use multi-core CPUs. We will discuss our experience with SSDs in PROOF environment. We will compare performance of HDD with SSD in I/O intensive analysis scenarios. In particular we will discuss PROOF system performance scaling with a number of simultaneously running analysis jobs.
Yang, Daejong; Kang, Kyungnam; Kim, Donghwan; Li, Zhiyong; Park, Inkyu
2015-01-01
A facile top-down/bottom-up hybrid nanofabrication process based on programmable temperature control and parallel chemical supply within microfluidic platform has been developed for the all liquid-phase synthesis of heterogeneous nanomaterial arrays. The synthesized materials and locations can be controlled by local heating with integrated microheaters and guided liquid chemical flow within microfluidic platform. As proofs-of-concept, we have demonstrated the synthesis of two types of nanomaterial arrays: (i) parallel array of TiO2 nanotubes, CuO nanospikes and ZnO nanowires, and (ii) parallel array of ZnO nanowire/CuO nanospike hybrid nanostructures, CuO nanospikes and ZnO nanowires. The laminar flow with negligible ionic diffusion between different precursor solutions as well as localized heating was verified by numerical calculation and experimental result of nanomaterial array synthesis. The devices made of heterogeneous nanomaterial array were utilized as a multiplexed sensor for toxic gases such as NO2 and CO. This method would be very useful for the facile fabrication of functional nanodevices based on highly integrated arrays of heterogeneous nanomaterials. PMID:25634814
A Pervasive Parallel Processing Framework for Data Visualization and Analysis at Extreme Scale
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moreland, Kenneth; Geveci, Berk
2014-11-01
The evolution of the computing world from teraflop to petaflop has been relatively effortless, with several of the existing programming models scaling effectively to the petascale. The migration to exascale, however, poses considerable challenges. All industry trends infer that the exascale machine will be built using processors containing hundreds to thousands of cores per chip. It can be inferred that efficient concurrency on exascale machines requires a massive amount of concurrent threads, each performing many operations on a localized piece of data. Currently, visualization libraries and applications are based off what is known as the visualization pipeline. In the pipelinemore » model, algorithms are encapsulated as filters with inputs and outputs. These filters are connected by setting the output of one component to the input of another. Parallelism in the visualization pipeline is achieved by replicating the pipeline for each processing thread. This works well for today’s distributed memory parallel computers but cannot be sustained when operating on processors with thousands of cores. Our project investigates a new visualization framework designed to exhibit the pervasive parallelism necessary for extreme scale machines. Our framework achieves this by defining algorithms in terms of worklets, which are localized stateless operations. Worklets are atomic operations that execute when invoked unlike filters, which execute when a pipeline request occurs. The worklet design allows execution on a massive amount of lightweight threads with minimal overhead. Only with such fine-grained parallelism can we hope to fill the billions of threads we expect will be necessary for efficient computation on an exascale machine.« less
Banana regime pressure anisotropy in a bumpy cylinder magnetic field
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garcia-Perciante, A.L.; Callen, J.D.; Shaing, K.C.
The pressure anisotropy is calculated for a plasma in a bumpy cylindrical magnetic field in the low collisionality (banana) regime for small magnetic-field modulations ({epsilon}{identical_to}{delta}B/2B<<1). Solutions are obtained by integrating the drift-kinetic equation along field lines in steady state. A closure for the local value of the parallel viscous force B{center_dot}{nabla}{center_dot}{pi}{sub parallel} is then calculated and is shown to exceed the flux-surface-averaged parallel viscous force by a factor of O(1/{epsilon}). A high-frequency limit ({omega}>>{nu}) for the pressure anisotropy is also determined and the calculation is then extended to include the full frequency dependence by using an expansion inmore » Cordey eigenfunctions.« less
Parallel processing optimization strategy based on MapReduce model in cloud storage environment
NASA Astrophysics Data System (ADS)
Cui, Jianming; Liu, Jiayi; Li, Qiuyan
2017-05-01
Currently, a large number of documents in the cloud storage process employed the way of packaging after receiving all the packets. From the local transmitter this stored procedure to the server, packing and unpacking will consume a lot of time, and the transmission efficiency is low as well. A new parallel processing algorithm is proposed to optimize the transmission mode. According to the operation machine graphs model work, using MPI technology parallel execution Mapper and Reducer mechanism. It is good to use MPI technology to implement Mapper and Reducer parallel mechanism. After the simulation experiment of Hadoop cloud computing platform, this algorithm can not only accelerate the file transfer rate, but also shorten the waiting time of the Reducer mechanism. It will break through traditional sequential transmission constraints and reduce the storage coupling to improve the transmission efficiency.
Efficient parallelization of analytic bond-order potentials for large-scale atomistic simulations
NASA Astrophysics Data System (ADS)
Teijeiro, C.; Hammerschmidt, T.; Drautz, R.; Sutmann, G.
2016-07-01
Analytic bond-order potentials (BOPs) provide a way to compute atomistic properties with controllable accuracy. For large-scale computations of heterogeneous compounds at the atomistic level, both the computational efficiency and memory demand of BOP implementations have to be optimized. Since the evaluation of BOPs is a local operation within a finite environment, the parallelization concepts known from short-range interacting particle simulations can be applied to improve the performance of these simulations. In this work, several efficient parallelization methods for BOPs that use three-dimensional domain decomposition schemes are described. The schemes are implemented into the bond-order potential code BOPfox, and their performance is measured in a series of benchmarks. Systems of up to several millions of atoms are simulated on a high performance computing system, and parallel scaling is demonstrated for up to thousands of processors.
Regional-scale calculation of the LS factor using parallel processing
NASA Astrophysics Data System (ADS)
Liu, Kai; Tang, Guoan; Jiang, Ling; Zhu, A.-Xing; Yang, Jianyi; Song, Xiaodong
2015-05-01
With the increase of data resolution and the increasing application of USLE over large areas, the existing serial implementation of algorithms for computing the LS factor is becoming a bottleneck. In this paper, a parallel processing model based on message passing interface (MPI) is presented for the calculation of the LS factor, so that massive datasets at a regional scale can be processed efficiently. The parallel model contains algorithms for calculating flow direction, flow accumulation, drainage network, slope, slope length and the LS factor. According to the existence of data dependence, the algorithms are divided into local algorithms and global algorithms. Parallel strategy are designed according to the algorithm characters including the decomposition method for maintaining the integrity of the results, optimized workflow for reducing the time taken for exporting the unnecessary intermediate data and a buffer-communication-computation strategy for improving the communication efficiency. Experiments on a multi-node system show that the proposed parallel model allows efficient calculation of the LS factor at a regional scale with a massive dataset.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jalas, S.; Dornmair, I.; Lehe, R.
Particle in Cell (PIC) simulations are a widely used tool for the investigation of both laser- and beam-driven plasma acceleration. It is a known issue that the beam quality can be artificially degraded by numerical Cherenkov radiation (NCR) resulting primarily from an incorrectly modeled dispersion relation. Pseudo-spectral solvers featuring infinite order stencils can strongly reduce NCR - or even suppress it - and are therefore well suited to correctly model the beam properties. For efficient parallelization of the PIC algorithm, however, localized solvers are inevitable. Arbitrary order pseudo-spectral methods provide this needed locality. Yet, these methods can again be pronemore » to NCR. Here in this paper, we show that acceptably low solver orders are sufficient to correctly model the physics of interest, while allowing for parallel computation by domain decomposition.« less
2008-05-30
Tribological behavior and graphitization of carbon nanotubes grown on 440C stainless steel . Tribo. Lett., 19(2):119-125, 2005. D-2 ...with a stainless steel parallel plate configuration as shown in figure 1. Due to the radial variation of the local shear stress T in the parallel...using a force transducer that is mounted below the surface. B-1 Exploded View Stainless Steel Plate Lower Fixture Microscale View Figure 1:
Deniz, Cem M; Vaidya, Manushka V; Sodickson, Daniel K; Lattanzi, Riccardo
2016-01-01
We investigated global specific absorption rate (SAR) and radiofrequency (RF) power requirements in parallel transmission as the distance between the transmit coils and the sample was increased. We calculated ultimate intrinsic SAR (UISAR), which depends on object geometry and electrical properties but not on coil design, and we used it as the reference to compare the performance of various transmit arrays. We investigated the case of fixing coil size and increasing the number of coils while moving the array away from the sample, as well as the case of fixing coil number and scaling coil dimensions. We also investigated RF power requirements as a function of lift-off, and tracked local SAR distributions associated with global SAR optima. In all cases, the target excitation profile was achieved and global SAR (as well as associated maximum local SAR) decreased with lift-off, approaching UISAR, which was constant for all lift-offs. We observed a lift-off value that optimizes the balance between global SAR and power losses in coil conductors. We showed that, using parallel transmission, global SAR can decrease at ultra high fields for finite arrays with a sufficient number of transmit elements. For parallel transmission, the distance between coils and object can be optimized to reduce SAR and minimize RF power requirements associated with homogeneous excitation. © 2015 Wiley Periodicals, Inc.
Johnson, Shannon A; Blaha, Leslie M; Houpt, Joseph W; Townsend, James T
2010-02-01
Previous studies of global-local processing in autism spectrum disorders (ASDs) have indicated mixed findings, with some evidence of a local processing bias, or preference for detail-level information, and other results suggesting typical global advantage, or preference for the whole or gestalt. Findings resulting from this paradigm have been used to argue for or against a detail focused processing bias in ASDs, and thus have important theoretical implications. We applied Systems Factorial Technology, and the associated Double Factorial Paradigm (both defined in the text), to examine information processing characteristics during a divided attention global-local task in high-functioning individuals with an ASD and typically developing controls. Group data revealed global advantage for both groups, contrary to some current theories of ASDs. Information processing models applied to each participant revealed that task performance, although showing no differences at the group level, was supported by different cognitive mechanisms in ASD participants compared to controls. All control participants demonstrated inhibitory parallel processing and the majority demonstrated a minimum-time stopping rule. In contrast, ASD participants showed exhaustive parallel processing with mild facilitatory interactions between global and local information. Thus our results indicate fundamental differences in the stopping rules and channel dependencies in individuals with an ASD.
Local spatio-temporal analysis in vision systems
NASA Astrophysics Data System (ADS)
Geisler, Wilson S.; Bovik, Alan; Cormack, Lawrence; Ghosh, Joydeep; Gildeen, David
1994-07-01
The aims of this project are the following: (1) develop a physiologically and psychophysically based model of low-level human visual processing (a key component of which are local frequency coding mechanisms); (2) develop image models and image-processing methods based upon local frequency coding; (3) develop algorithms for performing certain complex visual tasks based upon local frequency representations, (4) develop models of human performance in certain complex tasks based upon our understanding of low-level processing; and (5) develop a computational testbed for implementing, evaluating and visualizing the proposed models and algorithms, using a massively parallel computer. Progress has been substantial on all aims. The highlights include the following: (1) completion of a number of psychophysical and physiological experiments revealing new, systematic and exciting properties of the primate (human and monkey) visual system; (2) further development of image models that can accurately represent the local frequency structure in complex images; (3) near completion in the construction of the Texas Active Vision Testbed; (4) development and testing of several new computer vision algorithms dealing with shape-from-texture, shape-from-stereo, and depth-from-focus; (5) implementation and evaluation of several new models of human visual performance; and (6) evaluation, purchase and installation of a MasPar parallel computer.
Decentralized Interleaving of Paralleled Dc-Dc Buck Converters: Preprint
DOE Office of Scientific and Technical Information (OSTI.GOV)
Johnson, Brian B; Rodriguez, Miguel; Sinha, Mohit
We present a decentralized control strategy that yields switch interleaving among parallel connected dc-dc buck converters without communication. The proposed method is based on the digital implementation of the dynamics of a nonlinear oscillator circuit as the controller. Each controller is fully decentralized, i.e., it only requires the locally measured output current to synthesize the pulse width modulation (PWM) carrier waveform. By virtue of the intrinsic electrical coupling between converters, the nonlinear oscillator-based controllers converge to an interleaved state with uniform phase-spacing across PWM carriers. To the knowledge of the authors, this work represents the first fully decentralized strategy formore » switch interleaving of paralleled dc-dc buck converters.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boman, Erik G.
This LDRD project was a campus exec fellowship to fund (in part) Donald Nguyen’s PhD research at UT-Austin. His work has focused on parallel programming models, and scheduling irregular algorithms on shared-memory systems using the Galois framework. Galois provides a simple but powerful way for users and applications to automatically obtain good parallel performance using certain supported data containers. The naïve user can write serial code, while advanced users can optimize performance by advanced features, such as specifying the scheduling policy. Galois was used to parallelize two sparse matrix reordering schemes: RCM and Sloan. Such reordering is important in high-performancemore » computing to obtain better data locality and thus reduce run times.« less
ERIC Educational Resources Information Center
Hult, Francis M.; Källkvist, Marie
2016-01-01
In this paper, the language policies of three Swedish universities are examined as instances of language planning in local contexts. Although Sweden has the national Language Act of 2009 (SFS 2009:600) as well as a general Higher Education Ordinance (SFS 1993:100; SFS 2014:1096), language planning for higher education is left to the purview of…
Eroglu, Duygu Yilmaz; Ozmutlu, H Cenk
2014-01-01
We developed mixed integer programming (MIP) models and hybrid genetic-local search algorithms for the scheduling problem of unrelated parallel machines with job sequence and machine-dependent setup times and with job splitting property. The first contribution of this paper is to introduce novel algorithms which make splitting and scheduling simultaneously with variable number of subjobs. We proposed simple chromosome structure which is constituted by random key numbers in hybrid genetic-local search algorithm (GAspLA). Random key numbers are used frequently in genetic algorithms, but it creates additional difficulty when hybrid factors in local search are implemented. We developed algorithms that satisfy the adaptation of results of local search into the genetic algorithms with minimum relocation operation of genes' random key numbers. This is the second contribution of the paper. The third contribution of this paper is three developed new MIP models which are making splitting and scheduling simultaneously. The fourth contribution of this paper is implementation of the GAspLAMIP. This implementation let us verify the optimality of GAspLA for the studied combinations. The proposed methods are tested on a set of problems taken from the literature and the results validate the effectiveness of the proposed algorithms.
The benefits of otoplasty for children: further evidence to satisfy the modern NHS.
Cooper-Hobson, G; Jaffe, W
2009-02-01
To take standards from, and revalidate an existing study which addressed the psychological and social outcomes following otoplasty in children [Bradbury E, Hewison J, Timmons M. Psychological and social outcome of prominent ear correction in children. Br J Plast Surg 1992;45:97-100]. The psychosocial experiences of children undergoing otoplasty at the University Hospital of North Staffordshire were retrospectively examined and compared to the cohort in the existing study. Retrospective questionnaires were sent to all children aged 5-16 (n=101) who were on the hospital records having undergone otoplasty between 1999 and 2003, investigating social experiences, and experience of surgery. This study found: 97% reported an increase in happiness; 92% reported an increase in self-confidence; 79% noted improved social experience; 100% reported bullying reduced or stopped. The Wilcoxon Rank Sum Test confirmed the statistical validity of these findings (P<0.001). The existing study found: 63% of children reported increase in happiness and confidence; 13% noted improved social experiences; 53% noted bullying had stopped entirely. Otoplasty is an effective procedure in alleviating psychosocial distress in the vast majority of children that undergo the operation, and hence this study supports the continued availability of otoplasty on the NHS for children with prominent ears.
Cooper, Lilli; Seth, Rohit; Rhodes, Elizabeth; Alousi, Mohammed; Sivakumar, Bran
2017-01-01
Sickle cell disease (SCD) is an increasingly common condition in the UK. The safety of free tissue transfer in these patients is controversial, and no specific guidelines exist. The aim of this paper is to create recommendations for the plastic surgical multidisciplinary team for use in the assessment and management of SCD patients undergoing free tissue transfer and reconstruction. A literature review was performed in PubMed of 'sickle [TiAb] AND plast* adj3 surg*. Sickle cell disease is explained, as is the relative peri-operative risk in different genotypes of SCD. Acute and chronic manifestations of SCD are described by system, for consideration at pre-operative assessment and post-operative review. The evidence surrounding free tissue transfer and SCD is discussed and the outcomes in published cases summarised. An algorithm for peri-operative multi-disciplinary management is outlined and justified. Free tissue transfer theoretically carries a high risk of a crisis, due not only to long anaesthetic times, but the potential requirement for tourniquet use, and the relatively hypoxic state of the transferred tissue. This paper outlines a useful, practical algorithm to optimise the safety of free tissue transfer in patients with SCD. Copyright © 2016 British Association of Plastic, Reconstructive and Aesthetic Surgeons. Published by Elsevier Ltd. All rights reserved.
Material-Model-Based Determination of the Shock-Hugoniot Relations in Nanosegregated Polyurea
NASA Astrophysics Data System (ADS)
Grujicic, Mica; Snipes, J. S.; Galgalikar, R.; Ramaswami, S.
2014-02-01
Previous experimental investigations reported in the open literature have indicated that applying polyurea external coatings and/or internal linings can substantially improve ballistic penetration resistance and blast survivability of buildings, vehicles, and laboratory/field test-plates, as well as the blast-mitigation capacity of combat helmets. The protective role of polyurea coatings/linings has been linked to polyurea microstructure, which consists of discrete hard-domains distributed randomly within a compliant/soft matrix. When this protective role is investigated computationally, the availability of reliable, high-fidelity constitutive models for polyurea is vitally important. In the present work, a comprehensive overview and a critical assessment of a polyurea material constitutive model, recently proposed by Shim and Mohr (Int J Plast 27:868-886, 2011), are carried out. The review revealed that this model can accurately account for the experimentally measured uniaxial-stress versus strain data obtained under monotonic and multistep compressive loading/unloading conditions, as well as under stress relaxation conditions. On the other hand, by combining analytical and finite-element procedures with the material model in order to define the basic shock-Hugoniot relations for this material, it was found that the computed shock-Hugoniot relations differ significantly from their experimental counterparts. Potential reasons for the disagreement between the computed and experimental shock-Hugoniot relations are identified.
NASA Astrophysics Data System (ADS)
Rodrigues, Manuel J.; Fernandes, David E.; Silveirinha, Mário G.; Falcão, Gabriel
2018-01-01
This work introduces a parallel computing framework to characterize the propagation of electron waves in graphene-based nanostructures. The electron wave dynamics is modeled using both "microscopic" and effective medium formalisms and the numerical solution of the two-dimensional massless Dirac equation is determined using a Finite-Difference Time-Domain scheme. The propagation of electron waves in graphene superlattices with localized scattering centers is studied, and the role of the symmetry of the microscopic potential in the electron velocity is discussed. The computational methodologies target the parallel capabilities of heterogeneous multi-core CPU and multi-GPU environments and are built with the OpenCL parallel programming framework which provides a portable, vendor agnostic and high throughput-performance solution. The proposed heterogeneous multi-GPU implementation achieves speedup ratios up to 75x when compared to multi-thread and multi-core CPU execution, reducing simulation times from several hours to a couple of minutes.
3D Data Denoising via Nonlocal Means Filter by Using Parallel GPU Strategies
Cuomo, Salvatore; De Michele, Pasquale; Piccialli, Francesco
2014-01-01
Nonlocal Means (NLM) algorithm is widely considered as a state-of-the-art denoising filter in many research fields. Its high computational complexity leads researchers to the development of parallel programming approaches and the use of massively parallel architectures such as the GPUs. In the recent years, the GPU devices had led to achieving reasonable running times by filtering, slice-by-slice, and 3D datasets with a 2D NLM algorithm. In our approach we design and implement a fully 3D NonLocal Means parallel approach, adopting different algorithm mapping strategies on GPU architecture and multi-GPU framework, in order to demonstrate its high applicability and scalability. The experimental results we obtained encourage the usability of our approach in a large spectrum of applicative scenarios such as magnetic resonance imaging (MRI) or video sequence denoising. PMID:25045397
Parallel Implicit Runge-Kutta Methods Applied to Coupled Orbit/Attitude Propagation
NASA Astrophysics Data System (ADS)
Hatten, Noble; Russell, Ryan P.
2017-12-01
A variable-step Gauss-Legendre implicit Runge-Kutta (GLIRK) propagator is applied to coupled orbit/attitude propagation. Concepts previously shown to improve efficiency in 3DOF propagation are modified and extended to the 6DOF problem, including the use of variable-fidelity dynamics models. The impact of computing the stage dynamics of a single step in parallel is examined using up to 23 threads and 22 associated GLIRK stages; one thread is reserved for an extra dynamics function evaluation used in the estimation of the local truncation error. Efficiency is found to peak for typical examples when using approximately 8 to 12 stages for both serial and parallel implementations. Accuracy and efficiency compare favorably to explicit Runge-Kutta and linear-multistep solvers for representative scenarios. However, linear-multistep methods are found to be more efficient for some applications, particularly in a serial computing environment, or when parallelism can be applied across multiple trajectories.
Three-Dimensional High-Lift Analysis Using a Parallel Unstructured Multigrid Solver
NASA Technical Reports Server (NTRS)
Mavriplis, Dimitri J.
1998-01-01
A directional implicit unstructured agglomeration multigrid solver is ported to shared and distributed memory massively parallel machines using the explicit domain-decomposition and message-passing approach. Because the algorithm operates on local implicit lines in the unstructured mesh, special care is required in partitioning the problem for parallel computing. A weighted partitioning strategy is described which avoids breaking the implicit lines across processor boundaries, while incurring minimal additional communication overhead. Good scalability is demonstrated on a 128 processor SGI Origin 2000 machine and on a 512 processor CRAY T3E machine for reasonably fine grids. The feasibility of performing large-scale unstructured grid calculations with the parallel multigrid algorithm is demonstrated by computing the flow over a partial-span flap wing high-lift geometry on a highly resolved grid of 13.5 million points in approximately 4 hours of wall clock time on the CRAY T3E.
MMS Observations and Hybrid Simulations of Surface Ripples at a Marginally Quasi-Parallel Shock
NASA Astrophysics Data System (ADS)
Gingell, Imogen; Schwartz, Steven J.; Burgess, David; Johlander, Andreas; Russell, Christopher T.; Burch, James L.; Ergun, Robert E.; Fuselier, Stephen; Gershman, Daniel J.; Giles, Barbara L.; Goodrich, Katherine A.; Khotyaintsev, Yuri V.; Lavraud, Benoit; Lindqvist, Per-Arne; Strangeway, Robert J.; Trattner, Karlheinz; Torbert, Roy B.; Wei, Hanying; Wilder, Frederick
2017-11-01
Simulations and observations of collisionless shocks have shown that deviations of the nominal local shock normal orientation, that is, surface waves or ripples, are expected to propagate in the ramp and overshoot of quasi-perpendicular shocks. Here we identify signatures of a surface ripple propagating during a crossing of Earth's marginally quasi-parallel (θBn˜45∘) or quasi-parallel bow shock on 27 November 2015 06:01:44 UTC by the Magnetospheric Multiscale (MMS) mission and determine the ripple's properties using multispacecraft methods. Using two-dimensional hybrid simulations, we confirm that surface ripples are a feature of marginally quasi-parallel and quasi-parallel shocks under the observed solar wind conditions. In addition, since these marginally quasi-parallel and quasi-parallel shocks are expected to undergo a cyclic reformation of the shock front, we discuss the impact of multiple sources of nonstationarity on shock structure. Importantly, ripples are shown to be transient phenomena, developing faster than an ion gyroperiod and only during the period of the reformation cycle when a newly developed shock ramp is unaffected by turbulence in the foot. We conclude that the change in properties of the ripple observed by MMS is consistent with the reformation of the shock front over a time scale of an ion gyroperiod.
Activation of preexisting transverse structures in an evolving magmatic rift in East Africa
NASA Astrophysics Data System (ADS)
Muirhead, J. D.; Kattenhorn, S. A.
2018-01-01
Inherited crustal weaknesses have long been recognized as important factors in strain localization and basin development in the East African Rift System (EARS). However, the timing and kinematics (e.g., sense of slip) of transverse (rift-oblique) faults that exploit these weaknesses are debated, and thus the roles of inherited weaknesses at different stages of rift basin evolution are often overlooked. The mechanics of transverse faulting were addressed through an analysis of the Kordjya fault of the Magadi basin (Kenya Rift). Fault kinematics were investigated from field and remote-sensing data collected on fault and joint systems. Our analysis indicates that the Kordjya fault consists of a complex system of predominantly NNE-striking, rift-parallel fault segments that collectively form a NNW-trending array of en echelon faults. The transverse Kordjya fault therefore reactivated existing rift-parallel faults in ∼1 Ma lavas as oblique-normal faults with a component of sinistral shear. In all, these fault motions accommodate dip-slip on an underlying transverse structure that exploits the Aswa basement shear zone. This study shows that transverse faults may be activated through a complex interplay among magma-assisted strain localization, preexisting structures, and local stress rotations. Rather than forming during rift initiation, transverse structures can develop after the establishment of pervasive rift-parallel fault systems, and may exhibit dip-slip kinematics when activated from local stress rotations. The Kordjya fault is shown here to form a kinematic linkage that transfers strain to a newly developing center of concentrated magmatism and normal faulting. It is concluded that recently activated transverse faults not only reveal the effects of inherited basement weaknesses on fault development, but also provide important clues regarding developing magmatic and tectonic systems as young continental rift basins evolve.
Giner, Emmanuel; Tenti, Lorenzo; Angeli, Celestino; Malrieu, Jean-Paul
2016-09-28
The impact of the antisymmetrization is often addressed as a local property of the many-electron wave function, namely that the wave function should vanish when two electrons with parallel spins are in the same position in space. In this paper, we emphasize that this presentation is unduly restrictive: we illustrate the strong non-local character of the antisymmetrization principle, together with the fact that it is a matter of spin symmetry rather than spin parallelism. To this aim, we focus our attention on the simplest representation of various states of two-electron systems, both in atomic (helium atom) and molecular (H 2 and the π system of the ethylene molecule) cases. We discuss the non-local property of the nodal structure of some two-electron wave functions, both using analytical derivations and graphical representations of cuttings of the nodal hypersurfaces. The attention is then focussed on the impact of the antisymmetrization on the maxima of the two-body density, and we show that it introduces strong correlation effects (radial and/or angular) with a non-local character. These correlation effects are analyzed in terms of inflation and depletion zones, which are easily identifiable, thanks to the nodes of the orbitals composing the wave function. Also, we show that the correlation effects induced by the antisymmetrization occur also for anti-parallel spins since all M s components of a given spin state have the same N-body densities. Finally, we illustrate that these correlation effects occur also for the singlet states, but they have strictly opposite impacts: the inflation zones in the triplet become depletion zones in the singlet and vice versa.
Proof of Concept for the Rewrite Rule Machine: Interensemble Studies
1994-02-23
34 -,,, S2 •fbo fibo 0 1 Figure 1: Concurrent Rewriting of Fibonacci Expressions exploit a problem’s parallelism at several levels. We call this...property multigrain concurrency; it makes the RRM very well suited for solving not only homogeneous problems, but also complex, locally homogeneous but...interprocessor message passing over a network-is not well suited to data parallelism. A key goal of the RRM is to combine the best of these two approaches in a
2012-09-30
platform (HPC) was developed, called the HPC-Acoustic Data Accelerator, or HPC-ADA for short. The HPC-ADA was designed based on fielded systems [1-4...software (Detection cLassificaiton for MAchine learning - High Peformance Computing). The software package was designed to utilize parallel and...Sedna [7] and is designed using a parallel architecture2, allowing existing algorithms to distribute to the various processing nodes with minimal changes
Connectionist Models: Proceedings of the Summer School Held in San Diego, California on 1990
1990-01-01
modes: control network continues activation spreading based There is the sequential version and the parallel version on the actual inputs instead of...ent). 2. Execute all motoric actions based on activations of r a ent.The parallel version of the algorithm is local in time, units in A. Update the...a- movements that help o recognize an entering person.) tions like ’move focus left’, ’rotate focus’ are based on the activations of the C’s output
Pattern recognition with parallel associative memory
NASA Technical Reports Server (NTRS)
Toth, Charles K.; Schenk, Toni
1990-01-01
An examination is conducted of the feasibility of searching targets in aerial photographs by means of a parallel associative memory (PAM) that is based on the nearest-neighbor algorithm; the Hamming distance is used as a measure of closeness, in order to discriminate patterns. Attention has been given to targets typically used for ground-control points. The method developed sorts out approximate target positions where precise localizations are needed, in the course of the data-acquisition process. The majority of control points in different images were correctly identified.
Polarization-dependent DANES study on vertically-aligned ZnO nanorods
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sun, Chengjun; Park, Chang-In; Jin, Zhenlan
2016-05-01
The local structural and local density of states of vertically-aligned ZnO nanorods were examined by using a polarization-dependent diffraction anomalous near edge structure (DANES) measurements from c-oriented ZnO nanorods at the Zn K edge with the incident x-ray electric field parallel and perpendicular to the x-ray momentum transfer direction. Orientation-dependent local structures determined by DANES were comparable with polarization-dependent EXAFS results. Unlike other techniques, polarization-dependent DANES can uniquely describe the orientation-dependent local structural properties and the local density of states of a selected element in selected-phased crystals of compounds or mixed-phased structures.
NASA Astrophysics Data System (ADS)
Emter, Thomas; Petereit, Janko
2014-05-01
An integrated multi-sensor fusion framework for localization and mapping for autonomous navigation in unstructured outdoor environments based on extended Kalman filters (EKF) is presented. The sensors for localization include an inertial measurement unit, a GPS, a fiber optic gyroscope, and wheel odometry. Additionally a 3D LIDAR is used for simultaneous localization and mapping (SLAM). A 3D map is built while concurrently a localization in a so far established 2D map is estimated with the current scan of the LIDAR. Despite of longer run-time of the SLAM algorithm compared to the EKF update, a high update rate is still guaranteed by sophisticatedly joining and synchronizing two parallel localization estimators.
Tuning iteration space slicing based tiled multi-core code implementing Nussinov's RNA folding.
Palkowski, Marek; Bielecki, Wlodzimierz
2018-01-15
RNA folding is an ongoing compute-intensive task of bioinformatics. Parallelization and improving code locality for this kind of algorithms is one of the most relevant areas in computational biology. Fortunately, RNA secondary structure approaches, such as Nussinov's recurrence, involve mathematical operations over affine control loops whose iteration space can be represented by the polyhedral model. This allows us to apply powerful polyhedral compilation techniques based on the transitive closure of dependence graphs to generate parallel tiled code implementing Nussinov's RNA folding. Such techniques are within the iteration space slicing framework - the transitive dependences are applied to the statement instances of interest to produce valid tiles. The main problem at generating parallel tiled code is defining a proper tile size and tile dimension which impact parallelism degree and code locality. To choose the best tile size and tile dimension, we first construct parallel parametric tiled code (parameters are variables defining tile size). With this purpose, we first generate two nonparametric tiled codes with different fixed tile sizes but with the same code structure and then derive a general affine model, which describes all integer factors available in expressions of those codes. Using this model and known integer factors present in the mentioned expressions (they define the left-hand side of the model), we find unknown integers in this model for each integer factor available in the same fixed tiled code position and replace in this code expressions, including integer factors, with those including parameters. Then we use this parallel parametric tiled code to implement the well-known tile size selection (TSS) technique, which allows us to discover in a given search space the best tile size and tile dimension maximizing target code performance. For a given search space, the presented approach allows us to choose the best tile size and tile dimension in parallel tiled code implementing Nussinov's RNA folding. Experimental results, received on modern Intel multi-core processors, demonstrate that this code outperforms known closely related implementations when the length of RNA strands is bigger than 2500.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Masters, A.; Dougherty, M. K.; Sulaiman, A. H.
A leading explanation for the origin of Galactic cosmic rays is acceleration at high-Mach number shock waves in the collisionless plasma surrounding young supernova remnants. Evidence for this is provided by multi-wavelength non-thermal emission thought to be associated with ultrarelativistic electrons at these shocks. However, the dependence of the electron acceleration process on the orientation of the upstream magnetic field with respect to the local normal to the shock front (quasi-parallel/quasi-perpendicular) is debated. Cassini spacecraft observations at Saturn’s bow shock have revealed examples of electron acceleration under quasi-perpendicular conditions, and the first in situ evidence of electron acceleration at amore » quasi-parallel shock. Here we use Cassini data to make the first comparison between energy spectra of locally accelerated electrons under these differing upstream magnetic field regimes. We present data taken during a quasi-perpendicular shock crossing on 2008 March 8 and during a quasi-parallel shock crossing on 2007 February 3, highlighting that both were associated with electron acceleration to at least MeV energies. The magnetic signature of the quasi-perpendicular crossing has a relatively sharp upstream–downstream transition, and energetic electrons were detected close to the transition and immediately downstream. The magnetic transition at the quasi-parallel crossing is less clear, energetic electrons were encountered upstream and downstream, and the electron energy spectrum is harder above ∼100 keV. We discuss whether the acceleration is consistent with diffusive shock acceleration theory in each case, and suggest that the quasi-parallel spectral break is due to an energy-dependent interaction between the electrons and short, large-amplitude magnetic structures.« less
Palkowski, Marek; Bielecki, Wlodzimierz
2017-06-02
RNA secondary structure prediction is a compute intensive task that lies at the core of several search algorithms in bioinformatics. Fortunately, the RNA folding approaches, such as the Nussinov base pair maximization, involve mathematical operations over affine control loops whose iteration space can be represented by the polyhedral model. Polyhedral compilation techniques have proven to be a powerful tool for optimization of dense array codes. However, classical affine loop nest transformations used with these techniques do not optimize effectively codes of dynamic programming of RNA structure predictions. The purpose of this paper is to present a novel approach allowing for generation of a parallel tiled Nussinov RNA loop nest exposing significantly higher performance than that of known related code. This effect is achieved due to improving code locality and calculation parallelization. In order to improve code locality, we apply our previously published technique of automatic loop nest tiling to all the three loops of the Nussinov loop nest. This approach first forms original rectangular 3D tiles and then corrects them to establish their validity by means of applying the transitive closure of a dependence graph. To produce parallel code, we apply the loop skewing technique to a tiled Nussinov loop nest. The technique is implemented as a part of the publicly available polyhedral source-to-source TRACO compiler. Generated code was run on modern Intel multi-core processors and coprocessors. We present the speed-up factor of generated Nussinov RNA parallel code and demonstrate that it is considerably faster than related codes in which only the two outer loops of the Nussinov loop nest are tiled.
A parallel adaptive mesh refinement algorithm
NASA Technical Reports Server (NTRS)
Quirk, James J.; Hanebutte, Ulf R.
1993-01-01
Over recent years, Adaptive Mesh Refinement (AMR) algorithms which dynamically match the local resolution of the computational grid to the numerical solution being sought have emerged as powerful tools for solving problems that contain disparate length and time scales. In particular, several workers have demonstrated the effectiveness of employing an adaptive, block-structured hierarchical grid system for simulations of complex shock wave phenomena. Unfortunately, from the parallel algorithm developer's viewpoint, this class of scheme is quite involved; these schemes cannot be distilled down to a small kernel upon which various parallelizing strategies may be tested. However, because of their block-structured nature such schemes are inherently parallel, so all is not lost. In this paper we describe the method by which Quirk's AMR algorithm has been parallelized. This method is built upon just a few simple message passing routines and so it may be implemented across a broad class of MIMD machines. Moreover, the method of parallelization is such that the original serial code is left virtually intact, and so we are left with just a single product to support. The importance of this fact should not be underestimated given the size and complexity of the original algorithm.
NASA Technical Reports Server (NTRS)
Hockney, George; Lee, Seungwon
2008-01-01
A computer program known as PyPele, originally written as a Pythonlanguage extension module of a C++ language program, has been rewritten in pure Python language. The original version of PyPele dispatches and coordinates parallel-processing tasks on cluster computers and provides a conceptual framework for spacecraft-mission- design and -analysis software tools to run in an embarrassingly parallel mode. The original version of PyPele uses SSH (Secure Shell a set of standards and an associated network protocol for establishing a secure channel between a local and a remote computer) to coordinate parallel processing. Instead of SSH, the present Python version of PyPele uses Message Passing Interface (MPI) [an unofficial de-facto standard language-independent application programming interface for message- passing on a parallel computer] while keeping the same user interface. The use of MPI instead of SSH and the preservation of the original PyPele user interface make it possible for parallel application programs written previously for the original version of PyPele to run on MPI-based cluster computers. As a result, engineers using the previously written application programs can take advantage of embarrassing parallelism without need to rewrite those programs.
Parallel Tensor Compression for Large-Scale Scientific Data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kolda, Tamara G.; Ballard, Grey; Austin, Woody Nathan
As parallel computing trends towards the exascale, scientific data produced by high-fidelity simulations are growing increasingly massive. For instance, a simulation on a three-dimensional spatial grid with 512 points per dimension that tracks 64 variables per grid point for 128 time steps yields 8 TB of data. By viewing the data as a dense five way tensor, we can compute a Tucker decomposition to find inherent low-dimensional multilinear structure, achieving compression ratios of up to 10000 on real-world data sets with negligible loss in accuracy. So that we can operate on such massive data, we present the first-ever distributed memorymore » parallel implementation for the Tucker decomposition, whose key computations correspond to parallel linear algebra operations, albeit with nonstandard data layouts. Our approach specifies a data distribution for tensors that avoids any tensor data redistribution, either locally or in parallel. We provide accompanying analysis of the computation and communication costs of the algorithms. To demonstrate the compression and accuracy of the method, we apply our approach to real-world data sets from combustion science simulations. We also provide detailed performance results, including parallel performance in both weak and strong scaling experiments.« less
Run-time parallelization and scheduling of loops
NASA Technical Reports Server (NTRS)
Saltz, Joel H.; Mirchandaney, Ravi; Baxter, Doug
1988-01-01
The class of problems that can be effectively compiled by parallelizing compilers is discussed. This is accomplished with the doconsider construct which would allow these compilers to parallelize many problems in which substantial loop-level parallelism is available but cannot be detected by standard compile-time analysis. We describe and experimentally analyze mechanisms used to parallelize the work required for these types of loops. In each of these methods, a new loop structure is produced by modifying the loop to be parallelized. We also present the rules by which these loop transformations may be automated in order that they be included in language compilers. The main application area of the research involves problems in scientific computations and engineering. The workload used in our experiment includes a mixture of real problems as well as synthetically generated inputs. From our extensive tests on the Encore Multimax/320, we have reached the conclusion that for the types of workloads we have investigated, self-execution almost always performs better than pre-scheduling. Further, the improvement in performance that accrues as a result of global topological sorting of indices as opposed to the less expensive local sorting, is not very significant in the case of self-execution.
A global database with parallel measurements to study non-climatic changes
NASA Astrophysics Data System (ADS)
Venema, Victor; Auchmann, Renate; Aguilar, Enric
2015-04-01
n this work we introduce the rationale behind the ongoing compilation of a parallel measurements database, under the umbrella of the International Surface Temperatures Initiative (ISTI) and with the support of the World Meteorological Organization. We intend this database to become instrumental for a better understanding of inhomogeneities affecting the evaluation of long term changes in daily climate data. Long instrumental climate records are usually affected by non-climatic changes, due to, e.g., relocations and changes in instrumentation, instrument height or data collection and manipulation procedures. These so-called inhomogeneities distort the climate signal and can hamper the assessment of trends and variability. Thus to study climatic changes we need to accurately distinguish non-climatic and climatic signals. .The most direct way to study the influence of non-climatic changes on the distribution and to understand the reasons for these biases is the analysis of parallel measurements representing the old and new situation (in terms of e.g. instruments, location). According to the limited number of available studies and our understanding of the causes of inhomogeneity, we expect that they will have a strong impact on the tails of the distribution of temperatures and most likely of other climate elements. Our abilities to statistically homogenize daily data will be increased by systematically studying different causes of inhomogeneity replicated through parallel measurements. Current studies of non-climatic changes using parallel data are limited to local and regional case studies. However, the effect of specific transitions depends on the local climate and the most interesting climatic questions are about the systematic large-scale biases produced by transitions that occurred in many regions. Important potentially biasing transitions are the adoption of Stevenson screens, efforts to reduce undercatchment of precipitation or the move to automatic weather stations. Thus a large global parallel dataset is highly desirable as it allows for the study of systematic biases in the global record. In the ISTI Parallel Observations Science Team (POST), we will gather parallel data in their native format (to avoid undetectable conversion errors we will convert it to a standard format ourselves). We are interested in data from all climate variables at all time scales; from annual to sub-daily. High-resolution data is important for understanding the physical causes for the differences between the parallel measurements. For the same reason, we are also interested in other climate variables measured at the same station. For example, in case of parallel temperature measurements, the influencing factors are expected to be insolation, wind and clouds cover; in case of parallel precipitation measurements, wind and temperature are potentially important. Metadata that describe the parallel measurements is as important as the data itself and will be collected as well. For example, the types of the instruments, their siting, height, maintenance, etc. Because they are widely used to study moderate extremes, we will compute the indices of the Expert Team on Climate Change Detection and Indices (ETCCDI). In case the daily data cannot be shared, we would appreciate these indices from parallel measurements. For more information: http://tinyurl.com/ISTI-Parallel
A data distributed parallel algorithm for ray-traced volume rendering
NASA Technical Reports Server (NTRS)
Ma, Kwan-Liu; Painter, James S.; Hansen, Charles D.; Krogh, Michael F.
1993-01-01
This paper presents a divide-and-conquer ray-traced volume rendering algorithm and a parallel image compositing method, along with their implementation and performance on the Connection Machine CM-5, and networked workstations. This algorithm distributes both the data and the computations to individual processing units to achieve fast, high-quality rendering of high-resolution data. The volume data, once distributed, is left intact. The processing nodes perform local ray tracing of their subvolume concurrently. No communication between processing units is needed during this locally ray-tracing process. A subimage is generated by each processing unit and the final image is obtained by compositing subimages in the proper order, which can be determined a priori. Test results on both the CM-5 and a group of networked workstations demonstrate the practicality of our rendering algorithm and compositing method.
Jalas, S.; Dornmair, I.; Lehe, R.; ...
2017-03-20
Particle in Cell (PIC) simulations are a widely used tool for the investigation of both laser- and beam-driven plasma acceleration. It is a known issue that the beam quality can be artificially degraded by numerical Cherenkov radiation (NCR) resulting primarily from an incorrectly modeled dispersion relation. Pseudo-spectral solvers featuring infinite order stencils can strongly reduce NCR - or even suppress it - and are therefore well suited to correctly model the beam properties. For efficient parallelization of the PIC algorithm, however, localized solvers are inevitable. Arbitrary order pseudo-spectral methods provide this needed locality. Yet, these methods can again be pronemore » to NCR. Here in this paper, we show that acceptably low solver orders are sufficient to correctly model the physics of interest, while allowing for parallel computation by domain decomposition.« less
Local search to improve coordinate-based task mapping
Balzuweit, Evan; Bunde, David P.; Leung, Vitus J.; ...
2015-10-31
We present a local search strategy to improve the coordinate-based mapping of a parallel job’s tasks to the MPI ranks of its parallel allocation in order to reduce network congestion and the job’s communication time. The goal is to reduce the number of network hops between communicating pairs of ranks. Our target is applications with a nearest-neighbor stencil communication pattern running on mesh systems with non-contiguous processor allocation, such as Cray XE and XK Systems. Utilizing the miniGhost mini-app, which models the shock physics application CTH, we demonstrate that our strategy reduces application running time while also reducing the runtimemore » variability. Furthermore, we further show that mapping quality can vary based on the selected allocation algorithm, even between allocation algorithms of similar apparent quality.« less
Synchronizing compute node time bases in a parallel computer
Chen, Dong; Faraj, Daniel A; Gooding, Thomas M; Heidelberger, Philip
2015-01-27
Synchronizing time bases in a parallel computer that includes compute nodes organized for data communications in a tree network, where one compute node is designated as a root, and, for each compute node: calculating data transmission latency from the root to the compute node; configuring a thread as a pulse waiter; initializing a wakeup unit; and performing a local barrier operation; upon each node completing the local barrier operation, entering, by all compute nodes, a global barrier operation; upon all nodes entering the global barrier operation, sending, to all the compute nodes, a pulse signal; and for each compute node upon receiving the pulse signal: waking, by the wakeup unit, the pulse waiter; setting a time base for the compute node equal to the data transmission latency between the root node and the compute node; and exiting the global barrier operation.
Synchronizing compute node time bases in a parallel computer
Chen, Dong; Faraj, Daniel A; Gooding, Thomas M; Heidelberger, Philip
2014-12-30
Synchronizing time bases in a parallel computer that includes compute nodes organized for data communications in a tree network, where one compute node is designated as a root, and, for each compute node: calculating data transmission latency from the root to the compute node; configuring a thread as a pulse waiter; initializing a wakeup unit; and performing a local barrier operation; upon each node completing the local barrier operation, entering, by all compute nodes, a global barrier operation; upon all nodes entering the global barrier operation, sending, to all the compute nodes, a pulse signal; and for each compute node upon receiving the pulse signal: waking, by the wakeup unit, the pulse waiter; setting a time base for the compute node equal to the data transmission latency between the root node and the compute node; and exiting the global barrier operation.
A gyrokinetic one-dimensional scrape-off layer model of an edge-localized mode heat pulse
Shi, E. L.; Hakim, A. H.; Hammett, G. W.
2015-02-03
An electrostatic gyrokinetic-based model is applied to simulate parallel plasma transport in the scrape-off layer to a divertor plate. We focus on a test problem that has been studied previously, using parameters chosen to model a heat pulse driven by an edge-localized mode in JET. Previous work has used direct particle-in-cellequations with full dynamics, or Vlasov or fluid equations with only parallel dynamics. With the use of the gyrokinetic quasineutrality equation and logical sheathboundary conditions, spatial and temporal resolution requirements are no longer set by the electron Debye length and plasma frequency, respectively. Finally, this test problem also helps illustratemore » some of the physics contained in the Hamiltonian form of the gyrokineticequations and some of the numerical challenges in developing an edge gyrokinetic code.« less
Wilson, Robert L.; Frisz, Jessica F.; Hanafin, William P.; Carpenter, Kevin J.; Hutcheon, Ian D.; Weber, Peter K.; Kraft, Mary L.
2014-01-01
The local abundance of specific lipid species near a membrane protein is hypothesized to influence the protein’s activity. The ability to simultaneously image the distributions of specific protein and lipid species in the cell membrane would facilitate testing these hypotheses. Recent advances in imaging the distribution of cell membrane lipids with mass spectrometry have created the desire for membrane protein probes that can be simultaneously imaged with isotope labeled lipids. Such probes would enable conclusive tests of whether specific proteins co-localize with particular lipid species. Here, we describe the development of fluorine-functionalized colloidal gold immunolabels that facilitate the detection and imaging of specific proteins in parallel with lipids in the plasma membrane using high-resolution SIMS performed with a NanoSIMS. First, we developed a method to functionalize colloidal gold nanoparticles with a partially fluorinated mixed monolayer that permitted NanoSIMS detection and rendered the functionalized nanoparticles dispersible in aqueous buffer. Then, to allow for selective protein labeling, we attached the fluorinated colloidal gold nanoparticles to the nonbinding portion of antibodies. By combining these functionalized immunolabels with metabolic incorporation of stable isotopes, we demonstrate that influenza hemagglutinin and cellular lipids can be imaged in parallel using NanoSIMS. These labels enable a general approach to simultaneously imaging specific proteins and lipids with high sensitivity and lateral resolution, which may be used to evaluate predictions of protein co-localization with specific lipid species. PMID:22284327
Locality Aware Concurrent Start for Stencil Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shrestha, Sunil; Gao, Guang R.; Manzano Franco, Joseph B.
Stencil computations are at the heart of many physical simulations used in scientific codes. Thus, there exists a plethora of optimization efforts for this family of computations. Among these techniques, tiling techniques that allow concurrent start have proven to be very efficient in providing better performance for these critical kernels. Nevertheless, with many core designs being the norm, these optimization techniques might not be able to fully exploit locality (both spatial and temporal) on multiple levels of the memory hierarchy without compromising parallelism. It is no longer true that the machine can be seen as a homogeneous collection of nodesmore » with caches, main memory and an interconnect network. New architectural designs exhibit complex grouping of nodes, cores, threads, caches and memory connected by an ever evolving network-on-chip design. These new designs may benefit greatly from carefully crafted schedules and groupings that encourage parallel actors (i.e. threads, cores or nodes) to be aware of the computational history of other actors in close proximity. In this paper, we provide an efficient tiling technique that allows hierarchical concurrent start for memory hierarchy aware tile groups. Each execution schedule and tile shape exploit the available parallelism, load balance and locality present in the given applications. We demonstrate our technique on the Intel Xeon Phi architecture with selected and representative stencil kernels. We show improvement ranging from 5.58% to 31.17% over existing state-of-the-art techniques.« less
Trapping and Injecting Single Domain Walls in Magnetic Wire by Local Fields
NASA Astrophysics Data System (ADS)
Vázquez, Manuel; Basheed, G. A.; Infante, Germán; Del Real, Rafael P.
2012-01-01
A single domain wall (DW) moves at linearly increasing velocity under an increasing homogeneous drive magnetic field. Present experiments show that the DW is braked and finally trapped at a given position when an additional antiparallel local magnetic field is applied. That position and its velocity are further controlled by suitable tuning of the local field. In turn, the parallel local field of small amplitude does not significantly affect the effective wall speed at long distance, although it generates tail-to-tail and head-to-head pairs of walls moving along opposite directions when that field is strong enough.
Turbulence-driven anisotropic electron tail generation during magnetic reconnection
NASA Astrophysics Data System (ADS)
DuBois, A. M.; Scherer, A.; Almagri, A. F.; Anderson, J. K.; Pandya, M. D.; Sarff, J. S.
2018-05-01
Magnetic reconnection (MR) plays an important role in particle transport, energization, and acceleration in space, astrophysical, and laboratory plasmas. In the Madison Symmetric Torus reversed field pinch, discrete MR events release large amounts of energy from the equilibrium magnetic field, a fraction of which is transferred to electrons and ions. Previous experiments revealed an anisotropic electron tail that favors the perpendicular direction and is symmetric in the parallel. New profile measurements of x-ray emission show that the tail distribution is localized near the magnetic axis, consistent modeling of the bremsstrahlung emission. The tail appears first near the magnetic axis and then spreads radially, and the dynamics in the anisotropy and diffusion are discussed. The data presented imply that the electron tail formation likely results from a turbulent wave-particle interaction and provides evidence that high energy electrons are escaping the core-localized region through pitch angle scattering into the parallel direction, followed by stochastic parallel transport to the plasma edge. New measurements also show a strong correlation between high energy x-ray measurements and tearing mode dynamics, suggesting that the coupling between core and edge tearing modes is essential for energetic electron tail formation.
Advances in locally constrained k-space-based parallel MRI.
Samsonov, Alexey A; Block, Walter F; Arunachalam, Arjun; Field, Aaron S
2006-02-01
In this article, several theoretical and methodological developments regarding k-space-based, locally constrained parallel MRI (pMRI) reconstruction are presented. A connection between Parallel MRI with Adaptive Radius in k-Space (PARS) and GRAPPA methods is demonstrated. The analysis provides a basis for unified treatment of both methods. Additionally, a weighted PARS reconstruction is proposed, which may absorb different weighting strategies for improved image reconstruction. Next, a fast and efficient method for pMRI reconstruction of data sampled on non-Cartesian trajectories is described. In the new technique, the computational burden associated with the numerous matrix inversions in the original PARS method is drastically reduced by limiting direct calculation of reconstruction coefficients to only a few reference points. The rest of the coefficients are found by interpolating between the reference sets, which is possible due to the similar configuration of points participating in reconstruction for highly symmetric trajectories, such as radial and spirals. As a result, the time requirements are drastically reduced, which makes it practical to use pMRI with non-Cartesian trajectories in many applications. The new technique was demonstrated with simulated and actual data sampled on radial trajectories. Copyright 2006 Wiley-Liss, Inc.
NASA Astrophysics Data System (ADS)
Blasevski, D.; Del-Castillo-Negrete, D.
2012-10-01
Heat transport in magnetized plasmas is a problem of fundamental interest in controlled fusion. In Ref.footnotetext D. del-Castillo-Negrete, and L. Chac'on, Phys. Rev. Lett., 106, 195004 (2011); Phys. Plasmas 19, 056112 (2012). we proposed a Lagrangian-Green's function (LG) method to study this problem in the strongly anisotropic (χ=0) regime. The LG method bypasses the need to discretize the transport operators on a grid and it is applicable to general parallel flux closures and 3-D magnetic fields. Here we apply the LG method to parallel transport (with local and nonlocal parallel flux closures) in reversed shear magnetic field configurations known to exhibit robust transport barriers in the vicinity of the extrema of the q-profile. By shearless Cantori (SC) we mean the invariant Cantor sets remaining after the destruction of toroidal flux surfaces with zero magnetic shear, q^'=0. We provide numerical evidence of the role of SC in the anomalously slow relaxation of radial temperature gradients in chaotic magnetic fields with no transport barriers. The spatio-temporal evolution of temperature pulses localized in the reversed shear region exhibits non-diffusive self-similar evolution and nonlocal effective radial transport.
ERIC Educational Resources Information Center
Scott, Mark
2004-01-01
Throughout the 1990s, Europe's rural areas increasingly embraced local action and local development solutions to face the challenge of the continued re-structuring of the agricultural industry. In parallel, in both the EU and the UK, a policy discourse has emerged which envisages a fundamental shift in support policies for rural areas from a…
Decentralized Interleaving of Paralleled Dc-Dc Buck Converters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Johnson, Brian B; Rodriguez, Miguel; Sinha, Mohit
We present a decentralized control strategy that yields switch interleaving among parallel-connected dc-dc buck converters. The proposed method is based on the digital implementation of the dynamics of a nonlinear oscillator circuit as the controller. Each controller is fully decentralized, i.e., it only requires the locally measured output current to synthesize the pulse width modulation (PWM) carrier waveform and no communication between different controllers is needed. By virtue of the intrinsic electrical coupling between converters, the nonlinear oscillator-based controllers converge to an interleaved state with uniform phase-spacing across PWM carriers. To the knowledge of the authors, this work presents themore » first fully decentralized strategy for switch interleaving in paralleled dc-dc buck converters.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Naito, O.
2015-08-15
An analytic formula has been derived for the relativistic incoherent Thomson backscattering spectrum for a drifting anisotropic plasma when the scattering vector is parallel to the drifting direction. The shape of the scattering spectrum is insensitive to the electron temperature perpendicular to the scattering vector, but its amplitude may be modulated. As a result, while the measured temperature correctly represents the electron distribution parallel to the scattering vector, the electron density may be underestimated when the perpendicular temperature is higher than the parallel temperature. Since the scattering spectrum in shorter wavelengths is greatly enhanced by the existence of drift, themore » diagnostics might be used to measure local electron current density in fusion plasmas.« less
PRATHAM: Parallel Thermal Hydraulics Simulations using Advanced Mesoscopic Methods
DOE Office of Scientific and Technical Information (OSTI.GOV)
Joshi, Abhijit S; Jain, Prashant K; Mudrich, Jaime A
2012-01-01
At the Oak Ridge National Laboratory, efforts are under way to develop a 3D, parallel LBM code called PRATHAM (PaRAllel Thermal Hydraulic simulations using Advanced Mesoscopic Methods) to demonstrate the accuracy and scalability of LBM for turbulent flow simulations in nuclear applications. The code has been developed using FORTRAN-90, and parallelized using the message passing interface MPI library. Silo library is used to compact and write the data files, and VisIt visualization software is used to post-process the simulation data in parallel. Both the single relaxation time (SRT) and multi relaxation time (MRT) LBM schemes have been implemented in PRATHAM.more » To capture turbulence without prohibitively increasing the grid resolution requirements, an LES approach [5] is adopted allowing large scale eddies to be numerically resolved while modeling the smaller (subgrid) eddies. In this work, a Smagorinsky model has been used, which modifies the fluid viscosity by an additional eddy viscosity depending on the magnitude of the rate-of-strain tensor. In LBM, this is achieved by locally varying the relaxation time of the fluid.« less
Magnetic intermittency of solar wind turbulence in the dissipation range
NASA Astrophysics Data System (ADS)
Pei, Zhongtian; He, Jiansen; Tu, Chuanyi; Marsch, Eckart; Wang, Linghua
2016-04-01
The feature, nature, and fate of intermittency in the dissipation range are an interesting topic in the solar wind turbulence. We calculate the distribution of flatness for the magnetic field fluctuations as a functionof angle and scale. The flatness distribution shows a "butterfly" pattern, with two wings located at angles parallel/anti-parallel to local mean magnetic field direction and main body located at angles perpendicular to local B0. This "butterfly" pattern illustrates that the flatness profile in (anti-) parallel direction approaches to the maximum value at larger scale and drops faster than that in perpendicular direction. The contours for probability distribution functions at different scales illustrate a "vase" pattern, more clear in parallel direction, which confirms the scale-variation of flatness and indicates the intermittency generation and dissipation. The angular distribution of structure function in the dissipation range shows an anisotropic pattern. The quasi-mono-fractal scaling of structure function in the dissipation range is also illustrated and investigated with the mathematical model for inhomogeneous cascading (extended p-model). Different from the inertial range, the extended p-model for the dissipation range results in approximate uniform fragmentation measure. However, more complete mathematicaland physical model involving both non-uniform cascading and dissipation is needed. The nature of intermittency may be strong structures or large amplitude fluctuations, which may be tested with magnetic helicity. In one case study, we find the heating effect in terms of entropy for large amplitude fluctuations seems to be more obvious than strong structures.
Human Corneal Limbal-Epithelial Cell Response to Varying Silk Film Geometric Topography In Vitro
Lawrence, Brian D.; Pan, Zhi; Liu, Aihong; Kaplan, David L.; Rosenblatt, Mark I.
2012-01-01
Silk fibroin films are a promising class of biomaterials that have a number of advantages for use in ophthalmic applications due to their transparent nature, mechanical properties and minimal inflammatory response upon implantation. Freestanding silk films with parallel line and concentric ring topographies were generated for in vitro characterization of human corneal limbal-epithelial (HCLE) cell response upon differing geometric patterned surfaces. Results indicated that silk film topography significantly affected initial HCLE culture substrate attachment, cellular alignment, cell-to-cell contact formation, actin cytoskeleton alignment, and focal adhesion (FA) localization. Most notably, parallel line patterned surfaces displayed a 36%–54% increase on average in initial cell attachment, which corresponded to an over 2-fold increase in FA localization when compared to other silk film surfaces and controls. In addition, distinct localization of FA formation was observed along the edges for all patterned silk film topographies. In conclusion, silk film feature topography appears to help direct corneal epithelial cell response and cytoskeleton development, especially in regards to FA distribution, in vitro. PMID:22705042
Locally adaptive parallel temperature accelerated dynamics method
NASA Astrophysics Data System (ADS)
Shim, Yunsic; Amar, Jacques G.
2010-03-01
The recently-developed temperature-accelerated dynamics (TAD) method [M. Sørensen and A.F. Voter, J. Chem. Phys. 112, 9599 (2000)] along with the more recently developed parallel TAD (parTAD) method [Y. Shim et al, Phys. Rev. B 76, 205439 (2007)] allow one to carry out non-equilibrium simulations over extended time and length scales. The basic idea behind TAD is to speed up transitions by carrying out a high-temperature MD simulation and then use the resulting information to obtain event times at the desired low temperature. In a typical implementation, a fixed high temperature Thigh is used. However, in general one expects that for each configuration there exists an optimal value of Thigh which depends on the particular transition pathways and activation energies for that configuration. Here we present a locally adaptive high-temperature TAD method in which instead of using a fixed Thigh the high temperature is dynamically adjusted in order to maximize simulation efficiency. Preliminary results of the performance obtained from parTAD simulations of Cu/Cu(100) growth using the locally adaptive Thigh method will also be presented.
Avoiding and tolerating latency in large-scale next-generation shared-memory multiprocessors
NASA Technical Reports Server (NTRS)
Probst, David K.
1993-01-01
A scalable solution to the memory-latency problem is necessary to prevent the large latencies of synchronization and memory operations inherent in large-scale shared-memory multiprocessors from reducing high performance. We distinguish latency avoidance and latency tolerance. Latency is avoided when data is brought to nearby locales for future reference. Latency is tolerated when references are overlapped with other computation. Latency-avoiding locales include: processor registers, data caches used temporally, and nearby memory modules. Tolerating communication latency requires parallelism, allowing the overlap of communication and computation. Latency-tolerating techniques include: vector pipelining, data caches used spatially, prefetching in various forms, and multithreading in various forms. Relaxing the consistency model permits increased use of avoidance and tolerance techniques. Each model is a mapping from the program text to sets of partial orders on program operations; it is a convention about which temporal precedences among program operations are necessary. Information about temporal locality and parallelism constrains the use of avoidance and tolerance techniques. Suitable architectural primitives and compiler technology are required to exploit the increased freedom to reorder and overlap operations in relaxed models.
A transient FETI methodology for large-scale parallel implicit computations in structural mechanics
NASA Technical Reports Server (NTRS)
Farhat, Charbel; Crivelli, Luis; Roux, Francois-Xavier
1992-01-01
Explicit codes are often used to simulate the nonlinear dynamics of large-scale structural systems, even for low frequency response, because the storage and CPU requirements entailed by the repeated factorizations traditionally found in implicit codes rapidly overwhelm the available computing resources. With the advent of parallel processing, this trend is accelerating because explicit schemes are also easier to parallelize than implicit ones. However, the time step restriction imposed by the Courant stability condition on all explicit schemes cannot yet -- and perhaps will never -- be offset by the speed of parallel hardware. Therefore, it is essential to develop efficient and robust alternatives to direct methods that are also amenable to massively parallel processing because implicit codes using unconditionally stable time-integration algorithms are computationally more efficient when simulating low-frequency dynamics. Here we present a domain decomposition method for implicit schemes that requires significantly less storage than factorization algorithms, that is several times faster than other popular direct and iterative methods, that can be easily implemented on both shared and local memory parallel processors, and that is both computationally and communication-wise efficient. The proposed transient domain decomposition method is an extension of the method of Finite Element Tearing and Interconnecting (FETI) developed by Farhat and Roux for the solution of static problems. Serial and parallel performance results on the CRAY Y-MP/8 and the iPSC-860/128 systems are reported and analyzed for realistic structural dynamics problems. These results establish the superiority of the FETI method over both the serial/parallel conjugate gradient algorithm with diagonal scaling and the serial/parallel direct method, and contrast the computational power of the iPSC-860/128 parallel processor with that of the CRAY Y-MP/8 system.
Architectures for reasoning in parallel
NASA Technical Reports Server (NTRS)
Hall, Lawrence O.
1989-01-01
The research conducted has dealt with rule-based expert systems. The algorithms that may lead to effective parallelization of them were investigated. Both the forward and backward chained control paradigms were investigated in the course of this work. The best computer architecture for the developed and investigated algorithms has been researched. Two experimental vehicles were developed to facilitate this research. They are Backpac, a parallel backward chained rule-based reasoning system and Datapac, a parallel forward chained rule-based reasoning system. Both systems have been written in Multilisp, a version of Lisp which contains the parallel construct, future. Applying the future function to a function causes the function to become a task parallel to the spawning task. Additionally, Backpac and Datapac have been run on several disparate parallel processors. The machines are an Encore Multimax with 10 processors, the Concert Multiprocessor with 64 processors, and a 32 processor BBN GP1000. Both the Concert and the GP1000 are switch-based machines. The Multimax has all its processors hung off a common bus. All are shared memory machines, but have different schemes for sharing the memory and different locales for the shared memory. The main results of the investigations come from experiments on the 10 processor Encore and the Concert with partitions of 32 or less processors. Additionally, experiments have been run with a stripped down version of EMYCIN.
Parallel community climate model: Description and user`s guide
DOE Office of Scientific and Technical Information (OSTI.GOV)
Drake, J.B.; Flanery, R.E.; Semeraro, B.D.
This report gives an overview of a parallel version of the NCAR Community Climate Model, CCM2, implemented for MIMD massively parallel computers using a message-passing programming paradigm. The parallel implementation was developed on an Intel iPSC/860 with 128 processors and on the Intel Delta with 512 processors, and the initial target platform for the production version of the code is the Intel Paragon with 2048 processors. Because the implementation uses a standard, portable message-passing libraries, the code has been easily ported to other multiprocessors supporting a message-passing programming paradigm. The parallelization strategy used is to decompose the problem domain intomore » geographical patches and assign each processor the computation associated with a distinct subset of the patches. With this decomposition, the physics calculations involve only grid points and data local to a processor and are performed in parallel. Using parallel algorithms developed for the semi-Lagrangian transport, the fast Fourier transform and the Legendre transform, both physics and dynamics are computed in parallel with minimal data movement and modest change to the original CCM2 source code. Sequential or parallel history tapes are written and input files (in history tape format) are read sequentially by the parallel code to promote compatibility with production use of the model on other computer systems. A validation exercise has been performed with the parallel code and is detailed along with some performance numbers on the Intel Paragon and the IBM SP2. A discussion of reproducibility of results is included. A user`s guide for the PCCM2 version 2.1 on the various parallel machines completes the report. Procedures for compilation, setup and execution are given. A discussion of code internals is included for those who may wish to modify and use the program in their own research.« less
Ozmutlu, H. Cenk
2014-01-01
We developed mixed integer programming (MIP) models and hybrid genetic-local search algorithms for the scheduling problem of unrelated parallel machines with job sequence and machine-dependent setup times and with job splitting property. The first contribution of this paper is to introduce novel algorithms which make splitting and scheduling simultaneously with variable number of subjobs. We proposed simple chromosome structure which is constituted by random key numbers in hybrid genetic-local search algorithm (GAspLA). Random key numbers are used frequently in genetic algorithms, but it creates additional difficulty when hybrid factors in local search are implemented. We developed algorithms that satisfy the adaptation of results of local search into the genetic algorithms with minimum relocation operation of genes' random key numbers. This is the second contribution of the paper. The third contribution of this paper is three developed new MIP models which are making splitting and scheduling simultaneously. The fourth contribution of this paper is implementation of the GAspLAMIP. This implementation let us verify the optimality of GAspLA for the studied combinations. The proposed methods are tested on a set of problems taken from the literature and the results validate the effectiveness of the proposed algorithms. PMID:24977204
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dettmer, Simon L.; Keyser, Ulrich F.; Pagliara, Stefano
In this article we present methods for measuring hindered Brownian motion in the confinement of complex 3D geometries using digital video microscopy. Here we discuss essential features of automated 3D particle tracking as well as diffusion data analysis. By introducing local mean squared displacement-vs-time curves, we are able to simultaneously measure the spatial dependence of diffusion coefficients, tracking accuracies and drift velocities. Such local measurements allow a more detailed and appropriate description of strongly heterogeneous systems as opposed to global measurements. Finite size effects of the tracking region on measuring mean squared displacements are also discussed. The use of thesemore » methods was crucial for the measurement of the diffusive behavior of spherical polystyrene particles (505 nm diameter) in a microfluidic chip. The particles explored an array of parallel channels with different cross sections as well as the bulk reservoirs. For this experiment we present the measurement of local tracking accuracies in all three axial directions as well as the diffusivity parallel to the channel axis while we observed no significant flow but purely Brownian motion. Finally, the presented algorithm is suitable also for tracking of fluorescently labeled particles and particles driven by an external force, e.g., electrokinetic or dielectrophoretic forces.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morozov, Dmitriy; Weber, Gunther H.
2014-03-31
Topological techniques provide robust tools for data analysis. They are used, for example, for feature extraction, for data de-noising, and for comparison of data sets. This chapter concerns contour trees, a topological descriptor that records the connectivity of the isosurfaces of scalar functions. These trees are fundamental to analysis and visualization of physical phenomena modeled by real-valued measurements. We study the parallel analysis of contour trees. After describing a particular representation of a contour tree, called local{global representation, we illustrate how di erent problems that rely on contour trees can be solved in parallel with minimal communication.
DOE Office of Scientific and Technical Information (OSTI.GOV)
2017-05-17
PeleC is an adaptive-mesh compressible hydrodynamics code for reacting flows. It solves the compressible Navier-Stokes with multispecies transport in a block structured framework. The resulting algorithm is well suited for flows with localized resolution requirements and robust to discontinuities. User controllable refinement crieteria has the potential to result in extremely small numerical dissipation and dispersion, making this code appropriate for both research and applied usage. The code is built on the AMReX library which facilitates hierarchical parallelism and manages distributed memory parallism. PeleC algorithms are implemented to express shared memory parallelism.
The Basal Ganglia and Adaptive Motor Control
NASA Astrophysics Data System (ADS)
Graybiel, Ann M.; Aosaki, Toshihiko; Flaherty, Alice W.; Kimura, Minoru
1994-09-01
The basal ganglia are neural structures within the motor and cognitive control circuits in the mammalian forebrain and are interconnected with the neocortex by multiple loops. Dysfunction in these parallel loops caused by damage to the striatum results in major defects in voluntary movement, exemplified in Parkinson's disease and Huntington's disease. These parallel loops have a distributed modular architecture resembling local expert architectures of computational learning models. During sensorimotor learning, such distributed networks may be coordinated by widely spaced striatal interneurons that acquire response properties on the basis of experienced reward.
A Massively Parallel Code for Polarization Calculations
NASA Astrophysics Data System (ADS)
Akiyama, Shizuka; Höflich, Peter
2001-03-01
We present an implementation of our Monte-Carlo radiation transport method for rapidly expanding, NLTE atmospheres for massively parallel computers which utilizes both the distributed and shared memory models. This allows us to take full advantage of the fast communication and low latency inherent to nodes with multiple CPUs, and to stretch the limits of scalability with the number of nodes compared to a version which is based on the shared memory model. Test calculations on a local 20-node Beowulf cluster with dual CPUs showed an improved scalability by about 40%.
A Queue Simulation Tool for a High Performance Scientific Computing Center
NASA Technical Reports Server (NTRS)
Spear, Carrie; McGalliard, James
2007-01-01
The NASA Center for Computational Sciences (NCCS) at the Goddard Space Flight Center provides high performance highly parallel processors, mass storage, and supporting infrastructure to a community of computational Earth and space scientists. Long running (days) and highly parallel (hundreds of CPUs) jobs are common in the workload. NCCS management structures batch queues and allocates resources to optimize system use and prioritize workloads. NCCS technical staff use a locally developed discrete event simulation tool to model the impacts of evolving workloads, potential system upgrades, alternative queue structures and resource allocation policies.
Unstructured grids on SIMD torus machines
NASA Technical Reports Server (NTRS)
Bjorstad, Petter E.; Schreiber, Robert
1994-01-01
Unstructured grids lead to unstructured communication on distributed memory parallel computers, a problem that has been considered difficult. Here, we consider adaptive, offline communication routing for a SIMD processor grid. Our approach is empirical. We use large data sets drawn from supercomputing applications instead of an analytic model of communication load. The chief contribution of this paper is an experimental demonstration of the effectiveness of certain routing heuristics. Our routing algorithm is adaptive, nonminimal, and is generally designed to exploit locality. We have a parallel implementation of the router, and we report on its performance.
Parallel implementation of an adaptive scheme for 3D unstructured grids on the SP2
NASA Technical Reports Server (NTRS)
Strawn, Roger C.; Oliker, Leonid; Biswas, Rupak
1996-01-01
Dynamic mesh adaption on unstructured grids is a powerful tool for computing unsteady flows that require local grid modifications to efficiently resolve solution features. For this work, we consider an edge-based adaption scheme that has shown good single-processor performance on the C90. We report on our experience parallelizing this code for the SP2. Results show a 47.0X speedup on 64 processors when 10 percent of the mesh is randomly refined. Performance deteriorates to 7.7X when the same number of edges are refined in a highly-localized region. This is because almost all the mesh adaption is confined to a single processor. However, this problem can be remedied by repartitioning the mesh immediately after targeting edges for refinement but before the actual adaption takes place. With this change, the speedup improves dramatically to 43.6X.
Parallel Implementation of an Adaptive Scheme for 3D Unstructured Grids on the SP2
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Biswas, Rupak; Strawn, Roger C.
1996-01-01
Dynamic mesh adaption on unstructured grids is a powerful tool for computing unsteady flows that require local grid modifications to efficiently resolve solution features. For this work, we consider an edge-based adaption scheme that has shown good single-processor performance on the C90. We report on our experience parallelizing this code for the SP2. Results show a 47.OX speedup on 64 processors when 10% of the mesh is randomly refined. Performance deteriorates to 7.7X when the same number of edges are refined in a highly-localized region. This is because almost all mesh adaption is confined to a single processor. However, this problem can be remedied by repartitioning the mesh immediately after targeting edges for refinement but before the actual adaption takes place. With this change, the speedup improves dramatically to 43.6X.
Cache Locality Optimization for Recursive Programs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lifflander, Jonathan; Krishnamoorthy, Sriram
We present an approach to optimize the cache locality for recursive programs by dynamically splicing--recursively interleaving--the execution of distinct function invocations. By utilizing data effect annotations, we identify concurrency and data reuse opportunities across function invocations and interleave them to reduce reuse distance. We present algorithms that efficiently track effects in recursive programs, detect interference and dependencies, and interleave execution of function invocations using user-level (non-kernel) lightweight threads. To enable multi-core execution, a program is parallelized using a nested fork/join programming model. Our cache optimization strategy is designed to work in the context of a random work stealing scheduler. Wemore » present an implementation using the MIT Cilk framework that demonstrates significant improvements in sequential and parallel performance, competitive with a state-of-the-art compile-time optimizer for loop programs and a domain- specific optimizer for stencil programs.« less
NASA Astrophysics Data System (ADS)
Pei, Zongrui; Eisenbach, Markus
2017-06-01
Dislocations are among the most important defects in determining the mechanical properties of both conventional alloys and high-entropy alloys. The Peierls-Nabarro model supplies an efficient pathway to their geometries and mobility. The difficulty in solving the integro-differential Peierls-Nabarro equation is how to effectively avoid the local minima in the energy landscape of a dislocation core. Among the other methods to optimize the dislocation core structures, we choose the algorithm of Particle Swarm Optimization, an algorithm that simulates the social behaviors of organisms. By employing more particles (bigger swarm) and more iterative steps (allowing them to explore for longer time), the local minima can be effectively avoided. But this would require more computational cost. The advantage of this algorithm is that it is readily parallelized in modern high computing architecture. We demonstrate the performance of our parallelized algorithm scales linearly with the number of employed cores.
The parallel globe: a powerful instrument to perform investigations of Earth’s illumination
NASA Astrophysics Data System (ADS)
Rossi, Sabrina; Giordano, Enrica; Lanciano, Nicoletta
2015-01-01
Many researchers have documented the difficulties for learners of different ages and preparations in understanding basic astronomical concepts. Traditional instructional strategies and communication media do not seem to be effective in producing meaningful understanding, or even induce misconceptions and misinterpretations. In line with recent proposals for pedagogical sequences and learning progressions about core concepts and basic procedures in physics and astronomy education, in this paper we suggest an intermediate, essential step in the teaching path from the local geocentric view of the Earth-Sun system to a heliocentric one. With this aim we present data collected over a day and a year from an instrument we call the ‘parallel globe’, a globe positioned locally homothetic to the Earth. Some analyses are suggested, in particular of the phenomenon of illumination of the Earth and its variations, that are consistent with the proposed instructional objectives.
A Parallel Ghosting Algorithm for The Flexible Distributed Mesh Database
Mubarak, Misbah; Seol, Seegyoung; Lu, Qiukai; ...
2013-01-01
Critical to the scalability of parallel adaptive simulations are parallel control functions including load balancing, reduced inter-process communication and optimal data decomposition. In distributed meshes, many mesh-based applications frequently access neighborhood information for computational purposes which must be transmitted efficiently to avoid parallel performance degradation when the neighbors are on different processors. This article presents a parallel algorithm of creating and deleting data copies, referred to as ghost copies, which localize neighborhood data for computation purposes while minimizing inter-process communication. The key characteristics of the algorithm are: (1) It can create ghost copies of any permissible topological order in amore » 1D, 2D or 3D mesh based on selected adjacencies. (2) It exploits neighborhood communication patterns during the ghost creation process thus eliminating all-to-all communication. (3) For applications that need neighbors of neighbors, the algorithm can create n number of ghost layers up to a point where the whole partitioned mesh can be ghosted. Strong and weak scaling results are presented for the IBM BG/P and Cray XE6 architectures up to a core count of 32,768 processors. The algorithm also leads to scalable results when used in a parallel super-convergent patch recovery error estimator, an application that frequently accesses neighborhood data to carry out computation.« less
The effect of anisotropic heat transport on magnetic islands in 3-D configurations
NASA Astrophysics Data System (ADS)
Schlutt, M. G.; Hegna, C. C.
2012-08-01
An analytic theory of nonlinear pressure-induced magnetic island formation using a boundary layer analysis is presented. This theory extends previous work by including the effects of finite parallel heat transport and is applicable to general three dimensional magnetic configurations. In this work, particular attention is paid to the role of finite parallel heat conduction in the context of pressure-induced island physics. It is found that localized currents that require self-consistent deformation of the pressure profile, such as resistive interchange and bootstrap currents, are attenuated by finite parallel heat conduction when the magnetic islands are sufficiently small. However, these anisotropic effects do not change saturated island widths caused by Pfirsch-Schlüter current effects. Implications for finite pressure-induced island healing are discussed.
The architecture of tomorrow's massively parallel computer
NASA Technical Reports Server (NTRS)
Batcher, Ken
1987-01-01
Goodyear Aerospace delivered the Massively Parallel Processor (MPP) to NASA/Goddard in May 1983, over three years ago. Ever since then, Goodyear has tried to look in a forward direction. There is always some debate as to which way is forward when it comes to supercomputer architecture. Improvements to the MPP's massively parallel architecture are discussed in the areas of data I/O, memory capacity, connectivity, and indirect (or local) addressing. In I/O, transfer rates up to 640 megabytes per second can be achieved. There are devices that can supply the data and accept it at this rate. The memory capacity can be increased up to 128 megabytes in the ARU and over a gigabyte in the staging memory. For connectivity, there are several different kinds of multistage networks that should be considered.
2D-RBUC for efficient parallel compression of residuals
NASA Astrophysics Data System (ADS)
Đurđević, Đorđe M.; Tartalja, Igor I.
2018-02-01
In this paper, we present a method for lossless compression of residuals with an efficient SIMD parallel decompression. The residuals originate from lossy or near lossless compression of height fields, which are commonly used to represent models of terrains. The algorithm is founded on the existing RBUC method for compression of non-uniform data sources. We have adapted the method to capture 2D spatial locality of height fields, and developed the data decompression algorithm for modern GPU architectures already present even in home computers. In combination with the point-level SIMD-parallel lossless/lossy high field compression method HFPaC, characterized by fast progressive decompression and seamlessly reconstructed surface, the newly proposed method trades off small efficiency degradation for a non negligible compression ratio (measured up to 91%) benefit.
MPgrafic: A parallel MPI version of Grafic-1
NASA Astrophysics Data System (ADS)
Prunet, Simon; Pichon, Christophe
2013-04-01
MPgrafic is a parallel MPI version of Grafic-1 which can produce large cosmological initial conditions on a cluster without requiring shared memory. The real Fourier transforms are carried in place using fftw while minimizing the amount of used memory (at the expense of performance) in the spirit of Grafic-1. The writing of the output file is also carried in parallel. In addition to the technical parallelization, it provides three extensions over Grafic-1: it can produce power spectra with baryon wiggles (DJ Eisenstein and W. Hu, Ap. J. 496);it has the optional ability to load a lower resolution noise map corresponding to the low frequency component which will fix the larger scale modes of the simulation (extra flag 0/1 at the end of the input process) in the spirit of Grafic-2;it can be used in conjunction with constrfield, which generates initial conditions phases from a list of local constraints on density, tidal field density gradient and velocity.
A general parallel sparse-blocked matrix multiply for linear scaling SCF theory
NASA Astrophysics Data System (ADS)
Challacombe, Matt
2000-06-01
A general approach to the parallel sparse-blocked matrix-matrix multiply is developed in the context of linear scaling self-consistent-field (SCF) theory. The data-parallel message passing method uses non-blocking communication to overlap computation and communication. The space filling curve heuristic is used to achieve data locality for sparse matrix elements that decay with “separation”. Load balance is achieved by solving the bin packing problem for blocks with variable size.With this new method as the kernel, parallel performance of the simplified density matrix minimization (SDMM) for solution of the SCF equations is investigated for RHF/6-31G ∗∗ water clusters and RHF/3-21G estane globules. Sustained rates above 5.7 GFLOPS for the SDMM have been achieved for (H 2 O) 200 with 95 Origin 2000 processors. Scalability is found to be limited by load imbalance, which increases with decreasing granularity, due primarily to the inhomogeneous distribution of variable block sizes.
Sequential Feedback Scheme Outperforms the Parallel Scheme for Hamiltonian Parameter Estimation.
Yuan, Haidong
2016-10-14
Measurement and estimation of parameters are essential for science and engineering, where the main quest is to find the highest achievable precision with the given resources and design schemes to attain it. Two schemes, the sequential feedback scheme and the parallel scheme, are usually studied in the quantum parameter estimation. While the sequential feedback scheme represents the most general scheme, it remains unknown whether it can outperform the parallel scheme for any quantum estimation tasks. In this Letter, we show that the sequential feedback scheme has a threefold improvement over the parallel scheme for Hamiltonian parameter estimations on two-dimensional systems, and an order of O(d+1) improvement for Hamiltonian parameter estimation on d-dimensional systems. We also show that, contrary to the conventional belief, it is possible to simultaneously achieve the highest precision for estimating all three components of a magnetic field, which sets a benchmark on the local precision limit for the estimation of a magnetic field.
NASA Astrophysics Data System (ADS)
Homuth, B.; Löbl, U.; Batte, A. G.; Link, K.; Kasereka, C. M.; Rümpker, G.
2016-09-01
Shear-wave splitting measurements from local and teleseismic earthquakes are used to investigate the seismic anisotropy in the upper mantle beneath the Rwenzori region of the East African Rift system. At most stations, shear-wave splitting parameters obtained from individual earthquakes exhibit only minor variations with backazimuth. We therefore employ a joint inversion of SKS waveforms to derive hypothetical one-layer parameters. The corresponding fast polarizations are generally rift parallel and the average delay time is about 1 s. Shear phases from local events within the crust are characterized by an average delay time of 0.04 s. Delay times from local mantle earthquakes are in the range of 0.2 s. This observation suggests that the dominant source region for seismic anisotropy beneath the rift is located within the mantle. We use finite-frequency waveform modeling to test different models of anisotropy within the lithosphere/asthenosphere system of the rift. The results show that the rift-parallel fast polarizations are consistent with horizontal transverse isotropy (HTI anisotropy) caused by rift-parallel magmatic intrusions or lenses located within the lithospheric mantle—as it would be expected during the early stages of continental rifting. Furthermore, the short-scale spatial variations in the fast polarizations observed in the southern part of the study area can be explained by effects due to sedimentary basins of low isotropic velocity in combination with a shift in the orientation of anisotropic fabrics in the upper mantle. A uniform anisotropic layer in relation to large-scale asthenospheric mantle flow is less consistent with the observed splitting parameters.
NASA Astrophysics Data System (ADS)
Yang, Liping; Zhang, Lei; He, Jiansen; Tu, Chuanyi; Li, Shengtai; Wang, Xin; Wang, Linghua
2018-03-01
Multi-order structure functions in the solar wind are reported to display a monofractal scaling when sampled parallel to the local magnetic field and a multifractal scaling when measured perpendicularly. Whether and to what extent will the scaling anisotropy be weakened by the enhancement of turbulence amplitude relative to the background magnetic strength? In this study, based on two runs of the magnetohydrodynamic (MHD) turbulence simulation with different relative levels of turbulence amplitude, we investigate and compare the scaling of multi-order magnetic structure functions and magnetic probability distribution functions (PDFs) as well as their dependence on the direction of the local field. The numerical results show that for the case of large-amplitude MHD turbulence, the multi-order structure functions display a multifractal scaling at all angles to the local magnetic field, with PDFs deviating significantly from the Gaussian distribution and a flatness larger than 3 at all angles. In contrast, for the case of small-amplitude MHD turbulence, the multi-order structure functions and PDFs have different features in the quasi-parallel and quasi-perpendicular directions: a monofractal scaling and Gaussian-like distribution in the former, and a conversion of a monofractal scaling and Gaussian-like distribution into a multifractal scaling and non-Gaussian tail distribution in the latter. These results hint that when intermittencies are abundant and intense, the multifractal scaling in the structure functions can appear even if it is in the quasi-parallel direction; otherwise, the monofractal scaling in the structure functions remains even if it is in the quasi-perpendicular direction.
Schallmo, Michael-Paul; Grant, Andrea N; Burton, Philip C; Olman, Cheryl A
2016-08-01
Although V1 responses are driven primarily by elements within a neuron's receptive field, which subtends about 1° visual angle in parafoveal regions, previous work has shown that localized fMRI responses to visual elements reflect not only local feature encoding but also long-range pattern attributes. However, separating the response to an image feature from the response to the surrounding stimulus and studying the interactions between these two responses demands both spatial precision and signal independence, which may be challenging to attain with fMRI. The present study used 7 Tesla fMRI with 1.2-mm resolution to measure the interactions between small sinusoidal grating patches (targets) at 3° eccentricity and surrounds of various sizes and orientations to test the conditions under which localized, context-dependent fMRI responses could be predicted from either psychophysical or electrophysiological data. Targets were presented at 8%, 16%, and 32% contrast while manipulating (a) spatial extent of parallel (strongly suppressive) or orthogonal (weakly suppressive) surrounds, (b) locus of attention, (c) stimulus onset asynchrony between target and surround, and (d) blocked versus event-related design. In all experiments, the V1 fMRI signal was lower when target stimuli were flanked by parallel versus orthogonal context. Attention amplified fMRI responses to all stimuli but did not show a selective effect on central target responses or a measurable effect on orientation-dependent surround suppression. Suppression of the V1 fMRI response by parallel surrounds was stronger than predicted from psychophysics but showed a better match to previous electrophysiological reports.
A global database with parallel measurements to study non-climatic changes
NASA Astrophysics Data System (ADS)
Venema, Victor; Auchmann, Renate; Aguilar, Enric; Auer, Ingeborg; Azorin-Molina, Cesar; Brandsma, Theo; Brunetti, Michele; Dienst, Manuel; Domonkos, Peter; Gilabert, Alba; Lindén, Jenny; Milewska, Ewa; Nordli, Øyvind; Prohom, Marc; Rennie, Jared; Stepanek, Petr; Trewin, Blair; Vincent, Lucie; Willett, Kate; Wolff, Mareile
2016-04-01
In this work we introduce the rationale behind the ongoing compilation of a parallel measurements database, in the framework of the International Surface Temperatures Initiative (ISTI) and with the support of the World Meteorological Organization. We intend this database to become instrumental for a better understanding of inhomogeneities affecting the evaluation of long-term changes in daily climate data. Long instrumental climate records are usually affected by non-climatic changes, due to, e.g., (i) station relocations, (ii) instrument height changes, (iii) instrumentation changes, (iv) observing environment changes, (v) different sampling intervals or data collection procedures, among others. These so-called inhomogeneities distort the climate signal and can hamper the assessment of long-term trends and variability of climate. Thus to study climatic changes we need to accurately distinguish non-climatic and climatic signals. The most direct way to study the influence of non-climatic changes on the distribution and to understand the reasons for these biases is the analysis of parallel measurements representing the old and new situation (in terms of e.g. instruments, location, different radiation shields, etc.). According to the limited number of available studies and our understanding of the causes of inhomogeneity, we expect that they will have a strong impact on the tails of the distribution of air temperatures and most likely of other climate elements. Our abilities to statistically homogenize daily data will be increased by systematically studying different causes of inhomogeneity replicated through parallel measurements. Current studies of non-climatic changes using parallel data are limited to local and regional case studies. However, the effect of specific transitions depends on the local climate and the most interesting climatic questions are about the systematic large-scale biases produced by transitions that occurred in many regions. Important potentially biasing transitions are the adoption of Stevenson screens, relocations (to airports) efforts to reduce undercatchment of precipitation or the move to automatic weather stations. Thus a large global parallel dataset is highly desirable as it allows for the study of systematic biases in the global record. We are interested in data from all climate variables at all time scales; from annual to sub-daily. High-resolution data is important for understanding the physical causes for the differences between the parallel measurements. For the same reason, we are also interested in other climate variables measured at the same station. For example, in case of parallel air temperature measurements, the influencing factors are expected to be global radiation, wind, humidity and cloud cover; in case of parallel precipitation measurements, wind and wet-bulb temperature are potentially important. Metadata that describe the parallel measurements is as important as the data itself and will be collected as well. For example, the types of the instruments, their siting, height, maintenance, etc. Because they are widely used to study moderate extremes, we will compute the indices of the Expert Team on Climate Change Detection and Indices (ETCCDI). In case the daily data cannot be shared, we would appreciate contributions containing these indices from parallel measurements. For more information: http://tinyurl.com/ISTI-Parallel
Performance of the Wavelet Decomposition on Massively Parallel Architectures
NASA Technical Reports Server (NTRS)
El-Ghazawi, Tarek A.; LeMoigne, Jacqueline; Zukor, Dorothy (Technical Monitor)
2001-01-01
Traditionally, Fourier Transforms have been utilized for performing signal analysis and representation. But although it is straightforward to reconstruct a signal from its Fourier transform, no local description of the signal is included in its Fourier representation. To alleviate this problem, Windowed Fourier transforms and then wavelet transforms have been introduced, and it has been proven that wavelets give a better localization than traditional Fourier transforms, as well as a better division of the time- or space-frequency plane than Windowed Fourier transforms. Because of these properties and after the development of several fast algorithms for computing the wavelet representation of any signal, in particular the Multi-Resolution Analysis (MRA) developed by Mallat, wavelet transforms have increasingly been applied to signal analysis problems, especially real-life problems, in which speed is critical. In this paper we present and compare efficient wavelet decomposition algorithms on different parallel architectures. We report and analyze experimental measurements, using NASA remotely sensed images. Results show that our algorithms achieve significant performance gains on current high performance parallel systems, and meet scientific applications and multimedia requirements. The extensive performance measurements collected over a number of high-performance computer systems have revealed important architectural characteristics of these systems, in relation to the processing demands of the wavelet decomposition of digital images.
Large-scale parallel genome assembler over cloud computing environment.
Das, Arghya Kusum; Koppa, Praveen Kumar; Goswami, Sayan; Platania, Richard; Park, Seung-Jong
2017-06-01
The size of high throughput DNA sequencing data has already reached the terabyte scale. To manage this huge volume of data, many downstream sequencing applications started using locality-based computing over different cloud infrastructures to take advantage of elastic (pay as you go) resources at a lower cost. However, the locality-based programming model (e.g. MapReduce) is relatively new. Consequently, developing scalable data-intensive bioinformatics applications using this model and understanding the hardware environment that these applications require for good performance, both require further research. In this paper, we present a de Bruijn graph oriented Parallel Giraph-based Genome Assembler (GiGA), as well as the hardware platform required for its optimal performance. GiGA uses the power of Hadoop (MapReduce) and Giraph (large-scale graph analysis) to achieve high scalability over hundreds of compute nodes by collocating the computation and data. GiGA achieves significantly higher scalability with competitive assembly quality compared to contemporary parallel assemblers (e.g. ABySS and Contrail) over traditional HPC cluster. Moreover, we show that the performance of GiGA is significantly improved by using an SSD-based private cloud infrastructure over traditional HPC cluster. We observe that the performance of GiGA on 256 cores of this SSD-based cloud infrastructure closely matches that of 512 cores of traditional HPC cluster.
Hybrid Memory Management for Parallel Execution of Prolog on Shared Memory Multiprocessors
1990-06-01
organizing data to increase locality. The stack structure exhibits greater locality than the heap structure. Tradeoff decisions can also be made on...PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES...University of California at Berkeley,Department of Electrical Engineering and Computer Sciences,Berkeley,CA,94720 8. PERFORMING ORGANIZATION REPORT
NASA Astrophysics Data System (ADS)
Song, Y.; Lysak, R. L.
2017-12-01
Parallel electrostatic electric fields provide a powerful mechanism to accelerate auroral particles to high energy in the auroral acceleration region (AAR), creating both quasi-static and Alfvenic discrete aurorae. The total field-aligned current can be written as J||total=J||+J||D, where the displacement current is denoted as J||D=(1/4π)(∂E||/∂t), which describes the E||-generation (Song and Lysak, 2006). The generation of the total field-aligned current is related to spatial gradients of the parallel vorticity caused by the axial torque acting on field-aligned flux tubes in M-I coupling system. It should be noticed that parallel electric fields are not produced by the field-aligned current. In fact, the E||-generation is caused by Alfvenic interaction in the M-I coupling system, and is favored by a low plasma density and the enhanced localized azimuthal magnetic flux. We suggest that the nonlinear interaction of incident and reflected Alfven wave packets in the AAR can create reactive stress concentration, and therefore can generate the parallel electrostatic electric fields together with a seed low density cavity. The generated electric fields will quickly deepen the seed low density cavity, which can effectively create even stronger electrostatic electric fields. The electrostatic electric fields nested in a low density cavity and surrounded by enhanced azimuthal magnetic flux constitute Alfvenic electromagnetic plasma structures, such as Alfvenic Double Layers (DLs). The Poynting flux carried by Alfven waves can continuously supply energy from the generator region to the auroral acceleration region, supporting and sustaining Alfvenic DLs with long-lasting electrostatic electric fields which accelerate auroral particles to high energy. The generation of parallel electric fields and the formation of auroral arcs can redistribute perpendicular mechanical and magnetic stresses in auroral flux tubes, decoupling the magnetosphere from ionosphere drag locally. This may enhance the magnetotail earthward shear flows and rapidly buildup stronger parallel electric fields in the auroral acceleration region, leading to a sudden and violent tail energy release, if there is accumulated free magnetic energy in the tail.
Parallel Tetrahedral Mesh Adaptation with Dynamic Load Balancing
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Biswas, Rupak; Gabow, Harold N.
1999-01-01
The ability to dynamically adapt an unstructured grid is a powerful tool for efficiently solving computational problems with evolving physical features. In this paper, we report on our experience parallelizing an edge-based adaptation scheme, called 3D_TAG. using message passing. Results show excellent speedup when a realistic helicopter rotor mesh is randomly refined. However. performance deteriorates when the mesh is refined using a solution-based error indicator since mesh adaptation for practical problems occurs in a localized region., creating a severe load imbalance. To address this problem, we have developed PLUM, a global dynamic load balancing framework for adaptive numerical computations. Even though PLUM primarily balances processor workloads for the solution phase, it reduces the load imbalance problem within mesh adaptation by repartitioning the mesh after targeting edges for refinement but before the actual subdivision. This dramatically improves the performance of parallel 3D_TAG since refinement occurs in a more load balanced fashion. We also present optimal and heuristic algorithms that, when applied to the default mapping of a parallel repartitioner, significantly reduce the data redistribution overhead. Finally, portability is examined by comparing performance on three state-of-the-art parallel machines.
HeNCE: A Heterogeneous Network Computing Environment
Beguelin, Adam; Dongarra, Jack J.; Geist, George Al; ...
1994-01-01
Network computing seeks to utilize the aggregate resources of many networked computers to solve a single problem. In so doing it is often possible to obtain supercomputer performance from an inexpensive local area network. The drawback is that network computing is complicated and error prone when done by hand, especially if the computers have different operating systems and data formats and are thus heterogeneous. The heterogeneous network computing environment (HeNCE) is an integrated graphical environment for creating and running parallel programs over a heterogeneous collection of computers. It is built on a lower level package called parallel virtual machine (PVM).more » The HeNCE philosophy of parallel programming is to have the programmer graphically specify the parallelism of a computation and to automate, as much as possible, the tasks of writing, compiling, executing, debugging, and tracing the network computation. Key to HeNCE is a graphical language based on directed graphs that describe the parallelism and data dependencies of an application. Nodes in the graphs represent conventional Fortran or C subroutines and the arcs represent data and control flow. This article describes the present state of HeNCE, its capabilities, limitations, and areas of future research.« less
NASA Astrophysics Data System (ADS)
Leamy, Michael J.; Springer, Adam C.
In this research we report parallel implementation of a Cellular Automata-based simulation tool for computing elastodynamic response on complex, two-dimensional domains. Elastodynamic simulation using Cellular Automata (CA) has recently been presented as an alternative, inherently object-oriented technique for accurately and efficiently computing linear and nonlinear wave propagation in arbitrarily-shaped geometries. The local, autonomous nature of the method should lead to straight-forward and efficient parallelization. We address this notion on symmetric multiprocessor (SMP) hardware using a Java-based object-oriented CA code implementing triangular state machines (i.e., automata) and the MPI bindings written in Java (MPJ Express). We use MPJ Express to reconfigure our existing CA code to distribute a domain's automata to cores present on a dual quad-core shared-memory system (eight total processors). We note that this message passing parallelization strategy is directly applicable to computer clustered computing, which will be the focus of follow-on research. Results on the shared memory platform indicate nearly-ideal, linear speed-up. We conclude that the CA-based elastodynamic simulator is easily configured to run in parallel, and yields excellent speed-up on SMP hardware.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ban, H. Y.; Kavuri, V. C., E-mail: venk@physics.up
Purpose: The authors introduce a state-of-the-art all-optical clinical diffuse optical tomography (DOT) imaging instrument which collects spatially dense, multispectral, frequency-domain breast data in the parallel-plate geometry. Methods: The instrument utilizes a CCD-based heterodyne detection scheme that permits massively parallel detection of diffuse photon density wave amplitude and phase for a large number of source–detector pairs (10{sup 6}). The stand-alone clinical DOT instrument thus offers high spatial resolution with reduced crosstalk between absorption and scattering. Other novel features include a fringe profilometry system for breast boundary segmentation, real-time data normalization, and a patient bed design which permits both axial and sagittalmore » breast measurements. Results: The authors validated the instrument using tissue simulating phantoms with two different chromophore-containing targets and one scattering target. The authors also demonstrated the instrument in a case study breast cancer patient; the reconstructed 3D image of endogenous chromophores and scattering gave tumor localization in agreement with MRI. Conclusions: Imaging with a novel parallel-plate DOT breast imager that employs highly parallel, high-resolution CCD detection in the frequency-domain was demonstrated.« less
Maleki, Ehsan; Babashah, Hossein; Koohi, Somayyeh; Kavehvash, Zahra
2017-07-01
This paper presents an optical processing approach for exploring a large number of genome sequences. Specifically, we propose an optical correlator for global alignment and an extended moiré matching technique for local analysis of spatially coded DNA, whose output is fed to a novel three-dimensional artificial neural network for local DNA alignment. All-optical implementation of the proposed 3D artificial neural network is developed and its accuracy is verified in Zemax. Thanks to its parallel processing capability, the proposed structure performs local alignment of 4 million sequences of 150 base pairs in a few seconds, which is much faster than its electrical counterparts, such as the basic local alignment search tool.
A global database with parallel measurements to study non-climatic changes
NASA Astrophysics Data System (ADS)
Venema, Victor; Auchman, Renate; Aguilar, Enric
2017-04-01
In this work we introduce the rationale behind the ongoing compilation of a parallel measurements database, in the framework of the International Surface Temperatures Initiative (ISTI) and with the support of the World Meteorological Organization. We intend this database to become instrumental for a better understanding of inhomogeneities affecting the evaluation of long-term changes in daily climate data. Long instrumental climate records are usually affected by non-climatic changes, due to, e.g., (i) station re- locations, (ii) instrument height changes, (iii) instrumentation changes, (iv) observing environment changes, (v) different sampling intervals or data collection procedures, among others. These so-called inhomogeneities distort the climate signal and can hamper the assessment of long-term trends and variability of climate. Thus to study climatic changes we need to accurately distinguish non-climatic and climatic signals. The most direct way to study the influence of non-climatic changes on the distribution and to understand the reasons for these biases is the analysis of parallel measurements representing the old and new situation (in terms of e.g. instruments, location, different radiation shields, etc.). According to the limited number of available studies and our understanding of the causes of inhomogeneity, we expect that they will have a strong impact on the tails of the distribution of air temperatures and most likely of other climate elements. Our abilities to statistically homogenize daily data will be increased by systematically studying different causes of inhomogeneity replicated through parallel measurements. Current studies of non-climatic changes using parallel data are limited to local and regional case studies. However, the effect of specific transitions depends on the local climate and the most interesting climatic questions are about the systematic large-scale biases produced by transitions that occurred in many regions. Important potentially biasing transitions are the adoption of Stevenson screens, relocations (to airports) efforts to reduce undercatchment of precipitation or the move to automatic weather stations. Thus a large global parallel dataset is highly desirable as it allows for the study of systematic biases in the global record. We are interested in data from all climate variables at all time scales; from annual to sub-daily. High-resolution data is important for understanding the physical causes for the differences between the parallel measurements. For the same reason, we are also interested in other climate variables measured at the same station. For example, in case of parallel air temperature measurements, the influencing factors are expected to be global radiation, wind, humidity and cloud cover; in case of parallel precipitation measurements, wind and wet-bulb temperature are potentially important.
Parallel computing of physical maps--a comparative study in SIMD and MIMD parallelism.
Bhandarkar, S M; Chirravuri, S; Arnold, J
1996-01-01
Ordering clones from a genomic library into physical maps of whole chromosomes presents a central computational problem in genetics. Chromosome reconstruction via clone ordering is usually isomorphic to the NP-complete Optimal Linear Arrangement problem. Parallel SIMD and MIMD algorithms for simulated annealing based on Markov chain distribution are proposed and applied to the problem of chromosome reconstruction via clone ordering. Perturbation methods and problem-specific annealing heuristics are proposed and described. The SIMD algorithms are implemented on a 2048 processor MasPar MP-2 system which is an SIMD 2-D toroidal mesh architecture whereas the MIMD algorithms are implemented on an 8 processor Intel iPSC/860 which is an MIMD hypercube architecture. A comparative analysis of the various SIMD and MIMD algorithms is presented in which the convergence, speedup, and scalability characteristics of the various algorithms are analyzed and discussed. On a fine-grained, massively parallel SIMD architecture with a low synchronization overhead such as the MasPar MP-2, a parallel simulated annealing algorithm based on multiple periodically interacting searches performs the best. For a coarse-grained MIMD architecture with high synchronization overhead such as the Intel iPSC/860, a parallel simulated annealing algorithm based on multiple independent searches yields the best results. In either case, distribution of clonal data across multiple processors is shown to exacerbate the tendency of the parallel simulated annealing algorithm to get trapped in a local optimum.
NASA Astrophysics Data System (ADS)
Hopperstad, O. S.; Børvik, T.; Berstad, T.; Lademo, O.-G.; Benallal, A.
2007-10-01
The constitutive relation proposed by McCormick (1988 Acta Metall. 36 3061-7) for materials exhibiting negative steady-state strain-rate sensitivity and the Portevin-Le Chatelier (PLC) effect is incorporated into an elastic-viscoplastic model for metals with plastic anisotropy. The constitutive model is implemented in LS-DYNA for corotational shell elements. Plastic anisotropy is taken into account by use of the yield criterion Yld2000/Yld2003 proposed by Barlat et al (2003 J. Plast. 19 1297-319) and Aretz (2004 Modelling Simul. Mater. Sci. Eng. 12 491-509). The parameters of the constitutive equations are determined for a rolled aluminium alloy (AA5083-H116) exhibiting negative steady-state strain-rate sensitivity and serrated yielding. The parameter identification is based on existing experimental data. A numerical investigation is conducted to determine the influence of the PLC effect on the onset of necking in uniaxial and biaxial tension for different overall strain rates. The numerical simulations show that the PLC effect leads to significant reductions in the strain to necking for both uniaxial and biaxial stress states. Increased surface roughness with plastic deformation is predicted for strain rates giving serrated yielding in uniaxial tension. It is likely that this is an important reason for the reduced critical strains. The characteristics of the deformation bands (orientation, width, velocity and strain rate) are also studied.
Mechanisms mediating parallel action monitoring in fronto-striatal circuits.
Beste, Christian; Ness, Vanessa; Lukas, Carsten; Hoffmann, Rainer; Stüwe, Sven; Falkenstein, Michael; Saft, Carsten
2012-08-01
Flexible response adaptation and the control of conflicting information play a pivotal role in daily life. Yet, little is known about the neuronal mechanisms mediating parallel control of these processes. We examined these mechanisms using a multi-methodological approach that integrated data from event-related potentials (ERPs) with structural MRI data and source localisation using sLORETA. Moreover, we calculated evoked wavelet oscillations. We applied this multi-methodological approach in healthy subjects and patients in a prodromal phase of a major basal ganglia disorder (i.e., Huntington's disease), to directly focus on fronto-striatal networks. Behavioural data indicated, especially the parallel execution of conflict monitoring and flexible response adaptation was modulated across the examined cohorts. When both processes do not co-incide a high integrity of fronto-striatal loops seems to be dispensable. The neurophysiological data suggests that conflict monitoring (reflected by the N2 ERP) and working memory processes (reflected by the P3 ERP) differentially contribute to this pattern of results. Flexible response adaptation under the constraint of high conflict processing affected the N2 and P3 ERP, as well as their delta frequency band oscillations. Yet, modulatory effects were strongest for the N2 ERP and evoked wavelet oscillations in this time range. The N2 ERPs were localized in the anterior cingulate cortex (BA32, BA24). Modulations of the P3 ERP were localized in parietal areas (BA7). In addition, MRI-determined caudate head volume predicted modulations in conflict monitoring, but not working memory processes. The results show how parallel conflict monitoring and flexible adaptation of action is mediated via fronto-striatal networks. While both, response monitoring and working memory processes seem to play a role, especially response selection processes and ACC-basal ganglia networks seem to be the driving force in mediating parallel conflict monitoring and flexible adaptation of actions. Copyright © 2012 Elsevier Inc. All rights reserved.
Aharonov-Bohm and Aharonov-Casher effects for local and nonlocal Cooper pairs
NASA Astrophysics Data System (ADS)
Tomaszewski, Damian; Busz, Piotr; López, Rosa; Žitko, Rok; Lee, Minchul; Martinek, Jan
2018-06-01
We study combined interference effects due to the Aharonov-Bohm (AB) and Aharonov-Casher (AC) phases in a Josephson supercurrent of local and nonlocal (split) Cooper pairs. We analyze a junction between two superconductors interconnected through a normal-state nanostructure with either (i) a ring, where single-electron interference is possible, or (ii) two parallel nanowires, where the single-electron interference can be absent, but the cross Andreev reflection can occur. In the low-transmission regime in both geometries the AB and AC effects can be related to only local or nonlocal Cooper pair transport, respectively.
DOE Office of Scientific and Technical Information (OSTI.GOV)
None
Performing a global barrier operation in a parallel computer that includes compute nodes coupled for data communications, where each compute node executes tasks, with one task on each compute node designated as a master task, including: for each task on each compute node until all master tasks have joined a global barrier: determining whether the task is a master task; if the task is not a master task, joining a single local barrier; if the task is a master task, joining the global barrier and the single local barrier only after all other tasks on the compute node have joinedmore » the single local barrier.« less
Jet Noise Source Localization Using Linear Phased Array
NASA Technical Reports Server (NTRS)
Agboola, Ferni A.; Bridges, James
2004-01-01
A study was conducted to further clarify the interpretation and application of linear phased array microphone results, for localizing aeroacoustics sources in aircraft exhaust jet. Two model engine nozzles were tested at varying power cycles with the array setup parallel to the jet axis. The array position was varied as well to determine best location for the array. The results showed that it is possible to resolve jet noise sources with bypass and other components separation. The results also showed that a focused near field image provides more realistic noise source localization at low to mid frequencies.
Efficient multitasking of Choleski matrix factorization on CRAY supercomputers
NASA Technical Reports Server (NTRS)
Overman, Andrea L.; Poole, Eugene L.
1991-01-01
A Choleski method is described and used to solve linear systems of equations that arise in large scale structural analysis. The method uses a novel variable-band storage scheme and is structured to exploit fast local memory caches while minimizing data access delays between main memory and vector registers. Several parallel implementations of this method are described for the CRAY-2 and CRAY Y-MP computers demonstrating the use of microtasking and autotasking directives. A portable parallel language, FORCE, is used for comparison with the microtasked and autotasked implementations. Results are presented comparing the matrix factorization times for three representative structural analysis problems from runs made in both dedicated and multi-user modes on both computers. CPU and wall clock timings are given for the parallel implementations and are compared to single processor timings of the same algorithm.
Performing an allreduce operation on a plurality of compute nodes of a parallel computer
Faraj, Ahmad [Rochester, MN
2012-04-17
Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer. Each compute node includes at least two processing cores. Each processing core has contribution data for the allreduce operation. Performing an allreduce operation on a plurality of compute nodes of a parallel computer includes: establishing one or more logical rings among the compute nodes, each logical ring including at least one processing core from each compute node; performing, for each logical ring, a global allreduce operation using the contribution data for the processing cores included in that logical ring, yielding a global allreduce result for each processing core included in that logical ring; and performing, for each compute node, a local allreduce operation using the global allreduce results for each processing core on that compute node.
Slip-parallel seismic lineations on the Northern Hayward Fault, California
Waldhauser, F.; Ellsworth, W.L.; Cole, A.
1999-01-01
A high-resolution relative earthquake location procedure is used to image the fine-scale seismicity structure of the northern Hayward fault, California. The seismicity defines a narrow, near-vertical fault zone containing horizontal alignments of hypocenters extending along the fault zone. The lineations persist over the 15-year observation interval, implying the localization of conditions on the fault where brittle failure conditions are met. The horizontal orientation of the lineations parallels the slip direction of the fault, suggesting that they are the result of the smearing of frictionally weak material along the fault plane over thousands of years.
Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul
2010-03-16
A massively parallel computer system contains an inter-nodal communications network of node-to-node links. Each node implements a respective routing strategy for routing data through the network, the routing strategies not necessarily being the same in every node. The routing strategies implemented in the nodes are dynamically adjusted during application execution to shift network workload as required. Preferably, adjustment of routing policies in selective nodes is performed at synchronization points. The network may be dynamically monitored, and routing strategies adjusted according to detected network conditions.
Arcmancer: Geodesics and polarized radiative transfer library
NASA Astrophysics Data System (ADS)
Pihajoki, Pauli; Mannerkoski, Matias; Nättilä, Joonas; Johansson, Peter H.
2018-05-01
Arcmancer computes geodesics and performs polarized radiative transfer in user-specified spacetimes. The library supports Riemannian and semi-Riemannian spaces of any dimension and metric; it also supports multiple simultaneous coordinate charts, embedded geometric shapes, local coordinate systems, and automatic parallel propagation. Arcmancer can be used to solve various problems in numerical geometry, such as solving the curve equation of motion using adaptive integration with configurable tolerances and differential equations along precomputed curves. It also provides support for curves with an arbitrary acceleration term and generic tools for generating ray initial conditions and performing parallel computation over the image, among other tools.
Hypercluster - Parallel processing for computational mechanics
NASA Technical Reports Server (NTRS)
Blech, Richard A.
1988-01-01
An account is given of the development status, performance capabilities and implications for further development of NASA-Lewis' testbed 'hypercluster' parallel computer network, in which multiple processors communicate through a shared memory. Processors have local as well as shared memory; the hypercluster is expanded in the same manner as the hypercube, with processor clusters replacing the normal single processor node. The NASA-Lewis machine has three nodes with a vector personality and one node with a scalar personality. Each of the vector nodes uses four board-level vector processors, while the scalar node uses four general-purpose microcomputer boards.
Algorithm implementation on the Navier-Stokes computer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Krist, S.E.; Zang, T.A.
1987-03-01
The Navier-Stokes Computer is a multi-purpose parallel-processing supercomputer which is currently under development at Princeton University. It consists of multiple local memory parallel processors, called Nodes, which are interconnected in a hypercube network. Details of the procedures involved in implementing an algorithm on the Navier-Stokes computer are presented. The particular finite difference algorithm considered in this analysis was developed for simulation of laminar-turbulent transition in wall bounded shear flows. Projected timing results for implementing this algorithm indicate that operation rates in excess of 42 GFLOPS are feasible on a 128 Node machine.
Algorithm implementation on the Navier-Stokes computer
NASA Technical Reports Server (NTRS)
Krist, Steven E.; Zang, Thomas A.
1987-01-01
The Navier-Stokes Computer is a multi-purpose parallel-processing supercomputer which is currently under development at Princeton University. It consists of multiple local memory parallel processors, called Nodes, which are interconnected in a hypercube network. Details of the procedures involved in implementing an algorithm on the Navier-Stokes computer are presented. The particular finite difference algorithm considered in this analysis was developed for simulation of laminar-turbulent transition in wall bounded shear flows. Projected timing results for implementing this algorithm indicate that operation rates in excess of 42 GFLOPS are feasible on a 128 Node machine.
Ibrahim, Khaled Z.; Madduri, Kamesh; Williams, Samuel; ...
2013-07-18
The Gyrokinetic Toroidal Code (GTC) uses the particle-in-cell method to efficiently simulate plasma microturbulence. This paper presents novel analysis and optimization techniques to enhance the performance of GTC on large-scale machines. We introduce cell access analysis to better manage locality vs. synchronization tradeoffs on CPU and GPU-based architectures. Finally, our optimized hybrid parallel implementation of GTC uses MPI, OpenMP, and NVIDIA CUDA, achieves up to a 2× speedup over the reference Fortran version on multiple parallel systems, and scales efficiently to tens of thousands of cores.
A parallel-machine scheduling problem with two competing agents
NASA Astrophysics Data System (ADS)
Lee, Wen-Chiung; Chung, Yu-Hsiang; Wang, Jen-Ya
2017-06-01
Scheduling with two competing agents has become popular in recent years. Most of the research has focused on single-machine problems. This article considers a parallel-machine problem, the objective of which is to minimize the total completion time of jobs from the first agent given that the maximum tardiness of jobs from the second agent cannot exceed an upper bound. The NP-hardness of this problem is also examined. A genetic algorithm equipped with local search is proposed to search for the near-optimal solution. Computational experiments are conducted to evaluate the proposed genetic algorithm.
Li, Yiming; Ishitsuka, Yuji; Hedde, Per Niklas; Nienhaus, G Ulrich
2013-06-25
In localization-based super-resolution microscopy, individual fluorescent markers are stochastically photoactivated and subsequently localized within a series of camera frames, yielding a final image with a resolution far beyond the diffraction limit. Yet, before localization can be performed, the subregions within the frames where the individual molecules are present have to be identified-oftentimes in the presence of high background. In this work, we address the importance of reliable molecule identification for the quality of the final reconstructed super-resolution image. We present a fast and robust algorithm (a-livePALM) that vastly improves the molecule detection efficiency while minimizing false assignments that can lead to image artifacts.
Ultra wide-band localization and SLAM: a comparative study for mobile robot navigation.
Segura, Marcelo J; Auat Cheein, Fernando A; Toibero, Juan M; Mut, Vicente; Carelli, Ricardo
2011-01-01
In this work, a comparative study between an Ultra Wide-Band (UWB) localization system and a Simultaneous Localization and Mapping (SLAM) algorithm is presented. Due to its high bandwidth and short pulses length, UWB potentially allows great accuracy in range measurements based on Time of Arrival (TOA) estimation. SLAM algorithms recursively estimates the map of an environment and the pose (position and orientation) of a mobile robot within that environment. The comparative study presented here involves the performance analysis of implementing in parallel an UWB localization based system and a SLAM algorithm on a mobile robot navigating within an environment. Real time results as well as error analysis are also shown in this work.
Sun, Miao; Tang, Yuquan; Yang, Shuang; Li, Jun; Sigrist, Markus W; Dong, Fengzhong
2016-06-06
We propose a method for localizing a fire source using an optical fiber distributed temperature sensor system. A section of two parallel optical fibers employed as the sensing element is installed near the ceiling of a closed room in which the fire source is located. By measuring the temperature of hot air flows, the problem of three-dimensional fire source localization is transformed to two dimensions. The method of the source location is verified with experiments using burning alcohol as fire source, and it is demonstrated that the method represents a robust and reliable technique for localizing a fire source also for long sensing ranges.
Oscillations in stellar atmospheres
NASA Technical Reports Server (NTRS)
Costa, A.; Ringuelet, A. E.; Fontenla, J. M.
1989-01-01
Atmospheric excitation and propagation of oscillations are analyzed for typical pulsating stars. The linear, plane-parallel approach for the pulsating atmosphere gives a local description of the phenomenon. From the local analysis of oscillations, the minimum frequencies are obtained for radially propagating waves. The comparison of the minimum frequencies obtained for a variety of stellar types is in good agreement with the observed periods of the oscillations. The role of the atmosphere in the globar stellar pulsations is thus emphasized.
Coupled Ocean/Atmospheric Mesoscale Prediction System (COAMPS), Version 5.0 (User’s Guide)
2010-03-30
provides tools for common modeling functions, as well as regridding, data decomposition, and communication on parallel computers. NRL/MR/7320--10...specified gncomDir. If running COAMPS at the DSRC (e.g. BABBAGE, DAVINCI , or EINSTEIN), the global NCOM files will be copied to /scr/[user]/COAMPS/data...the site (DSRC or local) and the platform (BABBAGE. DAVINCI , EINSTEIN, or local machine) on which COAMPS is being run. site=navy_dsrc (for DSRC
Electric field dependent local structure of (KxNa1-x) 0.5B i0.5Ti O3
NASA Astrophysics Data System (ADS)
Goetzee-Barral, A. J.; Usher, T.-M.; Stevenson, T. J.; Jones, J. L.; Levin, I.; Brown, A. P.; Bell, A. J.
2017-07-01
The in situ x-ray pair-distribution function (PDF) characterization technique has been used to study the behavior of (KxNa1-x) 0.5B i0.5Ti O3 , as a function of electric field. As opposed to conventional x-ray Bragg diffraction techniques, PDF is sensitive to local atomic displacements, detecting local structural changes at the angstrom to nanometer scale. Several field-dependent ordering mechanisms can be observed in x =0.15 , 0.18 and at the morphotropic phase boundary composition x =0.20 . X-ray total scattering shows suppression of diffuse scattering with increasing electric-field amplitude, indicative of an increase in structural ordering. Analysis of PDF peaks in the 3-4-Å range shows ordering of Bi-Ti distances parallel to the applied electric field, illustrated by peak amplitude redistribution parallel and perpendicular to the electric-field vector. A transition from <110 > to <112 > -type off-center displacements of Bi relative to the neighboring Ti atoms is observable with increasing x . Analysis of PDF peak shift with electric field shows the effects of Bi-Ti redistribution and onset of piezoelectric lattice strain. The combination of these field-induced ordering mechanisms is consistent with local redistribution of Bi-Ti distances associated with domain reorientation and an overall increase in order of atomic displacements.
Electric field dependent local structure of ( K x N a 1 - x ) 0.5 B i 0.5 Ti O 3
DOE Office of Scientific and Technical Information (OSTI.GOV)
Goetzee-Barral, A. J.; Usher, T. -M.; Stevenson, T. J.
The in situ x-ray pair-distribution function (PDF) characterization technique has been used to study the behavior of (K xNa 1–x) 0.5Bi 0.5TiO 3, as a function of electric field. As opposed to conventional x-ray Bragg diffraction techniques, PDF is sensitive to local atomic displacements, detecting local structural changes at the angstrom to nanometer scale. Several field-dependent ordering mechanisms can be observed in x = 0.15, 0.18 and at the morphotropic phase boundary composition x = 0.20. X-ray total scattering shows suppression of diffuse scattering with increasing electric-field amplitude, indicative of an increase in structural ordering. Analysis of PDF peaks inmore » the 3–4-Å range shows ordering of Bi-Ti distances parallel to the applied electric field, illustrated by peak amplitude redistribution parallel and perpendicular to the electric-field vector. A transition from < 110 > to < 112 >-type off-center displacements of Bi relative to the neighboring Ti atoms is observable with increasing x. Analysis of PDF peak shift with electric field shows the effects of Bi-Ti redistribution and onset of piezoelectric lattice strain. Furthermore, the combination of these field-induced ordering mechanisms is consistent with local redistribution of Bi-Ti distances associated with domain reorientation and an overall increase in order of atomic displacements.« less
Electric field dependent local structure of ( K x N a 1 - x ) 0.5 B i 0.5 Ti O 3
Goetzee-Barral, A. J.; Usher, T. -M.; Stevenson, T. J.; ...
2017-07-31
The in situ x-ray pair-distribution function (PDF) characterization technique has been used to study the behavior of (K xNa 1–x) 0.5Bi 0.5TiO 3, as a function of electric field. As opposed to conventional x-ray Bragg diffraction techniques, PDF is sensitive to local atomic displacements, detecting local structural changes at the angstrom to nanometer scale. Several field-dependent ordering mechanisms can be observed in x = 0.15, 0.18 and at the morphotropic phase boundary composition x = 0.20. X-ray total scattering shows suppression of diffuse scattering with increasing electric-field amplitude, indicative of an increase in structural ordering. Analysis of PDF peaks inmore » the 3–4-Å range shows ordering of Bi-Ti distances parallel to the applied electric field, illustrated by peak amplitude redistribution parallel and perpendicular to the electric-field vector. A transition from < 110 > to < 112 >-type off-center displacements of Bi relative to the neighboring Ti atoms is observable with increasing x. Analysis of PDF peak shift with electric field shows the effects of Bi-Ti redistribution and onset of piezoelectric lattice strain. Furthermore, the combination of these field-induced ordering mechanisms is consistent with local redistribution of Bi-Ti distances associated with domain reorientation and an overall increase in order of atomic displacements.« less
Toward real-time Monte Carlo simulation using a commercial cloud computing infrastructure.
Wang, Henry; Ma, Yunzhi; Pratx, Guillem; Xing, Lei
2011-09-07
Monte Carlo (MC) methods are the gold standard for modeling photon and electron transport in a heterogeneous medium; however, their computational cost prohibits their routine use in the clinic. Cloud computing, wherein computing resources are allocated on-demand from a third party, is a new approach for high performance computing and is implemented to perform ultra-fast MC calculation in radiation therapy. We deployed the EGS5 MC package in a commercial cloud environment. Launched from a single local computer with Internet access, a Python script allocates a remote virtual cluster. A handshaking protocol designates master and worker nodes. The EGS5 binaries and the simulation data are initially loaded onto the master node. The simulation is then distributed among independent worker nodes via the message passing interface, and the results aggregated on the local computer for display and data analysis. The described approach is evaluated for pencil beams and broad beams of high-energy electrons and photons. The output of cloud-based MC simulation is identical to that produced by single-threaded implementation. For 1 million electrons, a simulation that takes 2.58 h on a local computer can be executed in 3.3 min on the cloud with 100 nodes, a 47× speed-up. Simulation time scales inversely with the number of parallel nodes. The parallelization overhead is also negligible for large simulations. Cloud computing represents one of the most important recent advances in supercomputing technology and provides a promising platform for substantially improved MC simulation. In addition to the significant speed up, cloud computing builds a layer of abstraction for high performance parallel computing, which may change the way dose calculations are performed and radiation treatment plans are completed.
Performing a local barrier operation
Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E
2014-03-04
Performing a local barrier operation with parallel tasks executing on a compute node including, for each task: retrieving a present value of a counter; calculating, in dependence upon the present value of the counter and a total number of tasks performing the local barrier operation, a base value, the base value representing the counter's value prior to any task joining the local barrier; calculating, in dependence upon the base value and the total number of tasks performing the local barrier operation, a target value of the counter, the target value representing the counter's value when all tasks have joined the local barrier; joining the local barrier, including atomically incrementing the value of the counter; and repetitively, until the present value of the counter is no less than the target value of the counter: retrieving the present value of the counter and determining whether the present value equals the target value.
Performing a local barrier operation
Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E
2014-03-04
Performing a local barrier operation with parallel tasks executing on a compute node including, for each task: retrieving a present value of a counter; calculating, in dependence upon the present value of the counter and a total number of tasks performing the local barrier operation, a base value of the counter, the base value representing the counter's value prior to any task joining the local barrier; calculating, in dependence upon the base value and the total number of tasks performing the local barrier operation, a target value, the target value representing the counter's value when all tasks have joined the local barrier; joining the local barrier, including atomically incrementing the value of the counter; and repetitively, until the present value of the counter is no less than the target value of the counter: retrieving the present value of the counter and determining whether the present value equals the target value.
NASA Astrophysics Data System (ADS)
Chang, Faliang; Liu, Chunsheng
2017-09-01
The high variability of sign colors and shapes in uncontrolled environments has made the detection of traffic signs a challenging problem in computer vision. We propose a traffic sign detection (TSD) method based on coarse-to-fine cascade and parallel support vector machine (SVM) detectors to detect Chinese warning and danger traffic signs. First, a region of interest (ROI) extraction method is proposed to extract ROIs using color contrast features in local regions. The ROI extraction can reduce scanning regions and save detection time. For multiclass TSD, we propose a structure that combines a coarse-to-fine cascaded tree with a parallel structure of histogram of oriented gradients (HOG) + SVM detectors. The cascaded tree is designed to detect different types of traffic signs in a coarse-to-fine process. The parallel HOG + SVM detectors are designed to do fine detection of different types of traffic signs. The experiments demonstrate the proposed TSD method can rapidly detect multiclass traffic signs with different colors and shapes in high accuracy.
Integrating Cache Performance Modeling and Tuning Support in Parallelization Tools
NASA Technical Reports Server (NTRS)
Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)
1998-01-01
With the resurgence of distributed shared memory (DSM) systems based on cache-coherent Non Uniform Memory Access (ccNUMA) architectures and increasing disparity between memory and processors speeds, data locality overheads are becoming the greatest bottlenecks in the way of realizing potential high performance of these systems. While parallelization tools and compilers facilitate the users in porting their sequential applications to a DSM system, a lot of time and effort is needed to tune the memory performance of these applications to achieve reasonable speedup. In this paper, we show that integrating cache performance modeling and tuning support within a parallelization environment can alleviate this problem. The Cache Performance Modeling and Prediction Tool (CPMP), employs trace-driven simulation techniques without the overhead of generating and managing detailed address traces. CPMP predicts the cache performance impact of source code level "what-if" modifications in a program to assist a user in the tuning process. CPMP is built on top of a customized version of the Computer Aided Parallelization Tools (CAPTools) environment. Finally, we demonstrate how CPMP can be applied to tune a real Computational Fluid Dynamics (CFD) application.
ERIC Educational Resources Information Center
Roberts, Jack
2008-01-01
A combination of commercialism and professionalism has become a powerful force undermining the wholesome nature of amateur athletic programs in local secondary schools of America. The growth in popularity of professional sports in America parallels the introduction of television. The pervasive influence of television on American life has driven…
NASA Astrophysics Data System (ADS)
Bazhenov, V. G.; Bragov, A. M.; Konstantinov, A. Yu.; Kotov, V. L.
2015-05-01
This paper presents an analysis of the accuracy of known and new modeling methods using the hypothesis of local and plane sections for solution of problems of the impact and plane-parallel motion of conical bodies at an angle to the free surface of the half-space occupied by elastoplastic soil. The parameters of the local interaction model that is quadratic in velocity are determined by solving the one-dimensional problem of the expansion of a spherical cavity. Axisymmetric problems for each of the meridional section are solved simultaneously neglecting mass and momentum transfer in the circumferential direction and using an approach based on the hypothesis of plane sections. The dynamic and kinematic parameters of oblique penetration obtained using modified models are compared with the results of computer simulation in a three-dimensional formulation. The results obtained with regard to the contact stress distribution along the generator of the pointed cone are in satisfactory agreement.
Thermo-elastic wave model of the photothermal and photoacoustic signal
DOE Office of Scientific and Technical Information (OSTI.GOV)
Meja, P.; Steiger, B.; Delsanto, P.P.
1996-12-31
By means of the thermo-elastic wave equation the dynamical propagation of mechanical stress and temperature can be described and applied to model the photothermal and photoacoustic signal. Analytical solutions exist only in particular cases. Using massively parallel computers it is possible to simulate the photothermal and photoacoustic signal in a most sufficient way. In this paper the method of local interaction simulation approach (LISA) is presented and selected examples of its application are given. The advantages of this method, which is particularly suitable for parallel processing, consist in reduced computation time and simple description of the photoacoustic signal in opticalmore » materials. The present contribution introduces the authors model, the formalism and some results in the 1 D case for homogeneous nonattenuative materials. The photoacoustic wave can be understood as a wave with locally limited displacement. This displacement corresponds to a temperature variation. Both variables are usually measured in photoacoustics and photothermal measurements. Therefore the temperature and displacement dependence on optical, elastic and thermal constants is analysed.« less
["Dual Guidance"? - parallel combination of ultrasound-guidance and nerve stimulation - Contra].
Maecken, Tim
2015-07-01
Sonography is a highly user-dependent technology. It presupposes a considerable degree of sonoanatomic and sonographic knowledge and requires good practical skills of the examiner. Sonography allows the identification of the puncture target, observes the needle feed and assesses the spread pattern of the local anesthetic in real time. Peripheral electrical nerve stimulation (PNS) cannot offer these advantages to the same degree, but may allow nerve localization under difficult sonographic conditions. The combination of the two locating techniques is complex in its practical implementation. Partially, the use of one location technique is made even more difficult by the combination with the second. PNS in parallel to sonography serves primarily as a warning technology in the case of an invisible cannula tip. It should not be construed as a compensation technique for the lack of sonographic skills or knowledge. However, PNS may be helpful in the sense of a bridging technology as long as the user is aware of its limitations. © Georg Thieme Verlag Stuttgart · New York.
Asymptotic-preserving Lagrangian approach for modeling anisotropic transport in magnetized plasmas
NASA Astrophysics Data System (ADS)
Chacon, Luis; Del-Castillo-Negrete, Diego
2012-03-01
Modeling electron transport in magnetized plasmas is extremely challenging due to the extreme anisotropy between parallel (to the magnetic field) and perpendicular directions (the transport-coefficient ratio χ/χ˜10^10 in fusion plasmas). Recently, a novel Lagrangian Green's function method has been proposedfootnotetextD. del-Castillo-Negrete, L. Chac'on, PRL, 106, 195004 (2011); D. del-Castillo-Negrete, L. Chac'on, Phys. Plasmas, submitted (2011) to solve the local and non-local purely parallel transport equation in general 3D magnetic fields. The approach avoids numerical pollution, is inherently positivity-preserving, and is scalable algorithmically (i.e., work per degree-of-freedom is grid-independent). In this poster, we discuss the extension of the Lagrangian Green's function approach to include perpendicular transport terms and sources. We present an asymptotic-preserving numerical formulation, which ensures a consistent numerical discretization temporally and spatially for arbitrary χ/χ ratios. We will demonstrate the potential of the approach with various challenging configurations, including the case of transport across a magnetic island in cylindrical geometry.
Heralded entangling quantum gate via cavity-assisted photon scattering
NASA Astrophysics Data System (ADS)
Borges, Halyne S.; Rossatto, Daniel Z.; Luiz, Fabrício S.; Villas-Boas, Celso J.
2018-01-01
We theoretically investigate the generation of heralded entanglement between two identical atoms via cavity-assisted photon scattering in two different configurations, namely, either both atoms confined in the same cavity or trapped into locally separated ones. Our protocols are given by a very simple and elegant single-step process, the key mechanism of which is a controlled-phase-flip gate implemented by impinging a single photon on single-sided cavities. In particular, when the atoms are localized in remote cavities, we introduce a single-step parallel quantum circuit instead of the serial process extensively adopted in the literature. We also show that such parallel circuit can be straightforwardly applied to entangle two macroscopic clouds of atoms. Both protocols proposed here predict a high entanglement degree with a success probability close to unity for state-of-the-art parameters. Among other applications, our proposal and its extension to multiple atom-cavity systems step toward a suitable route for quantum networking, in particular for quantum state transfer, quantum teleportation, and nonlocal quantum memory.
NASA Astrophysics Data System (ADS)
Ingraham, M. D.; Dewers, T. A.; Heath, J. E.
2016-12-01
Utilizing the localization conditions laid out in Rudnicki 2002, the failure of a series of tests performed on Mancos shale has been analyzed. Shale specimens were tested under constant mean stress conditions in an axisymmetric stress state, with specimens cored both parallel and perpendicular to bedding. Failure data indicates that for the range of pressures tested the failure surface is well represented by a Mohr- Coulomb failure surface with a friction angle of 34.4 for specimens cored parallel to bedding, and 26.5 for specimens cored perpendicular to bedding. There is no evidence of a yield cap up to 200 MPa mean stress. Comparison with the theory shows that the best agreement in terms of band angles comes from assuming normality of the plastic strain increment. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.
Pei, Zongrui; Max-Planck-Inst. fur Eisenforschung, Duseldorf; Eisenbach, Markus
2017-02-06
Dislocations are among the most important defects in determining the mechanical properties of both conventional alloys and high-entropy alloys. The Peierls-Nabarro model supplies an efficient pathway to their geometries and mobility. The difficulty in solving the integro-differential Peierls-Nabarro equation is how to effectively avoid the local minima in the energy landscape of a dislocation core. Among the other methods to optimize the dislocation core structures, we choose the algorithm of Particle Swarm Optimization, an algorithm that simulates the social behaviors of organisms. By employing more particles (bigger swarm) and more iterative steps (allowing them to explore for longer time), themore » local minima can be effectively avoided. But this would require more computational cost. The advantage of this algorithm is that it is readily parallelized in modern high computing architecture. We demonstrate the performance of our parallelized algorithm scales linearly with the number of employed cores.« less
Direct Observation of Parallel Folding Pathways Revealed Using a Symmetric Repeat Protein System
Aksel, Tural; Barrick, Doug
2014-01-01
Although progress has been made to determine the native fold of a polypeptide from its primary structure, the diversity of pathways that connect the unfolded and folded states has not been adequately explored. Theoretical and computational studies predict that proteins fold through parallel pathways on funneled energy landscapes, although experimental detection of pathway diversity has been challenging. Here, we exploit the high translational symmetry and the direct length variation afforded by linear repeat proteins to directly detect folding through parallel pathways. By comparing folding rates of consensus ankyrin repeat proteins (CARPs), we find a clear increase in folding rates with increasing size and repeat number, although the size of the transition states (estimated from denaturant sensitivity) remains unchanged. The increase in folding rate with chain length, as opposed to a decrease expected from typical models for globular proteins, is a clear demonstration of parallel pathways. This conclusion is not dependent on extensive curve-fitting or structural perturbation of protein structure. By globally fitting a simple parallel-Ising pathway model, we have directly measured nucleation and propagation rates in protein folding, and have quantified the fluxes along each path, providing a detailed energy landscape for folding. This finding of parallel pathways differs from results from kinetic studies of repeat-proteins composed of sequence-variable repeats, where modest repeat-to-repeat energy variation coalesces folding into a single, dominant channel. Thus, for globular proteins, which have much higher variation in local structure and topology, parallel pathways are expected to be the exception rather than the rule. PMID:24988356
User’s Guide for the Coupled Ocean/Atmospheric Mesoscale Prediction System (COAMPS) Version 5.0
2010-03-30
provides tools for common modeling functions, as well as regridding, data decomposition, and communication on parallel computers. NRL/MR/7320...specified gncomDir. If running COAMPS at the DSRC (e.g. BABBAGE, DAVINCI , or EINSTEIN), the global NCOM files will be copied to /scr/[user]/COAMPS/data...the site (DSRC or local) and the platform (BABBAGE. DAVINCI , EINSTEIN, or local machine) on which COAMPS is being run. site=navy_dsrc (for DSRC
2011-04-01
roll rates are estimates of projectile roll rates with respect to the sun and the local geomagnetic field respectively. The solar aspect angle is the...vector and a vector originating at the CG and parallel to the local geomagnetic field. Methodologies employed to obtain these and other airframe states...and an independent approach (POINTER) and relative magnitude information about the side moments was obtained. VAPP-24 underwent a reversal in coning
A Data Parallel Multizone Navier-Stokes Code
NASA Technical Reports Server (NTRS)
Jespersen, Dennis C.; Levit, Creon; Kwak, Dochan (Technical Monitor)
1995-01-01
We have developed a data parallel multizone compressible Navier-Stokes code on the Connection Machine CM-5. The code is set up for implicit time-stepping on single or multiple structured grids. For multiple grids and geometrically complex problems, we follow the "chimera" approach, where flow data on one zone is interpolated onto another in the region of overlap. We will describe our design philosophy and give some timing results for the current code. The design choices can be summarized as: 1. finite differences on structured grids; 2. implicit time-stepping with either distributed solves or data motion and local solves; 3. sequential stepping through multiple zones with interzone data transfer via a distributed data structure. We have implemented these ideas on the CM-5 using CMF (Connection Machine Fortran), a data parallel language which combines elements of Fortran 90 and certain extensions, and which bears a strong similarity to High Performance Fortran (HPF). One interesting feature is the issue of turbulence modeling, where the architecture of a parallel machine makes the use of an algebraic turbulence model awkward, whereas models based on transport equations are more natural. We will present some performance figures for the code on the CM-5, and consider the issues involved in transitioning the code to HPF for portability to other parallel platforms.
Split off-specular reflection and surface scattering from woven materials
NASA Astrophysics Data System (ADS)
Pont, Sylvia C.; Koenderink, Jan J.
2003-03-01
We measured radiance distributions for black lining cloth and copper gauze using the convenient technique of wrapping the materials around a circular cylinder, irradiating it with a parallel light source and collecting the scattered radiance by a digital camera. One family of parallel threads (weave or weft) was parallel to the cylinder generator. The most salient features for such glossy plane weaves are a splitting up of the reflection peak due to the wavy variations in local slopes of the threads around the cylinders and a surface scattering lobe due to the threads that run along the cylinder. These scattering characteristics are quite different from the (off-)specular peaks and lobes that were found before for random rough specular surfaces. The split off-specular reflection is due to the regular structures in our samples of man-made materials. We derived simple approximations for these reflectance characteristics using geometrical optics.
Passing in Command Line Arguments and Parallel Cluster/Multicore Batching in R with batch.
Hoffmann, Thomas J
2011-03-01
It is often useful to rerun a command line R script with some slight change in the parameters used to run it - a new set of parameters for a simulation, a different dataset to process, etc. The R package batch provides a means to pass in multiple command line options, including vectors of values in the usual R format, easily into R. The same script can be setup to run things in parallel via different command line arguments. The R package batch also provides a means to simplify this parallel batching by allowing one to use R and an R-like syntax for arguments to spread a script across a cluster or local multicore/multiprocessor computer, with automated syntax for several popular cluster types. Finally it provides a means to aggregate the results together of multiple processes run on a cluster.
NASA Astrophysics Data System (ADS)
Ji, X.; Shen, C.
2017-12-01
Flood inundation presents substantial societal hazards and also changes biogeochemistry for systems like the Amazon. It is often expensive to simulate high-resolution flood inundation and propagation in a long-term watershed-scale model. Due to the Courant-Friedrichs-Lewy (CFL) restriction, high resolution and large local flow velocity both demand prohibitively small time steps even for parallel codes. Here we develop a parallel surface-subsurface process-based model enhanced by multi-resolution meshes that are adaptively switched on or off. The high-resolution overland flow meshes are enabled only when the flood wave invades to floodplains. This model applies semi-implicit, semi-Lagrangian (SISL) scheme in solving dynamic wave equations, and with the assistant of the multi-mesh method, it also adaptively chooses the dynamic wave equation only in the area of deep inundation. Therefore, the model achieves a balance between accuracy and computational cost.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deslippe, Jack; da Jornada, Felipe H.; Vigil-Fowler, Derek
2016-10-06
We profile and optimize calculations performed with the BerkeleyGW code on the Xeon-Phi architecture. BerkeleyGW depends both on hand-tuned critical kernels as well as on BLAS and FFT libraries. We describe the optimization process and performance improvements achieved. We discuss a layered parallelization strategy to take advantage of vector, thread and node-level parallelism. We discuss locality changes (including the consequence of the lack of L3 cache) and effective use of the on-package high-bandwidth memory. We show preliminary results on Knights-Landing including a roofline study of code performance before and after a number of optimizations. We find that the GW methodmore » is particularly well-suited for many-core architectures due to the ability to exploit a large amount of parallelism over plane-wave components, band-pairs, and frequencies.« less
User-Defined Data Distributions in High-Level Programming Languages
NASA Technical Reports Server (NTRS)
Diaconescu, Roxana E.; Zima, Hans P.
2006-01-01
One of the characteristic features of today s high performance computing systems is a physically distributed memory. Efficient management of locality is essential for meeting key performance requirements for these architectures. The standard technique for dealing with this issue has involved the extension of traditional sequential programming languages with explicit message passing, in the context of a processor-centric view of parallel computation. This has resulted in complex and error-prone assembly-style codes in which algorithms and communication are inextricably interwoven. This paper presents a high-level approach to the design and implementation of data distributions. Our work is motivated by the need to improve the current parallel programming methodology by introducing a paradigm supporting the development of efficient and reusable parallel code. This approach is currently being implemented in the context of a new programming language called Chapel, which is designed in the HPCS project Cascade.
Analysis techniques for diagnosing runaway ion distributions in the reversed field pinch
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, J., E-mail: jkim536@wisc.edu; Anderson, J. K.; Capecchi, W.
2016-11-15
An advanced neutral particle analyzer (ANPA) on the Madison Symmetric Torus measures deuterium ions of energy ranges 8-45 keV with an energy resolution of 2-4 keV and time resolution of 10 μs. Three different experimental configurations measure distinct portions of the naturally occurring fast ion distributions: fast ions moving parallel, anti-parallel, or perpendicular to the plasma current. On a radial-facing port, fast ions moving perpendicular to the current have the necessary pitch to be measured by the ANPA. With the diagnostic positioned on a tangent line through the plasma core, a chord integration over fast ion density, background neutral density,more » and local appropriate pitch defines the measured sample. The plasma current can be reversed to measure anti-parallel fast ions in the same configuration. Comparisons of energy distributions for the three configurations show an anisotropic fast ion distribution favoring high pitch ions.« less
NASA Astrophysics Data System (ADS)
Vera, N. C.; GMMC
2013-05-01
In this paper we present the results of macrohybrid mixed Darcian flow in porous media in a general three-dimensional domain. The global problem is solved as a set of local subproblems which are posed using a domain decomposition method. Unknown fields of local problems, velocity and pressure are approximated using mixed finite elements. For this application, a general three-dimensional domain is considered which is discretized using tetrahedra. The discrete domain is decomposed into subdomains and reformulated the original problem as a set of subproblems, communicated through their interfaces. To solve this set of subproblems, we use finite element mixed and parallel computing. The parallelization of a problem using this methodology can, in principle, to fully exploit a computer equipment and also provides results in less time, two very important elements in modeling. Referencias G.Alduncin and N.Vera-Guzmán Parallel proximal-point algorithms for mixed _nite element models of _ow in the subsurface, Commun. Numer. Meth. Engng 2004; 20:83-104 (DOI: 10.1002/cnm.647) Z. Chen, G.Huan and Y. Ma Computational Methods for Multiphase Flows in Porous Media, SIAM, Society for Industrial and Applied Mathematics, Philadelphia, 2006. A. Quarteroni and A. Valli, Numerical Approximation of Partial Differential Equations, Springer-Verlag, Berlin, 1994. Brezzi F, Fortin M. Mixed and Hybrid Finite Element Methods. Springer: New York, 1991.
Ontogeny of the sheathing leaf base in maize (Zea mays).
Johnston, Robyn; Leiboff, Samuel; Scanlon, Michael J
2015-01-01
Leaves develop from the shoot apical meristem (SAM) via recruitment of leaf founder cells. Unlike eudicots, most monocot leaves display parallel venation and sheathing bases wherein the margins overlap the stem. Here we utilized computed tomography (CT) imaging, localization of PIN-FORMED1 (PIN1) auxin transport proteins, and in situ hybridization of leaf developmental transcripts to analyze the ontogeny of monocot leaf morphology in maize (Zea mays). CT imaging of whole-mounted shoot apices illustrates the plastochron-specific stages during initiation of the basal sheath margins from the tubular disc of insertion (DOI). PIN1 localizations identify basipetal auxin transport in the SAM L1 layer at the site of leaf initiation, a process that continues reiteratively during later recruitment of lateral leaf domains. Refinement of these auxin transport domains results in multiple, parallel provascular strands within the initiating primordium. By contrast, auxin is transported from the L2 toward the L1 at the developing margins of the leaf sheath. Transcripts involved in organ boundary formation and dorsiventral patterning accumulate within the DOI, preceding the outgrowth of the overlapping margins of the sheathing leaf base. We suggest a model wherein sheathing bases and parallel veins are both patterned via the extended recruitment of lateral maize leaf domains from the SAM. © 2014 The Authors. New Phytologist © 2014 New Phytologist Trust.
Learning, memory, and the role of neural network architecture.
Hermundstad, Ann M; Brown, Kevin S; Bassett, Danielle S; Carlson, Jean M
2011-06-01
The performance of information processing systems, from artificial neural networks to natural neuronal ensembles, depends heavily on the underlying system architecture. In this study, we compare the performance of parallel and layered network architectures during sequential tasks that require both acquisition and retention of information, thereby identifying tradeoffs between learning and memory processes. During the task of supervised, sequential function approximation, networks produce and adapt representations of external information. Performance is evaluated by statistically analyzing the error in these representations while varying the initial network state, the structure of the external information, and the time given to learn the information. We link performance to complexity in network architecture by characterizing local error landscape curvature. We find that variations in error landscape structure give rise to tradeoffs in performance; these include the ability of the network to maximize accuracy versus minimize inaccuracy and produce specific versus generalizable representations of information. Parallel networks generate smooth error landscapes with deep, narrow minima, enabling them to find highly specific representations given sufficient time. While accurate, however, these representations are difficult to generalize. In contrast, layered networks generate rough error landscapes with a variety of local minima, allowing them to quickly find coarse representations. Although less accurate, these representations are easily adaptable. The presence of measurable performance tradeoffs in both layered and parallel networks has implications for understanding the behavior of a wide variety of natural and artificial learning systems.
Ultra Wide-Band Localization and SLAM: A Comparative Study for Mobile Robot Navigation
Segura, Marcelo J.; Auat Cheein, Fernando A.; Toibero, Juan M.; Mut, Vicente; Carelli, Ricardo
2011-01-01
In this work, a comparative study between an Ultra Wide-Band (UWB) localization system and a Simultaneous Localization and Mapping (SLAM) algorithm is presented. Due to its high bandwidth and short pulses length, UWB potentially allows great accuracy in range measurements based on Time of Arrival (TOA) estimation. SLAM algorithms recursively estimates the map of an environment and the pose (position and orientation) of a mobile robot within that environment. The comparative study presented here involves the performance analysis of implementing in parallel an UWB localization based system and a SLAM algorithm on a mobile robot navigating within an environment. Real time results as well as error analysis are also shown in this work. PMID:22319397
Parallel Implementation of the Terrain Masking Algorithm
1994-03-01
contains behavior rules which can define a computation or an algorithm. It can communicate with other process nodes, it can contain local data, and it can...terrain maskirg calculation is being performed. It is this algorithm that comsumes about seventy percent of the total terrain masking calculation time
ERIC Educational Resources Information Center
O'Connell, Daniel C.; Kowal, Sabine; Ageneau, Carie
2005-01-01
A psycholinguistic hypothesis regarding the use of interjections in spoken utterances, originally formulated by Ameka (1992b, 1994) for the English language, but not confirmed in the German-language research of Kowal and O'Connell (2004 a & c), was tested: The local syntactic isolation of interjections is paralleled by their articulatory isolation…
Unstructured Adaptive Meshes: Bad for Your Memory?
NASA Technical Reports Server (NTRS)
Biswas, Rupak; Feng, Hui-Yu; VanderWijngaart, Rob
2003-01-01
This viewgraph presentation explores the need for a NASA Advanced Supercomputing (NAS) parallel benchmark for problems with irregular dynamical memory access. This benchmark is important and necessary because: 1) Problems with localized error source benefit from adaptive nonuniform meshes; 2) Certain machines perform poorly on such problems; 3) Parallel implementation may provide further performance improvement but is difficult. Some examples of problems which use irregular dynamical memory access include: 1) Heat transfer problem; 2) Heat source term; 3) Spectral element method; 4) Base functions; 5) Elemental discrete equations; 6) Global discrete equations. Nonconforming Mesh and Mortar Element Method are covered in greater detail in this presentation.
Serial multiplier arrays for parallel computation
NASA Technical Reports Server (NTRS)
Winters, Kel
1990-01-01
Arrays of systolic serial-parallel multiplier elements are proposed as an alternative to conventional SIMD mesh serial adder arrays for applications that are multiplication intensive and require few stored operands. The design and operation of a number of multiplier and array configurations featuring locality of connection, modularity, and regularity of structure are discussed. A design methodology combining top-down and bottom-up techniques is described to facilitate development of custom high-performance CMOS multiplier element arrays as well as rapid synthesis of simulation models and semicustom prototype CMOS components. Finally, a differential version of NORA dynamic circuits requiring a single-phase uncomplemented clock signal introduced for this application.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pratapa, Phanisri P.; Suryanarayana, Phanish; Pask, John E.
We present the Clenshaw–Curtis Spectral Quadrature (SQ) method for real-space O(N) Density Functional Theory (DFT) calculations. In this approach, all quantities of interest are expressed as bilinear forms or sums over bilinear forms, which are then approximated by spatially localized Clenshaw–Curtis quadrature rules. This technique is identically applicable to both insulating and metallic systems, and in conjunction with local reformulation of the electrostatics, enables the O(N) evaluation of the electronic density, energy, and atomic forces. The SQ approach also permits infinite-cell calculations without recourse to Brillouin zone integration or large supercells. We employ a finite difference representation in order tomore » exploit the locality of electronic interactions in real space, enable systematic convergence, and facilitate large-scale parallel implementation. In particular, we derive expressions for the electronic density, total energy, and atomic forces that can be evaluated in O(N) operations. We demonstrate the systematic convergence of energies and forces with respect to quadrature order as well as truncation radius to the exact diagonalization result. In addition, we show convergence with respect to mesh size to established O(N 3) planewave results. In conclusion, we establish the efficiency of the proposed approach for high temperature calculations and discuss its particular suitability for large-scale parallel computation.« less
Pratapa, Phanisri P.; Suryanarayana, Phanish; Pask, John E.
2015-12-02
We present the Clenshaw–Curtis Spectral Quadrature (SQ) method for real-space O(N) Density Functional Theory (DFT) calculations. In this approach, all quantities of interest are expressed as bilinear forms or sums over bilinear forms, which are then approximated by spatially localized Clenshaw–Curtis quadrature rules. This technique is identically applicable to both insulating and metallic systems, and in conjunction with local reformulation of the electrostatics, enables the O(N) evaluation of the electronic density, energy, and atomic forces. The SQ approach also permits infinite-cell calculations without recourse to Brillouin zone integration or large supercells. We employ a finite difference representation in order tomore » exploit the locality of electronic interactions in real space, enable systematic convergence, and facilitate large-scale parallel implementation. In particular, we derive expressions for the electronic density, total energy, and atomic forces that can be evaluated in O(N) operations. We demonstrate the systematic convergence of energies and forces with respect to quadrature order as well as truncation radius to the exact diagonalization result. In addition, we show convergence with respect to mesh size to established O(N 3) planewave results. In conclusion, we establish the efficiency of the proposed approach for high temperature calculations and discuss its particular suitability for large-scale parallel computation.« less
Spring Break: A Lesson in Circuits. "This Old House" College Style.
ERIC Educational Resources Information Center
Duch, Barbara
2001-01-01
Introduces students to the topics of electricity and circuits within the context of house wiring. Explores the properties of series and parallel circuits, researches local wiring codes, calculates the current used by appliances based on their power ratings, and designs circuits in a typical kitchen. (Author/ASK)
ERIC Educational Resources Information Center
Watson, Joan Q.
These 24 self-contained competency-based modules are designed to acquaint Florida adult students with laws they will meet in everyday life; fundamentals of local, state, and federal governments; and the criminal and juvenile justice systems. (The 130 objectives are categorized in the first three levels of the Cognitive Domain and parallel the…
LOCAL AND GLOBAL DYNAMICS OF POLYLACTIDES. (R826733)
Polylactides (PLAs) are a family of degradable plastics having a component of the dipole moment both perpendicular and parallel to the polymer backbone (i.e. is a type-A polymer). We have studied the sub-glass, segmental and global chain dynamics in a series of fully amorphous...
Universities, Local Partnerships and the Promotion of Youth Entrepreneurship
ERIC Educational Resources Information Center
Bezerra, Éder D.; Borges, Cândido; Andreassi, Tales
2017-01-01
Youth entrepreneurship has gained prominence in recent years, but there are few studies which investigate the characteristics of companies created by students in the university environment (also known as "student spin-off companies") or the "ecosystem" in which these companies are incubated and "hatched". In parallel,…
Information Sharing Environment Interim Implementation Plan
2006-01-01
10 3.2.3 Integrating Results into the Broader ISE Implementation ........................ 11 3.3 ISE Governance ...and State, Local, and Tribal Governments , Law Enforcement Agencies, and the Private Sector...parallel with these efforts, Congress enacted three laws providing the U.S. Government with greater authority for collecting, analyzing, and disseminating
Simple Derivation of the Maxwell Stress Tensor and Electrostrictive Effects in Crystals
ERIC Educational Resources Information Center
Juretschke, H. J.
1977-01-01
Shows that local equilibrium and energy considerations in an elastic dielectric crystal lead to a simple derivation of the Maxwell stress tensor in anisotropic dielectric solids. The resulting equilibrium stress-strain relations are applied to determine the deformations of a charged parallel plate capacitor. (MLH)
Control of a simulated arm using a novel combination of Cerebellar learning mechanisms
NASA Technical Reports Server (NTRS)
Assad, C.; Hartmann, M.; Paulin, M. G.
2001-01-01
We present a model of cerebellar cortex that combines two types of learning: feedforward predicitve association based on local Hebbian-type learning between granule cell ascending branch and parallel fiber inputs, and reinforcement learning with feedback error correction based on climbing fiber activity.
NASA Astrophysics Data System (ADS)
Shoemaker, C. A.; Pang, M.; Akhtar, T.; Bindel, D.
2016-12-01
New parallel surrogate global optimization algorithms are developed and applied to objective functions that are expensive simulations (possibly with multiple local minima). The algorithms can be applied to most geophysical simulations, including those with nonlinear partial differential equations. The optimization does not require simulations be parallelized. Asynchronous (and synchronous) parallel execution is available in the optimization toolbox "pySOT". The parallel algorithms are modified from serial to eliminate fine grained parallelism. The optimization is computed with open source software pySOT, a Surrogate Global Optimization Toolbox that allows user to pick the type of surrogate (or ensembles), the search procedure on surrogate, and the type of parallelism (synchronous or asynchronous). pySOT also allows the user to develop new algorithms by modifying parts of the code. In the applications here, the objective function takes up to 30 minutes for one simulation, and serial optimization can take over 200 hours. Results from Yellowstone (NSF) and NCSS (Singapore) supercomputers are given for groundwater contaminant hydrology simulations with applications to model parameter estimation and decontamination management. All results are compared with alternatives. The first results are for optimization of pumping at many wells to reduce cost for decontamination of groundwater at a superfund site. The optimization runs with up to 128 processors. Superlinear speed up is obtained for up to 16 processors, and efficiency with 64 processors is over 80%. Each evaluation of the objective function requires the solution of nonlinear partial differential equations to describe the impact of spatially distributed pumping and model parameters on model predictions for the spatial and temporal distribution of groundwater contaminants. The second application uses an asynchronous parallel global optimization for groundwater quality model calibration. The time for a single objective function evaluation varies unpredictably, so efficiency is improved with asynchronous parallel calculations to improve load balancing. The third application (done at NCSS) incorporates new global surrogate multi-objective parallel search algorithms into pySOT and applies it to a large watershed calibration problem.
Bouma, Arnold H.; Feeley, Mary H.; Kindinger, Jack G.; Stelting, Charles E.; Hilde, Thomas W.C.
1981-01-01
A high-resolution seismic reflection survey was conducted in a small area of the upper Louisiana Continental Slope known as Green Canyon Area. This area includes tracts 427, 428, 471, 472, 515, and 516, that will be offered for sale in March 1982 as part of Lease Sale 67.The sea floor of this region is, slightly hummocky and is underlain by salt diapirs that are mantled by early Tertiary shale. Most of the shale is overlain by younger Tertiary and Quaternary deposits, although locally some of the shale protrudes the sea floor. Because of proximity to older Mississippi River sources, the sediments are thick. The sediment cover shows an abundance of geologic phenomena such as horsts, grabens, growth faults, normal faults, and consolidation faults, zones with distinct and indistinct parallel reflections, semi-transparent zones, distorted zones, and angular unconformities.The major feature of this region is a N-S linear zone of uplifted and intruded sedimentary deposits formed due to diapiric intrusion.Small scale graben development over the crest of the structure can be attributed to extension and collapse. Large scale undulations of reflections well off the flanks of the uplifted structure suggest sediment creep and slumping. Dipping of parallel reflections show block faulting and tilting.Air gun (5 and 40 cubic inch) records reveal at least five major sequences that show masked onlap and slumping in their lower parts grading into more distinct parallel reflections in their upper parts. Such sequences can be related to local uplift and sea level changes. Minisparker records of this area show similar sequences but on a smaller scale. The distinct parallel reflections often onlap the diapir flanks. The highly reflective parts of these sequences may represent turbidite-type deposition, possibly at times of lower sea level. The acoustically more transparent parts of each sequence may represent deposits containing primarily hemipelagic and pelagic sediment.A complex ridge system is present along the west side of the area and distinct parallel reflections onlap onto this structure primarily from the east. Much of this deposition may be ascribed to sedimentation within a submarine canyon whose position is controlled by this ridge.
A Parallel Vector Machine for the PM Programming Language
NASA Astrophysics Data System (ADS)
Bellerby, Tim
2016-04-01
PM is a new programming language which aims to make the writing of computational geoscience models on parallel hardware accessible to scientists who are not themselves expert parallel programmers. It is based around the concept of communicating operators: language constructs that enable variables local to a single invocation of a parallelised loop to be viewed as if they were arrays spanning the entire loop domain. This mechanism enables different loop invocations (which may or may not be executing on different processors) to exchange information in a manner that extends the successful Communicating Sequential Processes idiom from single messages to collective communication. Communicating operators avoid the additional synchronisation mechanisms, such as atomic variables, required when programming using the Partitioned Global Address Space (PGAS) paradigm. Using a single loop invocation as the fundamental unit of concurrency enables PM to uniformly represent different levels of parallelism from vector operations through shared memory systems to distributed grids. This paper describes an implementation of PM based on a vectorised virtual machine. On a single processor node, concurrent operations are implemented using masked vector operations. Virtual machine instructions operate on vectors of values and may be unmasked, masked using a Boolean field, or masked using an array of active vector cell locations. Conditional structures (such as if-then-else or while statement implementations) calculate and apply masks to the operations they control. A shift in mask representation from Boolean to location-list occurs when active locations become sufficiently sparse. Parallel loops unfold data structures (or vectors of data structures for nested loops) into vectors of values that may additionally be distributed over multiple computational nodes and then split into micro-threads compatible with the size of the local cache. Inter-node communication is accomplished using standard OpenMP and MPI. Performance analyses of the PM vector machine, demonstrating its scaling properties with respect to domain size and the number of processor nodes will be presented for a range of hardware configurations. The PM software and language definition are being made available under unrestrictive MIT and Creative Commons Attribution licenses respectively: www.pm-lang.org.
The line integral approach to radarclinometry
Wildey, R.L.
1987-01-01
Radarclinometry, the invention of which has been previously reported, is a technique for deriving a topographic map from a single radar image by using the dependence upon terrain-surface orientation of the integrated signal of an individual image pixel. The radiometric calibration required for precise operation and testing does not yet exist, but the imminence of important applications justifies parallel, rather than serial, development of radarclinometry and radiometrically calibrated radar. The present investigation reports three developmental advances: (1) The solid angle of integration of back-scattered specific intensity constituting a pixel signal is more accurately accounted for in its dependence on surface orientation than in previous work. (2) The local curvature hypothesis, which removes the requirement of a ground-truth profile as a boundary condition and enables the formulation of the theory in terms of a line integral, has been expanded to include the three possibilities of Local Cylindricity, Local Biaxial Ellipsoidal Hyperbolicity, and Least-Squares Local Sphericity. (3) The theory is integrated in the cross-ground-range direction, which is ill-conditioned compared to the ground-range direction, whereas the original formulation was based on enforced isotropy in the two-dimensional power spectrum of the topography. It was found necessary to prohibit the hypothesis of Local Biaxial Ellipsoidal Hyperbolicity in the cross-range stepping, for reasons not completely clear. Variation in the proportioning between curvature assumptions had produced topographic maps that are in good mutual agreement but not realistic in appearance. They are severely banded parallel to the ground-range direction, most especially at small radar zenith angles. Numerical experimentation with the falsification of topography through incorrect decalibration as performed on a Gaussian hill suggests that the banding and its exaggeration at high radar incidence angles could easily be due to our lack of radiometric calibration. ?? 1987 D. Reidel Publishing Company.
Sorin, Clément; Musse, Maja; Mariette, François; Bouchereau, Alain; Leport, Laurent
2015-02-01
Differential palisade and spongy parenchyma structural changes in oilseed rape leaf were demonstrated. These dismantling processes were linked to early senescence events and associated to remobilization processes. During leaf senescence, an ordered cell dismantling process allows efficient nutrient remobilization. However, in Brassica napus plants, an important amount of nitrogen (N) in fallen leaves is associated with low N remobilization efficiency (NRE). The leaf is a complex organ mainly constituted of palisade and spongy parenchyma characterized by different structures and functions concerning water relations and carbon fixation. The aim of the present study was to demonstrate a specific structural evolution of these parenchyma throughout natural senescence in B. napus, probably linked to differential nutrient remobilization processes. The study was performed on 340 leaves from 32 plants during an 8-week development period under controlled growing conditions. Water distribution and status at the cellular level were investigated by low-field proton nuclear magnetic resonance (NMR), while light and electron microscopy were used to observe cell and plast structure. Physiological parameters were determined on all leaves studied and used as indicators of leaf development and remobilization progress. The results revealed a process of hydration and cell enlargement of leaf tissues associated with senescence. Wide variations were observed in the palisade parenchyma while spongy cells changed only very slightly. The major new functional information revealed was the link between the early senescence events and specific tissue dismantling processes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bloß, P., E-mail: bloss@kuz-leipzig.de, E-mail: juettner@kuz-leipzig.de, E-mail: jacob@kuz-leipzig.de, E-mail: loeser@kuz-leipzig.de, E-mail: michaelis@kuz-leipzig.de, E-mail: krajewsky@kuz-leipzig.de; Jüttner, G., E-mail: bloss@kuz-leipzig.de, E-mail: juettner@kuz-leipzig.de, E-mail: jacob@kuz-leipzig.de, E-mail: loeser@kuz-leipzig.de, E-mail: michaelis@kuz-leipzig.de, E-mail: krajewsky@kuz-leipzig.de; Jacob, S., E-mail: bloss@kuz-leipzig.de, E-mail: juettner@kuz-leipzig.de, E-mail: jacob@kuz-leipzig.de, E-mail: loeser@kuz-leipzig.de, E-mail: michaelis@kuz-leipzig.de, E-mail: krajewsky@kuz-leipzig.de
2014-05-15
Micro plastic parts open new fields for application, e. g., to electronics, sensor technologies, optics, and medical engineering. Before micro parts can go to mass production, there is a strong need of having the possibility for testing different designs and materials including material combinations. Hence, flexible individual technical and technological solutions for processing are necessary. To manufacture high quality micro parts, a micro injection moulding machine named formicaPlast based on a two-step plunger injection technology was developed. Resulting from its design, the residence time and the accuracy problems for managing small shot volumes with reproducible high accuracy are uncompromisingly solved.more » Due to their simple geometry possessing smooth transitions and non adherent inner surfaces, the plunger units allow to process 'all' thermoplastics from polyolefines to high performance polymers, optical clear polymers, thermally sensitive bioresorbables, highly filled systems (the so-called powder injection molding PIM), and liquid silicon rubber (LSR, here with a special kit). The applied platform strategy in the 1K and 2K version allows integrating automation for assembling, handling and packaging. A perpendicular arrangement allows encapsulation of inserts, also partially, and integration of this machine into process chains. Considering a wide variety of different parts consisting of different materials, the high potential of the technology is demonstrated. Based on challenging industrial parts from electronic applications (2K micro MID and bump mat, where both are highly structured parts), the technological solutions are presented in more detail.« less
A study of gradient strengthening based on a finite-deformation gradient crystal-plasticity model
NASA Astrophysics Data System (ADS)
Pouriayevali, Habib; Xu, Bai-Xiang
2017-11-01
A comprehensive study on a finite-deformation gradient crystal-plasticity model which has been derived based on Gurtin's framework (Int J Plast 24:702-725, 2008) is carried out here. This systematic investigation on the different roles of governing components of the model represents the strength of this framework in the prediction of a wide range of hardening behaviors as well as rate-dependent and scale-variation responses in a single crystal. The model is represented in the reference configuration for the purpose of numerical implementation and then implemented in the FEM software ABAQUS via a user-defined subroutine (UEL). Furthermore, a function of accumulation rates of dislocations is employed and viewed as a measure of formation of short-range interactions. Our simulation results reveal that the dissipative gradient strengthening can be identified as a source of isotropic-hardening behavior, which may represent the effect of irrecoverable work introduced by Gurtin and Ohno (J Mech Phys Solids 59:320-343, 2011). Here, the variation of size dependency at different magnitude of a rate-sensitivity parameter is also discussed. Moreover, an observation of effect of a distinctive feature in the model which explains the effect of distortion of crystal lattice in the reference configuration is reported in this study for the first time. In addition, plastic flows in predefined slip systems and expansion of accumulation of GNDs are distinctly observed in varying scales and under different loading conditions.
Richardson, Sunil; Seelan, Nikkie S; Selvaraj, Dhivakar; Khandeparker, Rakshit V; Gnanamony, Sangeetha
2016-06-01
To assess speech outcomes after anterior maxillary distraction (AMD) in patients with cleft-related maxillary hypoplasia. Fifty-eight patients at least 10 years old with cleft-related maxillary hypoplasia were included in this study irrespective of gender, type of cleft lip and palate, and amount of required advancement. AMD was carried out in all patients using a tooth-borne palatal distractor by a single oral and maxillofacial surgeon. Perceptual speech assessment was performed by 2 speech language pathologists preoperatively, before placement of the distractor device, and 6 months postoperatively using the scoring system of Perkins et al (Plast Reconstr Surg 116:72, 2005); the system evaluates velopharyngeal insufficiency (VPI), resonance, nasal air emission, articulation errors, and intelligibility. The data obtained were tabulated and subjected to statistical analysis using Wilcoxon signed rank test. A P value less than .05 was considered significant. Eight patients were lost to follow-up. At 6-month follow-up, improvements of 62% (n = 31), 64% (n = 32), 50% (n = 25), 68% (n = 34), and 70% (n = 35) in VPI, resonance, nasal air emission, articulation, and intelligibility, respectively, were observed, with worsening of all parameters in 1 patient (2%). The results for all tested parameters were highly significant (P ≤ .001). AMD offers a substantial improvement in speech for all 5 parameters of perceptual speech assessment. Copyright © 2016 The American Association of Oral and Maxillofacial Surgeons. Published by Elsevier Inc. All rights reserved.
Long-range interactions and parallel scalability in molecular simulations
NASA Astrophysics Data System (ADS)
Patra, Michael; Hyvönen, Marja T.; Falck, Emma; Sabouri-Ghomi, Mohsen; Vattulainen, Ilpo; Karttunen, Mikko
2007-01-01
Typical biomolecular systems such as cellular membranes, DNA, and protein complexes are highly charged. Thus, efficient and accurate treatment of electrostatic interactions is of great importance in computational modeling of such systems. We have employed the GROMACS simulation package to perform extensive benchmarking of different commonly used electrostatic schemes on a range of computer architectures (Pentium-4, IBM Power 4, and Apple/IBM G5) for single processor and parallel performance up to 8 nodes—we have also tested the scalability on four different networks, namely Infiniband, GigaBit Ethernet, Fast Ethernet, and nearly uniform memory architecture, i.e. communication between CPUs is possible by directly reading from or writing to other CPUs' local memory. It turns out that the particle-mesh Ewald method (PME) performs surprisingly well and offers competitive performance unless parallel runs on PC hardware with older network infrastructure are needed. Lipid bilayers of sizes 128, 512 and 2048 lipid molecules were used as the test systems representing typical cases encountered in biomolecular simulations. Our results enable an accurate prediction of computational speed on most current computing systems, both for serial and parallel runs. These results should be helpful in, for example, choosing the most suitable configuration for a small departmental computer cluster.
Dynamics and control of cable-suspended parallel robots for giant telescopes
NASA Astrophysics Data System (ADS)
Zhuang, Peng; Yao, Zhengqiu
2006-06-01
A cable-suspended parallel robot utilizes the basic idea of Stewart platform but replaces parallel links with cables and linear actuators with winches. It has many advantages over a conventional crane. The concept of applying a cable-suspended parallel robot into the construction and maintenance of giant telescope is presented in this paper. Compared with the mass and travel of the moving platform of the robot, the mass and deformation of the cables can be disregarded. Based on the premises, the kinematic and dynamic models of the robot are built. Through simulation, the inertia and gravity of moving platform are found to have dominant effect on the dynamic characteristic of the robot, while the dynamics of actuators can be disregarded, so a simplified dynamic model applicable to real-time control is obtained. Moreover, according to control-law partitioning approach and optimization theory, a workspace model-based controller is proposed considering the characteristic that the cables can only pull but not push. The simulation results indicate that the controller possesses good accuracy in pose and speed tracking, and keeps the cables in reliable tension by maintaining the minimum strain above a certain given value, thus ensures smooth motion and accurate localization for moving platform.
NASA Technical Reports Server (NTRS)
Farhat, Charbel; Lesoinne, Michel
1993-01-01
Most of the recently proposed computational methods for solving partial differential equations on multiprocessor architectures stem from the 'divide and conquer' paradigm and involve some form of domain decomposition. For those methods which also require grids of points or patches of elements, it is often necessary to explicitly partition the underlying mesh, especially when working with local memory parallel processors. In this paper, a family of cost-effective algorithms for the automatic partitioning of arbitrary two- and three-dimensional finite element and finite difference meshes is presented and discussed in view of a domain decomposed solution procedure and parallel processing. The influence of the algorithmic aspects of a solution method (implicit/explicit computations), and the architectural specifics of a multiprocessor (SIMD/MIMD, startup/transmission time), on the design of a mesh partitioning algorithm are discussed. The impact of the partitioning strategy on load balancing, operation count, operator conditioning, rate of convergence and processor mapping is also addressed. Finally, the proposed mesh decomposition algorithms are demonstrated with realistic examples of finite element, finite volume, and finite difference meshes associated with the parallel solution of solid and fluid mechanics problems on the iPSC/2 and iPSC/860 multiprocessors.
Lee, Anthony; Yau, Christopher; Giles, Michael B.; Doucet, Arnaud; Holmes, Christopher C.
2011-01-01
We present a case-study on the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods. Graphics cards, containing multiple Graphics Processing Units (GPUs), are self-contained parallel computational devices that can be housed in conventional desktop and laptop computers and can be thought of as prototypes of the next generation of many-core processors. For certain classes of population-based Monte Carlo algorithms they offer massively parallel simulation, with the added advantage over conventional distributed multi-core processors that they are cheap, easily accessible, easy to maintain, easy to code, dedicated local devices with low power consumption. On a canonical set of stochastic simulation examples including population-based Markov chain Monte Carlo methods and Sequential Monte Carlo methods, we nd speedups from 35 to 500 fold over conventional single-threaded computer code. Our findings suggest that GPUs have the potential to facilitate the growth of statistical modelling into complex data rich domains through the availability of cheap and accessible many-core computation. We believe the speedup we observe should motivate wider use of parallelizable simulation methods and greater methodological attention to their design. PMID:22003276
Parallel peak pruning for scalable SMP contour tree computation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carr, Hamish A.; Weber, Gunther H.; Sewell, Christopher M.
As data sets grow to exascale, automated data analysis and visualisation are increasingly important, to intermediate human understanding and to reduce demands on disk storage via in situ analysis. Trends in architecture of high performance computing systems necessitate analysis algorithms to make effective use of combinations of massively multicore and distributed systems. One of the principal analytic tools is the contour tree, which analyses relationships between contours to identify features of more than local importance. Unfortunately, the predominant algorithms for computing the contour tree are explicitly serial, and founded on serial metaphors, which has limited the scalability of this formmore » of analysis. While there is some work on distributed contour tree computation, and separately on hybrid GPU-CPU computation, there is no efficient algorithm with strong formal guarantees on performance allied with fast practical performance. Here in this paper, we report the first shared SMP algorithm for fully parallel contour tree computation, withfor-mal guarantees of O(lgnlgt) parallel steps and O(n lgn) work, and implementations with up to 10x parallel speed up in OpenMP and up to 50x speed up in NVIDIA Thrust.« less
Multisensor Parallel Largest Ellipsoid Distributed Data Fusion with Unknown Cross-Covariances
Liu, Baoyu; Zhan, Xingqun; Zhu, Zheng H.
2017-01-01
As the largest ellipsoid (LE) data fusion algorithm can only be applied to two-sensor system, in this contribution, parallel fusion structure is proposed to introduce the LE algorithm into a multisensor system with unknown cross-covariances, and three parallel fusion structures based on different estimate pairing methods are presented and analyzed. In order to assess the influence of fusion structure on fusion performance, two fusion performance assessment parameters are defined as Fusion Distance and Fusion Index. Moreover, the formula for calculating the upper bounds of actual fused error covariances of the presented multisensor LE fusers is also provided. Demonstrated with simulation examples, the Fusion Index indicates fuser’s actual fused accuracy and its sensitivity to the sensor orders, as well as its robustness to the accuracy of newly added sensors. Compared to the LE fuser with sequential structure, the LE fusers with proposed parallel structures not only significantly improve their properties in these aspects, but also embrace better performances in consistency and computation efficiency. The presented multisensor LE fusers generally have better accuracies than covariance intersection (CI) fusion algorithm and are consistent when the local estimates are weakly correlated. PMID:28661442
The force on the flex: Global parallelism and portability
NASA Technical Reports Server (NTRS)
Jordan, H. F.
1986-01-01
A parallel programming methodology, called the force, supports the construction of programs to be executed in parallel by an unspecified, but potentially large, number of processes. The methodology was originally developed on a pipelined, shared memory multiprocessor, the Denelcor HEP, and embodies the primitive operations of the force in a set of macros which expand into multiprocessor Fortran code. A small set of primitives is sufficient to write large parallel programs, and the system has been used to produce 10,000 line programs in computational fluid dynamics. The level of complexity of the force primitives is intermediate. It is high enough to mask detailed architectural differences between multiprocessors but low enough to give the user control over performance. The system is being ported to a medium scale multiprocessor, the Flex/32, which is a 20 processor system with a mixture of shared and local memory. Memory organization and the type of processor synchronization supported by the hardware on the two machines lead to some differences in efficient implementations of the force primitives, but the user interface remains the same. An initial implementation was done by retargeting the macros to Flexible Computer Corporation's ConCurrent C language. Subsequently, the macros were caused to directly produce the system calls which form the basis for ConCurrent C. The implementation of the Fortran based system is in step with Flexible Computer Corporations's implementation of a Fortran system in the parallel environment.
DGDFT: A massively parallel method for large scale density functional theory calculations.
Hu, Wei; Lin, Lin; Yang, Chao
2015-09-28
We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10(-4) Hartree/atom in terms of the error of energy and 6.2 × 10(-4) Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail.
Reducing Speckle In One-Look SAR Images
NASA Technical Reports Server (NTRS)
Nathan, K. S.; Curlander, J. C.
1990-01-01
Local-adaptive-filter algorithm incorporated into digital processing of synthetic-aperture-radar (SAR) echo data to reduce speckle in resulting imagery. Involves use of image statistics in vicinity of each picture element, in conjunction with original intensity of element, to estimate brightness more nearly proportional to true radar reflectance of corresponding target. Increases ratio of signal to speckle noise without substantial degradation of resolution common to multilook SAR images. Adapts to local variations of statistics within scene, preserving subtle details. Computationally simple. Lends itself to parallel processing of different segments of image, making possible increased throughput.
Ideal magnetohydrodynamic theory for localized interchange modes in toroidal anisotropic plasmas
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shi, Tonghui, E-mail: thshi@ipp.ac.cn; Wan, B. N.; Sun, Y.
2016-08-15
Ideal magnetohydrodynamic theory for localized interchange modes is developed for toroidal plasmas with anisotropic pressure. The work extends the existing theories of Johnson and Hastie [Phys. Fluids 31, 1609 (1988)], etc., to the low n mode case, where n is the toroidal mode number. Also, the plasma compressibility is included, so that the coupling of the parallel motion to perpendicular one, i.e., the so-called apparent mass effect, is investigated in the anisotropic pressure case. The singular layer equation is obtained, and the generalized Mercier's criterion is derived.
Walenta, Albert H.
1979-01-01
Improved multiwire chamber having means for resolving the left/right ambiguity in the location of an ionizing event. The chamber includes a plurality of spaced parallel anode wires positioned between spaced planar cathodes. Associated with each of the anode wires are a pair of localizing wires, one positioned on either side of the anode wire. The localizing wires are connected to a differential amplifier whose output polarity is determined by whether the ionizing event occurs to the right or left of the anode wire.
Analog system for computing sparse codes
Rozell, Christopher John; Johnson, Don Herrick; Baraniuk, Richard Gordon; Olshausen, Bruno A.; Ortman, Robert Lowell
2010-08-24
A parallel dynamical system for computing sparse representations of data, i.e., where the data can be fully represented in terms of a small number of non-zero code elements, and for reconstructing compressively sensed images. The system is based on the principles of thresholding and local competition that solves a family of sparse approximation problems corresponding to various sparsity metrics. The system utilizes Locally Competitive Algorithms (LCAs), nodes in a population continually compete with neighboring units using (usually one-way) lateral inhibition to calculate coefficients representing an input in an over complete dictionary.
The Response of a 2D Emulsion to Local Perturbations
NASA Astrophysics Data System (ADS)
Hong, Xia; Orellana, Carlos; Weeks, Eric
2015-03-01
We experimentally perturb a quasi-two-dimensional emulsion packing by inflating an oil droplet into the system in a controlled way. Our samples are oil-in-water emulsion confined between two close-spaced parallel plates, so that the droplets are deformed into pancake shapes. In this system, there is only viscous friction and no static friction between droplets. By imaging the droplets with a video microscopy, we observe rearrangement events induced by the local perturbation. Simultaneously, we measure droplet-droplet contact forces by analyzing the outlines of each droplet in our movies. These allow us to study how the packings with varying degrees of spatial order have different responses to the local perturbation.
A Parallel Fast Sweeping Method for the Eikonal Equation
NASA Astrophysics Data System (ADS)
Baker, B.
2017-12-01
Recently, there has been an exciting emergence of probabilistic methods for travel time tomography. Unlike gradient-based optimization strategies, probabilistic tomographic methods are resistant to becoming trapped in a local minimum and provide a much better quantification of parameter resolution than, say, appealing to ray density or performing checkerboard reconstruction tests. The benefits associated with random sampling methods however are only realized by successive computation of predicted travel times in, potentially, strongly heterogeneous media. To this end this abstract is concerned with expediting the solution of the Eikonal equation. While many Eikonal solvers use a fast marching method, the proposed solver will use the iterative fast sweeping method because the eight fixed sweep orderings in each iteration are natural targets for parallelization. To reduce the number of iterations and grid points required the high-accuracy finite difference stencil of Nobel et al., 2014 is implemented. A directed acyclic graph (DAG) is created with a priori knowledge of the sweep ordering and finite different stencil. By performing a topological sort of the DAG sets of independent nodes are identified as candidates for concurrent updating. Additionally, the proposed solver will also address scalability during earthquake relocation, a necessary step in local and regional earthquake tomography and a barrier to extending probabilistic methods from active source to passive source applications, by introducing an asynchronous parallel forward solve phase for all receivers in the network. Synthetic examples using the SEG over-thrust model will be presented.
ColDICE: A parallel Vlasov–Poisson solver using moving adaptive simplicial tessellation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sousbie, Thierry, E-mail: tsousbie@gmail.com; Department of Physics, The University of Tokyo, Tokyo 113-0033; Research Center for the Early Universe, School of Science, The University of Tokyo, Tokyo 113-0033
2016-09-15
Resolving numerically Vlasov–Poisson equations for initially cold systems can be reduced to following the evolution of a three-dimensional sheet evolving in six-dimensional phase-space. We describe a public parallel numerical algorithm consisting in representing the phase-space sheet with a conforming, self-adaptive simplicial tessellation of which the vertices follow the Lagrangian equations of motion. The algorithm is implemented both in six- and four-dimensional phase-space. Refinement of the tessellation mesh is performed using the bisection method and a local representation of the phase-space sheet at second order relying on additional tracers created when needed at runtime. In order to preserve in the bestmore » way the Hamiltonian nature of the system, refinement is anisotropic and constrained by measurements of local Poincaré invariants. Resolution of Poisson equation is performed using the fast Fourier method on a regular rectangular grid, similarly to particle in cells codes. To compute the density projected onto this grid, the intersection of the tessellation and the grid is calculated using the method of Franklin and Kankanhalli [65–67] generalised to linear order. As preliminary tests of the code, we study in four dimensional phase-space the evolution of an initially small patch in a chaotic potential and the cosmological collapse of a fluctuation composed of two sinusoidal waves. We also perform a “warm” dark matter simulation in six-dimensional phase-space that we use to check the parallel scaling of the code.« less
Novel Highly Parallel and Systolic Architectures Using Quantum Dot-Based Hardware
NASA Technical Reports Server (NTRS)
Fijany, Amir; Toomarian, Benny N.; Spotnitz, Matthew
1997-01-01
VLSI technology has made possible the integration of massive number of components (processors, memory, etc.) into a single chip. In VLSI design, memory and processing power are relatively cheap and the main emphasis of the design is on reducing the overall interconnection complexity since data routing costs dominate the power, time, and area required to implement a computation. Communication is costly because wires occupy the most space on a circuit and it can also degrade clock time. In fact, much of the complexity (and hence the cost) of VLSI design results from minimization of data routing. The main difficulty in VLSI routing is due to the fact that crossing of the lines carrying data, instruction, control, etc. is not possible in a plane. Thus, in order to meet this constraint, the VLSI design aims at keeping the architecture highly regular with local and short interconnection. As a result, while the high level of integration has opened the way for massively parallel computation, practical and full exploitation of such a capability in many applications of interest has been hindered by the constraints on interconnection pattern. More precisely. the use of only localized communication significantly simplifies the design of interconnection architecture but at the expense of somewhat restricted class of applications. For example, there are currently commercially available products integrating; hundreds of simple processor elements within a single chip. However, the lack of adequate interconnection pattern among these processing elements make them inefficient for exploiting a large degree of parallelism in many applications.
NASA Astrophysics Data System (ADS)
Laloy, Eric; Linde, Niklas; Jacques, Diederik; Mariethoz, Grégoire
2016-04-01
The sequential geostatistical resampling (SGR) algorithm is a Markov chain Monte Carlo (MCMC) scheme for sampling from possibly non-Gaussian, complex spatially-distributed prior models such as geologic facies or categorical fields. In this work, we highlight the limits of standard SGR for posterior inference of high-dimensional categorical fields with realistically complex likelihood landscapes and benchmark a parallel tempering implementation (PT-SGR). Our proposed PT-SGR approach is demonstrated using synthetic (error corrupted) data from steady-state flow and transport experiments in categorical 7575- and 10,000-dimensional 2D conductivity fields. In both case studies, every SGR trial gets trapped in a local optima while PT-SGR maintains an higher diversity in the sampled model states. The advantage of PT-SGR is most apparent in an inverse transport problem where the posterior distribution is made bimodal by construction. PT-SGR then converges towards the appropriate data misfit much faster than SGR and partly recovers the two modes. In contrast, for the same computational resources SGR does not fit the data to the appropriate error level and hardly produces a locally optimal solution that looks visually similar to one of the two reference modes. Although PT-SGR clearly surpasses SGR in performance, our results also indicate that using a small number (16-24) of temperatures (and thus parallel cores) may not permit complete sampling of the posterior distribution by PT-SGR within a reasonable computational time (less than 1-2 weeks).
NASA Astrophysics Data System (ADS)
Jang, W.; Engda, T. A.; Neff, J. C.; Herrick, J.
2017-12-01
Many crop models are increasingly used to evaluate crop yields at regional and global scales. However, implementation of these models across large areas using fine-scale grids is limited by computational time requirements. In order to facilitate global gridded crop modeling with various scenarios (i.e., different crop, management schedule, fertilizer, and irrigation) using the Environmental Policy Integrated Climate (EPIC) model, we developed a distributed parallel computing framework in Python. Our local desktop with 14 cores (28 threads) was used to test the distributed parallel computing framework in Iringa, Tanzania which has 406,839 grid cells. High-resolution soil data, SoilGrids (250 x 250 m), and climate data, AgMERRA (0.25 x 0.25 deg) were also used as input data for the gridded EPIC model. The framework includes a master file for parallel computing, input database, input data formatters, EPIC model execution, and output analyzers. Through the master file for parallel computing, the user-defined number of threads of CPU divides the EPIC simulation into jobs. Then, Using EPIC input data formatters, the raw database is formatted for EPIC input data and the formatted data moves into EPIC simulation jobs. Then, 28 EPIC jobs run simultaneously and only interesting results files are parsed and moved into output analyzers. We applied various scenarios with seven different slopes and twenty-four fertilizer ranges. Parallelized input generators create different scenarios as a list for distributed parallel computing. After all simulations are completed, parallelized output analyzers are used to analyze all outputs according to the different scenarios. This saves significant computing time and resources, making it possible to conduct gridded modeling at regional to global scales with high-resolution data. For example, serial processing for the Iringa test case would require 113 hours, while using the framework developed in this study requires only approximately 6 hours, a nearly 95% reduction in computing time.
Dharmaraj, Christopher D; Thadikonda, Kishan; Fletcher, Anthony R; Doan, Phuc N; Devasahayam, Nallathamby; Matsumoto, Shingo; Johnson, Calvin A; Cook, John A; Mitchell, James B; Subramanian, Sankaran; Krishna, Murali C
2009-01-01
Three-dimensional Oximetric Electron Paramagnetic Resonance Imaging using the Single Point Imaging modality generates unpaired spin density and oxygen images that can readily distinguish between normal and tumor tissues in small animals. It is also possible with fast imaging to track the changes in tissue oxygenation in response to the oxygen content in the breathing air. However, this involves dealing with gigabytes of data for each 3D oximetric imaging experiment involving digital band pass filtering and background noise subtraction, followed by 3D Fourier reconstruction. This process is rather slow in a conventional uniprocessor system. This paper presents a parallelization framework using OpenMP runtime support and parallel MATLAB to execute such computationally intensive programs. The Intel compiler is used to develop a parallel C++ code based on OpenMP. The code is executed on four Dual-Core AMD Opteron shared memory processors, to reduce the computational burden of the filtration task significantly. The results show that the parallel code for filtration has achieved a speed up factor of 46.66 as against the equivalent serial MATLAB code. In addition, a parallel MATLAB code has been developed to perform 3D Fourier reconstruction. Speedup factors of 4.57 and 4.25 have been achieved during the reconstruction process and oximetry computation, for a data set with 23 x 23 x 23 gradient steps. The execution time has been computed for both the serial and parallel implementations using different dimensions of the data and presented for comparison. The reported system has been designed to be easily accessible even from low-cost personal computers through local internet (NIHnet). The experimental results demonstrate that the parallel computing provides a source of high computational power to obtain biophysical parameters from 3D EPR oximetric imaging, almost in real-time.
Matrix Product Operator Simulations of Quantum Algorithms
2015-02-01
parallel to the Grover subspace parametrically: (Zi|φ〉)‖ = s cos γ|α〉+ s sin γ|β〉, s = √ a(k)2 (N − 1)2 + b(k)2, γ = tan −1 ( b(k)(N − 1) a(k) ) (6.32) Each...of this vector parallel to the Grover subspace in parametric form: (XiZi|φ〉)‖ = s cos(γ)|α〉+ s sin(γ)|β〉, s = 1√ N − 1 , γ = tan −1 ( cot (( k + 1 2 ) θ...quant- ph/0001106, 2000. Bibliography 146 [30] Jérémie Roland and Nicolas J Cerf. Quantum search by local adiabatic evolution. Physical Review A, 65(4
A Systolic Array-Based FPGA Parallel Architecture for the BLAST Algorithm
Guo, Xinyu; Wang, Hong; Devabhaktuni, Vijay
2012-01-01
A design of systolic array-based Field Programmable Gate Array (FPGA) parallel architecture for Basic Local Alignment Search Tool (BLAST) Algorithm is proposed. BLAST is a heuristic biological sequence alignment algorithm which has been used by bioinformatics experts. In contrast to other designs that detect at most one hit in one-clock-cycle, our design applies a Multiple Hits Detection Module which is a pipelining systolic array to search multiple hits in a single-clock-cycle. Further, we designed a Hits Combination Block which combines overlapping hits from systolic array into one hit. These implementations completed the first and second step of BLAST architecture and achieved significant speedup comparing with previously published architectures. PMID:25969747
Gooding, Thomas Michael [Rochester, MN
2011-04-19
An analytical mechanism for a massively parallel computer system automatically analyzes data retrieved from the system, and identifies nodes which exhibit anomalous behavior in comparison to their immediate neighbors. Preferably, anomalous behavior is determined by comparing call-return stack tracebacks for each node, grouping like nodes together, and identifying neighboring nodes which do not themselves belong to the group. A node, not itself in the group, having a large number of neighbors in the group, is a likely locality of error. The analyzer preferably presents this information to the user by sorting the neighbors according to number of adjoining members of the group.
NASA Astrophysics Data System (ADS)
Battaïa, Olga; Dolgui, Alexandre; Guschinsky, Nikolai; Levin, Genrikh
2014-10-01
Solving equipment selection and line balancing problems together allows better line configurations to be reached and avoids local optimal solutions. This article considers jointly these two decision problems for mass production lines with serial-parallel workplaces. This study was motivated by the design of production lines based on machines with rotary or mobile tables. Nevertheless, the results are more general and can be applied to assembly and production lines with similar structures. The designers' objectives and the constraints are studied in order to suggest a relevant mathematical model and an efficient optimization approach to solve it. A real case study is used to validate the model and the developed approach.
Albedo of an irradiated plane-parallel atmosphere with finite optical depth
NASA Astrophysics Data System (ADS)
Fukue, Jun
2018-03-01
We analytically derive albedo for a plane-parallel atmosphere with finite optical depth, irradiated by an external source, under the local thermodynamic equilibrium approximation. Albedo is expressed as a function of the photon destruction probability ɛ and optical depth τ, with several parameters such as dilution factors of the external source. In the particular case of the infinite optical depth, albedo A is expressed as A=[1 + (1-W_J/W_H)√{3ɛ}/3]/(1+√{3ɛ}), where WJ and WH are the dilution factors for the mean intensity and Eddington flux, respectively. An example of a model atmosphere is also presented under a gray approximation.
Parallel Transport Quantum Logic Gates with Trapped Ions.
de Clercq, Ludwig E; Lo, Hsiang-Yu; Marinelli, Matteo; Nadlinger, David; Oswald, Robin; Negnevitsky, Vlad; Kienzler, Daniel; Keitch, Ben; Home, Jonathan P
2016-02-26
We demonstrate single-qubit operations by transporting a beryllium ion with a controlled velocity through a stationary laser beam. We use these to perform coherent sequences of quantum operations, and to perform parallel quantum logic gates on two ions in different processing zones of a multiplexed ion trap chip using a single recycled laser beam. For the latter, we demonstrate individually addressed single-qubit gates by local control of the speed of each ion. The fidelities we observe are consistent with operations performed using standard methods involving static ions and pulsed laser fields. This work therefore provides a path to scalable ion trap quantum computing with reduced requirements on the optical control complexity.
Li, I-Hsum; Chen, Ming-Chang; Wang, Wei-Yen; Su, Shun-Feng; Lai, To-Wen
2014-01-27
A single-webcam distance measurement technique for indoor robot localization is proposed in this paper. The proposed localization technique uses webcams that are available in an existing surveillance environment. The developed image-based distance measurement system (IBDMS) and parallel lines distance measurement system (PLDMS) have two merits. Firstly, only one webcam is required for estimating the distance. Secondly, the set-up of IBDMS and PLDMS is easy, which only one known-dimension rectangle pattern is needed, i.e., a ground tile. Some common and simple image processing techniques, i.e., background subtraction are used to capture the robot in real time. Thus, for the purposes of indoor robot localization, the proposed method does not need to use expensive high-resolution webcams and complicated pattern recognition methods but just few simple estimating formulas. From the experimental results, the proposed robot localization method is reliable and effective in an indoor environment.
Boiling local heat transfer enhancement in minichannels using nanofluids
2013-01-01
This paper reports an experimental study on nanofluid convective boiling heat transfer in parallel rectangular minichannels of 800 μm hydraulic diameter. Experiments are conducted with pure water and silver nanoparticles suspended in water base fluid. Two small volume fractions of silver nanoparticles suspended in water are tested: 0.000237% and 0.000475%. The experimental results show that the local heat transfer coefficient, local heat flux, and local wall temperature are affected by silver nanoparticle concentration in water base fluid. In addition, different correlations established for boiling flow heat transfer in minichannels or macrochannels are evaluated. It is found that the correlation of Kandlikar and Balasubramanian is the closest to the water boiling heat transfer results. The boiling local heat transfer enhancement by adding silver nanoparticles in base fluid is not uniform along the channel flow. Better performances and highest effect of nanoparticle concentration on the heat transfer are obtained at the minichannels entrance. PMID:23506445
Infrastructure-Free Mapping and Localization for Tunnel-Based Rail Applications Using 2D Lidar
NASA Astrophysics Data System (ADS)
Daoust, Tyler
This thesis presents an infrastructure-free mapping and localization framework for rail vehicles using only a lidar sensor. The method was designed to handle modern underground tunnels: narrow, parallel, and relatively smooth concrete walls. A sliding-window algorithm was developed to estimate the train's motion, using a Renyi's Quadratic Entropy (RQE)-based point-cloud alignment system. The method was tested with datasets gathered on a subway train travelling at high speeds, with 75 km of data across 14 runs, simulating 500 km of localization. The system was capable of mapping with an average error of less than 0.6 % by distance. It was capable of continuously localizing, relative to the map, to within 10 cm in stations and at crossovers, and 2.3 m in pathological sections of tunnel. This work has the potential to improve train localization in a tunnel, which can be used to increase capacity and for automation purposes.
Li, I-Hsum; Chen, Ming-Chang; Wang, Wei-Yen; Su, Shun-Feng; Lai, To-Wen
2014-01-01
A single-webcam distance measurement technique for indoor robot localization is proposed in this paper. The proposed localization technique uses webcams that are available in an existing surveillance environment. The developed image-based distance measurement system (IBDMS) and parallel lines distance measurement system (PLDMS) have two merits. Firstly, only one webcam is required for estimating the distance. Secondly, the set-up of IBDMS and PLDMS is easy, which only one known-dimension rectangle pattern is needed, i.e., a ground tile. Some common and simple image processing techniques, i.e., background subtraction are used to capture the robot in real time. Thus, for the purposes of indoor robot localization, the proposed method does not need to use expensive high-resolution webcams and complicated pattern recognition methods but just few simple estimating formulas. From the experimental results, the proposed robot localization method is reliable and effective in an indoor environment. PMID:24473282
NASA Astrophysics Data System (ADS)
Sourbier, Florent; Operto, Stéphane; Virieux, Jean; Amestoy, Patrick; L'Excellent, Jean-Yves
2009-03-01
This is the first paper in a two-part series that describes a massively parallel code that performs 2D frequency-domain full-waveform inversion of wide-aperture seismic data for imaging complex structures. Full-waveform inversion methods, namely quantitative seismic imaging methods based on the resolution of the full wave equation, are computationally expensive. Therefore, designing efficient algorithms which take advantage of parallel computing facilities is critical for the appraisal of these approaches when applied to representative case studies and for further improvements. Full-waveform modelling requires the resolution of a large sparse system of linear equations which is performed with the massively parallel direct solver MUMPS for efficient multiple-shot simulations. Efficiency of the multiple-shot solution phase (forward/backward substitutions) is improved by using the BLAS3 library. The inverse problem relies on a classic local optimization approach implemented with a gradient method. The direct solver returns the multiple-shot wavefield solutions distributed over the processors according to a domain decomposition driven by the distribution of the LU factors. The domain decomposition of the wavefield solutions is used to compute in parallel the gradient of the objective function and the diagonal Hessian, this latter providing a suitable scaling of the gradient. The algorithm allows one to test different strategies for multiscale frequency inversion ranging from successive mono-frequency inversion to simultaneous multifrequency inversion. These different inversion strategies will be illustrated in the following companion paper. The parallel efficiency and the scalability of the code will also be quantified.
Wang, Zi-Fu; Li, Ming-Hao; Chen, Wei-Wen; Hsu, Shang-Te Danny; Chang, Ta-Chau
2016-01-01
The folding topology of DNA G-quadruplexes (G4s) depends not only on their nucleotide sequences but also on environmental factors and/or ligand binding. Here, a G4 ligand, 3,6-bis(1-methyl-4-vinylpyridium iodide)-9-(1-(1-methyl-piperidinium iodide)-3,6,9-trioxaundecane) carbazole (BMVC-8C3O), can induce topological conversion of non-parallel to parallel forms in human telomeric DNA G4s. Nuclear magnetic resonance (NMR) spectroscopy with hydrogen-deuterium exchange (HDX) reveals the presence of persistent imino proton signals corresponding to the central G-quartet during topological conversion of Tel23 and Tel25 G4s from hybrid to parallel forms, implying that the transition pathway mainly involves local rearrangements. In contrast, rapid HDX was observed during the transition of 22-CTA G4 from an anti-parallel form to a parallel form, resulting in complete disappearance of all the imino proton signals, suggesting the involvement of substantial unfolding events associated with the topological transition. Site-specific imino proton NMR assignments of Tel23 G4 enable determination of the interconversion rates of individual guanine bases and detection of the presence of intermediate states. Since the rate of ligand binding is much higher than the rate of ligand-induced topological conversion, a three-state kinetic model was evoked to establish the associated energy diagram for the topological conversion of Tel23 G4 induced by BMVC-8C3O. PMID:26975658
Parallelization and implementation of approximate root isolation for nonlinear system by Monte Carlo
NASA Astrophysics Data System (ADS)
Khosravi, Ebrahim
1998-12-01
This dissertation solves a fundamental problem of isolating the real roots of nonlinear systems of equations by Monte-Carlo that were published by Bush Jones. This algorithm requires only function values and can be applied readily to complicated systems of transcendental functions. The implementation of this sequential algorithm provides scientists with the means to utilize function analysis in mathematics or other fields of science. The algorithm, however, is so computationally intensive that the system is limited to a very small set of variables, and this will make it unfeasible for large systems of equations. Also a computational technique was needed for investigating a metrology of preventing the algorithm structure from converging to the same root along different paths of computation. The research provides techniques for improving the efficiency and correctness of the algorithm. The sequential algorithm for this technique was corrected and a parallel algorithm is presented. This parallel method has been formally analyzed and is compared with other known methods of root isolation. The effectiveness, efficiency, enhanced overall performance of the parallel processing of the program in comparison to sequential processing is discussed. The message passing model was used for this parallel processing, and it is presented and implemented on Intel/860 MIMD architecture. The parallel processing proposed in this research has been implemented in an ongoing high energy physics experiment: this algorithm has been used to track neutrinoes in a super K detector. This experiment is located in Japan, and data can be processed on-line or off-line locally or remotely.
27 CFR 9.125 - Fredericksburg in the Texas Hill Country.
Code of Federal Regulations, 2010 CFR
2010-04-01
...) 1504, at the junction of a light-duty road known locally as Jung Road. (1) From the beginning point, the boundary proceeds on Jung Road in a northwesterly direction across the Pedernales River. (2) Then northwesterly approximately 1 mile along Jung Road as it parallels the Pedernales River. (3) Then north along...
Kenneth Smith
1986-01-01
The history of shortleaf pine in the South generally parallels that of the area having the largest concentration of shortleaf, the Ouachita Mountains of Arkansas and Oklahoma. There, in the nineteenth century, agricultural settlers cut trees to clear land for crops and supply local needs for wood. Around 1900, cutting greatly expanded as large sawmills began to log by...
Beam Dynamics Simulation Platform and Studies of Beam Breakup in Dielectric Wakefield Structures
NASA Astrophysics Data System (ADS)
Schoessow, P.; Kanareykin, A.; Jing, C.; Kustov, A.; Altmark, A.; Gai, W.
2010-11-01
A particle-Green's function beam dynamics code (BBU-3000) to study beam breakup effects is incorporated into a parallel computing framework based on the Boinc software environment, and supports both task farming on a heterogeneous cluster and local grid computing. User access to the platform is through a web browser.
ERIC Educational Resources Information Center
Adamson, Bob; Forestier, Katherine; Morris, Paul; Han, Christine
2017-01-01
Since the mid-1980s, a number of East Asian societies have consistently performed well in international tests, and their education systems have emerged as models of "best practice", including Hong Kong, which has been extensively referenced by politicians and their advisers in England. In parallel, local dissatisfaction with the…
Interface colloidal robotic manipulator
Aronson, Igor; Snezhko, Oleksiy
2015-08-04
A magnetic colloidal system confined at the interface between two immiscible liquids and energized by an alternating magnetic field dynamically self-assembles into localized asters and arrays of asters. The colloidal system exhibits locomotion and shape change. By controlling a small external magnetic field applied parallel to the interface, structures can capture, transport, and position target particles.
NASA Astrophysics Data System (ADS)
Pascoe, Stephen; Iwi, Alan; kershaw, philip; Stephens, Ag; Lawrence, Bryan
2014-05-01
The advent of large-scale data and the consequential analysis problems have led to two new challenges for the research community: how to share such data to get the maximum value and how to carry out efficient analysis. Solving both challenges require a form of parallelisation: the first is social parallelisation (involving trust and information sharing), the second data parallelisation (involving new algorithms and tools). The JASMIN infrastructure supports both kinds of parallelism by providing a multi-tennent environment with petabyte-scale storage, VM provisioning and batch cluster facilities. The JASMIN Analysis Platform (JAP) is an analysis software layer for JASMIN which emphasises ease of transition from a researcher's local environment to JASMIN. JAP brings together tools traditionally used by multiple communities and configures them to work together, enabling users to move analysis from their local environment to JASMIN without rewriting code. JAP also provides facilities to exploit JASMIN's parallel capabilities whilst maintaining their familiar analysis environment where ever possible. Modern opensource analysis tools typically have multiple dependent packages, increasing the installation burden on system administrators. When you consider a suite of tools, often with both common and conflicting dependencies, analysis pipelines can become locked to a particular installation simply because of the effort required to reconstruct the dependency tree. JAP addresses this problem by providing a consistent suite of RPMs compatible with RedHat Enterprise Linux and CentOS 6.4. Researchers can install JAP locally, either as RPMs or through a pre-built VM image, giving them the confidence to know moving analysis to JASMIN will not disrupt their environment. Analysis parallelisation is in it's infancy in climate sciences, with few tools capable of exploiting any parallel environment beyond manual scripting of the use of multiple processors. JAP begins to bridge this gap through a veriety of higher-level tools for parallelisation and job scheduling such as IPython-parallel and MPI support for interactive analysis languages. We find that enabling even simple parallelisation of workflows, together with the state of the art I/O performance of JASMIN storage, provides many users with the large increases in efficiency they need to scale their analyses to conteporary data volumes and tackly new, previously inaccessible, problems.
On the generation of double layers from ion- and electron-acoustic instabilities
NASA Astrophysics Data System (ADS)
Fu, Xiangrong; Cowee, Misa M.; Gary, S. Peter; Winske, Dan
2016-03-01
A plasma double layer (DL) is a nonlinear electrostatic structure that carries a uni-polar electric field parallel to the background magnetic field due to local charge separation. Past studies showed that DLs observed in space plasmas are mostly associated with the ion acoustic instability. Recent Van Allen Probes observations of parallel electric field structures traveling much faster than the ion acoustic speed have motivated a computational study to test the hypothesis that a new type of DLs—electron acoustic DLs—generated from the electron acoustic instability are responsible for these electric fields. Nonlinear particle-in-cell simulations yield negative results, i.e., the hypothetical electron acoustic DLs cannot be formed in a way similar to ion acoustic DLs. Linear theory analysis and the simulations show that the frequencies of electron acoustic waves are too high for ions to respond and maintain charge separation required by DLs. However, our results do show that local density perturbations in a two-electron-component plasma can result in unipolar-like electric field structures that propagate at the electron thermal speed, suggesting another potential explanation for the observations.
A High Order, Locally-Adaptive Method for the Navier-Stokes Equations
NASA Astrophysics Data System (ADS)
Chan, Daniel
1998-11-01
I have extended the FOSLS method of Cai, Manteuffel and McCormick (1997) and implemented it within the framework of a spectral element formulation using the Legendre polynomial basis function. The FOSLS method solves the Navier-Stokes equations as a system of coupled first-order equations and provides the ellipticity that is needed for fast iterative matrix solvers like multigrid to operate efficiently. Each element is treated as an object and its properties are self-contained. Only C^0 continuity is imposed across element interfaces; this design allows local grid refinement and coarsening without the burden of having an elaborate data structure, since only information along element boundaries is needed. With the FORTRAN 90 programming environment, I can maintain a high computational efficiency by employing a hybrid parallel processing model. The OpenMP directives provides parallelism in the loop level which is executed in a shared-memory SMP and the MPI protocol allows the distribution of elements to a cluster of SMP's connected via a commodity network. This talk will provide timing results and a comparison with a second order finite difference method.
Towards a large-scale scalable adaptive heart model using shallow tree meshes
NASA Astrophysics Data System (ADS)
Krause, Dorian; Dickopf, Thomas; Potse, Mark; Krause, Rolf
2015-10-01
Electrophysiological heart models are sophisticated computational tools that place high demands on the computing hardware due to the high spatial resolution required to capture the steep depolarization front. To address this challenge, we present a novel adaptive scheme for resolving the deporalization front accurately using adaptivity in space. Our adaptive scheme is based on locally structured meshes. These tensor meshes in space are organized in a parallel forest of trees, which allows us to resolve complicated geometries and to realize high variations in the local mesh sizes with a minimal memory footprint in the adaptive scheme. We discuss both a non-conforming mortar element approximation and a conforming finite element space and present an efficient technique for the assembly of the respective stiffness matrices using matrix representations of the inclusion operators into the product space on the so-called shallow tree meshes. We analyzed the parallel performance and scalability for a two-dimensional ventricle slice as well as for a full large-scale heart model. Our results demonstrate that the method has good performance and high accuracy.
Multiple channel data acquisition system
Crawley, H. Bert; Rosenberg, Eli I.; Meyer, W. Thomas; Gorbics, Mark S.; Thomas, William D.; McKay, Roy L.; Homer, Jr., John F.
1990-05-22
A multiple channel data acquisition system for the transfer of large amounts of data from a multiplicity of data channels has a plurality of modules which operate in parallel to convert analog signals to digital data and transfer that data to a communications host via a FASTBUS. Each module has a plurality of submodules which include a front end buffer (FEB) connected to input circuitry having an analog to digital converter with cache memory for each of a plurality of channels. The submodules are interfaced with the FASTBUS via a FASTBUS coupler which controls a module bus and a module memory. The system is triggered to effect rapid parallel data samplings which are stored to the cache memories. The cache memories are uploaded to the FEBs during which zero suppression occurs. The data in the FEBs is reformatted and compressed by a local processor during transfer to the module memory. The FASTBUS coupler is used by the communications host to upload the compressed and formatted data from the module memory. The local processor executes programs which are downloaded to the module memory through the FASTBUS coupler.
Multiple channel data acquisition system
Crawley, H.B.; Rosenberg, E.I.; Meyer, W.T.; Gorbics, M.S.; Thomas, W.D.; McKay, R.L.; Homer, J.F. Jr.
1990-05-22
A multiple channel data acquisition system for the transfer of large amounts of data from a multiplicity of data channels has a plurality of modules which operate in parallel to convert analog signals to digital data and transfer that data to a communications host via a FASTBUS. Each module has a plurality of submodules which include a front end buffer (FEB) connected to input circuitry having an analog to digital converter with cache memory for each of a plurality of channels. The submodules are interfaced with the FASTBUS via a FASTBUS coupler which controls a module bus and a module memory. The system is triggered to effect rapid parallel data samplings which are stored to the cache memories. The cache memories are uploaded to the FEBs during which zero suppression occurs. The data in the FEBs is reformatted and compressed by a local processor during transfer to the module memory. The FASTBUS coupler is used by the communications host to upload the compressed and formatted data from the module memory. The local processor executes programs which are downloaded to the module memory through the FASTBUS coupler. 25 figs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Satake, Shin-ichi; Kanamori, Hiroyuki; Kunugi, Tomoaki
2007-02-01
We have developed a parallel algorithm for microdigital-holographic particle-tracking velocimetry. The algorithm is used in (1) numerical reconstruction of a particle image computer using a digital hologram, and (2) searching for particles. The numerical reconstruction from the digital hologram makes use of the Fresnel diffraction equation and the FFT (fast Fourier transform),whereas the particle search algorithm looks for local maximum graduation in a reconstruction field represented by a 3D matrix. To achieve high performance computing for both calculations (reconstruction and particle search), two memory partitions are allocated to the 3D matrix. In this matrix, the reconstruction part consists of horizontallymore » placed 2D memory partitions on the x-y plane for the FFT, whereas, the particle search part consists of vertically placed 2D memory partitions set along the z axes.Consequently, the scalability can be obtained for the proportion of processor elements,where the benchmarks are carried out for parallel computation by a SGI Altix machine.« less
Particle simulation of plasmas on the massively parallel processor
NASA Technical Reports Server (NTRS)
Gledhill, I. M. A.; Storey, L. R. O.
1987-01-01
Particle simulations, in which collective phenomena in plasmas are studied by following the self consistent motions of many discrete particles, involve several highly repetitive sets of calculations that are readily adaptable to SIMD parallel processing. A fully electromagnetic, relativistic plasma simulation for the massively parallel processor is described. The particle motions are followed in 2 1/2 dimensions on a 128 x 128 grid, with periodic boundary conditions. The two dimensional simulation space is mapped directly onto the processor network; a Fast Fourier Transform is used to solve the field equations. Particle data are stored according to an Eulerian scheme, i.e., the information associated with each particle is moved from one local memory to another as the particle moves across the spatial grid. The method is applied to the study of the nonlinear development of the whistler instability in a magnetospheric plasma model, with an anisotropic electron temperature. The wave distribution function is included as a new diagnostic to allow simulation results to be compared with satellite observations.
Gyrokinetic theory of turbulent acceleration and momentum conservation in tokamak plasmas
NASA Astrophysics Data System (ADS)
Lu, WANG; Shuitao, PENG; P, H. DIAMOND
2018-07-01
Understanding the generation of intrinsic rotation in tokamak plasmas is crucial for future fusion reactors such as ITER. We proposed a new mechanism named turbulent acceleration for the origin of the intrinsic parallel rotation based on gyrokinetic theory. The turbulent acceleration acts as a local source or sink of parallel rotation, i.e., volume force, which is different from the divergence of residual stress, i.e., surface force. However, the order of magnitude of turbulent acceleration can be comparable to that of the divergence of residual stress for electrostatic ion temperature gradient (ITG) turbulence. A possible theoretical explanation for the experimental observation of electron cyclotron heating induced decrease of co-current rotation was also proposed via comparison between the turbulent acceleration driven by ITG turbulence and that driven by collisionless trapped electron mode turbulence. We also extended this theory to electromagnetic ITG turbulence and investigated the electromagnetic effects on intrinsic parallel rotation drive. Finally, we demonstrated that the presence of turbulent acceleration does not conflict with momentum conservation.
Rapid, parallel path planning by propagating wavefronts of spiking neural activity
Ponulak, Filip; Hopfield, John J.
2013-01-01
Efficient path planning and navigation is critical for animals, robotics, logistics and transportation. We study a model in which spatial navigation problems can rapidly be solved in the brain by parallel mental exploration of alternative routes using propagating waves of neural activity. A wave of spiking activity propagates through a hippocampus-like network, altering the synaptic connectivity. The resulting vector field of synaptic change then guides a simulated animal to the appropriate selected target locations. We demonstrate that the navigation problem can be solved using realistic, local synaptic plasticity rules during a single passage of a wavefront. Our model can find optimal solutions for competing possible targets or learn and navigate in multiple environments. The model provides a hypothesis on the possible computational mechanisms for optimal path planning in the brain, at the same time it is useful for neuromorphic implementations, where the parallelism of information processing proposed here can fully be harnessed in hardware. PMID:23882213
Aono, Masashi; Gunji, Yukio-Pegio
2003-10-01
The emergence derived from errors is the key importance for both novel computing and novel usage of the computer. In this paper, we propose an implementable experimental plan for the biological computing so as to elicit the emergent property of complex systems. An individual plasmodium of the true slime mold Physarum polycephalum acts in the slime mold computer. Modifying the Elementary Cellular Automaton as it entails the global synchronization problem upon the parallel computing provides the NP-complete problem solved by the slime mold computer. The possibility to solve the problem by giving neither all possible results nor explicit prescription of solution-seeking is discussed. In slime mold computing, the distributivity in the local computing logic can change dynamically, and its parallel non-distributed computing cannot be reduced into the spatial addition of multiple serial computings. The computing system based on exhaustive absence of the super-system may produce, something more than filling the vacancy.
NASA Technical Reports Server (NTRS)
Greenstadt, E. W.; Moses, S. L.; Coroniti, F. V.; Farris, M. H.; Russell, C. T.
1993-01-01
ULF waves in Earth's foreshock cause the instantaneous angle theta-B(n) between the upstream magnetic field and the shock normal to deviate from its average value. Close to the quasi-parallel (Q-parallel) shock, the transverse components of the waves become so large that the orientation of the field to the normal becomes quasi-perpendicular (Q-perpendicular) during applicable phases of each wave cycle. Large upstream pulses of B were observed completely enclosed in excursions of Theta-B(n) into the Q-perpendicular range. A recent numerical simulation included Theta-B(n) among the parameters examined in Q-parallel runs, and described a similar coincidence as intrinsic to a stage in development of the reformation process of such shocks. Thus, the natural environment of the Q-perpendicular section of Earth's bow shock seems to include an identifiable class of enlarged magnetic pulses for which local Q-perpendicular geometry is a necessary association.
Terrain types and local-scale stratigraphy of grooved terrain on ganymede
NASA Technical Reports Server (NTRS)
Murchie, Scott L.; Head, James W.; Helfenstein, Paul; Plescia, Jeffrey B.
1986-01-01
Grooved terrain is subdivided on the basis of pervasive morphology into: (1) groove lanes - elongate parallel groove bands, (2) grooved polygons - polygonal domains of parallel grooves, (3) reticulate terrain - polygonal domains of orthogonal grooves, and (4) complex grooved terrain - polygons with several complexly cross-cutting groove sets. Detailed geologic mapping of select areas, employing previously established conventions for determining relative age relations, reveals a general three-stage sequence of grooved terrain emplacement: first, dissection of the lithosphere by throughgoing grooves, and pervasive deformation of intervening blocks; second, extensive flooding and continued deformation of the intervening blocks; third, repeated superposition of groove lanes concentrated at sites of initial throughgoing grooves. This sequence is corroborated by crater-density measurements. Dominant orientations of groove sets are parallel to relict zones of weakness that probably were reactivated during grooved terrain formation. Groove lane morphology and development consistent with that predicted for passive rifts suggests a major role of global expansion in grooved terrain formation.
Desmet, Gert
2013-11-01
The finite length parallel zone (FPZ)-model is proposed as an alternative model for the axial- or eddy-dispersion caused by the occurrence of local velocity biases or flow heterogeneities in porous media such as those used in liquid chromatography columns. The mathematical plate height expression evolving from the model shows that the A- and C-term band broadening effects that can originate from a given velocity bias should be coupled in an exponentially decaying way instead of harmonically as proposed in Giddings' coupling theory. In the low and high velocity limit both models converge, while a 12% difference can be observed in the (practically most relevant) intermediate range of reduced velocities. Explicit expressions for the A- and C-constants appearing in the exponential decay-based plate height expression have been derived for each of the different possible velocity bias levels (single through-pore and particle level, multi-particle level and trans-column level). These expressions allow to directly relate the band broadening originating from these different levels to the local fundamental transport parameters, hence offering the possibility to include a velocity-dependent and, if, needed retention factor-dependent transversal dispersion coefficient. Having developed the mathematics for the general case wherein a difference in retention equilibrium establishes between the two parallel zones, the effect of any possible local variations in packing density and/or retention capacity on the eddy-dispersion can be explicitly accounted for as well. It is furthermore also shown that, whereas the lumped transport parameter model used in the basic variant of the FPZ-model only provides a first approximation of the true decay constant, the model can be extended by introducing a constant correction factor to correctly account for the continuous transversal dispersion transport in the velocity bias zones. Copyright © 2013 Elsevier B.V. All rights reserved.
Toward real-time Monte Carlo simulation using a commercial cloud computing infrastructure
NASA Astrophysics Data System (ADS)
Wang, Henry; Ma, Yunzhi; Pratx, Guillem; Xing, Lei
2011-09-01
Monte Carlo (MC) methods are the gold standard for modeling photon and electron transport in a heterogeneous medium; however, their computational cost prohibits their routine use in the clinic. Cloud computing, wherein computing resources are allocated on-demand from a third party, is a new approach for high performance computing and is implemented to perform ultra-fast MC calculation in radiation therapy. We deployed the EGS5 MC package in a commercial cloud environment. Launched from a single local computer with Internet access, a Python script allocates a remote virtual cluster. A handshaking protocol designates master and worker nodes. The EGS5 binaries and the simulation data are initially loaded onto the master node. The simulation is then distributed among independent worker nodes via the message passing interface, and the results aggregated on the local computer for display and data analysis. The described approach is evaluated for pencil beams and broad beams of high-energy electrons and photons. The output of cloud-based MC simulation is identical to that produced by single-threaded implementation. For 1 million electrons, a simulation that takes 2.58 h on a local computer can be executed in 3.3 min on the cloud with 100 nodes, a 47× speed-up. Simulation time scales inversely with the number of parallel nodes. The parallelization overhead is also negligible for large simulations. Cloud computing represents one of the most important recent advances in supercomputing technology and provides a promising platform for substantially improved MC simulation. In addition to the significant speed up, cloud computing builds a layer of abstraction for high performance parallel computing, which may change the way dose calculations are performed and radiation treatment plans are completed. This work was presented in part at the 2010 Annual Meeting of the American Association of Physicists in Medicine (AAPM), Philadelphia, PA.
Anisotropic Behaviour of Magnetic Power Spectra in Solar Wind Turbulence.
NASA Astrophysics Data System (ADS)
Banerjee, S.; Saur, J.; Gerick, F.; von Papen, M.
2017-12-01
Introduction:High altitude fast solar wind turbulence (SWT) shows different spectral properties as a function of the angle between the flow direction and the scale dependent mean magnetic field (Horbury et al., PRL, 2008). The average magnetic power contained in the near perpendicular direction (80º-90º) was found to be approximately 5 times larger than the average power in the parallel direction (0º- 10º). In addition, the parallel power spectra was found to give a steeper (-2) power law than the perpendicular power spectral density (PSD) which followed a near Kolmogorov slope (-5/3). Similar anisotropic behaviour has also been observed (Chen et al., MNRAS, 2011) for slow solar wind (SSW), but using a different method exploiting multi-spacecraft data of Cluster. Purpose:In the current study, using Ulysses data, we investigate (i) the anisotropic behaviour of near ecliptic slow solar wind using the same methodology (described below) as that of Horbury et al. (2008) and (ii) the dependence of the anisotropic behaviour of SWT as a function of the heliospheric latitude.Method:We apply the wavelet method to calculate the turbulent power spectra of the magnetic field fluctuations parallel and perpendicular to the local mean magnetic field (LMF). According to Horbury et al., LMF for a given scale (or size) is obtained using an envelope of the envelope of that size. Results:(i) SSW intervals always show near -5/3 perpendicular spectra. Unlike the fast solar wind (FSW) intervals, for SSW, we often find intervals where power parallel to the mean field is not observed. For a few intervals with sufficient power in parallel direction, slow wind turbulence also exhibit -2 parallel spectra similar to FSW.(ii) The behaviours of parallel and perpendicular power spectra are found to be independent of the heliospheric latitude. Conclusion:In the current study we do not find significant influence of the heliospheric latitude on the spectral slopes of parallel and perpendicular magnetic spectra. This indicates that the spectral anisotropy in parallel and perpendicular direction is governed by intrinsic properties of SWT.
NASA Astrophysics Data System (ADS)
Teddy, Livian; Hardiman, Gagoek; Nuroji; Tudjono, Sri
2017-12-01
Indonesia is an area prone to earthquake that may cause casualties and damage to buildings. The fatalities or the injured are not largely caused by the earthquake, but by building collapse. The collapse of the building is resulted from the building behaviour against the earthquake, and it depends on many factors, such as architectural design, geometry configuration of structural elements in horizontal and vertical plans, earthquake zone, geographical location (distance to earthquake center), soil type, material quality, and construction quality. One of the geometry configurations that may lead to the collapse of the building is irregular configuration of non-parallel system. In accordance with FEMA-451B, irregular configuration in non-parallel system is defined to have existed if the vertical lateral force-retaining elements are neither parallel nor symmetric with main orthogonal axes of the earthquake-retaining axis system. Such configuration may lead to torque, diagonal translation and local damage to buildings. It does not mean that non-parallel irregular configuration should not be formed on architectural design; however the designer must know the consequence of earthquake behaviour against buildings with irregular configuration of non-parallel system. The present research has the objective to identify earthquake behaviour in architectural geometry with irregular configuration of non-parallel system. The present research was quantitative with simulation experimental method. It consisted of 5 models, where architectural data and model structure data were inputted and analyzed using the software SAP2000 in order to find out its performance, and ETAB2015 to determine the eccentricity occurred. The output of the software analysis was tabulated, graphed, compared and analyzed with relevant theories. For areas of strong earthquake zones, avoid designing buildings which wholly form irregular configuration of non-parallel system. If it is inevitable to design a building with building parts containing irregular configuration of non-parallel system, make it more rigid by forming a triangle module, and use the formula.A good collaboration is needed between architects and structural experts in creating earthquake architecture.
Smith, Roger J
2008-10-01
A novel diagnostic technique for the remote and nonperturbative sensing of the local magnetic field in reactor relevant plasmas is presented. Pulsed polarimetry [Patent No. 12/150,169 (pending)] combines optical scattering with the Faraday effect. The polarimetric light detection and ranging (LIDAR)-like diagnostic has the potential to be a local B(pol) diagnostic on ITER and can achieve spatial resolutions of millimeters on high energy density (HED) plasmas using existing lasers. The pulsed polarimetry method is based on nonlocal measurements and subtle effects are introduced that are not present in either cw polarimetry or Thomson scattering LIDAR. Important features include the capability of simultaneously measuring local T(e), n(e), and B(parallel) along the line of sight, a resiliency to refractive effects, a short measurement duration providing near instantaneous data in time, and location for real-time feedback and control of magnetohydrodynamic (MHD) instabilities and the realization of a widely applicable internal magnetic field diagnostic for the magnetic fusion energy program. The technique improves for higher n(e)B(parallel) product and higher n(e) and is well suited for diagnosing the transient plasmas in the HED program. Larger devices such as ITER and DEMO are also better suited to the technique, allowing longer pulse lengths and thereby relaxing key technology constraints making pulsed polarimetry a valuable asset for next step devices. The pulsed polarimetry technique is clarified by way of illustration on the ITER tokamak and plasmas within the magnetized target fusion program within present technological means.
The Casimir effect for parallel plates revisited
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kawakami, N. A.; Nemes, M. C.; Wreszinski, Walter F.
2007-10-15
The Casimir effect for a massless scalar field with Dirichlet and periodic boundary conditions (bc's) on infinite parallel plates is revisited in the local quantum field theory (lqft) framework introduced by Kay [Phys. Rev. D 20, 3052 (1979)]. The model displays a number of more realistic features than the ones he treated. In addition to local observables, as the energy density, we propose to consider intensive variables, such as the energy per unit area {epsilon}, as fundamental observables. Adopting this view, lqft rejects Dirichlet (the same result may be proved for Neumann or mixed) bc, and accepts periodic bc: inmore » the former case {epsilon} diverges, in the latter it is finite, as is shown by an expression for the local energy density obtained from lqft through the use of the Poisson summation formula. Another way to see this uses methods from the Euler summation formula: in the proof of regularization independence of the energy per unit area, a regularization-dependent surface term arises upon use of Dirichlet bc, but not periodic bc. For the conformally invariant scalar quantum field, this surface term is absent due to the condition of zero trace of the energy momentum tensor, as remarked by De Witt [Phys. Rep. 19, 295 (1975)]. The latter property does not hold in the application to the dark energy problem in cosmology, in which we argue that periodic bc might play a distinguished role.« less
Free-surface liquid jet impingement on rib patterned superhydrophobic surfaces
NASA Astrophysics Data System (ADS)
Maynes, D.; Johnson, M.; Webb, B. W.
2011-05-01
We report experimental results characterizing the dynamics of a liquid jet impinging normally on hydrophilic, hydrophobic, and superhydrophobic surfaces spanning the Weber number (based on the jet velocity and diameter) range from 100 to 1900. The superhydrophobic surfaces are fabricated with both hydrophobically coated silicon and polydimethylsiloxane that exhibit alternating microribs and cavities. For all surfaces a transition from a thin radially moving liquid sheet occurs. This takes the form of the classical hydraulic jump for the hydrophilic surfaces but is markedly different for the hydrophobic and superhydrophobic surfaces, where the transition is significantly influenced by surface tension and a break-up into droplets is observed at high Weber number. For the superhydrophobic surfaces, the transition exhibits an elliptical shape with the major axis being aligned parallel to the ribs, concomitant with the frictional resistance being smaller in the parallel direction than in the transverse direction. However, the total projected area of the ellipse exhibits a nearly linear dependence on the jet Weber number, and was nominally invariant with varying hydrophobicity and relative size of the ribs and cavities. For the hydrophobic and superhydrophobic scenarios, the local Weber number based on the local radial velocity and local depth of the radially moving liquid sheet is observed to be of order unity at the transition location. The results also reveal that for increasing relative size of the cavities, the ratio of the ellipse axis (major-to-minor) increases.
Burian, Agata; Uyttewaal, Magalie
2013-01-01
Cortical microtubules (CMTs) are often aligned in a particular direction in individual cells or even in groups of cells and play a central role in the definition of growth anisotropy. How the CMTs themselves are aligned is not well known, but two hypotheses have been proposed. According to the first hypothesis, CMTs align perpendicular to the maximal growth direction, and, according to the second, CMTs align parallel to the maximal stress direction. Since both hypotheses were formulated on the basis of mainly qualitative assessments, the link between CMT organization, organ geometry, and cell growth is revisited using a quantitative approach. For this purpose, CMT orientation, local curvature, and growth parameters for each cell were measured in the growing shoot apical meristem (SAM) of Arabidopsis thaliana. Using this approach, it has been shown that stable CMTs tend to be perpendicular to the direction of maximal growth in cells at the SAM periphery, but parallel in the cells at the boundary domain. When examining the local curvature of the SAM surface, no strict correlation between curvature and CMT arrangement was found, which implies that SAM geometry, and presumed geometry-derived stress distribution, is not sufficient to prescribe the CMT orientation. However, a better match between stress and CMTs was found when mechanical stress derived from differential growth was also considered. PMID:24153420
Proposal for massively parallel data storage system
NASA Technical Reports Server (NTRS)
Mansuripur, M.
1992-01-01
An architecture for integrating large numbers of data storage units (drives) to form a distributed mass storage system is proposed. The network of interconnected units consists of nodes and links. At each node there resides a controller board, a data storage unit and, possibly, a local/remote user-terminal. The links (twisted-pair wires, coax cables, or fiber-optic channels) provide the communications backbone of the network. There is no central controller for the system as a whole; all decisions regarding allocation of resources, routing of messages and data-blocks, creation and distribution of redundant data-blocks throughout the system (for protection against possible failures), frequency of backup operations, etc., are made locally at individual nodes. The system can handle as many user-terminals as there are nodes in the network. Various users compete for resources by sending their requests to the local controller-board and receiving allocations of time and storage space. In principle, each user can have access to the entire system, and all drives can be running in parallel to service the requests for one or more users. The system is expandable up to a maximum number of nodes, determined by the number of routing-buffers built into the controller boards. Additional drives, controller-boards, user-terminals, and links can be simply plugged into an existing system in order to expand its capacity.
Field aligned flows driven by neutral puffing at MAST
NASA Astrophysics Data System (ADS)
Waters, I.; Frerichs, H.; Silburn, S.; Feng, Y.; Harrison, J.; Kirk, A.; Schmitz, O.
2018-06-01
Neutral deuterium gas puffing at the high field side of the mega ampere spherical tokamak (MAST) is shown to drive carbon impurity flows that are aligned with the trajectory of the magnetic field lines in the plasma scrape-off-layer. These impurity flows were directly imaged with emissions from C2+ ions at MAST by coherence imaging spectroscopy and were qualitatively reproduced in deuterium plasmas by modeling with the EMC3-EIRENE plasma edge fluid and kinetic neutral transport code. A reduced one-dimensional momentum and particle balance shows that a localized increase in the static plasma pressure in front of the neutral gas puff yields an acceleration of the plasma due to local ionization. Perpendicular particle transport yields a decay from which a parallel length scale can be determined. Parameter scans in EMC3-EIRENE were carried out to determine the sensitivity of the deuterium plasma flow phenomena to local fueling and diffusion parameters and it is found that these flows robustly form across a wide variety of plasma conditions. Finally, efforts to couple this behavior in the background plasma directly to the impurity flows observed experimentally in MAST using a trace impurity model are discussed. These results provide insight into the fueling and exhaust features at this pivotal point of the radial and parallel particle flux balance, which is a major part of the plasma fueling and exhaust characteristics in a magnetically confined fusion device.
Burian, Agata; Ludynia, Michal; Uyttewaal, Magalie; Traas, Jan; Boudaoud, Arezki; Hamant, Olivier; Kwiatkowska, Dorota
2013-12-01
Cortical microtubules (CMTs) are often aligned in a particular direction in individual cells or even in groups of cells and play a central role in the definition of growth anisotropy. How the CMTs themselves are aligned is not well known, but two hypotheses have been proposed. According to the first hypothesis, CMTs align perpendicular to the maximal growth direction, and, according to the second, CMTs align parallel to the maximal stress direction. Since both hypotheses were formulated on the basis of mainly qualitative assessments, the link between CMT organization, organ geometry, and cell growth is revisited using a quantitative approach. For this purpose, CMT orientation, local curvature, and growth parameters for each cell were measured in the growing shoot apical meristem (SAM) of Arabidopsis thaliana. Using this approach, it has been shown that stable CMTs tend to be perpendicular to the direction of maximal growth in cells at the SAM periphery, but parallel in the cells at the boundary domain. When examining the local curvature of the SAM surface, no strict correlation between curvature and CMT arrangement was found, which implies that SAM geometry, and presumed geometry-derived stress distribution, is not sufficient to prescribe the CMT orientation. However, a better match between stress and CMTs was found when mechanical stress derived from differential growth was also considered.
Martin, Adrian; Schiavi, Emanuele; Eryaman, Yigitcan; Herraiz, Joaquin L; Gagoski, Borjan; Adalsteinsson, Elfar; Wald, Lawrence L; Guerin, Bastien
2016-06-01
A new framework for the design of parallel transmit (pTx) pulses is presented introducing constraints for local and global specific absorption rate (SAR) in the presence of errors in the radiofrequency (RF) transmit chain. The first step is the design of a pTx RF pulse with explicit constraints for global and local SAR. Then, the worst possible SAR associated with that pulse due to RF transmission errors ("worst-case SAR") is calculated. Finally, this information is used to re-calculate the pulse with lower SAR constraints, iterating this procedure until its worst-case SAR is within safety limits. Analysis of an actual pTx RF transmit chain revealed amplitude errors as high as 8% (20%) and phase errors above 3° (15°) for spokes (spiral) pulses. Simulations show that using the proposed framework, pulses can be designed with controlled "worst-case SAR" in the presence of errors of this magnitude at minor cost of the excitation profile quality. Our worst-case SAR-constrained pTx design strategy yields pulses with local and global SAR within the safety limits even in the presence of RF transmission errors. This strategy is a natural way to incorporate SAR safety factors in the design of pTx pulses. Magn Reson Med 75:2493-2504, 2016. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Fall, András; Ukar, Estibalitz; Laubach, Stephen E.
2016-09-01
Electron backscattered diffraction techniques (EBSD) show that Dauphiné twins in quartz are widespread in many tectonometamorphic environments. Our study documents that under diagenetic temperatures (< 200 °C) and burial depths < 5 km Dauphiné twins are common in isolated fracture quartz deposits spanning between fracture walls (i.e., quartz bridges) in low-porosity quartz-cemented sandstones. Using examples from East Texas and Colorado cores, we show that twins are associated with crack-seal microstructure and fluid inclusions. Fracture wall-parallel and wall-normal inclusion trails contain coexisting aqueous and hydrocarbon gas inclusions, so homogenization temperatures of aqueous inclusions record true trapping temperatures. Inclusions in alignments normal to fracture walls are large and irregularly shaped compared to those aligned parallel to walls, but both show similar liquid-to-vapor ratios. Stacking transmitted light images with scanning electron microscope cathodoluminescence (SEM-CL) and EBSD images demonstrates that Dauphiné twin boundaries are localized along wall-normal inclusion trails. Trapping temperatures for wall-normal inclusion trails are usually higher than those aligned parallel to the fracture wall. Wall-normal fluid inclusion assemblage temperatures typically match the highest temperatures of wall-parallel assemblages trapped during sequential widening, but not necessarily the most recent. In context of burial histories for these samples, this temperature pattern implies that wall-normal assemblages form at discrete times during or after crack-seal fracture widening. Localization in isolated, potentially high-stress quartz deposits in fractures is compatible with a mechanical origin for these Dauphiné twins. Punctuated temperature values and discrepant sizes and shapes of inclusions in wall-normal trails implies that twinning is a by-product of the formation of the wall-normal inclusion assemblages. The association of Dauphiné twins and fluid inclusion assemblages from which temperature and possibly timing can be inferred provides a way to research timing as well as magnitude of paleostress in some diagenetic settings.
Local Norms and Test Characteristics for Selected Forms of the M.A.A. Placement Test.
ERIC Educational Resources Information Center
Melancon, Janet G.; Thompson, Bruce
The psychometric integrity of selected items from the Mathematics Association of America (MAA) placement tests for college students was investigated. Two alternative and parallel versions of the test were developed (Form A and Form B) for this study. Data for 539 students seeking admission into an undergraduate mathematics curriculum at a private…
Random Walk Method for Potential Problems
NASA Technical Reports Server (NTRS)
Krishnamurthy, T.; Raju, I. S.
2002-01-01
A local Random Walk Method (RWM) for potential problems governed by Lapalace's and Paragon's equations is developed for two- and three-dimensional problems. The RWM is implemented and demonstrated in a multiprocessor parallel environment on a Beowulf cluster of computers. A speed gain of 16 is achieved as the number of processors is increased from 1 to 23.
Boundary and object detection in real world images. [by means of algorithms
NASA Technical Reports Server (NTRS)
Yakimovsky, Y.
1974-01-01
A solution to the problem of automatic location of objects in digital pictures by computer is presented. A self-scaling local edge detector which can be applied in parallel on a picture is described. Clustering algorithms and boundary following algorithms which are sequential in nature process the edge data to locate images of objects.
ERIC Educational Resources Information Center
Wedin, Åsa; Wessman, Anneli
2017-01-01
In this article, language policy is analysed in relation to multilingual practices in primary school through an understanding of the policy on different levels--as management, perception and practice. The article is based on longitudinal ethnographic action research that was conducted parallel to local school development. Here we draw on material…
Parallel processing for pitch splitting decomposition
NASA Astrophysics Data System (ADS)
Barnes, Levi; Li, Yong; Wadkins, David; Biederman, Steve; Miloslavsky, Alex; Cork, Chris
2009-10-01
Decomposition of an input pattern in preparation for a double patterning process is an inherently global problem in which the influence of a local decomposition decision can be felt across an entire pattern. In spite of this, a large portion of the work can be massively distributed. Here, we discuss the advantages of geometric distribution for polygon operations with limited range of influence. Further, we have found that even the naturally global "coloring" step can, in large part, be handled in a geometrically local manner. In some practical cases, up to 70% of the work can be distributed geometrically. We also describe the methods for partitioning the problem into local pieces and present scaling data up to 100 CPUs. These techniques reduce DPT decomposition runtime by orders of magnitude.
A FAST ITERATIVE METHOD FOR SOLVING THE EIKONAL EQUATION ON TETRAHEDRAL DOMAINS
Fu, Zhisong; Kirby, Robert M.; Whitaker, Ross T.
2014-01-01
Generating numerical solutions to the eikonal equation and its many variations has a broad range of applications in both the natural and computational sciences. Efficient solvers on cutting-edge, parallel architectures require new algorithms that may not be theoretically optimal, but that are designed to allow asynchronous solution updates and have limited memory access patterns. This paper presents a parallel algorithm for solving the eikonal equation on fully unstructured tetrahedral meshes. The method is appropriate for the type of fine-grained parallelism found on modern massively-SIMD architectures such as graphics processors and takes into account the particular constraints and capabilities of these computing platforms. This work builds on previous work for solving these equations on triangle meshes; in this paper we adapt and extend previous two-dimensional strategies to accommodate three-dimensional, unstructured, tetrahedralized domains. These new developments include a local update strategy with data compaction for tetrahedral meshes that provides solutions on both serial and parallel architectures, with a generalization to inhomogeneous, anisotropic speed functions. We also propose two new update schemes, specialized to mitigate the natural data increase observed when moving to three dimensions, and the data structures necessary for efficiently mapping data to parallel SIMD processors in a way that maintains computational density. Finally, we present descriptions of the implementations for a single CPU, as well as multicore CPUs with shared memory and SIMD architectures, with comparative results against state-of-the-art eikonal solvers. PMID:25221418
Subcostal Transverse Abdominis Plane Block for Acute Pain Management: A Review.
Soliz, Jose M; Lipski, Ian; Hancher-Hodges, Shannon; Speer, Barbra Bryce; Popat, Keyuri
2017-10-01
The subcostal transverse abdominis plane (SCTAP) block is the deposition of local anesthetic in the transverse abdominis plane inferior and parallel to the costal margin. There is a growing consensus that the SCTAP block provides better analgesia for upper abdominal incisions than the traditional transverse abdominis plane block. In addition, when used as part of a four-quadrant transverse abdominis plane block, the SCTAP block may provide adequate analgesia for major abdominal surgery. The purpose of this review is to discuss the SCTAP block, including its indications, technique, local anesthetic solutions, and outcomes.
Local Variation of Hashtag Spike Trains and Popularity in Twitter
Sanlı, Ceyda; Lambiotte, Renaud
2015-01-01
We draw a parallel between hashtag time series and neuron spike trains. In each case, the process presents complex dynamic patterns including temporal correlations, burstiness, and all other types of nonstationarity. We propose the adoption of the so-called local variation in order to uncover salient dynamical properties, while properly detrending for the time-dependent features of a signal. The methodology is tested on both real and randomized hashtag spike trains, and identifies that popular hashtags present regular and so less bursty behavior, suggesting its potential use for predicting online popularity in social media. PMID:26161650
Dipolar particles in a double-trap confinement: Response to tilting the dipolar orientation
NASA Astrophysics Data System (ADS)
Bjerlin, J.; Bengtsson, J.; Deuretzbacher, F.; Kristinsdóttir, L. H.; Reimann, S. M.
2018-02-01
We analyze the microscopic few-body properties of dipolar particles confined in two parallel quasi-one-dimensional harmonic traps. In particular, we show that an adiabatic rotation of the dipole orientation about the trap axes can drive an initially nonlocalized few-fermion state into a localized state with strong intertrap pairing. With an instant, nonadiabatic rotation, however, localization is inhibited and a highly excited state is reached. This state may be interpreted as the few-body analog of a super-Tonks-Girardeau state, known from one-dimensional systems with contact interactions.
Parallel public spheres: distance and discourse in letters to the editor.
Perrin, Andrew J; Vaisey, Stephen
2008-11-01
This article examines letters to the editor as one of the ways citizens seek to enact a public sphere using technological mediation. Using a sample of all letters received by a metropolitan newspaper during a three-month period (N = 1,113), the authors demonstrate that the tone and argumentative styles of letters differ with the scope of the issues the letters address. Local issues evoke more reasoned, conciliatory tones, while issues beyond the local context evoke more emotional, confrontational tones, even after controlling for individual writers' characteristics and anger as a motivation to write.
Sound source tracking device for telematic spatial sound field reproduction
NASA Astrophysics Data System (ADS)
Cardenas, Bruno
This research describes an algorithm that localizes sound sources for use in telematic applications. The localization algorithm is based on amplitude differences between various channels of a microphone array of directional shotgun microphones. The amplitude differences will be used to locate multiple performers and reproduce their voices, which were recorded at close distance with lavalier microphones, spatially corrected using a loudspeaker rendering system. In order to track multiple sound sources in parallel the information gained from the lavalier microphones will be utilized to estimate the signal-to-noise ratio between each performer and the concurrent performers.
Effects of parallel planning on agreement production.
Veenstra, Alma; Meyer, Antje S; Acheson, Daniel J
2015-11-01
An important issue in current psycholinguistics is how the time course of utterance planning affects the generation of grammatical structures. The current study investigated the influence of parallel activation of the components of complex noun phrases on the generation of subject-verb agreement. Specifically, the lexical interference account (Gillespie & Pearlmutter, 2011b; Solomon & Pearlmutter, 2004) predicts more agreement errors (i.e., attraction) for subject phrases in which the head and local noun mismatch in number (e.g., the apple next to the pears) when nouns are planned in parallel than when they are planned in sequence. We used a speeded picture description task that yielded sentences such as the apple next to the pears is red. The objects mentioned in the noun phrase were either semantically related or unrelated. To induce agreement errors, pictures sometimes mismatched in number. In order to manipulate the likelihood of parallel processing of the objects and to test the hypothesized relationship between parallel processing and the rate of agreement errors, the pictures were either placed close together or far apart. Analyses of the participants' eye movements and speech onset latencies indicated slower processing of the first object and stronger interference from the related (compared to the unrelated) second object in the close than in the far condition. Analyses of the agreement errors yielded an attraction effect, with more errors in mismatching than in matching conditions. However, the magnitude of the attraction effect did not differ across the close and far conditions. Thus, spatial proximity encouraged parallel processing of the pictures, which led to interference of the associated conceptual and/or lexical representation, but, contrary to the prediction, it did not lead to more attraction errors. Copyright © 2015 Elsevier B.V. All rights reserved.
Huang, Jianhua
2012-07-01
There are three methods for calculating thermal insulation of clothing measured with a thermal manikin, i.e. the global method, the serial method, and the parallel method. Under the condition of homogeneous clothing insulation, these three methods yield the same insulation values. If the local heat flux is uniform over the manikin body, the global and serial methods provide the same insulation value. In most cases, the serial method gives a higher insulation value than the global method. There is a possibility that the insulation value from the serial method is lower than the value from the global method. The serial method always gives higher insulation value than the parallel method. The insulation value from the parallel method is higher or lower than the value from the global method, depending on the relationship between the heat loss distribution and the surface temperatures. Under the circumstance of uniform surface temperature distribution over the manikin body, the global and parallel methods give the same insulation value. If the constant surface temperature mode is used in the manikin test, the parallel method can be used to calculate the thermal insulation of clothing. If the constant heat flux mode is used in the manikin test, the serial method can be used to calculate the thermal insulation of clothing. The global method should be used for calculating thermal insulation of clothing for all manikin control modes, especially for thermal comfort regulation mode. The global method should be chosen by clothing manufacturers for labelling their products. The serial and parallel methods provide more information with respect to the different parts of clothing.
NASA Astrophysics Data System (ADS)
Bellerby, Tim
2014-05-01
Model Integration System (MIST) is open-source environmental modelling programming language that directly incorporates data parallelism. The language is designed to enable straightforward programming structures, such as nested loops and conditional statements to be directly translated into sequences of whole-array (or more generally whole data-structure) operations. MIST thus enables the programmer to use well-understood constructs, directly relating to the mathematical structure of the model, without having to explicitly vectorize code or worry about details of parallelization. A range of common modelling operations are supported by dedicated language structures operating on cell neighbourhoods rather than individual cells (e.g.: the 3x3 local neighbourhood needed to implement an averaging image filter can be simply accessed from within a simple loop traversing all image pixels). This facility hides details of inter-process communication behind more mathematically relevant descriptions of model dynamics. The MIST automatic vectorization/parallelization process serves both to distribute work among available nodes and separately to control storage requirements for intermediate expressions - enabling operations on very large domains for which memory availability may be an issue. MIST is designed to facilitate efficient interpreter based implementations. A prototype open source interpreter is available, coded in standard FORTRAN 95, with tools to rapidly integrate existing FORTRAN 77 or 95 code libraries. The language is formally specified and thus not limited to FORTRAN implementation or to an interpreter-based approach. A MIST to FORTRAN compiler is under development and volunteers are sought to create an ANSI-C implementation. Parallel processing is currently implemented using OpenMP. However, parallelization code is fully modularised and could be replaced with implementations using other libraries. GPU implementation is potentially possible.
Echegaray, Sebastian; Bakr, Shaimaa; Rubin, Daniel L; Napel, Sandy
2017-10-06
The aim of this study was to develop an open-source, modular, locally run or server-based system for 3D radiomics feature computation that can be used on any computer system and included in existing workflows for understanding associations and building predictive models between image features and clinical data, such as survival. The QIFE exploits various levels of parallelization for use on multiprocessor systems. It consists of a managing framework and four stages: input, pre-processing, feature computation, and output. Each stage contains one or more swappable components, allowing run-time customization. We benchmarked the engine using various levels of parallelization on a cohort of CT scans presenting 108 lung tumors. Two versions of the QIFE have been released: (1) the open-source MATLAB code posted to Github, (2) a compiled version loaded in a Docker container, posted to DockerHub, which can be easily deployed on any computer. The QIFE processed 108 objects (tumors) in 2:12 (h/mm) using 1 core, and 1:04 (h/mm) hours using four cores with object-level parallelization. We developed the Quantitative Image Feature Engine (QIFE), an open-source feature-extraction framework that focuses on modularity, standards, parallelism, provenance, and integration. Researchers can easily integrate it with their existing segmentation and imaging workflows by creating input and output components that implement their existing interfaces. Computational efficiency can be improved by parallelizing execution at the cost of memory usage. Different parallelization levels provide different trade-offs, and the optimal setting will depend on the size and composition of the dataset to be processed.
Parallel Programming Strategies for Irregular Adaptive Applications
NASA Technical Reports Server (NTRS)
Biswas, Rupak; Biegel, Bryan (Technical Monitor)
2001-01-01
Achieving scalable performance for dynamic irregular applications is eminently challenging. Traditional message-passing approaches have been making steady progress towards this goal; however, they suffer from complex implementation requirements. The use of a global address space greatly simplifies the programming task, but can degrade the performance for such computations. In this work, we examine two typical irregular adaptive applications, Dynamic Remeshing and N-Body, under competing programming methodologies and across various parallel architectures. The Dynamic Remeshing application simulates flow over an airfoil, and refines localized regions of the underlying unstructured mesh. The N-Body experiment models two neighboring Plummer galaxies that are about to undergo a merger. Both problems demonstrate dramatic changes in processor workloads and interprocessor communication with time; thus, dynamic load balancing is a required component.
Ising versus XY anisotropy in frustrated R(2)Ti(2)O(7) compounds as "Seen" by Polarized Neutrons.
Cao, H; Gukasov, A; Mirebeau, I; Bonville, P; Decorse, C; Dhalenne, G
2009-07-31
We studied the field induced magnetic order in R(2)Ti(2)O(7) pyrochlore compounds with either uniaxial (R=Ho, Tb) or planar (R=Er, Yb) anisotropy, by polarized neutron diffraction. The determination of the local susceptibility tensor {chi(parallel to),chi(perpendicular)} provides a universal description of the field induced structures in the paramagnetic phase (2-270 K), whatever the field value (1-7 T) and direction. Comparison of the thermal variations of chi(parallel to) and chi(perpendicular) with calculations using the rare earth crystal field shows that exchange and dipolar interactions must be taken into account. We determine the molecular field tensor in each case and show that it can be strongly anisotropic.
Compressional residual stress in Bastogne boudins revealed by synchrotron X-ray microdiffraction
Chen, Kai; Kunz, Martin; Li, Yao; ...
2016-06-22
Lattice distortions in crystals can be mapped at the micron scale using synchrotron X-ray Laue microdiffraction (μXRD). From lattice distortions the shape and orientation of the elastic strain tensor can be derived and interpreted in terms of residual stress. We apply the new method to vein quartz from the original boudinage locality at Bastogne, Belgium. Furthermore, a long-standing debate surrounds the kinematics of the Bastogne boudins. The μXRD measurements reveal a shortening residual elastic strain, perpendicular to the vein wall, corroborating the model that the Bastogne boudins formed by layer-parallel shortening and not by layer-parallel extension, as is in themore » geological community generally inferred by the process of boudinage.« less
A proposed experimental search for chameleons using asymmetric parallel plates
DOE Office of Scientific and Technical Information (OSTI.GOV)
Burrage, Clare; Copeland, Edmund J.; Stevenson, James A., E-mail: Clare.Burrage@nottingham.ac.uk, E-mail: ed.copeland@nottingham.ac.uk, E-mail: james.stevenson@nottingham.ac.uk
2016-08-01
Light scalar fields coupled to matter are a common consequence of theories of dark energy and attempts to solve the cosmological constant problem. The chameleon screening mechanism is commonly invoked in order to suppress the fifth forces mediated by these scalars, sufficiently to avoid current experimental constraints, without fine tuning. The force is suppressed dynamically by allowing the mass of the scalar to vary with the local density. Recently it has been shown that near future cold atoms experiments using atom-interferometry have the ability to access a large proportion of the chameleon parameter space. In this work we demonstrate howmore » experiments utilising asymmetric parallel plates can push deeper into the remaining parameter space available to the chameleon.« less
Implementation of highly parallel and large scale GW calculations within the OpenAtom software
NASA Astrophysics Data System (ADS)
Ismail-Beigi, Sohrab
The need to describe electronic excitations with better accuracy than provided by band structures produced by Density Functional Theory (DFT) has been a long-term enterprise for the computational condensed matter and materials theory communities. In some cases, appropriate theoretical frameworks have existed for some time but have been difficult to apply widely due to computational cost. For example, the GW approximation incorporates a great deal of important non-local and dynamical electronic interaction effects but has been too computationally expensive for routine use in large materials simulations. OpenAtom is an open source massively parallel ab initiodensity functional software package based on plane waves and pseudopotentials (http://charm.cs.uiuc.edu/OpenAtom/) that takes advantage of the Charm + + parallel framework. At present, it is developed via a three-way collaboration, funded by an NSF SI2-SSI grant (ACI-1339804), between Yale (Ismail-Beigi), IBM T. J. Watson (Glenn Martyna) and the University of Illinois at Urbana Champaign (Laxmikant Kale). We will describe the project and our current approach towards implementing large scale GW calculations with OpenAtom. Potential applications of large scale parallel GW software for problems involving electronic excitations in semiconductor and/or metal oxide systems will be also be pointed out.
A Family of ACO Routing Protocols for Mobile Ad Hoc Networks.
Rupérez Cañas, Delfín; Sandoval Orozco, Ana Lucila; García Villalba, Luis Javier; Kim, Tai-Hoon
2017-05-22
In this work, an ACO routing protocol for mobile ad hoc networks based on AntHocNet is specified. As its predecessor, this new protocol, called AntOR, is hybrid in the sense that it contains elements from both reactive and proactive routing. Specifically, it combines a reactive route setup process with a proactive route maintenance and improvement process. Key aspects of the AntOR protocol are the disjoint-link and disjoint-node routes, separation between the regular pheromone and the virtual pheromone in the diffusion process and the exploration of routes, taking into consideration the number of hops in the best routes. In this work, a family of ACO routing protocols based on AntOR is also specified. These protocols are based on protocol successive refinements. In this work, we also present a parallelized version of AntOR that we call PAntOR. Using programming multiprocessor architectures based on the shared memory protocol, PAntOR allows running tasks in parallel using threads. This parallelization is applicable in the route setup phase, route local repair process and link failure notification. In addition, a variant of PAntOR that consists of having more than one interface, which we call PAntOR-MI (PAntOR-Multiple Interface), is specified. This approach parallelizes the sending of broadcast messages by interface through threads.
Spatial processing in the auditory cortex of the macaque monkey
NASA Astrophysics Data System (ADS)
Recanzone, Gregg H.
2000-10-01
The patterns of cortico-cortical and cortico-thalamic connections of auditory cortical areas in the rhesus monkey have led to the hypothesis that acoustic information is processed in series and in parallel in the primate auditory cortex. Recent physiological experiments in the behaving monkey indicate that the response properties of neurons in different cortical areas are both functionally distinct from each other, which is indicative of parallel processing, and functionally similar to each other, which is indicative of serial processing. Thus, auditory cortical processing may be similar to the serial and parallel "what" and "where" processing by the primate visual cortex. If "where" information is serially processed in the primate auditory cortex, neurons in cortical areas along this pathway should have progressively better spatial tuning properties. This prediction is supported by recent experiments that have shown that neurons in the caudomedial field have better spatial tuning properties than neurons in the primary auditory cortex. Neurons in the caudomedial field are also better than primary auditory cortex neurons at predicting the sound localization ability across different stimulus frequencies and bandwidths in both azimuth and elevation. These data support the hypothesis that the primate auditory cortex processes acoustic information in a serial and parallel manner and suggest that this may be a general cortical mechanism for sensory perception.
Cache-Oblivious parallel SIMD Viterbi decoding for sequence search in HMMER.
Ferreira, Miguel; Roma, Nuno; Russo, Luis M S
2014-05-30
HMMER is a commonly used bioinformatics tool based on Hidden Markov Models (HMMs) to analyze and process biological sequences. One of its main homology engines is based on the Viterbi decoding algorithm, which was already highly parallelized and optimized using Farrar's striped processing pattern with Intel SSE2 instruction set extension. A new SIMD vectorization of the Viterbi decoding algorithm is proposed, based on an SSE2 inter-task parallelization approach similar to the DNA alignment algorithm proposed by Rognes. Besides this alternative vectorization scheme, the proposed implementation also introduces a new partitioning of the Markov model that allows a significantly more efficient exploitation of the cache locality. Such optimization, together with an improved loading of the emission scores, allows the achievement of a constant processing throughput, regardless of the innermost-cache size and of the dimension of the considered model. The proposed optimized vectorization of the Viterbi decoding algorithm was extensively evaluated and compared with the HMMER3 decoder to process DNA and protein datasets, proving to be a rather competitive alternative implementation. Being always faster than the already highly optimized ViterbiFilter implementation of HMMER3, the proposed Cache-Oblivious Parallel SIMD Viterbi (COPS) implementation provides a constant throughput and offers a processing speedup as high as two times faster, depending on the model's size.
Zhou, Yuan; Cheng, Xinyao; Xu, Xiangyang; Song, Enmin
2013-12-01
Segmentation of carotid artery intima-media in longitudinal ultrasound images for measuring its thickness to predict cardiovascular diseases can be simplified as detecting two nearly parallel boundaries within a certain distance range, when plaque with irregular shapes is not considered. In this paper, we improve the implementation of two dynamic programming (DP) based approaches to parallel boundary detection, dual dynamic programming (DDP) and piecewise linear dual dynamic programming (PL-DDP). Then, a novel DP based approach, dual line detection (DLD), which translates the original 2-D curve position to a 4-D parameter space representing two line segments in a local image segment, is proposed to solve the problem while maintaining efficiency and rotation invariance. To apply the DLD to ultrasound intima-media segmentation, it is imbedded in a framework that employs an edge map obtained from multiplication of the responses of two edge detectors with different scales and a coupled snake model that simultaneously deforms the two contours for maintaining parallelism. The experimental results on synthetic images and carotid arteries of clinical ultrasound images indicate improved performance of the proposed DLD compared to DDP and PL-DDP, with respect to accuracy and efficiency. Copyright © 2013 Elsevier B.V. All rights reserved.
Deformation along the leading edge of the Maiella thrust sheet in central Italy
NASA Astrophysics Data System (ADS)
Aydin, Atilla; Antonellini, Marco; Tondi, Emanuele; Agosta, Fabrizio
2010-09-01
The eastern forelimb of the Maiella anticline above the leading edge of the underlying thrust displays a complex system of fractures, faults and a series of kink bands in the Cretaceous platform carbonates. The kink bands have steep limbs, display top-to-the-east shear, parallel to the overall transport direction, and are brecciated and faulted. A system of pervasive normal faults, trending sub-parallel to the strike of the mechanical layers, accommodates local extension generated by flexural slip. Two sets of strike-slip faults exist: one is left-lateral at a high angle to the main Maiella thrust; the other is right-lateral, intersecting the first set at an acute angle. The normal and strike-slip faults were formed by shearing across bed-parallel, strike-, and dip-parallel pressure solution seams and associated splays; the thrust faults follow the tilted mechanical layers along the steeper limb of the kink bands. The three pervasive, mutually-orthogonal pressure solution seams are pre-tilting. One set of low-angle normal faults, the oldest set in the area, is also pre-tilting. All other fault/fold structures appear to show signs of overlapping periods of activity accounting for the complex tri-shear-like deformation that developed as the front evolved during the Oligocene-Pliocene Apennine orogeny.
Adsorption and dissociation of molecular oxygen on α-Pu (0 2 0) surface: A density functional study
NASA Astrophysics Data System (ADS)
Wang, Jianguang; Ray, Asok K.
2011-09-01
Molecular and dissociative oxygen adsorptions on the α-Pu (0 2 0) surface have been systematically studied using the full-potential linearized augmented-plane-wave plus local orbitals (FP-LAPW+lo) basis method and the Perdew-Burke-Ernzerhof (PBE) exchange-correlation functional. Chemisorption energies have been optimized for the distance of the admolecule from the Pu surface and the bond length of O-O atoms for four adsorption sites and three approaches of O 2 admolecule to the (0 2 0) surface. Chemisorption energies have been calculated at the scalar relativistic level with no spin-orbit coupling (NSOC) and at the fully relativistic level with spin-orbit coupling (SOC). Dissociative adsorptions are found at the two horizontal approaches (O 2 is parallel to the surface and perpendicular/parallel to a lattice vector). Hor2 (O 2 is parallel to the surface and perpendicular to a lattice vector) approach at the one-fold top site is the most stable adsorption site, with chemisorption energies of 8.048 and 8.415 eV for the NSOC and SOC cases, respectively, and an OO separation of 3.70 Å. Molecular adsorption occurs at the Vert (O 2 is vertical to the surface) approach of each adsorption site. The calculated work functions and net spin magnetic moments, respectively, increase and decrease in all cases upon chemisorption compared to the clean surface. The partial charges inside the muffin-tins, the difference charge density distributions, and the local density of states have been used to investigate the Pu-admolecule electronic structures and bonding mechanisms.
Generation of BBFs and DFs, Formation of Substorm Auroras and Triggers of Substorm Onset
NASA Astrophysics Data System (ADS)
Song, Y.; Lysak, R. L.
2014-12-01
Substorm onset is a dynamical response of the MI coupling system to external solar wind driving conditions and to internal dynamical processes. During the growth phase, the solar wind energy and momentum are transferred into the magnetosphere via MHD mesoscale Alfvenic interactions throughout the magnetopause current sheet. A decrease in momentum transfer from the solar wind into the magnetosphere starts a preconditioning stage, and produces a strong earthward body force acting on the whole magnetotail within a short time period. The strong earthward force will cause localized transients in the tail, such as multiple BBFs, DFs, plasma bubbles, and excited MHD waves. On auroral flux tubes, FACs carried by Alfven waves are generated by Alfvenic interactions between tail earthward flows associated with BBFs/DFs/Bubbles and the ionospheric drag. Nonlinear Alfvenic interaction between the incident and reflected Alfven wave packets in the auroral acceleration region can produce localized parallel electric fields and substorm auroral arcs. During the preconditioning stage prior to substorm onset, the generation of parallel electric fields and auroral arcs can redistribute perpendicular mechanical and magnetic stresses, "decoupling" the magnetosphere from the ionosphere drag. This will enhance the tail earthward flows and rapidly build up stronger parallel electric fields in the auroral acceleration region, leading to a sudden and violent tail energy release and substorm auroral poleward expansion. We suggest that in preconditioning stage, the decrease in the solar wind momentum transfer is a necessary condition of the substorm onset. Additionally, "decoupling" the magnetosphere from ionosphere drag can trigger substorm expansion onset.
Clinal variation at phenology-related genes in spruce: parallel evolution in FTL2 and Gigantea?
Chen, Jun; Tsuda, Yoshiaki; Stocks, Michael; Källman, Thomas; Xu, Nannan; Kärkkäinen, Katri; Huotari, Tea; Semerikov, Vladimir L; Vendramin, Giovanni G; Lascoux, Martin
2014-07-01
Parallel clines in different species, or in different geographical regions of the same species, are an important source of information on the genetic basis of local adaptation. We recently detected latitudinal clines in SNPs frequencies and gene expression of candidate genes for growth cessation in Scandinavian populations of Norway spruce (Picea abies). Here we test whether the same clines are also present in Siberian spruce (P. obovata), a close relative of Norway spruce with a different Quaternary history. We sequenced nine candidate genes and 27 control loci and genotyped 14 SSR loci in six populations of P. obovata located along the Yenisei river from latitude 56°N to latitude 67°N. In contrast to Scandinavian Norway spruce that both departs from the standard neutral model (SNM) and shows a clear population structure, Siberian spruce populations along the Yenisei do not depart from the SNM and are genetically unstructured. Nonetheless, as in Norway spruce, growth cessation is significantly clinal. Polymorphisms in photoperiodic (FTL2) and circadian clock (Gigantea, GI, PRR3) genes also show significant clinal variation and/or evidence of local selection. In GI, one of the variants is the same as in Norway spruce. Finally, a strong cline in gene expression is observed for FTL2, but not for GI. These results, together with recent physiological studies, confirm the key role played by FTL2 and circadian clock genes in the control of growth cessation in spruce species and suggest the presence of parallel adaptation in these two species. Copyright © 2014 by the Genetics Society of America.
Heat loads on poloidal and toroidal edges of castellated plasma-facing components in COMPASS
NASA Astrophysics Data System (ADS)
Dejarnac, R.; Corre, Y.; Vondracek, P.; Gaspar, J.; Gauthier, E.; Gunn, J. P.; Komm, M.; Gardarein, J.-L.; Horacek, J.; Hron, M.; Matejicek, J.; Pitts, R. A.; Panek, R.
2018-06-01
Dedicated experiments have been performed in the COMPASS tokamak to thoroughly study the power deposition processes occurring on poloidal and toroidal edges of castellated plasma-facing components in tokamaks during steady-state L-mode conditions. Surface temperatures measured by a high resolution infra-red camera are compared with reconstructed synthetic data from a 2D thermal model using heat flux profiles derived from both the optical approximation and 2D particle-in-cell (PIC) simulations. In the case of poloidal leading edges, when the contribution from local radiation is taken into account, the parallel heat flux deduced from unperturbed, upstream measurements is fully consistent with the observed temperature increase at the leading edges of various heights, respecting power balance assuming simple projection of the parallel flux density. Smoothing of the heat flux deposition profile due to finite ion Larmor radius predicted by the PIC simulations is found to be weak and the power deposition on misaligned poloidal edges is better described by the optical approximation. This is consistent with an electron-dominated regime associated with a non-ambipolar parallel current flow. In the case of toroidal gap edges, the different contributions of the total incoming flux along the gap have been observed experimentally for the first time. They confirm the results of recent numerical studies performed for ITER showing that in specific cases the heat deposition does not necessarily follow the optical approximation. Indeed, ions can spiral onto the magnetically shadowed toroidal edge. Particle-in-cell simulations emphasize again the role played by local non-ambipolarity in the deposition pattern.
Separating figure from ground with a parallel network.
Kienker, P K; Sejnowski, T J; Hinton, G E; Schumacher, L E
1986-01-01
The differentiation of figure from ground plays an important role in the perceptual organization of visual stimuli. The rapidity with which we can discriminate the inside from the outside of a figure suggests that at least this step in the process may be performed in visual cortex by a large number of neurons in several different areas working together in parallel. We have attempted to simulate this collective computation by designing a network of simple processing units that receives two types of information: bottom-up input from the image containing the outlines of a figure, which may be incomplete, and a top-down attentional input that biases one part of the image to be the inside of the figure. No presegmentation of the image was assumed. Two methods for performing the computation were explored: gradient descent, which seeks locally optimal states, and simulated annealing, which attempts to find globally optimal states by introducing noise into the computation. For complete outlines, gradient descent was faster, but the range of input parameters leading to successful performance was very narrow. In contrast, simulated annealing was more robust: it worked over a wider range of attention parameters and a wider range of outlines, including incomplete ones. Our network model is too simplified to serve as a model of human performance, but it does demonstrate that one global property of outlines can be computed through local interactions in a parallel network. Some features of the model, such as the role of noise in escaping from nonglobal optima, may generalize to more realistic models.
A generalized plasma dispersion function for electron damping in tokamak plasmas
Berry, L. A.; Jaeger, E. F.; Phillips, C. K.; ...
2016-10-14
Radio frequency wave propagation in finite temperature, magnetized plasmas exhibits a wide range of physics phenomena. The plasma response is nonlocal in space and time, and numerous modes are possible with the potential for mode conversions and transformations. Additionally, diffraction effects are important due to finite wavelength and finite-size wave launchers. Multidimensional simulations are required to describe these phenomena, but even with this complexity, the fundamental plasma response is assumed to be the uniform plasma response with the assumption that the local plasma current for a Fourier mode can be described by the Stix conductivity. But, for plasmas with non-uniformmore » magnetic fields, the wave vector itself is nonlocal. When resolved into components perpendicular (k ) and parallel (k ||) to the magnetic field, locality of the parallel component can easily be violated when the wavelength is large. The impact of this inconsistency is that estimates of the wave damping can be incorrect (typically low) due to unresolved resonances. For the case of ion cyclotron damping, this issue has already been addressed by including the effect of parallel magnetic field gradients. In this case, a modified plasma response (Z function) allows resonance broadening even when k || = 0, and this improves the convergence and accuracy of wave simulations. In our paper, we extend this formalism to include electron damping and find improved convergence and accuracy for parameters where electron damping is dominant, such as high harmonic fast wave heating in the NSTX-U tokamak, and helicon wave launch for off-axis current drive in the DIII-D tokamak.« less
Coeval emplacement and orogen-parallel transport of gold in oblique convergent orogens
NASA Astrophysics Data System (ADS)
Upton, Phaedra; Craw, Dave
2016-12-01
Varying amounts of gold mineralisation is occurring in all young and active collisional mountain belts. Concurrently, these syn-orogenic hydrothermal deposits are being eroded and transported to form placer deposits. Local extension occurs in convergent orogens, especially oblique orogens, and facilitates emplacement of syn-orogenic gold-bearing deposits with or without associated magmatism. Numerical modelling has shown that extension results from directional variations in movement rates along the rock transport trajectory during convergence, and is most pronounced for highly oblique convergence with strong crustal rheology. On-going uplift during orogenesis exposes gold deposits to erosion, transport, and localised placer concentration. Drainage patterns in variably oblique convergent orogenic belts typically have an orogen-parallel or sub-parallel component; the details of which varies with convergence obliquity and the vagaries of underlying geological controls. This leads to lateral transport of eroded syn-orogenic gold on a range of scales, up to > 100 km. The presence of inherited crustal blocks with contrasting rheology in oblique orogenic collision zones can cause perturbations in drainage patterns, but numerical modelling suggests that orogen-parallel drainage is still a persistent and robust feature. The presence of an inherited block of weak crust enhances the orogen-parallel drainage by imposition of localised subsidence zones elongated along a plate boundary. Evolution and reorientation of orogen-parallel drainage can sever links between gold placer deposits and their syn-orogenic sources. Many of these modelled features of syn-orogenic gold emplacement and varying amounts of orogen-parallel detrital gold transport can be recognised in the Miocene to Recent New Zealand oblique convergent orogen. These processes contribute little gold to major placer goldfields, which require more long-term recycling and placer gold concentration. Most eroded syn-orogenic gold becomes diluted by abundant lithic debris in rivers and sedimentary basins except where localised concentration occurs, especially on beaches.
NASA Astrophysics Data System (ADS)
Rosin, M. S.; Schekochihin, A. A.; Rincon, F.; Cowley, S. C.
2011-05-01
Weakly collisional magnetized cosmic plasmas have a dynamical tendency to develop pressure anisotropies with respect to the local direction of the magnetic field. These anisotropies trigger plasma instabilities at scales just above the ion Larmor radius ρi and much below the mean free path λmfp. They have growth rates of a fraction of the ion cyclotron frequency, which is much faster than either the global dynamics or even local turbulence. Despite their microscopic nature, these instabilities dramatically modify the transport properties and, therefore, the macroscopic dynamics of the plasma. The non-linear evolution of these instabilities is expected to drive pressure anisotropies towards marginal stability values, controlled by the plasma beta βi. Here this non-linear evolution is worked out in an ab initio kinetic calculation for the simplest analytically tractable example - the parallel (k⊥= 0) firehose instability in a high-beta plasma. An asymptotic theory is constructed, based on a particular physical ordering and leading to a closed non-linear equation for the firehose turbulence. In the non-linear regime, both the analytical theory and the numerical solution predict secular (∝t) growth of magnetic fluctuations. The fluctuations develop a k-3∥ spectrum, extending from scales somewhat larger than ρi to the maximum scale that grows secularly with time (∝t1/2); the relative pressure anisotropy (p⊥-p∥)/p∥ tends to the marginal value -2/βi. The marginal state is achieved via changes in the magnetic field, not particle scattering. When a parallel ion heat flux is present, the parallel firehose mutates into the new gyrothermal instability (GTI), which continues to exist up to firehose-stable values of pressure anisotropy, which can be positive and are limited by the magnitude of the ion heat flux. The non-linear evolution of the GTI also features secular growth of magnetic fluctuations, but the fluctuation spectrum is eventually dominated by modes around a maximal scale ˜ρilT/λmfp, where lT is the scale of the parallel temperature variation. Implications for momentum and heat transport are speculated about. This study is motivated by our interest in the dynamics of galaxy cluster plasmas (which are used as the main astrophysical example), but its relevance to solar wind and accretion flow plasmas is also briefly discussed.
Devore, Sasha; Ihlefeld, Antje; Hancock, Kenneth; Shinn-Cunningham, Barbara; Delgutte, Bertrand
2009-01-01
In reverberant environments, acoustic reflections interfere with the direct sound arriving at a listener’s ears, distorting the spatial cues for sound localization. Yet, human listeners have little difficulty localizing sounds in most settings. Because reverberant energy builds up over time, the source location is represented relatively faithfully during the early portion of a sound, but this representation becomes increasingly degraded later in the stimulus. We show that the directional sensitivity of single neurons in the auditory midbrain of anesthetized cats follows a similar time course, although onset dominance in temporal response patterns results in more robust directional sensitivity than expected, suggesting a simple mechanism for improving directional sensitivity in reverberation. In parallel behavioral experiments, we demonstrate that human lateralization judgments are consistent with predictions from a population rate model decoding the observed midbrain responses, suggesting a subcortical origin for robust sound localization in reverberant environments. PMID:19376072
Real-time image dehazing using local adaptive neighborhoods and dark-channel-prior
NASA Astrophysics Data System (ADS)
Valderrama, Jesus A.; Díaz-Ramírez, Víctor H.; Kober, Vitaly; Hernandez, Enrique
2015-09-01
A real-time algorithm for single image dehazing is presented. The algorithm is based on calculation of local neighborhoods of a hazed image inside a moving window. The local neighborhoods are constructed by computing rank-order statistics. Next the dark-channel-prior approach is applied to the local neighborhoods to estimate the transmission function of the scene. By using the suggested approach there is no need for applying a refining algorithm to the estimated transmission such as the soft matting algorithm. To achieve high-rate signal processing the proposed algorithm is implemented exploiting massive parallelism on a graphics processing unit (GPU). Computer simulation results are carried out to test the performance of the proposed algorithm in terms of dehazing efficiency and speed of processing. These tests are performed using several synthetic and real images. The obtained results are analyzed and compared with those obtained with existing dehazing algorithms.
OceanXtremes: Scalable Anomaly Detection in Oceanographic Time-Series
NASA Astrophysics Data System (ADS)
Wilson, B. D.; Armstrong, E. M.; Chin, T. M.; Gill, K. M.; Greguska, F. R., III; Huang, T.; Jacob, J. C.; Quach, N.
2016-12-01
The oceanographic community must meet the challenge to rapidly identify features and anomalies in complex and voluminous observations to further science and improve decision support. Given this data-intensive reality, we are developing an anomaly detection system, called OceanXtremes, powered by an intelligent, elastic Cloud-based analytic service backend that enables execution of domain-specific, multi-scale anomaly and feature detection algorithms across the entire archive of 15 to 30-year ocean science datasets.Our parallel analytics engine is extending the NEXUS system and exploits multiple open-source technologies: Apache Cassandra as a distributed spatial "tile" cache, Apache Spark for in-memory parallel computation, and Apache Solr for spatial search and storing pre-computed tile statistics and other metadata. OceanXtremes provides these key capabilities: Parallel generation (Spark on a compute cluster) of 15 to 30-year Ocean Climatologies (e.g. sea surface temperature or SST) in hours or overnight, using simple pixel averages or customizable Gaussian-weighted "smoothing" over latitude, longitude, and time; Parallel pre-computation, tiling, and caching of anomaly fields (daily variables minus a chosen climatology) with pre-computed tile statistics; Parallel detection (over the time-series of tiles) of anomalies or phenomena by regional area-averages exceeding a specified threshold (e.g. high SST in El Nino or SST "blob" regions), or more complex, custom data mining algorithms; Shared discovery and exploration of ocean phenomena and anomalies (facet search using Solr), along with unexpected correlations between key measured variables; Scalable execution for all capabilities on a hybrid Cloud, using our on-premise OpenStack Cloud cluster or at Amazon. The key idea is that the parallel data-mining operations will be run "near" the ocean data archives (a local "network" hop) so that we can efficiently access the thousands of files making up a three decade time-series. The presentation will cover the architecture of OceanXtremes, parallelization of the climatology computation and anomaly detection algorithms using Spark, example results for SST and other time-series, and parallel performance metrics.
Interactive Parallel Data Analysis within Data-Centric Cluster Facilities using the IPython Notebook
NASA Astrophysics Data System (ADS)
Pascoe, S.; Lansdowne, J.; Iwi, A.; Stephens, A.; Kershaw, P.
2012-12-01
The data deluge is making traditional analysis workflows for many researchers obsolete. Support for parallelism within popular tools such as matlab, IDL and NCO is not well developed and rarely used. However parallelism is necessary for processing modern data volumes on a timescale conducive to curiosity-driven analysis. Furthermore, for peta-scale datasets such as the CMIP5 archive, it is no longer practical to bring an entire dataset to a researcher's workstation for analysis, or even to their institutional cluster. Therefore, there is an increasing need to develop new analysis platforms which both enable processing at the point of data storage and which provides parallelism. Such an environment should, where possible, maintain the convenience and familiarity of our current analysis environments to encourage curiosity-driven research. We describe how we are combining the interactive python shell (IPython) with our JASMIN data-cluster infrastructure. IPython has been specifically designed to bridge the gap between the HPC-style parallel workflows and the opportunistic curiosity-driven analysis usually carried out using domain specific languages and scriptable tools. IPython offers a web-based interactive environment, the IPython notebook, and a cluster engine for parallelism all underpinned by the well-respected Python/Scipy scientific programming stack. JASMIN is designed to support the data analysis requirements of the UK and European climate and earth system modeling community. JASMIN, with its sister facility CEMS focusing the earth observation community, has 4.5 PB of fast parallel disk storage alongside over 370 computing cores provide local computation. Through the IPython interface to JASMIN, users can make efficient use of JASMIN's multi-core virtual machines to perform interactive analysis on all cores simultaneously or can configure IPython clusters across multiple VMs. Larger-scale clusters can be provisioned through JASMIN's batch scheduling system. Outputs can be summarised and visualised using the full power of Python's many scientific tools, including Scipy, Matplotlib, Pandas and CDAT. This rich user experience is delivered through the user's web browser; maintaining the interactive feel of a workstation-based environment with the parallel power of a remote data-centric processing facility.
Narayanaswamy, Arunachalam; Dwarakapuram, Saritha; Bjornsson, Christopher S; Cutler, Barbara M; Shain, William; Roysam, Badrinath
2010-03-01
This paper presents robust 3-D algorithms to segment vasculature that is imaged by labeling laminae, rather than the lumenal volume. The signal is weak, sparse, noisy, nonuniform, low-contrast, and exhibits gaps and spectral artifacts, so adaptive thresholding and Hessian filtering based methods are not effective. The structure deviates from a tubular geometry, so tracing algorithms are not effective. We propose a four step approach. The first step detects candidate voxels using a robust hypothesis test based on a model that assumes Poisson noise and locally planar geometry. The second step performs an adaptive region growth to extract weakly labeled and fine vessels while rejecting spectral artifacts. To enable interactive visualization and estimation of features such as statistical confidence, local curvature, local thickness, and local normal, we perform the third step. In the third step, we construct an accurate mesh representation using marching tetrahedra, volume-preserving smoothing, and adaptive decimation algorithms. To enable topological analysis and efficient validation, we describe a method to estimate vessel centerlines using a ray casting and vote accumulation algorithm which forms the final step of our algorithm. Our algorithm lends itself to parallel processing, and yielded an 8 x speedup on a graphics processor (GPU). On synthetic data, our meshes had average error per face (EPF) values of (0.1-1.6) voxels per mesh face for peak signal-to-noise ratios from (110-28 dB). Separately, the error from decimating the mesh to less than 1% of its original size, the EPF was less than 1 voxel/face. When validated on real datasets, the average recall and precision values were found to be 94.66% and 94.84%, respectively.
Hybrid fiber links for accurate optical frequency comparison
NASA Astrophysics Data System (ADS)
Lee, Won-Kyu; Stefani, Fabio; Bercy, Anthony; Lopez, Olivier; Amy-Klein, Anne; Pottie, Paul-Eric
2017-05-01
We present the experimental demonstration of a local two-way optical frequency comparison over a 43-km-long urban fiber network without any requirement for measurement synchronization. We combined the local two-way scheme with a regular active noise compensation scheme that was implemented on another parallel fiber leading to a highly reliable and robust frequency transfer. This hybrid scheme allowed us to investigate the major limiting factors of the local two-way comparison. We analyzed the contributions of the interferometers at both local and remote locations to the phase noise of the local two-way signal. Using the ability of this setup to be injected by either a single laser or two independent lasers, we measured the contributions of the demodulated laser instabilities to the long-term instability. We show that a fractional frequency instability level of 10-20 at 10,000 s can be obtained using this simple setup after propagation over a distance of 43 km in an urban area.
Pyridine adsorption and diffusion on Pt(111) investigated with density functional theory
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kolsbjerg, Esben L.; Groves, Michael N.; Hammer, Bjørk, E-mail: hammer@phys.au.dk
2016-04-28
The adsorption, diffusion, and dissociation of pyridine, C{sub 5}H{sub 5}N, on Pt(111) are investigated with van der Waals-corrected density functional theory. An elaborate search for local minima in the adsorption potential energy landscape reveals that the intact pyridine adsorbs with the aromatic ring parallel to the surface. Piecewise interconnections of the local minima in the energy landscape reveal that the most favourable diffusion path for pyridine has a barrier of 0.53 eV. In the preferred path, the pyridine remains parallel to the surface while performing small single rotational steps with a carbon-carbon double bond hinged above a single Pt atom.more » The origin of the diffusion pathway is discussed in terms of the C{sub 2}–Pt π-bond being stronger than the corresponding CN–Pt π-bond. The energy barrier and reaction enthalpy for dehydrogenation of adsorbed pyridine into an adsorbed, upright bound α-pyridyl species are calculated to 0.71 eV and 0.18 eV, respectively (both zero-point energy corrected). The calculations are used to rationalize previous experimental observations from the literature for pyridine on Pt(111).« less
Propagation of acoustic shock waves between parallel rigid boundaries and into shadow zones
DOE Office of Scientific and Technical Information (OSTI.GOV)
Desjouy, C., E-mail: cyril.desjouy@gmail.com; Ollivier, S.; Dragna, D.
2015-10-28
The study of acoustic shock propagation in complex environments is of great interest for urban acoustics, but also for source localization, an underlying problematic in military applications. To give a better understanding of the phenomenon taking place during the propagation of acoustic shocks, laboratory-scale experiments and numerical simulations were performed to study the propagation of weak shock waves between parallel rigid boundaries, and into shadow zones created by corners. In particular, this work focuses on the study of the local interactions taking place between incident, reflected, and diffracted waves according to the geometry in both regular or irregular – alsomore » called Von Neumann – regimes of reflection. In this latter case, an irregular reflection can lead to the formation of a Mach stem that can modify the spatial distribution of the acoustic pressure. Short duration acoustic shock waves were produced by a 20 kilovolts electric spark source and a schlieren optical method was used to visualize the incident shockfront and the reflection/diffraction patterns. Experimental results are compared to numerical simulations based on the high-order finite difference solution of the two dimensional Navier-Stokes equations.« less
NASA Astrophysics Data System (ADS)
Yasuda, Shugo; Yamamoto, Ryoichi
2015-11-01
The Synchronized Molecular-Dynamics simulation which was recently proposed by authors is applied to the analysis of polymer lubrication between parallel plates. In the SMD method, the MD simulations are assigned to small fluid elements to calculate the local stresses and temperatures and are synchronized at certain time intervals to satisfy the macroscopic heat- and momentum-transport equations.The rheological properties and conformation of the polymer chains coupled with local viscous heating are investigated with a non-dimensional parameter, the Nahme-Griffith number, which is defined as the ratio of the viscous heating to the thermal conduction at the characteristic temperature required to sufficiently change the viscosity. The present simulation demonstrates that strong shear thinning and a transitional behavior of the conformation of the polymer chains are exhibited with a rapid temperature rise when the Nahme-Griffith number exceeds unity.The results also clarify that the reentrant transition of the linear stress-optical relation occurs for large shear stresses due to the coupling of the conformation of polymer chains with heat generation under shear flows. This study was financially supported by JSPS KAKENHI Grant Nos. 26790080 and 26247069.
Image matrix processor for fast multi-dimensional computations
Roberson, George P.; Skeate, Michael F.
1996-01-01
An apparatus for multi-dimensional computation which comprises a computation engine, including a plurality of processing modules. The processing modules are configured in parallel and compute respective contributions to a computed multi-dimensional image of respective two dimensional data sets. A high-speed, parallel access storage system is provided which stores the multi-dimensional data sets, and a switching circuit routes the data among the processing modules in the computation engine and the storage system. A data acquisition port receives the two dimensional data sets representing projections through an image, for reconstruction algorithms such as encountered in computerized tomography. The processing modules include a programmable local host, by which they may be configured to execute a plurality of different types of multi-dimensional algorithms. The processing modules thus include an image manipulation processor, which includes a source cache, a target cache, a coefficient table, and control software for executing image transformation routines using data in the source cache and the coefficient table and loading resulting data in the target cache. The local host processor operates to load the source cache with a two dimensional data set, loads the coefficient table, and transfers resulting data out of the target cache to the storage system, or to another destination.
Structure Modulates Similarity-Based Interference in Sluicing: An Eye Tracking study
Harris, Jesse A.
2015-01-01
In cue-based content-addressable approaches to memory, a target and its competitors are retrieved in parallel from memory via a fast, associative cue-matching procedure under a severely limited focus of attention. Such a parallel matching procedure could in principle ignore the serial order or hierarchical structure characteristic of linguistic relations. I present an eye tracking while reading experiment that investigates whether the sentential position of a potential antecedent modulates the strength of similarity-based interference, a well-studied effect in which increased similarity in features between a target and its competitors results in slower and less accurate retrieval overall. The manipulation trades on an independently established Locality bias in sluiced structures to associate a wh-remnant (which ones) in clausal ellipsis with the most local correlate (some wines), as in The tourists enjoyed some wines, but I don't know which ones. The findings generally support cue-based parsing models of sentence processing that are subject to similarity-based interference in retrieval, and provide additional support to the growing body of evidence that retrieval is sensitive to both the structural position of a target antecedent and its competitors, and the specificity or diagnosticity of retrieval cues. PMID:26733893
Numerical investigation of two interacting parallel thruster-plumes and comparison to experiment
NASA Astrophysics Data System (ADS)
Grabe, Martin; Holz, André; Ziegenhagen, Stefan; Hannemann, Klaus
2014-12-01
Clusters of orbital thrusters are an attractive option to achieve graduated thrust levels and increased redundancy with available hardware, but the heavily under-expanded plumes of chemical attitude control thrusters placed in close proximity will interact, leading to a local amplification of downstream fluxes and of back-flow onto the spacecraft. The interaction of two similar, parallel, axi-symmetric cold-gas model thrusters has recently been studied in the DLR High-Vacuum Plume Test Facility STG under space-like vacuum conditions, employing a Patterson-type impact pressure probe with slot orifice. We reproduce a selection of these experiments numerically, and emphasise that a comparison of numerical results to the measured data is not straight-forward. The signal of the probe used in the experiments must be interpreted according to the degree of rarefaction and local flow Mach number, and both vary dramatically thoughout the flow-field. We present a procedure to reconstruct the probe signal by post-processing the numerically obtained flow-field data and show that agreement to the experimental results is then improved. Features of the investigated cold-gas thruster plume interaction are discussed on the basis of the numerical results.
Asymptotic-preserving Lagrangian approach for modeling anisotropic transport in magnetized plasmas
NASA Astrophysics Data System (ADS)
Chacon, Luis; Del-Castillo-Negrete, Diego
2011-10-01
Modeling electron transport in magnetized plasmas is extremely challenging due to the extreme anisotropy introduced by the presence of the magnetic field (χ∥ /χ⊥ ~1010 in fusion plasmas). Recently, a novel Lagrangian method has been proposed to solve the local and non-local purely parallel transport equation in general 3D magnetic fields. The approach avoids numerical pollution (in fact, it respects transport barriers -flux surfaces- exactly by construction), is inherently positivity-preserving, and is scalable algorithmically (i.e., work per degree-of-freedom is grid-independent). In this poster, we discuss the extension of the Lagrangian approach to include perpendicular transport and sources. We present an asymptotic-preserving numerical formulation that ensures a consistent numerical discretization temporally and spatially for arbitrary χ∥ /χ⊥ ratios. This is of importance because parallel and perpendicular transport terms in the transport equation may become comparable in regions of the plasma (e.g., at incipient islands), while remaining disparate elsewhere. We will demonstrate the potential of the approach with various challenging configurations, including the case of transport across a magnetic island in cylindrical geometry. D. del-Castillo-Negrete, L. Chacón, PRL, 106, 195004 (2011); DPP11 invited talk by del-Castillo-Negrete.
On the generation of double layers from ion- and electron-acoustic instabilities
Fu, Xiangrong; Cowee, Misa M.; Gary, Stephen Peter; ...
2016-03-17
A plasma double layer (DL) is a nonlinear electrostatic structure that carries a uni-polar electric field parallel to the background magnetic field due to local charge separation. Past studies showed that DLs observed in space plasmas are mostly associated with the ion acoustic instability. Recent Van Allen Probes observations of parallel electric fields traveling much faster than the ion acoustic speed have motivated a computational study to test the hypothesis that a new type of DLs – electron acoustic DLs – generated from the electron acoustic instability are responsible for these electric fields. Nonlinear particle-in-cell simulations yield negative results, i.e.more » the hypothetical electron acoustic DLs cannot be formed in a way similar to ion acoustic DLs. We find that linear theory analysis and the simulations show that the frequencies of electron acoustic waves are too high for ions to respond and maintain charge separation required by DLs. However, our results do show that local density perturbations in a two-electron-component plasma can result in unipolar-like electric fields that propagate at the electron thermal speed, suggesting another potential explanation for the observations.« less
On the generation of double layers from ion- and electron-acoustic instabilities
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fu, Xiangrong, E-mail: xrfu@lanl.gov; Cowee, Misa M.; Winske, Dan
2016-03-15
A plasma double layer (DL) is a nonlinear electrostatic structure that carries a uni-polar electric field parallel to the background magnetic field due to local charge separation. Past studies showed that DLs observed in space plasmas are mostly associated with the ion acoustic instability. Recent Van Allen Probes observations of parallel electric field structures traveling much faster than the ion acoustic speed have motivated a computational study to test the hypothesis that a new type of DLs—electron acoustic DLs—generated from the electron acoustic instability are responsible for these electric fields. Nonlinear particle-in-cell simulations yield negative results, i.e., the hypothetical electronmore » acoustic DLs cannot be formed in a way similar to ion acoustic DLs. Linear theory analysis and the simulations show that the frequencies of electron acoustic waves are too high for ions to respond and maintain charge separation required by DLs. However, our results do show that local density perturbations in a two-electron-component plasma can result in unipolar-like electric field structures that propagate at the electron thermal speed, suggesting another potential explanation for the observations.« less
Nguyen, Tuan-Anh; Nakib, Amir; Nguyen, Huy-Nam
2016-06-01
The Non-local means denoising filter has been established as gold standard for image denoising problem in general and particularly in medical imaging due to its efficiency. However, its computation time limited its applications in real world application, especially in medical imaging. In this paper, a distributed version on parallel hybrid architecture is proposed to solve the computation time problem and a new method to compute the filters' coefficients is also proposed, where we focused on the implementation and the enhancement of filters' parameters via taking the neighborhood of the current voxel more accurately into account. In terms of implementation, our key contribution consists in reducing the number of shared memory accesses. The different tests of the proposed method were performed on the brain-web database for different levels of noise. Performances and the sensitivity were quantified in terms of speedup, peak signal to noise ratio, execution time, the number of floating point operations. The obtained results demonstrate the efficiency of the proposed method. Moreover, the implementation is compared to that of other techniques, recently published in the literature. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Investigation of transient melting of tungsten by ELMs in ASDEX Upgrade
NASA Astrophysics Data System (ADS)
Krieger, K.; Sieglin, B.; Balden, M.; Coenen, J. W.; Göths, B.; Laggner, F.; de Marne, P.; Matthews, G. F.; Nille, D.; Rohde, V.; Dejarnac, R.; Faitsch, M.; Giannone, L.; Herrmann, A.; Horacek, J.; Komm, M.; Pitts, R. A.; Ratynskaia, S.; Thoren, E.; Tolias, P.; ASDEX-Upgrade Team; EUROfusion MST1 Team
2017-12-01
Repetitive melting of tungsten by power transients originating from edge localized modes (ELMs) has been studied in the tokamak experiment ASDEX Upgrade. Tungsten samples were exposed to H-mode discharges at the outer divertor target plate using the Divertor Manipulator II system. The exposed sample was designed with an elevated sloped surface inclined against the incident magnetic field to increase the projected parallel power flux to a level were transient melting by ELMs would occur. Sample exposure was controlled by moving the outer strike point to the sample location. As extension to previous melt studies in the new experiment both the current flow from the sample to vessel potential and the local surface temperature were measured with sufficient time resolution to resolve individual ELMs. The experiment provided for the first time a direct link of current flow and surface temperature during transient ELM events. This allows to further constrain the MEMOS melt motion code predictions and to improve the validation of its underlying model assumptions. Post exposure ex situ analysis of the retrieved samples confirms the decreased melt motion observed at shallower magnetic field line to surface angles compared to that at leading edges exposed to the parallel power flux.
Field gradients can control the alignment of nanorods.
Ooi, Chinchun; Yellen, Benjamin B
2008-08-19
This work is motivated by the unexpected experimental observation that field gradients can control the alignment of nonmagnetic nanorods immersed inside magnetic fluids. In the presence of local field gradients, nanorods were observed to align perpendicular to the external field at low field strengths, but parallel to the external field at high field strengths. The switching behavior results from the competition between a preference to align with the external field (orientational potential energy) and preference to move into regions of minimum magnetic field (positional potential energy). A theoretical model is developed to explain this experimental behavior by investigating the statistics of nanorod alignment as a function of both the external uniform magnetic field strength and the local magnetic field variation above a periodic array of micromagnets. Computational phase diagrams are developed which indicate that the relative population of nanorods in parallel and perpendicular states can be adjusted through several control parameters. However, an energy barrier to rotation was discovered to influence the rate kinetics and restrict the utility of this assembly technique to nanorods which are slightly shorter than the micromagnet length. Experimental results concerning the orientation of nanorods inside magnetic fluid are also presented and shown to be in strong agreement with the theoretical work.
Macrophages and cellular immunity in Drosophila melanogaster.
Gold, Katrina S; Brückner, Katja
2015-12-01
The invertebrate Drosophila melanogaster has been a powerful model for understanding blood cell development and immunity. Drosophila is a holometabolous insect, which transitions through a series of life stages from embryo, larva and pupa to adulthood. In spite of this, remarkable parallels exist between Drosophila and vertebrate macrophages, both in terms of development and function. More than 90% of Drosophila blood cells (hemocytes) are macrophages (plasmatocytes), making this highly tractable genetic system attractive for studying a variety of questions in macrophage biology. In vertebrates, recent findings revealed that macrophages have two independent origins: self-renewing macrophages, which reside and proliferate in local microenvironments in a variety of tissues, and macrophages of the monocyte lineage, which derive from hematopoietic stem or progenitor cells. Like vertebrates, Drosophila possesses two macrophage lineages with a conserved dual ontogeny. These parallels allow us to take advantage of the Drosophila model when investigating macrophage lineage specification, maintenance and amplification, and the induction of macrophages and their progenitors by local microenvironments and systemic cues. Beyond macrophage development, Drosophila further serves as a paradigm for understanding the mechanisms underlying macrophage function and cellular immunity in infection, tissue homeostasis and cancer, throughout development and adult life. Copyright © 2016. Published by Elsevier Ltd.
Macrophages and cellular immunity in Drosophila melanogaster
Gold, Katrina S.; Brückner, Katja
2016-01-01
The invertebrate Drosophila melanogaster has been a powerful model for understanding blood cell development and immunity. Drosophila is a holometabolous insect, which transitions through a series of life stages from embryo, larva and pupa to adulthood. In spite of this, remarkable parallels exist between Drosophila and vertebrate macrophages, both in terms of development and function. More than 90% of Drosophila blood cells (hemocytes) are macrophages (plasmatocytes), making this highly tractable genetic system attractive for studying a variety of questions in macrophage biology. In vertebrates, recent findings revealed that macrophages have two independent origins: self-renewing macrophages, which reside and proliferate in local microenvironments in a variety of tissues, and macrophages of the monocyte lineage, which derive from hematopoietic stem or progenitor cells. Like vertebrates, Drosophila possesses two macrophage lineages with a conserved dual ontogeny. These parallels allow us to take advantage of the Drosophila model when investigating macrophage lineage specification, maintenance and amplification, and the induction of macrophages and their progenitors by local microenvironments and systemic cues. Beyond macrophage development, Drosophila further serves as a paradigm for understanding the mechanisms underlying macrophage function and cellular immunity in infection, tissue homeostasis and cancer, throughout development and adult life. PMID:27117654
Interconnected subsets of memory follicular helper T cells have different effector functions.
Asrir, Assia; Aloulou, Meryem; Gador, Mylène; Pérals, Corine; Fazilleau, Nicolas
2017-10-10
Follicular helper T cells regulate high-affinity antibody production. Memory follicular helper T cells can be local in draining lymphoid organs and circulate in the blood, but the underlying mechanisms of this subdivision are unresolved. Here we show that both memory follicular helper T subsets sustain B-cell responses after reactivation. Local cells promote more plasma cell differentiation, whereas circulating cells promote more secondary germinal centers. In parallel, local memory B cells are homogeneous and programmed to become plasma cells, whereas circulating memory B cells are able to rediversify. Local memory follicular helper T cells have higher affinity T-cell receptors, which correlates with expression of peptide MHC-II at the surface of local memory B cells only. Blocking T-cell receptor-peptide MHC-II interactions induces the release of local memory follicular helper T cells in the circulating compartment. Our studies show that memory follicular helper T localization is highly intertwined with memory B cells, a finding that has important implications for vaccine design.Tfh cells can differentiate into memory cells. Here the authors describe distinct functional and phenotypic profiles of these memory Tfh cells dependent on their anatomical localization to the lymphoid organs or to the circulation.
Leonardy, Simone; Freymark, Gerald; Hebener, Sabrina; Ellehauge, Eva; Søgaard-Andersen, Lotte
2007-01-01
Myxococcus xanthus cells harbor two motility machineries, type IV pili (Tfp) and the A-engine. During reversals, the two machineries switch polarity synchronously. We present a mechanism that synchronizes this polarity switching. We identify the required for motility response regulator (RomR) as essential for A-motility. RomR localizes in a bipolar, asymmetric pattern with a large cluster at the lagging cell pole. The large RomR cluster relocates to the new lagging pole in parallel with cell reversals. Dynamic RomR localization is essential for cell reversals, suggesting that RomR relocalization induces the polarity switching of the A-engine. The analysis of RomR mutants shows that the output domain targets RomR to the poles and the receiver domain is essential for dynamic localization. The small GTPase MglA establishes correct RomR polarity, and the Frz two-component system regulates dynamic RomR localization. FrzS localizes with Tfp at the leading pole and relocates in an Frz-dependent manner to the opposite pole during reversals; FrzS and RomR localize and oscillate independently. The Frz system synchronizes these oscillations and thus the synchronous polarity switching of the motility machineries. PMID:17932488
The use of prophylactic antibiotics in plastic surgery: update in 2010.
Hauck, Randy M; Nogan, Stephen
2013-01-01
The indications for prophylactic antibiotics in plastic surgery remain controversial. No recent survey has been reported on the use of prophylactic antibiotics by plastic surgeons in clinical practice. This survey was designed to assess the current use of prophylactic antibiotics by plastic surgeons and to compare trends with previous studies. All members of the American Society of Plastic Surgeons with an e-mail address on the Society's website were contacted via an e-mail and sent a link to a SurveyMonkey questionnaire. To survey only in those subspecialty areas that they practice in, surgeons were queried only on the procedures that they perform. Within each section, a list of common representative procedures was included, with questions about the use of antibiotic prophylaxis. A total of 3824 American Society of Plastic Surgeons members were contacted. Of the 3613, 910 with working e-mail addresses responded to the survey for a response rate of 25%. And 833 or 91.5% completed the survey. Survey data cover the percentage of surgeons reporting their use of antibiotics in procedures that they currently perform. The percentage of plastic surgeons who use prophylactic antibiotics in almost all procedures studied has increased significantly when compared with earlier studies. The use of prophylactic antibiotics by plastic surgeons has increased considerably since the prior studies by Krizek et al (Plast Reconstr Surg. 1975;55:21-32 and 1985;76:953-963). Some of these uses are appropriate because of the use in procedures involving implants and longer operations. The elevated rates for clean procedures are not part of the evidence-based practice.
NASA Astrophysics Data System (ADS)
McNeill, L.; Moore, J. C.; Yamada, Y.; Chang, C.; Tobin, H.; Kinoshita, M.; Gulick, S.; Moore, G.; Iodp Exp. 314/315/316 Science Party, &
2008-12-01
Borehole breakouts are commonly observed in borehole images shortly after drilling of continental margin sites. This study aims to compile and compare these results to determine what in situ shallow stress measurements can tell us about the larger scale tectonic regime. Recent Logging While Drilling resistivity images across the Kumano transect of the Nankai subduction zone, during Expedition 314, Stage 1 of the IODP NanTroSEIZE project, add to this dataset. Expedition 314 site data within the prism (C0001, C0004, C0006, including the megasplay fault system which may overlie the seismogenic updip limit) suggest maximum compressive stress (SHmax) is perpendicular to the margin (not parallel to the convergence vector) but is rotated through 90° at the forearc basin site (C0002). These results may point to changes in stress state of the shallow forearc from east to west: compression in the aseismic active prism (with evidence of strain partitioning of oblique convergence); and extension above the updip seismogenic zone suggesting focus of plate coupling at the plate boundary and not in the shallow forearc. Further south, ODP Leg 196 drilled the prism toe (808) with breakouts indicating SHmax parallel to the convergence vector, in contrast to Exp. 314 results. The stress state in the shallow prism at Site 808 may be affected by nearby seamount subduction or may represent differences in strain partitioning. On the Cascadia margin, two drilling legs have collected LWD borehole images (Leg 204 and Exp. 311). Leg 204 drilled 3 sites at hydrate ridge in the C Cascadia outer prism with breakout orientations variable between closely spaced sites. Prism fold axes are parallel to the margin so we might expect SHmax perpendicular to the margin as in Exp. 314. Deviations from this orientation may reflect local and surface effects (Goldberg and Janik, 2006). Exp. 311, N Cascadia, drilled 5 sites across the prism with breakouts in LWD images. Subduction is not oblique here, in contrast to the other sites discussed, and most sites indicate SHmax almost parallel to convergence and normal to major fold axes. In one case, the in situ stress orientation is also compatible with shallow normal faulting from seismic data. Site 1325, in a slope basin, deviates from this orientation and may reflect local processes. Borehole breakouts within the shallow forearc of convergent margins are often in agreement with other indications of regional tectonic stress and may be indicative of processes at depth. Deviations may represent local stresses due to gravitational processes.
Localization of basic fibroblast growth factor binding sites in the chick embryonic neural retina.
Cirillo, A; Arruti, C; Courtois, Y; Jeanny, J C
1990-12-01
We have investigated the localization of basic fibroblast growth factor (bFGF) binding sites during the development of the neural retina in the chick embryo. The specificity of the affinity of bFGF for its receptors was assessed by competition experiments with unlabelled growth factor or with heparin, as well as by heparitinase treatment of the samples. Two different types of binding sites were observed in the neural retina by light-microscopic autoradiography. The first type, localized mainly to basement membranes, was highly sensitive to heparitinase digestion and to competition with heparin. It was not developmentally regulated. The second type of binding site, resistant to heparin competition, appeared to be associated with retinal cells from the earliest stages studied (3-day-old embryo, stages 21-22 of Hamburger and Hamilton). Its distribution was found to vary during embryonic development, paralleling layering of the neural retina. Binding of bFGF to the latter sites was observed throughout the retinal neuroepithelium at early stages but displayed a distinct pattern at the time when the inner and outer plexiform layers were formed. During the development of the inner plexiform layer, a banded pattern of bFGF binding was observed. These bands, lying parallel to the vitreal surface, seemed to codistribute with the synaptic bands existing in the inner plexiform layer. The presence of intra-retinal bFGF binding sites whose distribution varies with embryonic development suggests a regulatory mechanism involving differential actions of bFGF on neural retinal cells.
NASA Astrophysics Data System (ADS)
Maneva, Y. G.; Poedts, S.
2018-05-01
The power spectra of magnetic field fluctuations in the solar wind typically follow a power-law dependence with respect to the observed frequencies and wave-numbers. The background magnetic field often influences the plasma properties, setting a preferential direction for plasma heating and acceleration. At the same time the evolution of the solar-wind turbulence at the ion and electron scales is influenced by the plasma properties through local micro-instabilities and wave-particle interactions. The solar-wind-plasma temperature and the solar-wind turbulence at sub- and sup-ion scales simultaneously show anisotropic features, with different components and fluctuation power in parallel with and perpendicular to the orientation of the background magnetic field. The ratio between the power of the magnetic field fluctuations in parallel and perpendicular direction at the ion scales may vary with the heliospheric distance and depends on various parameters, including the local wave properties and nonthermal plasma features, such as temperature anisotropies and relative drift speeds. In this work we have performed two-and-a-half-dimensional hybrid simulations to study the generation and evolution of anisotropic turbulence in a drifting multi-ion species plasma. We investigate the evolution of the turbulent spectral slopes along and across the background magnetic field for the cases of initially isotropic and anisotropic turbulence. Finally, we show the effect of the various turbulent spectra for the local ion heating in the solar wind.
Wargo, Christopher J.; Gore, John C.
2013-01-01
Localized high-resolution diffusion tensor images (DTI) from the midbrain were obtained using reduced field-of-view (rFOV) methods combined with SENSE parallel imaging and single-shot echo planar (EPI) acquisitions at 7 T. This combination aimed to diminish sensitivities of DTI to motion, susceptibility variations, and EPI artifacts at ultra-high field. Outer-volume suppression (OVS) was applied in DTI acquisitions at 2- and 1-mm2 resolutions, b=1000 s/mm2, and six diffusion directions, resulting in scans of 7- and 14-min durations. Mean apparent diffusion coefficient (ADC) and fractional anisotropy (FA) values were measured in various fiber tract locations at the two resolutions and compared. Geometric distortion and signal-to-noise ratio (SNR) were additionally measured and compared for reduced-FOV and full-FOV DTI scans. Up to an eight-fold data reduction was achieved using DTI-OVS with SENSE at 1 mm2, and geometric distortion was halved. The localization of fiber tracts was improved, enabling targeted FA and ADC measurements. Significant differences in diffusion properties were observed between resolutions for a number of regions suggesting that FA values are impacted by partial volume effects even at a 2-mm2 resolution. The combined SENSE DTI-OVS approach allows large reductions in DTI data acquisition and provides improved quality for high-resolution diffusion studies of the human brain. PMID:23541390
NASA Astrophysics Data System (ADS)
Denneulin, T.; Wollschläger, N.; Everhardt, A. S.; Farokhipoor, S.; Noheda, B.; Snoeck, E.; Hÿtch, M.
2018-05-01
Lead zirconate titanate samples are used for their piezoelectric and ferroelectric properties in various types of micro-devices. Epitaxial layers of tetragonal perovskites have a tendency to relax by forming ferroelastic domains. The accommodation of the a/c/a/c polydomain structure on a flat substrate leads to nanoscale deformation gradients which locally influence the polarization by flexoelectric effect. Here, we investigated the deformation fields in epitaxial layers of Pb(Zr0.2Ti0.8)O3 grown on SrTiO3 substrates using transmission electron microscopy (TEM). We found that the deformation gradients depend on the domain walls inclination ( or to the substrate interface) of the successive domains and we describe three different a/c/a domain configurations: one configuration with parallel a-domains and two configurations with perpendicular a-domains (V-shaped and hat--shaped). In the parallel configuration, the c-domains contain horizontal and vertical gradients of out-of-plane deformation. In the V-shaped and hat--shaped configurations, the c-domains exhibit a bending deformation field with vertical gradients of in-plane deformation. Each of these configurations is expected to have a different influence on the polarization and so the local properties of the film. The deformation gradients were measured using dark-field electron holography, a TEM technique, which offers a good sensitivity (0.1%) and a large field-of-view (hundreds of nanometers). The measurements are compared with finite element simulations.
Ecological adaptation of diverse honey bee (Apis mellifera) populations.
Parker, Robert; Melathopoulos, Andony P; White, Rick; Pernal, Stephen F; Guarna, M Marta; Foster, Leonard J
2010-06-15
Honey bees are complex eusocial insects that provide a critical contribution to human agricultural food production. Their natural migration has selected for traits that increase fitness within geographical areas, but in parallel their domestication has selected for traits that enhance productivity and survival under local conditions. Elucidating the biochemical mechanisms of these local adaptive processes is a key goal of evolutionary biology. Proteomics provides tools unique among the major 'omics disciplines for identifying the mechanisms employed by an organism in adapting to environmental challenges. Through proteome profiling of adult honey bee midgut from geographically dispersed, domesticated populations combined with multiple parallel statistical treatments, the data presented here suggest some of the major cellular processes involved in adapting to different climates. These findings provide insight into the molecular underpinnings that may confer an advantage to honey bee populations. Significantly, the major energy-producing pathways of the mitochondria, the organelle most closely involved in heat production, were consistently higher in bees that had adapted to colder climates. In opposition, up-regulation of protein metabolism capacity, from biosynthesis to degradation, had been selected for in bees from warmer climates. Overall, our results present a proteomic interpretation of expression polymorphisms between honey bee ecotypes and provide insight into molecular aspects of local adaptation or selection with consequences for honey bee management and breeding. The implications of our findings extend beyond apiculture as they underscore the need to consider the interdependence of animal populations and their agro-ecological context.
An Exact Model-Based Method for Near-Field Sources Localization with Bistatic MIMO System.
Singh, Parth Raj; Wang, Yide; Chargé, Pascal
2017-03-30
In this paper, we propose an exact model-based method for near-field sources localization with a bistatic multiple input, multiple output (MIMO) radar system, and compare it with an approximated model-based method. The aim of this paper is to propose an efficient way to use the exact model of the received signals of near-field sources in order to eliminate the systematic error introduced by the use of approximated model in most existing near-field sources localization techniques. The proposed method uses parallel factor (PARAFAC) decomposition to deal with the exact model. Thanks to the exact model, the proposed method has better precision and resolution than the compared approximated model-based method. The simulation results show the performance of the proposed method.
NASA Astrophysics Data System (ADS)
Yang, Chang; Xiao, Fuliang; He, Yihua; Liu, Si; Zhou, Qinghua; Guo, Mingyue; Zhao, Wanli
2018-03-01
During the 13-14 November 2012 storm, Van Allen Probe A simultaneously observed a 10 h period of enhanced chorus (including quasi-parallel and oblique propagation components) and relativistic electron fluxes over a broad range of L = 3-6 and magnetic local time = 2-10 within a complete orbit cycle. By adopting a Gaussian fit to the observed wave spectra, we obtain the wave parameters and calculate the bounce-averaged diffusion coefficients. We solve the Fokker-Planck diffusion equation to simulate flux evolutions of relativistic (1.8-4.2 MeV) electrons during two intervals when Probe A passed the location L = 4.3 along its orbit. The simulating results show that chorus with combined quasi-parallel and oblique components can produce a more pronounced flux enhancement in the pitch angle range ˜45°-80°, consistent well with the observation. The current results provide the first evidence on how relativistic electron fluxes vary under the drive of almost continuously distributed chorus with both quasi-parallel and oblique components within a complete orbit of Van Allen Probe.
GPU-Based Point Cloud Superpositioning for Structural Comparisons of Protein Binding Sites.
Leinweber, Matthias; Fober, Thomas; Freisleben, Bernd
2018-01-01
In this paper, we present a novel approach to solve the labeled point cloud superpositioning problem for performing structural comparisons of protein binding sites. The solution is based on a parallel evolution strategy that operates on large populations and runs on GPU hardware. The proposed evolution strategy reduces the likelihood of getting stuck in a local optimum of the multimodal real-valued optimization problem represented by labeled point cloud superpositioning. The performance of the GPU-based parallel evolution strategy is compared to a previously proposed CPU-based sequential approach for labeled point cloud superpositioning, indicating that the GPU-based parallel evolution strategy leads to qualitatively better results and significantly shorter runtimes, with speed improvements of up to a factor of 1,500 for large populations. Binary classification tests based on the ATP, NADH, and FAD protein subsets of CavBase, a database containing putative binding sites, show average classification rate improvements from about 92 percent (CPU) to 96 percent (GPU). Further experiments indicate that the proposed GPU-based labeled point cloud superpositioning approach can be superior to traditional protein comparison approaches based on sequence alignments.
NASA Technical Reports Server (NTRS)
Aftosmis, M. J.; Berger, M. J.; Adomavicius, G.
2000-01-01
Preliminary verification and validation of an efficient Euler solver for adaptively refined Cartesian meshes with embedded boundaries is presented. The parallel, multilevel method makes use of a new on-the-fly parallel domain decomposition strategy based upon the use of space-filling curves, and automatically generates a sequence of coarse meshes for processing by the multigrid smoother. The coarse mesh generation algorithm produces grids which completely cover the computational domain at every level in the mesh hierarchy. A series of examples on realistically complex three-dimensional configurations demonstrate that this new coarsening algorithm reliably achieves mesh coarsening ratios in excess of 7 on adaptively refined meshes. Numerical investigations of the scheme's local truncation error demonstrate an achieved order of accuracy between 1.82 and 1.88. Convergence results for the multigrid scheme are presented for both subsonic and transonic test cases and demonstrate W-cycle multigrid convergence rates between 0.84 and 0.94. Preliminary parallel scalability tests on both simple wing and complex complete aircraft geometries shows a computational speedup of 52 on 64 processors using the run-time mesh partitioner.
NASA Astrophysics Data System (ADS)
Suryanarayana, Phanish; Pratapa, Phanisri P.; Sharma, Abhiraj; Pask, John E.
2018-03-01
We present SQDFT: a large-scale parallel implementation of the Spectral Quadrature (SQ) method for O(N) Kohn-Sham Density Functional Theory (DFT) calculations at high temperature. Specifically, we develop an efficient and scalable finite-difference implementation of the infinite-cell Clenshaw-Curtis SQ approach, in which results for the infinite crystal are obtained by expressing quantities of interest as bilinear forms or sums of bilinear forms, that are then approximated by spatially localized Clenshaw-Curtis quadrature rules. We demonstrate the accuracy of SQDFT by showing systematic convergence of energies and atomic forces with respect to SQ parameters to reference diagonalization results, and convergence with discretization to established planewave results, for both metallic and insulating systems. We further demonstrate that SQDFT achieves excellent strong and weak parallel scaling on computer systems consisting of tens of thousands of processors, with near perfect O(N) scaling with system size and wall times as low as a few seconds per self-consistent field iteration. Finally, we verify the accuracy of SQDFT in large-scale quantum molecular dynamics simulations of aluminum at high temperature.
NASA Astrophysics Data System (ADS)
Xing, F.; Masson, R.; Lopez, S.
2017-09-01
This paper introduces a new discrete fracture model accounting for non-isothermal compositional multiphase Darcy flows and complex networks of fractures with intersecting, immersed and non-immersed fractures. The so called hybrid-dimensional model using a 2D model in the fractures coupled with a 3D model in the matrix is first derived rigorously starting from the equi-dimensional matrix fracture model. Then, it is discretized using a fully implicit time integration combined with the Vertex Approximate Gradient (VAG) finite volume scheme which is adapted to polyhedral meshes and anisotropic heterogeneous media. The fully coupled systems are assembled and solved in parallel using the Single Program Multiple Data (SPMD) paradigm with one layer of ghost cells. This strategy allows for a local assembly of the discrete systems. An efficient preconditioner is implemented to solve the linear systems at each time step and each Newton type iteration of the simulation. The numerical efficiency of our approach is assessed on different meshes, fracture networks, and physical settings in terms of parallel scalability, nonlinear convergence and linear convergence.
Alignment between Protostellar Outflows and Filamentary Structure
NASA Astrophysics Data System (ADS)
Stephens, Ian W.; Dunham, Michael M.; Myers, Philip C.; Pokhrel, Riwaj; Sadavoy, Sarah I.; Vorobyov, Eduard I.; Tobin, John J.; Pineda, Jaime E.; Offner, Stella S. R.; Lee, Katherine I.; Kristensen, Lars E.; Jørgensen, Jes K.; Goodman, Alyssa A.; Bourke, Tyler L.; Arce, Héctor G.; Plunkett, Adele L.
2017-09-01
We present new Submillimeter Array (SMA) observations of CO(2-1) outflows toward young, embedded protostars in the Perseus molecular cloud as part of the Mass Assembly of Stellar Systems and their Evolution with the SMA (MASSES) survey. For 57 Perseus protostars, we characterize the orientation of the outflow angles and compare them with the orientation of the local filaments as derived from Herschel observations. We find that the relative angles between outflows and filaments are inconsistent with purely parallel or purely perpendicular distributions. Instead, the observed distribution of outflow-filament angles are more consistent with either randomly aligned angles or a mix of projected parallel and perpendicular angles. A mix of parallel and perpendicular angles requires perpendicular alignment to be more common by a factor of ˜3. Our results show that the observed distributions probably hold regardless of the protostar’s multiplicity, age, or the host core’s opacity. These observations indicate that the angular momentum axis of a protostar may be independent of the large-scale structure. We discuss the significance of independent protostellar rotation axes in the general picture of filament-based star formation.
Improved interior wall detection using designated dictionaries in compressive urban sensing problems
NASA Astrophysics Data System (ADS)
Lagunas, Eva; Amin, Moeness G.; Ahmad, Fauzia; Nájar, Montse
2013-05-01
In this paper, we address sparsity-based imaging of building interior structures for through-the-wall radar imaging and urban sensing applications. The proposed approach utilizes information about common building construction practices to form an appropriate sparse representation of the building layout. With a ground based SAR system, and considering that interior walls are either parallel or perpendicular to the exterior walls, the antenna at each position would receive reflections from the walls parallel to the radar's scan direction as well as from the corners between two meeting walls. We propose a two-step approach for wall detection and localization. In the first step, a dictionary of possible wall locations is used to recover the positions of both interior and exterior walls that are parallel to the scan direction. A follow-on step uses a dictionary of possible corner reflectors to locate wall-wall junctions along the detected wall segments, thereby determining the true wall extents and detecting walls perpendicular to the scan direction. The utility of the proposed approach is demonstrated using simulated data.
DC currents collected by a RF biased electrode quasi-parallel to the magnetic field
NASA Astrophysics Data System (ADS)
Faudot, E.; Devaux, S.; Moritz, J.; Bobkov, V.; Heuraux, S.
2017-10-01
Local plasma biasings due to RF sheaths close to ICRF antennas result mainly in a negative DC current collection on the antenna structure. In some specific cases, we may observe positive currents when the ion mobility (seen from the collecting surface) overcomes the electron one or/and when the collecting surface on the antenna side becomes larger than the other end of the flux tube connected to the wall. The typical configuration is when the antenna surface is almost parallel to the magnetic field lines and the other side perpendicular. To test the optimal case where the magnetic field is quasi-parallel to the electrode surface, one needs a linear magnetic configuration as our magnetized RF discharge experiment called Aline. The magnetic field angle is in our case lower than 1 relative to the RF biased surface. The DC current flowing through the discharge has been measured as a function of the magnetic field strength, neutral gas (He) pressure and RF power. The main result is the reversal of the DC current depending on the magnetic field, collision frequency and RF power level.
3D brain tumor localization and parameter estimation using thermographic approach on GPU.
Bousselham, Abdelmajid; Bouattane, Omar; Youssfi, Mohamed; Raihani, Abdelhadi
2018-01-01
The aim of this paper is to present a GPU parallel algorithm for brain tumor detection to estimate its size and location from surface temperature distribution obtained by thermography. The normal brain tissue is modeled as a rectangular cube including spherical tumor. The temperature distribution is calculated using forward three dimensional Pennes bioheat transfer equation, it's solved using massively parallel Finite Difference Method (FDM) and implemented on Graphics Processing Unit (GPU). Genetic Algorithm (GA) was used to solve the inverse problem and estimate the tumor size and location by minimizing an objective function involving measured temperature on the surface to those obtained by numerical simulation. The parallel implementation of Finite Difference Method reduces significantly the time of bioheat transfer and greatly accelerates the inverse identification of brain tumor thermophysical and geometrical properties. Experimental results show significant gains in the computational speed on GPU and achieve a speedup of around 41 compared to the CPU. The analysis performance of the estimation based on tumor size inside brain tissue also presented. Copyright © 2017 Elsevier Ltd. All rights reserved.
Parallel Geographic Variation in Drosophila melanogaster
Reinhardt, Josie A.; Kolaczkowski, Bryan; Jones, Corbin D.; Begun, David J.; Kern, Andrew D.
2014-01-01
Drosophila melanogaster, an ancestrally African species, has recently spread throughout the world, associated with human activity. The species has served as the focus of many studies investigating local adaptation relating to latitudinal variation in non-African populations, especially those from the United States and Australia. These studies have documented the existence of shared, genetically determined phenotypic clines for several life history and morphological traits. However, there are no studies designed to formally address the degree of shared latitudinal differentiation at the genomic level. Here we present our comparative analysis of such differentiation. Not surprisingly, we find evidence of substantial, shared selection responses on the two continents, probably resulting from selection on standing ancestral variation. The polymorphic inversion In(3R)P has an important effect on this pattern, but considerable parallelism is also observed across the genome in regions not associated with inversion polymorphism. Interestingly, parallel latitudinal differentiation is observed even for variants that are not particularly strongly differentiated, which suggests that very large numbers of polymorphisms are targets of spatially varying selection in this species. PMID:24610860
Proton core-beam system in the expanding solar wind: Hybrid simulations
NASA Astrophysics Data System (ADS)
Hellinger, Petr; Trávníček, Pavel M.
2011-11-01
Results of a two-dimensional hybrid expanding box simulation of a proton beam-core system in the solar wind are presented. The expansion with a strictly radial magnetic field leads to a decrease of the ratio between the proton perpendicular and parallel temperatures as well as to an increase of the ratio between the beam-core differential velocity and the local Alfvén velocity creating a free energy for many different instabilities. The system is indeed most of the time marginally stable with respect to the parallel magnetosonic, oblique Alfvén, proton cyclotron and parallel fire hose instabilities which determine the system evolution counteracting some effects of the expansion and interacting with each other. Nonlinear evolution of these instabilities leads to large modifications of the proton velocity distribution function. The beam and core protons are slowed with respect to each other and heated, and at later stages of the evolution the two populations are not clearly distinguishable. On the macroscopic level the instabilities cause large departures from the double adiabatic prediction leading to an efficient isotropization of effective proton temperatures in agreement with Helios observations.
A parallel reaction-transport model applied to cement hydration and microstructure development
NASA Astrophysics Data System (ADS)
Bullard, Jeffrey W.; Enjolras, Edith; George, William L.; Satterfield, Steven G.; Terrill, Judith E.
2010-03-01
A recently described stochastic reaction-transport model on three-dimensional lattices is parallelized and is used to simulate the time-dependent structural and chemical evolution in multicomponent reactive systems. The model, called HydratiCA, uses probabilistic rules to simulate the kinetics of diffusion, homogeneous reactions and heterogeneous phenomena such as solid nucleation, growth and dissolution in complex three-dimensional systems. The algorithms require information only from each lattice site and its immediate neighbors, and this localization enables the parallelized model to exhibit near-linear scaling up to several hundred processors. Although applicable to a wide range of material systems, including sedimentary rock beds, reacting colloids and biochemical systems, validation is performed here on two minerals that are commonly found in Portland cement paste, calcium hydroxide and ettringite, by comparing their simulated dissolution or precipitation rates far from equilibrium to standard rate equations, and also by comparing simulated equilibrium states to thermodynamic calculations, as a function of temperature and pH. Finally, we demonstrate how HydratiCA can be used to investigate microstructure characteristics, such as spatial correlations between different condensed phases, in more complex microstructures.
Message Passing and Shared Address Space Parallelism on an SMP Cluster
NASA Technical Reports Server (NTRS)
Shan, Hongzhang; Singh, Jaswinder P.; Oliker, Leonid; Biswas, Rupak; Biegel, Bryan (Technical Monitor)
2002-01-01
Currently, message passing (MP) and shared address space (SAS) are the two leading parallel programming paradigms. MP has been standardized with MPI, and is the more common and mature approach; however, code development can be extremely difficult, especially for irregularly structured computations. SAS offers substantial ease of programming, but may suffer from performance limitations due to poor spatial locality and high protocol overhead. In this paper, we compare the performance of and the programming effort required for six applications under both programming models on a 32-processor PC-SMP cluster, a platform that is becoming increasingly attractive for high-end scientific computing. Our application suite consists of codes that typically do not exhibit scalable performance under shared-memory programming due to their high communication-to-computation ratios and/or complex communication patterns. Results indicate that SAS can achieve about half the parallel efficiency of MPI for most of our applications, while being competitive for the others. A hybrid MPI+SAS strategy shows only a small performance advantage over pure MPI in some cases. Finally, improved implementations of two MPI collective operations on PC-SMP clusters are presented.
Towards implementation of cellular automata in Microbial Fuel Cells.
Tsompanas, Michail-Antisthenis I; Adamatzky, Andrew; Sirakoulis, Georgios Ch; Greenman, John; Ieropoulos, Ioannis
2017-01-01
The Microbial Fuel Cell (MFC) is a bio-electrochemical transducer converting waste products into electricity using microbial communities. Cellular Automaton (CA) is a uniform array of finite-state machines that update their states in discrete time depending on states of their closest neighbors by the same rule. Arrays of MFCs could, in principle, act as massive-parallel computing devices with local connectivity between elementary processors. We provide a theoretical design of such a parallel processor by implementing CA in MFCs. We have chosen Conway's Game of Life as the 'benchmark' CA because this is the most popular CA which also exhibits an enormously rich spectrum of patterns. Each cell of the Game of Life CA is realized using two MFCs. The MFCs are linked electrically and hydraulically. The model is verified via simulation of an electrical circuit demonstrating equivalent behaviours. The design is a first step towards future implementations of fully autonomous biological computing devices with massive parallelism. The energy independence of such devices counteracts their somewhat slow transitions-compared to silicon circuitry-between the different states during computation.
Towards implementation of cellular automata in Microbial Fuel Cells
Adamatzky, Andrew; Sirakoulis, Georgios Ch.; Greenman, John; Ieropoulos, Ioannis
2017-01-01
The Microbial Fuel Cell (MFC) is a bio-electrochemical transducer converting waste products into electricity using microbial communities. Cellular Automaton (CA) is a uniform array of finite-state machines that update their states in discrete time depending on states of their closest neighbors by the same rule. Arrays of MFCs could, in principle, act as massive-parallel computing devices with local connectivity between elementary processors. We provide a theoretical design of such a parallel processor by implementing CA in MFCs. We have chosen Conway’s Game of Life as the ‘benchmark’ CA because this is the most popular CA which also exhibits an enormously rich spectrum of patterns. Each cell of the Game of Life CA is realized using two MFCs. The MFCs are linked electrically and hydraulically. The model is verified via simulation of an electrical circuit demonstrating equivalent behaviours. The design is a first step towards future implementations of fully autonomous biological computing devices with massive parallelism. The energy independence of such devices counteracts their somewhat slow transitions—compared to silicon circuitry—between the different states during computation. PMID:28498871
Su, Zhong; Zhang, Lisha; Ramakrishnan, V; Hagan, Michael; Anscher, Mitchell
2011-05-01
To evaluate both the Calypso Systems' (Calypso Medical Technologies, Inc., Seattle, WA) localization accuracy in the presence of wireless metal-oxide-semiconductor field-effect transistor (MOSFET) dosimeters of dose verification system (DVS, Sicel Technologies, Inc., Morrisville, NC) and the dosimeters' reading accuracy in the presence of wireless electromagnetic transponders inside a phantom. A custom-made, solid-water phantom was fabricated with space for transponders and dosimeters. Two inserts were machined with positioning grooves precisely matching the dimensions of the transponders and dosimeters and were arranged in orthogonal and parallel orientations, respectively. To test the transponder localization accuracy with/without presence of dosimeters (hypothesis 1), multivariate analyses were performed on transponder-derived localization data with and without dosimeters at each preset distance to detect statistically significant localization differences between the control and test sets. To test dosimeter dose-reading accuracy with/without presence of transponders (hypothesis 2), an approach of alternating the transponder presence in seven identical fraction dose (100 cGy) deliveries and measurements was implemented. Two-way analysis of variance was performed to examine statistically significant dose-reading differences between the two groups and the different fractions. A relative-dose analysis method was also used to evaluate transponder impact on dose-reading accuracy after dose-fading effect was removed by a second-order polynomial fit. Multivariate analysis indicated that hypothesis 1 was false; there was a statistically significant difference between the localization data from the control and test sets. However, the upper and lower bounds of the 95% confidence intervals of the localized positional differences between the control and test sets were less than 0.1 mm, which was significantly smaller than the minimum clinical localization resolution of 0.5 mm. For hypothesis 2, analysis of variance indicated that there was no statistically significant difference between the dosimeter readings with and without the presence of transponders. Both orthogonal and parallel configurations had difference of polynomial-fit dose to measured dose values within 1.75%. The phantom study indicated that the Calypso System's localization accuracy was not affected clinically due to the presence of DVS wireless MOSFET dosimeters and the dosimeter-measured doses were not affected by the presence of transponders. Thus, the same patients could be implanted with both transponders and dosimeters to benefit from improved accuracy of radiotherapy treatments offered by conjunctional use of the two systems.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Unat, Didem; Dubey, Anshu; Hoefler, Torsten
The cost of data movement has always been an important concern in high performance computing (HPC) systems. It has now become the dominant factor in terms of both energy consumption and performance. Support for expression of data locality has been explored in the past, but those efforts have had only modest success in being adopted in HPC applications for various reasons. However, with the increasing complexity of the memory hierarchy and higher parallelism in emerging HPC systems, locality management has acquired a new urgency. Developers can no longer limit themselves to low-level solutions and ignore the potential for productivity andmore » performance portability obtained by using locality abstractions. Fortunately, the trend emerging in recent literature on the topic alleviates many of the concerns that got in the way of their adoption by application developers. Data locality abstractions are available in the forms of libraries, data structures, languages and runtime systems; a common theme is increasing productivity without sacrificing performance. Furthermore, this paper examines these trends and identifies commonalities that can combine various locality concepts to develop a comprehensive approach to expressing and managing data locality on future large-scale high-performance computing systems.« less
Dense flow around a sphere moving into a cloud of grains
NASA Astrophysics Data System (ADS)
Gondret, Philippe; Faure, Sylvain; Lefebvre-Lepot, Aline; Seguin, Antoine
2017-06-01
A bidimensional simulation of a sphere moving at constant velocity into a cloud of smaller spherical grains without gravity is presented with a non-smooth contact dynamics method. A dense granular "cluster" zone of about constant solid fraction builds progressively around the moving sphere until a stationary regime appears with a constant upstream cluster size that increases with the initial solid fraction ϕ0 of the cloud. A detailed analysis of the local strain rate and local stress fields inside the cluster reveals that, despite different spatial variations of strain and stresses, the local friction coeffcient μ appears to depend only on the local inertial number I as well as the local solid fraction ϕ, which means that a local rheology does exist in the present non parallel flow. The key point is that the spatial variations of I inside the cluster does not depend on the sphere velocity and explore only a small range between about 10-2 and 10-1. The influence of sidewalls is then investigated on the flow and the forces.
Anderson localization of shear waves observed by magnetic resonance imaging
NASA Astrophysics Data System (ADS)
Papazoglou, S.; Klatt, D.; Braun, J.; Sack, I.
2010-07-01
In this letter we present for the first time an experimental investigation of shear wave localization using motion-sensitive magnetic resonance imaging (MRI). Shear wave localization was studied in gel phantoms containing arrays of randomly positioned parallel glass rods. The phantoms were exposed to continuous harmonic vibrations in a frequency range from 25 to 175 Hz, yielding wavelengths on the order of the elastic mean free path, i.e. the Ioffe-Regel criterion of Anderson localization was satisfied. The experimental setup was further chosen such that purely shear horizontal waves were induced to avoid effects due to mode conversion and pressure waves. Analysis of the distribution of shear wave intensity in experiments and simulations revealed a significant deviation from Rayleigh statistics indicating that shear wave energy is localized. This observation is further supported by experiments on weakly scattering samples exhibiting Rayleigh statistics and an analysis of the multifractality of wave functions. Our results suggest that motion-sensitive MRI is a promising tool for studying Anderson localization of time-harmonic shear waves, which are increasingly used in dynamic elastography.
Core localization and {sigma}* delocalization in the O 1s core-excited sulfur dioxide molecule
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lindgren, Andreas; Kivimaeki, Antti; Sorensen, Stacey L.
Electron-ion-ion coincidence measurements of sulfur dioxide at discrete resonances near the O 1s ionization edge are reported. The spectra are analyzed using a model based upon molecular symmetry and on the geometry of the molecule. We find clear evidence for molecular alignment that can be ascribed to symmetry properties of the ground and core-excited states. Configuration interaction (CI) calculations indicate geometry changes in accord with the measured spectra. For the SO{sub 2} molecule, however, we find that the localized core hole does not produce measurable evidence for valence localization, since the transition dipole moment is not parallel to a breakingmore » {sigma}* O-S bond, in contrast to the case of ozone. The dissociation behavior based upon the CI calculations using symmetry-broken orbitals while fixing a localized core-hole site is found to be nearly equivalent to that using symmetry-adapted orbitals. This implies that the core-localization effect is not strong enough to localize the {sigma}* valence orbital.« less
NASA Astrophysics Data System (ADS)
Tanaka, Kenta K.; Ichioka, Masanori; Onari, Seiichiro
2018-04-01
Local NMR relaxation rates in the vortex state of chiral and helical p -wave superconductors are investigated by the quasiclassical Eilenberger theory. We calculate the spatial and resonance frequency dependences of the local NMR spin-lattice relaxation rate T1-1 and spin-spin relaxation rate T2-1. Depending on the relation between the NMR relaxation direction and the d -vector symmetry, the local T1-1 and T2-1 in the vortex core region show different behaviors. When the NMR relaxation direction is parallel to the d -vector component, the local NMR relaxation rate is anomalously suppressed by the negative coherence effect due to the spin dependence of the odd-frequency s -wave spin-triplet Cooper pairs. The difference between the local T1-1 and T2-1 in the site-selective NMR measurement is expected to be a method to examine the d -vector symmetry of candidate materials for spin-triplet superconductors.
ION ACCELERATION AT THE QUASI-PARALLEL BOW SHOCK: DECODING THE SIGNATURE OF INJECTION
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sundberg, Torbjörn; Haynes, Christopher T.; Burgess, D.
Collisionless shocks are efficient particle accelerators. At Earth, ions with energies exceeding 100 keV are seen upstream of the bow shock when the magnetic geometry is quasi-parallel, and large-scale supernova remnant shocks can accelerate ions into cosmic-ray energies. This energization is attributed to diffusive shock acceleration; however, for this process to become active, the ions must first be sufficiently energized. How and where this initial acceleration takes place has been one of the key unresolved issues in shock acceleration theory. Using Cluster spacecraft observations, we study the signatures of ion reflection events in the turbulent transition layer upstream of the terrestrial bowmore » shock, and with the support of a hybrid simulation of the shock, we show that these reflection signatures are characteristic of the first step in the ion injection process. These reflection events develop in particular in the region where the trailing edge of large-amplitude upstream waves intercept the local shock ramp and the upstream magnetic field changes from quasi-perpendicular to quasi-parallel. The dispersed ion velocity signature observed can be attributed to a rapid succession of ion reflections at this wave boundary. After the ions’ initial interaction with the shock, they flow upstream along the quasi-parallel magnetic field. Each subsequent wavefront in the upstream region will sweep the ions back toward the shock, where they gain energy with each transition between the upstream and the shock wave frames. Within three to five gyroperiods, some ions have gained enough parallel velocity to escape upstream, thus completing the injection process.« less
Characterizing parallel file-access patterns on a large-scale multiprocessor
NASA Technical Reports Server (NTRS)
Purakayastha, Apratim; Ellis, Carla Schlatter; Kotz, David; Nieuwejaar, Nils; Best, Michael
1994-01-01
Rapid increases in the computational speeds of multiprocessors have not been matched by corresponding performance enhancements in the I/O subsystem. To satisfy the large and growing I/O requirements of some parallel scientific applications, we need parallel file systems that can provide high-bandwidth and high-volume data transfer between the I/O subsystem and thousands of processors. Design of such high-performance parallel file systems depends on a thorough grasp of the expected workload. So far there have been no comprehensive usage studies of multiprocessor file systems. Our CHARISMA project intends to fill this void. The first results from our study involve an iPSC/860 at NASA Ames. This paper presents results from a different platform, the CM-5 at the National Center for Supercomputing Applications. The CHARISMA studies are unique because we collect information about every individual read and write request and about the entire mix of applications running on the machines. The results of our trace analysis lead to recommendations for parallel file system design. First the file system should support efficient concurrent access to many files, and I/O requests from many jobs under varying load conditions. Second, it must efficiently manage large files kept open for long periods. Third, it should expect to see small requests predominantly sequential access patterns, application-wide synchronous access, no concurrent file-sharing between jobs appreciable byte and block sharing between processes within jobs, and strong interprocess locality. Finally, the trace data suggest that node-level write caches and collective I/O request interfaces may be useful in certain environments.
Megavolt parallel potentials arising from double-layer streams in the Earth's outer radiation belt.
Mozer, F S; Bale, S D; Bonnell, J W; Chaston, C C; Roth, I; Wygant, J
2013-12-06
Huge numbers of double layers carrying electric fields parallel to the local magnetic field line have been observed on the Van Allen probes in connection with in situ relativistic electron acceleration in the Earth's outer radiation belt. For one case with adequate high time resolution data, 7000 double layers were observed in an interval of 1 min to produce a 230,000 V net parallel potential drop crossing the spacecraft. Lower resolution data show that this event lasted for 6 min and that more than 1,000,000 volts of net parallel potential crossed the spacecraft during this time. A double layer traverses the length of a magnetic field line in about 15 s and the orbital motion of the spacecraft perpendicular to the magnetic field was about 700 km during this 6 min interval. Thus, the instantaneous parallel potential along a single magnetic field line was the order of tens of kilovolts. Electrons on the field line might experience many such potential steps in their lifetimes to accelerate them to energies where they serve as the seed population for relativistic acceleration by coherent, large amplitude whistler mode waves. Because the double-layer speed of 3100 km/s is the order of the electron acoustic speed (and not the ion acoustic speed) of a 25 eV plasma, the double layers may result from a new electron acoustic mode. Acceleration mechanisms involving double layers may also be important in planetary radiation belts such as Jupiter, Saturn, Uranus, and Neptune, in the solar corona during flares, and in astrophysical objects.
Constructing Neuronal Network Models in Massively Parallel Environments.
Ippen, Tammo; Eppler, Jochen M; Plesser, Hans E; Diesmann, Markus
2017-01-01
Recent advances in the development of data structures to represent spiking neuron network models enable us to exploit the complete memory of petascale computers for a single brain-scale network simulation. In this work, we investigate how well we can exploit the computing power of such supercomputers for the creation of neuronal networks. Using an established benchmark, we divide the runtime of simulation code into the phase of network construction and the phase during which the dynamical state is advanced in time. We find that on multi-core compute nodes network creation scales well with process-parallel code but exhibits a prohibitively large memory consumption. Thread-parallel network creation, in contrast, exhibits speedup only up to a small number of threads but has little overhead in terms of memory. We further observe that the algorithms creating instances of model neurons and their connections scale well for networks of ten thousand neurons, but do not show the same speedup for networks of millions of neurons. Our work uncovers that the lack of scaling of thread-parallel network creation is due to inadequate memory allocation strategies and demonstrates that thread-optimized memory allocators recover excellent scaling. An analysis of the loop order used for network construction reveals that more complex tests on the locality of operations significantly improve scaling and reduce runtime by allowing construction algorithms to step through large networks more efficiently than in existing code. The combination of these techniques increases performance by an order of magnitude and harnesses the increasingly parallel compute power of the compute nodes in high-performance clusters and supercomputers.
MMS observations and hybrid simulations of rippled and reforming quasi-parallel shocks
NASA Astrophysics Data System (ADS)
Gingell, I.; Schwartz, S. J.; Burgess, D.; Johlander, A.; Russell, C. T.; Burch, J. L.; Ergun, R.; Fuselier, S. A.; Gershman, D. J.; Giles, B. L.; Goodrich, K.; Khotyaintsev, Y. V.; Lavraud, B.; Lindqvist, P. A.; Strangeway, R. J.; Trattner, K. J.; Torbert, R. B.; Wilder, F. D.
2017-12-01
Surface ripples, i.e. deviations in the nominal local shock orientation, are expected to propagate in the ramp and overshoot of collisionless shocks. These ripples have typically been associated with observations and simulations of quasi-perpendicular shocks. We present observations of a crossing of Earth's marginally quasi-parallel (θBn ˜ 45°) bow shock by the MMS spacecraft on 2015-11-27 06:01:44 UTC, for which we identify signatures consistent with a propagating surface ripple. In order to demonstrate the differences between ripples at quasi-perpendicular and quasi-parallel shocks, we also present two-dimensional hybrid simulations over a range of shock normal angles θBn under the observed solar wind conditions. We show that in the quasi-parallel cases surface ripples are transient phenomena modulated by the cyclic reformation of the shock front. These ripples develop faster than an ion gyroperiod and only during the period of the reformation cycle when a newly developed shock ramp is unaffected by turbulence in the foot. We conclude that the change of properties of the surface ripple observed by MMS while crossing Earth's quasi-parallel bow shock are consistent with the influence of cyclic reformation on shock structure. Given that both surface ripples and cyclic reformation are expected to affect the acceleration of electrons within the shock, the interaction of these phenomena and any other sources of shock non-stationary are important for models of particle acceleration. We therefore discuss signatures of electron heating and acceleration in several rippled shocks observed by MMS.
Constructing Neuronal Network Models in Massively Parallel Environments
Ippen, Tammo; Eppler, Jochen M.; Plesser, Hans E.; Diesmann, Markus
2017-01-01
Recent advances in the development of data structures to represent spiking neuron network models enable us to exploit the complete memory of petascale computers for a single brain-scale network simulation. In this work, we investigate how well we can exploit the computing power of such supercomputers for the creation of neuronal networks. Using an established benchmark, we divide the runtime of simulation code into the phase of network construction and the phase during which the dynamical state is advanced in time. We find that on multi-core compute nodes network creation scales well with process-parallel code but exhibits a prohibitively large memory consumption. Thread-parallel network creation, in contrast, exhibits speedup only up to a small number of threads but has little overhead in terms of memory. We further observe that the algorithms creating instances of model neurons and their connections scale well for networks of ten thousand neurons, but do not show the same speedup for networks of millions of neurons. Our work uncovers that the lack of scaling of thread-parallel network creation is due to inadequate memory allocation strategies and demonstrates that thread-optimized memory allocators recover excellent scaling. An analysis of the loop order used for network construction reveals that more complex tests on the locality of operations significantly improve scaling and reduce runtime by allowing construction algorithms to step through large networks more efficiently than in existing code. The combination of these techniques increases performance by an order of magnitude and harnesses the increasingly parallel compute power of the compute nodes in high-performance clusters and supercomputers. PMID:28559808
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hirata, So
2003-11-20
We develop a symbolic manipulation program and program generator (Tensor Contraction Engine or TCE) that automatically derives the working equations of a well-defined model of second-quantized many-electron theories and synthesizes efficient parallel computer programs on the basis of these equations. Provided an ansatz of a many-electron theory model, TCE performs valid contractions of creation and annihilation operators according to Wick's theorem, consolidates identical terms, and reduces the expressions into the form of multiple tensor contractions acted by permutation operators. Subsequently, it determines the binary contraction order for each multiple tensor contraction with the minimal operation and memory cost, factorizes commonmore » binary contractions (defines intermediate tensors), and identifies reusable intermediates. The resulting ordered list of binary tensor contractions, additions, and index permutations is translated into an optimized program that is combined with the NWChem and UTChem computational chemistry software packages. The programs synthesized by TCE take advantage of spin symmetry, Abelian point-group symmetry, and index permutation symmetry at every stage of calculations to minimize the number of arithmetic operations and storage requirement, adjust the peak local memory usage by index range tiling, and support parallel I/O interfaces and dynamic load balancing for parallel executions. We demonstrate the utility of TCE through automatic derivation and implementation of parallel programs for various models of configuration-interaction theory (CISD, CISDT, CISDTQ), many-body perturbation theory [MBPT(2), MBPT(3), MBPT(4)], and coupled-cluster theory (LCCD, CCD, LCCSD, CCSD, QCISD, CCSDT, and CCSDTQ).« less
Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul
2010-11-23
A massively parallel computer system contains an inter-nodal communications network of node-to-node links. Nodes vary a choice of routing policy for routing data in the network in a semi-random manner, so that similarly situated packets are not always routed along the same path. Semi-random variation of the routing policy tends to avoid certain local hot spots of network activity, which might otherwise arise using more consistent routing determinations. Preferably, the originating node chooses a routing policy for a packet, and all intermediate nodes in the path route the packet according to that policy. Policies may be rotated on a round-robin basis, selected by generating a random number, or otherwise varied.
Ion distribution effects of turbulence on a kinetic auroral arc model
NASA Technical Reports Server (NTRS)
Cornwall, J. M.; Chiu, Y. T.
1982-01-01
An inverted-V auroral arc structure plasma-kinetic model is extended to phenomenologically include the effects of electrostatic turbulence, with k-parallel/k-perpendicular being much less than unity. It is shown that, unless plasma sheet ions are very much more energetic than the electrons, anomalous resistivity is not a large contributor to parallel electrostatic potential drops, since the support of the observed potential drop requires a greater dissipation of energy than can be provided by the plasma sheet. Wave turbulence can, however, be present, with the ion cyclotron turbulence levels suggested by the ion resonance broadening saturation mechanism of Dum and Dupree (1970) being comparable to those observed on auroral field lines. The diffusion coefficient and net growth rate are much smaller than estimates based solely on local plasma properties.
Particle Based Simulations of Complex Systems with MP2C : Hydrodynamics and Electrostatics
NASA Astrophysics Data System (ADS)
Sutmann, Godehard; Westphal, Lidia; Bolten, Matthias
2010-09-01
Particle based simulation methods are well established paths to explore system behavior on microscopic to mesoscopic time and length scales. With the development of new computer architectures it becomes more and more important to concentrate on local algorithms which do not need global data transfer or reorganisation of large arrays of data across processors. This requirement strongly addresses long-range interactions in particle systems, i.e. mainly hydrodynamic and electrostatic contributions. In this article, emphasis is given to the implementation and parallelization of the Multi-Particle Collision Dynamics method for hydrodynamic contributions and a splitting scheme based on Multigrid for electrostatic contributions. Implementations are done for massively parallel architectures and are demonstrated for the IBM Blue Gene/P architecture Jugene in Jülich.
Mode structure symmetry breaking of energetic particle driven beta-induced Alfvén eigenmode
NASA Astrophysics Data System (ADS)
Lu, Z. X.; Wang, X.; Lauber, Ph.; Zonca, F.
2018-01-01
The mode structure symmetry breaking of energetic particle driven Beta-induced Alfvén Eigenmode (BAE) is studied based on global theory and simulation. The weak coupling formula gives a reasonable estimate of the local eigenvalue compared with global hybrid simulation using XHMGC. The non-perturbative effect of energetic particles on global mode structure symmetry breaking in radial and parallel (along B) directions is demonstrated. With the contribution from energetic particles, two dimensional (radial and poloidal) BAE mode structures with symmetric/asymmetric tails are produced using an analytical model. It is demonstrated that the symmetry breaking in radial and parallel directions is intimately connected. The effects of mode structure symmetry breaking on nonlinear physics, energetic particle transport, and the possible insight for experimental studies are discussed.
The Scalable Checkpoint/Restart Library
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moody, A.
The Scalable Checkpoint/Restart (SCR) library provides an interface that codes may use to worite our and read in application-level checkpoints in a scalable fashion. In the current implementation, checkpoint files are cached in local storage (hard disk or RAM disk) on the compute nodes. This technique provides scalable aggregate bandwidth and uses storage resources that are fully dedicated to the job. This approach addresses the two common drawbacks of checkpointing a large-scale application to a shared parallel file system, namely, limited bandwidth and file system contention. In fact, on current platforms, SCR scales linearly with the number of compute nodes.more » It has been benchmarked as high as 720GB/s on 1094 nodes of Atlas, which is nearly two orders of magnitude faster thanthe parallel file system.« less
Phase space simulation of collisionless stellar systems on the massively parallel processor
NASA Technical Reports Server (NTRS)
White, Richard L.
1987-01-01
A numerical technique for solving the collisionless Boltzmann equation describing the time evolution of a self gravitating fluid in phase space was implemented on the Massively Parallel Processor (MPP). The code performs calculations for a two dimensional phase space grid (with one space and one velocity dimension). Some results from calculations are presented. The execution speed of the code is comparable to the speed of a single processor of a Cray-XMP. Advantages and disadvantages of the MPP architecture for this type of problem are discussed. The nearest neighbor connectivity of the MPP array does not pose a significant obstacle. Future MPP-like machines should have much more local memory and easier access to staging memory and disks in order to be effective for this type of problem.
Turbine airfoil to shroud attachment method
DOE Office of Scientific and Technical Information (OSTI.GOV)
Campbell, Christian X; Kulkarni, Anand A; James, Allister W
2014-12-23
Bi-casting a platform (50) onto an end portion (42) of a turbine airfoil (31) after forming a coating of a fugitive material (56) on the end portion. After bi-casting the platform, the coating is dissolved and removed to relieve differential thermal shrinkage stress between the airfoil and platform. The thickness of the coating is varied around the end portion in proportion to varying amounts of local differential process shrinkage. The coating may be sprayed (76A, 76B) onto the end portion in opposite directions parallel to a chord line (41) of the airfoil or parallel to a mid-platform length (80) ofmore » the platform to form respective layers tapering in thickness from the leading (32) and trailing (34) edges along the suction side (36) of the airfoil.« less
Anisotropic pitch angle distribution of 50 eV to 50 keV particles at synchronous altitude.
NASA Technical Reports Server (NTRS)
Deforest, S. E.; Mcilwain, C. F.
1972-01-01
At times, the electron pitch angle distributions at synchronous orbit have been observed to be highly anisotropic. In the local morning region, distributions concentrated near 90 deg are often observed in particles of less than approximately 2000 V. This anisotropy decreases with increasing energy from 1 keV to the detector's limit at 50 keV. The time development of anisotropy is consistent with production by pitch angle scattering processes which are not effective on electrons with small velocities parallel to the magnetic field. Another type of distribution has been observed with the low-energy (below 1000 V) electrons concentrated parallel and antiparallel to the magnetic field. These distributions are only seen in the dusk sector, but this may be an orbital artifact.
NASA Technical Reports Server (NTRS)
Langtry, R. B.; Menter, F. R.; Likki, S. R.; Suzen, Y. B.; Huang, P. G.; Volker, S.
2006-01-01
A new correlation-based transition model has been developed, which is built strictly on local variables. As a result, the transition model is compatible with modern computational fluid dynamics (CFD) methods using unstructured grids and massive parallel execution. The model is based on two transport equations, one for the intermittency and one for the transition onset criteria in terms of momentum thickness Reynolds number. The proposed transport equations do not attempt to model the physics of the transition process (unlike, e.g., turbulence models), but form a framework for the implementation of correlation-based models into general-purpose CFD methods.
A Correlation-Based Transition Model using Local Variables. Part 1; Model Formation
NASA Technical Reports Server (NTRS)
Menter, F. R.; Langtry, R. B.; Likki, S. R.; Suzen, Y. B.; Huang, P. G.; Volker, S.
2006-01-01
A new correlation-based transition model has been developed, which is based strictly on local variables. As a result, the transition model is compatible with modern computational fluid dynamics (CFD) approaches, such as unstructured grids and massive parallel execution. The model is based on two transport equations, one for intermittency and one for the transition onset criteria in terms of momentum thickness Reynolds number. The proposed transport equations do not attempt to model the physics of the transition process (unlike, e.g., turbulence models) but from a framework for the implementation of correlation-based models into general-purpose CFD methods.
[Pathogenesis of apical periodontitis and its effects on the body].
Márton, I; Bágyi, K; Radics, T; Kiss, C
1998-01-01
During the last 25 years there have been major advances in understanding the etiology, pathogenesis and maintenance of inflammatory processes taking place in the periapical space. Polymicrobial infection of the pulp chamber is of primary importance in initiating periapical inflammation. Egress of bacteria and their antigenes stimulate the immune system to form a granulation tissue around the apical area. Local immune response eliminates excess number of invading organisms. However, in parallel with protective reactions, local activity of immunocompetent cells and their soluble products also contribute to tissue damage, bone resorption and perpetuation of inflammation. Present data indicate that interaction of T-lymphocytes and macrophages is crucial in this process.
Livermore Big Artificial Neural Network Toolkit
DOE Office of Scientific and Technical Information (OSTI.GOV)
Essen, Brian Van; Jacobs, Sam; Kim, Hyojin
2016-07-01
LBANN is a toolkit that is designed to train artificial neural networks efficiently on high performance computing architectures. It is optimized to take advantages of key High Performance Computing features to accelerate neural network training. Specifically it is optimized for low-latency, high bandwidth interconnects, node-local NVRAM, node-local GPU accelerators, and high bandwidth parallel file systems. It is built on top of the open source Elemental distributed-memory dense and spars-direct linear algebra and optimization library that is released under the BSD license. The algorithms contained within LBANN are drawn from the academic literature and implemented to work within a distributed-memory framework.
A Laboratory Facility for Research in Parallel Computation: Project Final Report.
1987-07-01
87 UNCLASSIFED AFOSR-TR-87-i9gi AFMS-86-279 F/ G 12/6 U MENE .306 fil L -0 1 25 1 4 1111 Llj i CHART 04.- 0 . FL F0. A- h 0 r .WrnKw -- w F-U-ML la...34A software tool for Building Supercomputer Applications" (I ) G ~Ij ONAVAILABILITY OF ABSTRACT 21. ABSTRACT SECURITY CLASSIFICATION %(I T ,V/,I rDIijN...processors may display different be- haviors. For example assume we have a processor g with a "good" local structure and a processor b with a "bad" local
NASA Astrophysics Data System (ADS)
Yang, Dikun; Oldenburg, Douglas W.; Haber, Eldad
2014-03-01
Airborne electromagnetic (AEM) methods are highly efficient tools for assessing the Earth's conductivity structures in a large area at low cost. However, the configuration of AEM measurements, which typically have widely distributed transmitter-receiver pairs, makes the rigorous modelling and interpretation extremely time-consuming in 3-D. Excessive overcomputing can occur when working on a large mesh covering the entire survey area and inverting all soundings in the data set. We propose two improvements. The first is to use a locally optimized mesh for each AEM sounding for the forward modelling and calculation of sensitivity. This dedicated local mesh is small with fine cells near the sounding location and coarse cells far away in accordance with EM diffusion and the geometric decay of the signals. Once the forward problem is solved on the local meshes, the sensitivity for the inversion on the global mesh is available through quick interpolation. Using local meshes for AEM forward modelling avoids unnecessary computing on fine cells on a global mesh that are far away from the sounding location. Since local meshes are highly independent, the forward modelling can be efficiently parallelized over an array of processors. The second improvement is random and dynamic down-sampling of the soundings. Each inversion iteration only uses a random subset of the soundings, and the subset is reselected for every iteration. The number of soundings in the random subset, determined by an adaptive algorithm, is tied to the degree of model regularization. This minimizes the overcomputing caused by working with redundant soundings. Our methods are compared against conventional methods and tested with a synthetic example. We also invert a field data set that was previously considered to be too large to be practically inverted in 3-D. These examples show that our methodology can dramatically reduce the processing time of 3-D inversion to a practical level without losing resolution. Any existing modelling technique can be included into our framework of mesh decoupling and adaptive sampling to accelerate large-scale 3-D EM inversions.
Experimental implementation of parallel riverbed erosion to study vegetation uprooting by flow
NASA Astrophysics Data System (ADS)
Perona, Paolo; Edmaier, Katharina; Crouzy, Benoît
2014-05-01
In nature, flow erosion leading to the uprooting of vegetation is often a delayed process that gradually reduces anchoring by root exposure and correspondingly increases drag on the exposed biomass. The process determining scouring or deposition of the riverbed, and consequently plant root exposure is complex and scale dependent. At the local scale, it is hydrodynamically driven and depends on obstacle porosity, as well as sediment vs obstacle size ratio. At a larger scale it results from morphodynamic conditions, which mostly depend on riverbed topography and stream bedload transport capacity. In the latter case, ablation of sediment gradually reduces local bed elevation around the obstacle at a scale larger than the obstacle size, and uprooting eventually occurs when flow drag exceeds the residual anchoring. Ideally, one would study the timescales of vegetation uprooting by flow by inducing parallel bed erosion. This condition is not trivial to obtain experimentally because bed elevation adjustments occur in relation to longitudinal changes in sediment apportion as described by Exner's equation. In this work, we study the physical conditions leading to parallel bed erosion by reducing Exner equation closed for bedload transport to a nonlinear partial differential equation, and showing that this is a particular "boundary value" problem. Eventually, we use the data of Edmaier (2014) from a small scale mobile-bed flume setup to verify the proposed theoretical framework, and to show how such a simple experiment can provide useful insights into the timescales of the uprooting process (Edmaier et al., 2011). REFERENCES - Edmaier, K., P. Burlando, and P. Perona (2011). Mechanisms of vegetation uprooting by flow in alluvial non-cohesive sediment. Hydrology and Earth System Sciences, vol. 15, p. 1615-1627. - Edmaier, K. Uprooting mechanisms of juvenile vegetation by flow. PhD thesis, EPFL, in preparation.
Effect of nanostructures orientation on electroosmotic flow in a microfluidic channel
NASA Astrophysics Data System (ADS)
Eng Lim, An; Lim, Chun Yee; Cheong Lam, Yee; Taboryski, Rafael; Rui Wang, Shu
2017-06-01
Electroosmotic flow (EOF) is an electric-field-induced fluid flow that has numerous micro-/nanofluidic applications, ranging from pumping to chemical and biomedical analyses. Nanoscale networks/structures are often integrated in microchannels for a broad range of applications, such as electrophoretic separation of biomolecules, high reaction efficiency catalytic microreactors, and enhancement of heat transfer and sensing. Their introduction has been known to reduce EOF. Hitherto, a proper study on the effect of nanostructures orientation on EOF in a microfluidic channel is yet to be carried out. In this investigation, we present a novel fabrication method for nanostructure designs that possess maximum orientation difference, i.e. parallel versus perpendicular indented nanolines, to examine the effect of nanostructures orientation on EOF. It consists of four phases: fabrication of silicon master, creation of mold insert via electroplating, injection molding with cyclic olefin copolymer, and thermal bonding and integration of practical inlet/outlet ports. The effect of nanostructures orientation on EOF was studied experimentally by current monitoring method. The experimental results show that nanolines which are perpendicular to the microchannel reduce the EOF velocity significantly (approximately 20%). This flow velocity reduction is due to the distortion of local electric field by the perpendicular nanolines at the nanostructured surface as demonstrated by finite element simulation. In contrast, nanolines which are parallel to the microchannel have no effect on EOF, as it can be deduced that the parallel nanolines do not distort the local electric field. The outcomes of this investigation contribute to the precise control of EOF in lab-on-chip devices, and fundamental understanding of EOF in devices which utilize nanostructured surfaces for chemical and biological analyses.
Yamamoto, Takumi; Yoshimatsu, Hidehiko; Hayashi, Akitatsu; Koshima, Isao
2015-10-01
The treatment of deep pressure ulcer with a wide wound edge undermining (pocket) is challenging, especially when conservative treatments are ineffective. As most patients with a pressure ulcer suffer from systemic comorbidities, invasive surgery cannot be performed on all patients, and less invasive treatment is required. Less invasive surgical intervention to a deep pressure ulcer, parallel pocket incision (PPI), was performed on 10 patients with intractable pressure ulcers with a pocket formation. In PPI procedures, two parallel skin incisions were made to open up the deepest fold of the pocket and to preserve the skin overlying the pocket lesion; through the created incisions, the necrotic tissues around the deepest fold of the undermining could be easily removed, which facilitated spontaneous wound healing. Postoperative results and complications were evaluated. All PPI procedures were safely performed under local infiltration anesthesia without major postoperative complication; minor bleeding was seen intraoperatively in three patients, which could be easily controlled with electric cautery coagulation. Nine of 10 ulcers were cured after PPI, and one could not be followed up due to the patient's death non-related to the pressure ulcer. For the nine cured patients, the average time for cure was 14.9 weeks, and no recurrence was observed at postoperative 6 months. PPI is a simple, technically easy, and less invasive surgical intervention to an intractable pressure ulcer with a pocket, which can be safely performed under local infiltration anesthesia even on a patient with severe systemic comorbidities. Copyright © 2015 British Association of Plastic, Reconstructive and Aesthetic Surgeons. Published by Elsevier Ltd. All rights reserved.