origin compute node: Topics by Science.gov

Sample records for origin compute node

Low latency, high bandwidth data communications between compute nodes in a parallel computer

DOEpatents

Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

2010-11-02

Methods, parallel computers, and computer program products are disclosed for low latency, high bandwidth data communications between compute nodes in a parallel computer. Embodiments include receiving, by an origin direct memory access (`DMA`) engine of an origin compute node, data for transfer to a target compute node; sending, by the origin DMA engine of the origin compute node to a target DMA engine on the target compute node, a request to send (`RTS`) message; transferring, by the origin DMA engine, a predetermined portion of the data to the target compute node using memory FIFO operation; determining, by the origin DMA engine whether an acknowledgement of the RTS message has been received from the target DMA engine; if the an acknowledgement of the RTS message has not been received, transferring, by the origin DMA engine, another predetermined portion of the data to the target compute node using a memory FIFO operation; and if the acknowledgement of the RTS message has been received by the origin DMA engine, transferring, by the origin DMA engine, any remaining portion of the data to the target compute node using a direct put operation.
Controlling data transfers from an origin compute node to a target compute node

DOEpatents

Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN

2011-06-21

Methods, apparatus, and products are disclosed for controlling data transfers from an origin compute node to a target compute node that include: receiving, by an application messaging module on the target compute node, an indication of a data transfer from an origin compute node to the target compute node; and administering, by the application messaging module on the target compute node, the data transfer using one or more messaging primitives of a system messaging module in dependence upon the indication.
Pacing a data transfer operation between compute nodes on a parallel computer

DOEpatents

Blocksome, Michael A [Rochester, MN

2011-09-13

Methods, systems, and products are disclosed for pacing a data transfer between compute nodes on a parallel computer that include: transferring, by an origin compute node, a chunk of an application message to a target compute node; sending, by the origin compute node, a pacing request to a target direct memory access (`DMA`) engine on the target compute node using a remote get DMA operation; determining, by the origin compute node, whether a pacing response to the pacing request has been received from the target DMA engine; and transferring, by the origin compute node, a next chunk of the application message if the pacing response to the pacing request has been received from the target DMA engine.
Chaining direct memory access data transfer operations for compute nodes in a parallel computer

DOEpatents

Archer, Charles J.; Blocksome, Michael A.

2010-09-28

Methods, systems, and products are disclosed for chaining DMA data transfer operations for compute nodes in a parallel computer that include: receiving, by an origin DMA engine on an origin node in an origin injection FIFO buffer for the origin DMA engine, a RGET data descriptor specifying a DMA transfer operation data descriptor on the origin node and a second RGET data descriptor on the origin node, the second RGET data descriptor specifying a target RGET data descriptor on the target node, the target RGET data descriptor specifying an additional DMA transfer operation data descriptor on the origin node; creating, by the origin DMA engine, an RGET packet in dependence upon the RGET data descriptor, the RGET packet containing the DMA transfer operation data descriptor and the second RGET data descriptor; and transferring, by the origin DMA engine to a target DMA engine on the target node, the RGET packet.
Self-pacing direct memory access data transfer operations for compute nodes in a parallel computer

DOEpatents

Blocksome, Michael A

2015-02-17

Methods, apparatus, and products are disclosed for self-pacing DMA data transfer operations for nodes in a parallel computer that include: transferring, by an origin DMA on an origin node, a RTS message to a target node, the RTS message specifying an message on the origin node for transfer to the target node; receiving, in an origin injection FIFO for the origin DMA from a target DMA on the target node in response to transferring the RTS message, a target RGET descriptor followed by a DMA transfer operation descriptor, the DMA descriptor for transmitting a message portion to the target node, the target RGET descriptor specifying an origin RGET descriptor on the origin node that specifies an additional DMA descriptor for transmitting an additional message portion to the target node; processing, by the origin DMA, the target RGET descriptor; and processing, by the origin DMA, the DMA transfer operation descriptor.
Low latency, high bandwidth data communications between compute nodes in a parallel computer

DOEpatents

Blocksome, Michael A

2014-04-01

Methods, systems, and products are disclosed for data transfers between nodes in a parallel computer that include: receiving, by an origin DMA on an origin node, a buffer identifier for a buffer containing data for transfer to a target node; sending, by the origin DMA to the target node, a RTS message; transferring, by the origin DMA, a data portion to the target node using a memory FIFO operation that specifies one end of the buffer from which to begin transferring the data; receiving, by the origin DMA, an acknowledgement of the RTS message from the target node; and transferring, by the origin DMA in response to receiving the acknowledgement, any remaining data portion to the target node using a direct put operation that specifies the other end of the buffer from which to begin transferring the data, including initiating the direct put operation without invoking an origin processing core.
Low latency, high bandwidth data communications between compute nodes in a parallel computer

DOEpatents

Blocksome, Michael A

2014-04-22

Methods, systems, and products are disclosed for data transfers between nodes in a parallel computer that include: receiving, by an origin DMA on an origin node, a buffer identifier for a buffer containing data for transfer to a target node; sending, by the origin DMA to the target node, a RTS message; transferring, by the origin DMA, a data portion to the target node using a memory FIFO operation that specifies one end of the buffer from which to begin transferring the data; receiving, by the origin DMA, an acknowledgement of the RTS message from the target node; and transferring, by the origin DMA in response to receiving the acknowledgement, any remaining data portion to the target node using a direct put operation that specifies the other end of the buffer from which to begin transferring the data, including initiating the direct put operation without invoking an origin processing core.
Low latency, high bandwidth data communications between compute nodes in a parallel computer

DOEpatents

Blocksome, Michael A

2013-07-02

Methods, systems, and products are disclosed for data transfers between nodes in a parallel computer that include: receiving, by an origin DMA on an origin node, a buffer identifier for a buffer containing data for transfer to a target node; sending, by the origin DMA to the target node, a RTS message; transferring, by the origin DMA, a data portion to the target node using a memory FIFO operation that specifies one end of the buffer from which to begin transferring the data; receiving, by the origin DMA, an acknowledgement of the RTS message from the target node; and transferring, by the origin DMA in response to receiving the acknowledgement, any remaining data portion to the target node using a direct put operation that specifies the other end of the buffer from which to begin transferring the data, including initiating the direct put operation without invoking an origin processing core.
Signaling completion of a message transfer from an origin compute node to a target compute node

DOEpatents

Blocksome, Michael A [Rochester, MN; Parker, Jeffrey J [Rochester, MN

2011-05-24

Signaling completion of a message transfer from an origin node to a target node includes: sending, by an origin DMA engine, an RTS message, the RTS message specifying an application message for transfer to the target node from the origin node; receiving, by the origin DMA engine, a remote get message containing a data descriptor for the message and a completion notification descriptor, the completion notification descriptor specifying a local direct put transfer operation for transferring data locally on the origin node; inserting, by the origin DMA engine in an injection FIFO buffer, the data descriptor followed by the completion notification descriptor; transferring, by the origin DMA engine to the target node, the message in dependence upon the data descriptor; and notifying, by the origin DMA engine, the application that transfer of the message is complete in dependence upon the completion notification descriptor.
Signaling completion of a message transfer from an origin compute node to a target compute node

DOEpatents

Blocksome, Michael A [Rochester, MN

2011-02-15

Signaling completion of a message transfer from an origin node to a target node includes: sending, by an origin DMA engine, an RTS message, the RTS message specifying an application message for transfer to the target node from the origin node; receiving, by the origin DMA engine, a remote get message containing a data descriptor for the message and a completion notification descriptor, the completion notification descriptor specifying a local memory FIFO data transfer operation for transferring data locally on the origin node; inserting, by the origin DMA engine in an injection FIFO buffer, the data descriptor followed by the completion notification descriptor; transferring, by the origin DMA engine to the target node, the message in dependence upon the data descriptor; and notifying, by the origin DMA engine, the application that transfer of the message is complete in dependence upon the completion notification descriptor.
Remote direct memory access

DOEpatents

Archer, Charles J.; Blocksome, Michael A.

2012-12-11

Methods, parallel computers, and computer program products are disclosed for remote direct memory access. Embodiments include transmitting, from an origin DMA engine on an origin compute node to a plurality target DMA engines on target compute nodes, a request to send message, the request to send message specifying a data to be transferred from the origin DMA engine to data storage on each target compute node; receiving, by each target DMA engine on each target compute node, the request to send message; preparing, by each target DMA engine, to store data according to the data storage reference and the data length, including assigning a base storage address for the data storage reference; sending, by one or more of the target DMA engines, an acknowledgment message acknowledging that all the target DMA engines are prepared to receive a data transmission from the origin DMA engine; receiving, by the origin DMA engine, the acknowledgement message from the one or more of the target DMA engines; and transferring, by the origin DMA engine, data to data storage on each of the target compute nodes according to the data storage reference using a single direct put operation.
Direct memory access transfer completion notification

DOEpatents

Archer, Charles J.; Blocksome, Michael A.; Parker, Jeffrey J.

2010-08-17

Methods, apparatus, and products are disclosed for DMA transfer completion notification that include: inserting, by an origin DMA engine on an origin compute node in an injection FIFO buffer, a data descriptor for an application message to be transferred to a target compute node on behalf of an application on the origin compute node; inserting, by the origin DMA engine, a completion notification descriptor in the injection FIFO buffer after the data descriptor for the message, the completion notification descriptor specifying an address of a completion notification field in application storage for the application; transferring, by the origin DMA engine to the target compute node, the message in dependence upon the data descriptor; and notifying, by the origin DMA engine, the application that the transfer of the message is complete, including performing a local direct put operation to store predesignated notification data at the address of the completion notification field.
Dispatching packets on a global combining network of a parallel computer

DOEpatents

Almasi, Gheorghe [Ardsley, NY; Archer, Charles J [Rochester, MN

2011-07-19

Methods, apparatus, and products are disclosed for dispatching packets on a global combining network of a parallel computer comprising a plurality of nodes connected for data communications using the network capable of performing collective operations and point to point operations that include: receiving, by an origin system messaging module on an origin node from an origin application messaging module on the origin node, a storage identifier and an operation identifier, the storage identifier specifying storage containing an application message for transmission to a target node, and the operation identifier specifying a message passing operation; packetizing, by the origin system messaging module, the application message into network packets for transmission to the target node, each network packet specifying the operation identifier and an operation type for the message passing operation specified by the operation identifier; and transmitting, by the origin system messaging module, the network packets to the target node.
Data communications in a parallel active messaging interface of a parallel computer

DOEpatents

Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

2013-10-29

Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the parallel computer including a plurality of compute nodes that execute a parallel application, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications through the PAMI and through data communications resources, including receiving in an origin endpoint of the PAMI a data communications instruction, the instruction characterized by an instruction type, the instruction specifying a transmission of transfer data from the origin endpoint to a target endpoint and transmitting, in accordance with the instruction type, the transfer data from the origin endpoint to the target endpoint.
Direct memory access transfer completion notification

DOEpatents

Chen, Dong; Giampapa, Mark E.; Heidelberger, Philip; Kumar, Sameer; Parker, Jeffrey J.; Steinmacher-Burow, Burkhard D.; Vranas, Pavlos

2010-07-27

Methods, compute nodes, and computer program products are provided for direct memory access (`DMA`) transfer completion notification. Embodiments include determining, by an origin DMA engine on an origin compute node, whether a data descriptor for an application message to be sent to a target compute node is currently in an injection first-in-first-out (`FIFO`) buffer in dependence upon a sequence number previously associated with the data descriptor, the total number of descriptors currently in the injection FIFO buffer, and the current sequence number for the newest data descriptor stored in the injection FIFO buffer; and notifying a processor core on the origin DMA engine that the message has been sent if the data descriptor for the message is not currently in the injection FIFO buffer.
Data communications in a parallel active messaging interface of a parallel computer

DOEpatents

Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

2014-02-11

Data communications in a parallel active messaging interface ('PAMI') or a parallel computer, the parallel computer including a plurality of compute nodes that execute a parallel application, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution of a compute node, including specification of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications instruction, the instruction characterized by instruction type, the instruction specifying a transmission of transfer data from the origin endpoint to a target endpoint and transmitting, in accordance witht the instruction type, the transfer data from the origin endpoin to the target endpoint.
Administering an epoch initiated for remote memory access

DOEpatents

Blocksome, Michael A; Miller, Douglas R

2014-03-18

Methods, systems, and products are disclosed for administering an epoch initiated for remote memory access that include: initiating, by an origin application messaging module on an origin compute node, one or more data transfers to a target compute node for the epoch; initiating, by the origin application messaging module after initiating the data transfers, a closing stage for the epoch, including rejecting any new data transfers after initiating the closing stage for the epoch; determining, by the origin application messaging module, whether the data transfers have completed; and closing, by the origin application messaging module, the epoch if the data transfers have completed.
Administering an epoch initiated for remote memory access

DOEpatents

Blocksome, Michael A; Miller, Douglas R

2012-10-23

Methods, systems, and products are disclosed for administering an epoch initiated for remote memory access that include: initiating, by an origin application messaging module on an origin compute node, one or more data transfers to a target compute node for the epoch; initiating, by the origin application messaging module after initiating the data transfers, a closing stage for the epoch, including rejecting any new data transfers after initiating the closing stage for the epoch; determining, by the origin application messaging module, whether the data transfers have completed; and closing, by the origin application messaging module, the epoch if the data transfers have completed.
Administering an epoch initiated for remote memory access

DOEpatents

Blocksome, Michael A.; Miller, Douglas R.

2013-01-01

Methods, systems, and products are disclosed for administering an epoch initiated for remote memory access that include: initiating, by an origin application messaging module on an origin compute node, one or more data transfers to a target compute node for the epoch; initiating, by the origin application messaging module after initiating the data transfers, a closing stage for the epoch, including rejecting any new data transfers after initiating the closing stage for the epoch; determining, by the origin application messaging module, whether the data transfers have completed; and closing, by the origin application messaging module, the epoch if the data transfers have completed.
Cooperation Helps Power Saving

DTIC Science & Technology

2009-04-07

the destination node hears the poll, the link between the two nodes is activated. In the original STEM, two radios working on two separate channels... hears the poll, the link between the two nodes is activated. In the original STEM, two radios working on two separate chan- nels are used: one radio is...Computer and Communications Societies. Proceedings. IEEE, vol. 3, pp. 1548–1557 vol.3, 2001. [2] R . Kravets and P. Krishnan, “Application-driven power

Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by semi-randomly varying routing policies for different packets

DOEpatents

Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

2010-11-23

A massively parallel computer system contains an inter-nodal communications network of node-to-node links. Nodes vary a choice of routing policy for routing data in the network in a semi-random manner, so that similarly situated packets are not always routed along the same path. Semi-random variation of the routing policy tends to avoid certain local hot spots of network activity, which might otherwise arise using more consistent routing determinations. Preferably, the originating node chooses a routing policy for a packet, and all intermediate nodes in the path route the packet according to that policy. Policies may be rotated on a round-robin basis, selected by generating a random number, or otherwise varied.
Near real-time traffic routing

NASA Technical Reports Server (NTRS)

Yang, Chaowei (Inventor); Xie, Jibo (Inventor); Zhou, Bin (Inventor); Cao, Ying (Inventor)

2012-01-01

A near real-time physical transportation network routing system comprising: a traffic simulation computing grid and a dynamic traffic routing service computing grid. The traffic simulator produces traffic network travel time predictions for a physical transportation network using a traffic simulation model and common input data. The physical transportation network is divided into a multiple sections. Each section has a primary zone and a buffer zone. The traffic simulation computing grid includes multiple of traffic simulation computing nodes. The common input data includes static network characteristics, an origin-destination data table, dynamic traffic information data and historical traffic data. The dynamic traffic routing service computing grid includes multiple dynamic traffic routing computing nodes and generates traffic route(s) using the traffic network travel time predictions.
BridgeRank: A novel fast centrality measure based on local structure of the network

NASA Astrophysics Data System (ADS)

Salavati, Chiman; Abdollahpouri, Alireza; Manbari, Zhaleh

2018-04-01

Ranking nodes in complex networks have become an important task in many application domains. In a complex network, influential nodes are those that have the most spreading ability. Thus, identifying influential nodes based on their spreading ability is a fundamental task in different applications such as viral marketing. One of the most important centrality measures to ranking nodes is closeness centrality which is efficient but suffers from high computational complexity O(n3) . This paper tries to improve closeness centrality by utilizing the local structure of nodes and presents a new ranking algorithm, called BridgeRank centrality. The proposed method computes local centrality value for each node. For this purpose, at first, communities are detected and the relationship between communities is completely ignored. Then, by applying a centrality in each community, only one best critical node from each community is extracted. Finally, the nodes are ranked based on computing the sum of the shortest path length of nodes to obtained critical nodes. We have also modified the proposed method by weighting the original BridgeRank and selecting several nodes from each community based on the density of that community. Our method can find the best nodes with high spread ability and low time complexity, which make it applicable to large-scale networks. To evaluate the performance of the proposed method, we use the SIR diffusion model. Finally, experiments on real and artificial networks show that our method is able to identify influential nodes so efficiently, and achieves better performance compared to other recent methods.
Monte Carlo simulation of photon migration in a cloud computing environment with MapReduce

PubMed Central

Pratx, Guillem; Xing, Lei

2011-01-01

Monte Carlo simulation is considered the most reliable method for modeling photon migration in heterogeneous media. However, its widespread use is hindered by the high computational cost. The purpose of this work is to report on our implementation of a simple MapReduce method for performing fault-tolerant Monte Carlo computations in a massively-parallel cloud computing environment. We ported the MC321 Monte Carlo package to Hadoop, an open-source MapReduce framework. In this implementation, Map tasks compute photon histories in parallel while a Reduce task scores photon absorption. The distributed implementation was evaluated on a commercial compute cloud. The simulation time was found to be linearly dependent on the number of photons and inversely proportional to the number of nodes. For a cluster size of 240 nodes, the simulation of 100 billion photon histories took 22 min, a 1258 × speed-up compared to the single-threaded Monte Carlo program. The overall computational throughput was 85,178 photon histories per node per second, with a latency of 100 s. The distributed simulation produced the same output as the original implementation and was resilient to hardware failure: the correctness of the simulation was unaffected by the shutdown of 50% of the nodes. PMID:22191916
Critical phenomena in communication/computation networks with various topologies and suboptimal to optimal resource allocation

NASA Astrophysics Data System (ADS)

Cogoni, Marco; Busonera, Giovanni; Anedda, Paolo; Zanetti, Gianluigi

2015-01-01

We generalize previous studies on critical phenomena in communication networks [1,2] by adding computational capabilities to the nodes. In our model, a set of tasks with random origin, destination and computational structure is distributed on a computational network, modeled as a graph. By varying the temperature of a Metropolis Montecarlo, we explore the global latency for an optimal to suboptimal resource assignment at a given time instant. By computing the two-point correlation function for the local overload, we study the behavior of the correlation distance (both for links and nodes) while approaching the congested phase: a transition from peaked to spread g(r) is seen above a critical (Montecarlo) temperature Tc. The average latency trend of the system is predicted by averaging over several network traffic realizations while maintaining a spatially detailed information for each node: a sharp decrease of performance is found over Tc independently of the workload. The globally optimized computational resource allocation and network routing defines a baseline for a future comparison of the transition behavior with respect to existing routing strategies [3,4] for different network topologies.
Performance and scalability evaluation of "Big Memory" on Blue Gene Linux.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yoshii, K.; Iskra, K.; Naik, H.

2011-05-01

We address memory performance issues observed in Blue Gene Linux and discuss the design and implementation of 'Big Memory' - an alternative, transparent memory space introduced to eliminate the memory performance issues. We evaluate the performance of Big Memory using custom memory benchmarks, NAS Parallel Benchmarks, and the Parallel Ocean Program, at a scale of up to 4,096 nodes. We find that Big Memory successfully resolves the performance issues normally encountered in Blue Gene Linux. For the ocean simulation program, we even find that Linux with Big Memory provides better scalability than does the lightweight compute node kernel designed solelymore » for high-performance applications. Originally intended exclusively for compute node tasks, our new memory subsystem dramatically improves the performance of certain I/O node applications as well. We demonstrate this performance using the central processor of the LOw Frequency ARray radio telescope as an example.« less
Localization Algorithm Based on a Spring Model (LASM) for Large Scale Wireless Sensor Networks.

PubMed

Chen, Wanming; Mei, Tao; Meng, Max Q-H; Liang, Huawei; Liu, Yumei; Li, Yangming; Li, Shuai

2008-03-15

A navigation method for a lunar rover based on large scale wireless sensornetworks is proposed. To obtain high navigation accuracy and large exploration area, highnode localization accuracy and large network scale are required. However, thecomputational and communication complexity and time consumption are greatly increasedwith the increase of the network scales. A localization algorithm based on a spring model(LASM) method is proposed to reduce the computational complexity, while maintainingthe localization accuracy in large scale sensor networks. The algorithm simulates thedynamics of physical spring system to estimate the positions of nodes. The sensor nodesare set as particles with masses and connected with neighbor nodes by virtual springs. Thevirtual springs will force the particles move to the original positions, the node positionscorrespondingly, from the randomly set positions. Therefore, a blind node position can bedetermined from the LASM algorithm by calculating the related forces with the neighbornodes. The computational and communication complexity are O(1) for each node, since thenumber of the neighbor nodes does not increase proportionally with the network scale size.Three patches are proposed to avoid local optimization, kick out bad nodes and deal withnode variation. Simulation results show that the computational and communicationcomplexity are almost constant despite of the increase of the network scale size. The time consumption has also been proven to remain almost constant since the calculation steps arealmost unrelated with the network scale size.
Aggregating job exit statuses of a plurality of compute nodes executing a parallel application

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aho, Michael E.; Attinella, John E.; Gooding, Thomas M.

Aggregating job exit statuses of a plurality of compute nodes executing a parallel application, including: identifying a subset of compute nodes in the parallel computer to execute the parallel application; selecting one compute node in the subset of compute nodes in the parallel computer as a job leader compute node; initiating execution of the parallel application on the subset of compute nodes; receiving an exit status from each compute node in the subset of compute nodes, where the exit status for each compute node includes information describing execution of some portion of the parallel application by the compute node; aggregatingmore » each exit status from each compute node in the subset of compute nodes; and sending an aggregated exit status for the subset of compute nodes in the parallel computer.« less
Distributing an executable job load file to compute nodes in a parallel computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gooding, Thomas M.

Distributing an executable job load file to compute nodes in a parallel computer, the parallel computer comprising a plurality of compute nodes, including: determining, by a compute node in the parallel computer, whether the compute node is participating in a job; determining, by the compute node in the parallel computer, whether a descendant compute node is participating in the job; responsive to determining that the compute node is participating in the job or that the descendant compute node is participating in the job, communicating, by the compute node to a parent compute node, an identification of a data communications linkmore » over which the compute node receives data from the parent compute node; constructing a class route for the job, wherein the class route identifies all compute nodes participating in the job; and broadcasting the executable load file for the job along the class route for the job.« less
Parallel Calculations in LS-DYNA

NASA Astrophysics Data System (ADS)

Vartanovich Mkrtychev, Oleg; Aleksandrovich Reshetov, Andrey

2017-11-01

Nowadays, structural mechanics exhibits a trend towards numeric solutions being found for increasingly extensive and detailed tasks, which requires that capacities of computing systems be enhanced. Such enhancement can be achieved by different means. E.g., in case a computing system is represented by a workstation, its components can be replaced and/or extended (CPU, memory etc.). In essence, such modification eventually entails replacement of the entire workstation, i.e. replacement of certain components necessitates exchange of others (faster CPUs and memory devices require buses with higher throughput etc.). Special consideration must be given to the capabilities of modern video cards. They constitute powerful computing systems capable of running data processing in parallel. Interestingly, the tools originally designed to render high-performance graphics can be applied for solving problems not immediately related to graphics (CUDA, OpenCL, Shaders etc.). However, not all software suites utilize video cards’ capacities. Another way to increase capacity of a computing system is to implement a cluster architecture: to add cluster nodes (workstations) and to increase the network communication speed between the nodes. The advantage of this approach is extensive growth due to which a quite powerful system can be obtained by combining not particularly powerful nodes. Moreover, separate nodes may possess different capacities. This paper considers the use of a clustered computing system for solving problems of structural mechanics with LS-DYNA software. To establish a range of dependencies a mere 2-node cluster has proven sufficient.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Gooding, Thomas M.

Distributing an executable job load file to compute nodes in a parallel computer, the parallel computer comprising a plurality of compute nodes, including: determining, by a compute node in the parallel computer, whether the compute node is participating in a job; determining, by the compute node in the parallel computer, whether a descendant compute node is participating in the job; responsive to determining that the compute node is participating in the job or that the descendant compute node is participating in the job, communicating, by the compute node to a parent compute node, an identification of a data communications linkmore » over which the compute node receives data from the parent compute node; constructing a class route for the job, wherein the class route identifies all compute nodes participating in the job; and broadcasting the executable load file for the job along the class route for the job.« less
Time Series Analysis for Spatial Node Selection in Environment Monitoring Sensor Networks

PubMed Central

Bhandari, Siddhartha; Jurdak, Raja; Kusy, Branislav

2017-01-01

Wireless sensor networks are widely used in environmental monitoring. The number of sensor nodes to be deployed will vary depending on the desired spatio-temporal resolution. Selecting an optimal number, position and sampling rate for an array of sensor nodes in environmental monitoring is a challenging question. Most of the current solutions are either theoretical or simulation-based where the problems are tackled using random field theory, computational geometry or computer simulations, limiting their specificity to a given sensor deployment. Using an empirical dataset from a mine rehabilitation monitoring sensor network, this work proposes a data-driven approach where co-integrated time series analysis is used to select the number of sensors from a short-term deployment of a larger set of potential node positions. Analyses conducted on temperature time series show 75% of sensors are co-integrated. Using only 25% of the original nodes can generate a complete dataset within a 0.5 °C average error bound. Our data-driven approach to sensor position selection is applicable for spatiotemporal monitoring of spatially correlated environmental parameters to minimize deployment cost without compromising data resolution. PMID:29271880
Scheduling applications for execution on a plurality of compute nodes of a parallel computer to manage temperature of the nodes during execution

DOEpatents

Archer, Charles J; Blocksome, Michael A; Peters, Amanda E; Ratterman, Joseph D; Smith, Brian E

2012-10-16

Methods, apparatus, and products are disclosed for scheduling applications for execution on a plurality of compute nodes of a parallel computer to manage temperature of the plurality of compute nodes during execution that include: identifying one or more applications for execution on the plurality of compute nodes; creating a plurality of physically discontiguous node partitions in dependence upon temperature characteristics for the compute nodes and a physical topology for the compute nodes, each discontiguous node partition specifying a collection of physically adjacent compute nodes; and assigning, for each application, that application to one or more of the discontiguous node partitions for execution on the compute nodes specified by the assigned discontiguous node partitions.
Synchronizing compute node time bases in a parallel computer

DOEpatents

Chen, Dong; Faraj, Daniel A; Gooding, Thomas M; Heidelberger, Philip

2015-01-27

Synchronizing time bases in a parallel computer that includes compute nodes organized for data communications in a tree network, where one compute node is designated as a root, and, for each compute node: calculating data transmission latency from the root to the compute node; configuring a thread as a pulse waiter; initializing a wakeup unit; and performing a local barrier operation; upon each node completing the local barrier operation, entering, by all compute nodes, a global barrier operation; upon all nodes entering the global barrier operation, sending, to all the compute nodes, a pulse signal; and for each compute node upon receiving the pulse signal: waking, by the wakeup unit, the pulse waiter; setting a time base for the compute node equal to the data transmission latency between the root node and the compute node; and exiting the global barrier operation.
Synchronizing compute node time bases in a parallel computer

DOEpatents

Chen, Dong; Faraj, Daniel A; Gooding, Thomas M; Heidelberger, Philip

2014-12-30

Synchronizing time bases in a parallel computer that includes compute nodes organized for data communications in a tree network, where one compute node is designated as a root, and, for each compute node: calculating data transmission latency from the root to the compute node; configuring a thread as a pulse waiter; initializing a wakeup unit; and performing a local barrier operation; upon each node completing the local barrier operation, entering, by all compute nodes, a global barrier operation; upon all nodes entering the global barrier operation, sending, to all the compute nodes, a pulse signal; and for each compute node upon receiving the pulse signal: waking, by the wakeup unit, the pulse waiter; setting a time base for the compute node equal to the data transmission latency between the root node and the compute node; and exiting the global barrier operation.
Fragmenting networks by targeting collective influencers at a mesoscopic level.

PubMed

Kobayashi, Teruyoshi; Masuda, Naoki

2016-11-25

A practical approach to protecting networks against epidemic processes such as spreading of infectious diseases, malware, and harmful viral information is to remove some influential nodes beforehand to fragment the network into small components. Because determining the optimal order to remove nodes is a computationally hard problem, various approximate algorithms have been proposed to efficiently fragment networks by sequential node removal. Morone and Makse proposed an algorithm employing the non-backtracking matrix of given networks, which outperforms various existing algorithms. In fact, many empirical networks have community structure, compromising the assumption of local tree-like structure on which the original algorithm is based. We develop an immunization algorithm by synergistically combining the Morone-Makse algorithm and coarse graining of the network in which we regard a community as a supernode. In this way, we aim to identify nodes that connect different communities at a reasonable computational cost. The proposed algorithm works more efficiently than the Morone-Makse and other algorithms on networks with community structure.
Fragmenting networks by targeting collective influencers at a mesoscopic level

NASA Astrophysics Data System (ADS)

Kobayashi, Teruyoshi; Masuda, Naoki

2016-11-01

A practical approach to protecting networks against epidemic processes such as spreading of infectious diseases, malware, and harmful viral information is to remove some influential nodes beforehand to fragment the network into small components. Because determining the optimal order to remove nodes is a computationally hard problem, various approximate algorithms have been proposed to efficiently fragment networks by sequential node removal. Morone and Makse proposed an algorithm employing the non-backtracking matrix of given networks, which outperforms various existing algorithms. In fact, many empirical networks have community structure, compromising the assumption of local tree-like structure on which the original algorithm is based. We develop an immunization algorithm by synergistically combining the Morone-Makse algorithm and coarse graining of the network in which we regard a community as a supernode. In this way, we aim to identify nodes that connect different communities at a reasonable computational cost. The proposed algorithm works more efficiently than the Morone-Makse and other algorithms on networks with community structure.
Fragmenting networks by targeting collective influencers at a mesoscopic level

PubMed Central

Kobayashi, Teruyoshi; Masuda, Naoki

2016-01-01

A practical approach to protecting networks against epidemic processes such as spreading of infectious diseases, malware, and harmful viral information is to remove some influential nodes beforehand to fragment the network into small components. Because determining the optimal order to remove nodes is a computationally hard problem, various approximate algorithms have been proposed to efficiently fragment networks by sequential node removal. Morone and Makse proposed an algorithm employing the non-backtracking matrix of given networks, which outperforms various existing algorithms. In fact, many empirical networks have community structure, compromising the assumption of local tree-like structure on which the original algorithm is based. We develop an immunization algorithm by synergistically combining the Morone-Makse algorithm and coarse graining of the network in which we regard a community as a supernode. In this way, we aim to identify nodes that connect different communities at a reasonable computational cost. The proposed algorithm works more efficiently than the Morone-Makse and other algorithms on networks with community structure. PMID:27886251
Machine learning based Intelligent cognitive network using fog computing

NASA Astrophysics Data System (ADS)

Lu, Jingyang; Li, Lun; Chen, Genshe; Shen, Dan; Pham, Khanh; Blasch, Erik

2017-05-01

In this paper, a Cognitive Radio Network (CRN) based on artificial intelligence is proposed to distribute the limited radio spectrum resources more efficiently. The CRN framework can analyze the time-sensitive signal data close to the signal source using fog computing with different types of machine learning techniques. Depending on the computational capabilities of the fog nodes, different features and machine learning techniques are chosen to optimize spectrum allocation. Also, the computing nodes send the periodic signal summary which is much smaller than the original signal to the cloud so that the overall system spectrum source allocation strategies are dynamically updated. Applying fog computing, the system is more adaptive to the local environment and robust to spectrum changes. As most of the signal data is processed at the fog level, it further strengthens the system security by reducing the communication burden of the communications network.
Providing full point-to-point communications among compute nodes of an operational group in a global combining network of a parallel computer

DOEpatents

Archer, Charles J; Faraj, Ahmad A; Inglett, Todd A; Ratterman, Joseph D

2013-04-16

Methods, apparatus, and products are disclosed for providing full point-to-point communications among compute nodes of an operational group in a global combining network of a parallel computer, each compute node connected to each adjacent compute node in the global combining network through a link, that include: receiving a network packet in a compute node, the network packet specifying a destination compute node; selecting, in dependence upon the destination compute node, at least one of the links for the compute node along which to forward the network packet toward the destination compute node; and forwarding the network packet along the selected link to the adjacent compute node connected to the compute node through the selected link.

Providing full point-to-point communications among compute nodes of an operational group in a global combining network of a parallel computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Archer, Charles J.; Faraj, Daniel A.; Inglett, Todd A.

Methods, apparatus, and products are disclosed for providing full point-to-point communications among compute nodes of an operational group in a global combining network of a parallel computer, each compute node connected to each adjacent compute node in the global combining network through a link, that include: receiving a network packet in a compute node, the network packet specifying a destination compute node; selecting, in dependence upon the destination compute node, at least one of the links for the compute node along which to forward the network packet toward the destination compute node; and forwarding the network packet along the selectedmore » link to the adjacent compute node connected to the compute node through the selected link.« less
Performing process migration with allreduce operations

DOEpatents

Archer, Charles Jens; Peters, Amanda; Wallenfelt, Brian Paul

2010-12-14

Compute nodes perform allreduce operations that swap processes at nodes. A first allreduce operation generates a first result and uses a first process from a first compute node, a second process from a second compute node, and zeros from other compute nodes. The first compute node replaces the first process with the first result. A second allreduce operation generates a second result and uses the first result from the first compute node, the second process from the second compute node, and zeros from others. The second compute node replaces the second process with the second result, which is the first process. A third allreduce operation generates a third result and uses the first result from first compute node, the second result from the second compute node, and zeros from others. The first compute node replaces the first result with the third result, which is the second process.
Fault tolerant hypercube computer system architecture

NASA Technical Reports Server (NTRS)

Madan, Herb S. (Inventor); Chow, Edward (Inventor)

1989-01-01

A fault-tolerant multiprocessor computer system of the hypercube type comprising a hierarchy of computers of like kind which can be functionally substituted for one another as necessary is disclosed. Communication between the working nodes is via one communications network while communications between the working nodes and watch dog nodes and load balancing nodes higher in the structure is via another communications network separate from the first. A typical branch of the hierarchy reporting to a master node or host computer comprises, a plurality of first computing nodes; a first network of message conducting paths for interconnecting the first computing nodes as a hypercube. The first network provides a path for message transfer between the first computing nodes; a first watch dog node; and a second network of message connecting paths for connecting the first computing nodes to the first watch dog node independent from the first network, the second network provides an independent path for test message and reconfiguration affecting transfers between the first computing nodes and the first switch watch dog node. There is additionally, a plurality of second computing nodes; a third network of message conducting paths for interconnecting the second computing nodes as a hypercube. The third network provides a path for message transfer between the second computing nodes; a fourth network of message conducting paths for connecting the second computing nodes to the first watch dog node independent from the third network. The fourth network provides an independent path for test message and reconfiguration affecting transfers between the second computing nodes and the first watch dog node; and a first multiplexer disposed between the first watch dog node and the second and fourth networks for allowing the first watch dog node to selectively communicate with individual ones of the computing nodes through the second and fourth networks; as well as, a second watch dog node operably connected to the first multiplexer whereby the second watch dog node can selectively communicate with individual ones of the computing nodes through the second and fourth networks. The branch is completed by a first load balancing node; and a second multiplexer connected between the first load balancing node and the first and second watch dog nodes, allowing the first load balancing node to selectively communicate with the first and second watch dog nodes.
Locating hardware faults in a data communications network of a parallel computer

DOEpatents

Archer, Charles J.; Megerian, Mark G.; Ratterman, Joseph D.; Smith, Brian E.

2010-01-12

Hardware faults location in a data communications network of a parallel computer. Such a parallel computer includes a plurality of compute nodes and a data communications network that couples the compute nodes for data communications and organizes the compute node as a tree. Locating hardware faults includes identifying a next compute node as a parent node and a root of a parent test tree, identifying for each child compute node of the parent node a child test tree having the child compute node as root, running a same test suite on the parent test tree and each child test tree, and identifying the parent compute node as having a defective link connected from the parent compute node to a child compute node if the test suite fails on the parent test tree and succeeds on all the child test trees.
Reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application

DOEpatents

Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Peters, Amanda A [Rochester, MN; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN

2012-01-10

Methods, apparatus, and products are disclosed for reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application that include: beginning, by each compute node, performance of a blocking operation specified by the parallel application, each compute node beginning the blocking operation asynchronously with respect to the other compute nodes; reducing, for each compute node, power to one or more hardware components of that compute node in response to that compute node beginning the performance of the blocking operation; and restoring, for each compute node, the power to the hardware components having power reduced in response to all of the compute nodes beginning the performance of the blocking operation.
Reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application

DOEpatents

Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Peters, Amanda E [Cambridge, MA; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN

2012-04-17

Methods, apparatus, and products are disclosed for reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application that include: beginning, by each compute node, performance of a blocking operation specified by the parallel application, each compute node beginning the blocking operation asynchronously with respect to the other compute nodes; reducing, for each compute node, power to one or more hardware components of that compute node in response to that compute node beginning the performance of the blocking operation; and restoring, for each compute node, the power to the hardware components having power reduced in response to all of the compute nodes beginning the performance of the blocking operation.
Identifying failure in a tree network of a parallel computer

DOEpatents

Archer, Charles J.; Pinnow, Kurt W.; Wallenfelt, Brian P.

2010-08-24

Methods, parallel computers, and products are provided for identifying failure in a tree network of a parallel computer. The parallel computer includes one or more processing sets including an I/O node and a plurality of compute nodes. For each processing set embodiments include selecting a set of test compute nodes, the test compute nodes being a subset of the compute nodes of the processing set; measuring the performance of the I/O node of the processing set; measuring the performance of the selected set of test compute nodes; calculating a current test value in dependence upon the measured performance of the I/O node of the processing set, the measured performance of the set of test compute nodes, and a predetermined value for I/O node performance; and comparing the current test value with a predetermined tree performance threshold. If the current test value is below the predetermined tree performance threshold, embodiments include selecting another set of test compute nodes. If the current test value is not below the predetermined tree performance threshold, embodiments include selecting from the test compute nodes one or more potential problem nodes and testing individually potential problem nodes and links to potential problem nodes.
A three-dimensional ground-water-flow model modified to reduce computer-memory requirements and better simulate confining-bed and aquifer pinchouts

USGS Publications Warehouse

Leahy, P.P.

1982-01-01

The Trescott computer program for modeling groundwater flow in three dimensions has been modified to (1) treat aquifer and confining bed pinchouts more realistically and (2) reduce the computer memory requirements needed for the input data. Using the original program, simulation of aquifer systems with nonrectangular external boundaries may result in a large number of nodes that are not involved in the numerical solution of the problem, but require computer storage. (USGS)
GAS eleven node thermal model (GEM)

NASA Technical Reports Server (NTRS)

Butler, Dan

1988-01-01

The Eleven Node Thermal Model (GEM) of the Get Away Special (GAS) container was originally developed based on the results of thermal tests of the GAS container. The model was then used in the thermal analysis and design of several NASA/GSFC GAS experiments, including the Flight Verification Payload, the Ultraviolet Experiment, and the Capillary Pumped Loop. The model description details the five cu ft container both with and without an insulated end cap. Mass specific heat values are also given so that transient analyses can be performed. A sample problem for each configuration is included as well so that GEM users can verify their computations. The model can be run on most personal computers with a thermal analyzer solution routine.
Providing nearest neighbor point-to-point communications among compute nodes of an operational group in a global combining network of a parallel computer

DOEpatents

Archer, Charles J.; Faraj, Ahmad A.; Inglett, Todd A.; Ratterman, Joseph D.

2012-10-23

Methods, apparatus, and products are disclosed for providing nearest neighbor point-to-point communications among compute nodes of an operational group in a global combining network of a parallel computer, each compute node connected to each adjacent compute node in the global combining network through a link, that include: identifying each link in the global combining network for each compute node of the operational group; designating one of a plurality of point-to-point class routing identifiers for each link such that no compute node in the operational group is connected to two adjacent compute nodes in the operational group with links designated for the same class routing identifiers; and configuring each compute node of the operational group for point-to-point communications with each adjacent compute node in the global combining network through the link between that compute node and that adjacent compute node using that link's designated class routing identifier.
Identifying a largest logical plane from a plurality of logical planes formed of compute nodes of a subcommunicator in a parallel computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Davis, Kristan D.; Faraj, Daniel A.

In a parallel computer, a largest logical plane from a plurality of logical planes formed of compute nodes of a subcommunicator may be identified by: identifying, by each compute node of the subcommunicator, all logical planes that include the compute node; calculating, by each compute node for each identified logical plane that includes the compute node, an area of the identified logical plane; initiating, by a root node of the subcommunicator, a gather operation; receiving, by the root node from each compute node of the subcommunicator, each node's calculated areas as contribution data to the gather operation; and identifying, bymore » the root node in dependence upon the received calculated areas, a logical plane of the subcommunicator having the greatest area.« less
Configuring compute nodes of a parallel computer in an operational group into a plurality of independent non-overlapping collective networks

DOEpatents

Archer, Charles J.; Inglett, Todd A.; Ratterman, Joseph D.; Smith, Brian E.

2010-03-02

Methods, apparatus, and products are disclosed for configuring compute nodes of a parallel computer in an operational group into a plurality of independent non-overlapping collective networks, the compute nodes in the operational group connected together for data communications through a global combining network, that include: partitioning the compute nodes in the operational group into a plurality of non-overlapping subgroups; designating one compute node from each of the non-overlapping subgroups as a master node; and assigning, to the compute nodes in each of the non-overlapping subgroups, class routing instructions that organize the compute nodes in that non-overlapping subgroup as a collective network such that the master node is a physical root.
Collectively loading an application in a parallel computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aho, Michael E.; Attinella, John E.; Gooding, Thomas M.

Collectively loading an application in a parallel computer, the parallel computer comprising a plurality of compute nodes, including: identifying, by a parallel computer control system, a subset of compute nodes in the parallel computer to execute a job; selecting, by the parallel computer control system, one of the subset of compute nodes in the parallel computer as a job leader compute node; retrieving, by the job leader compute node from computer memory, an application for executing the job; and broadcasting, by the job leader to the subset of compute nodes in the parallel computer, the application for executing the job.
Node Self-Deployment Algorithm Based on an Uneven Cluster with Radius Adjusting for Underwater Sensor Networks

PubMed Central

Jiang, Peng; Xu, Yiming; Wu, Feng

2016-01-01

Existing move-restricted node self-deployment algorithms are based on a fixed node communication radius, evaluate the performance based on network coverage or the connectivity rate and do not consider the number of nodes near the sink node and the energy consumption distribution of the network topology, thereby degrading network reliability and the energy consumption balance. Therefore, we propose a distributed underwater node self-deployment algorithm. First, each node begins the uneven clustering based on the distance on the water surface. Each cluster head node selects its next-hop node to synchronously construct a connected path to the sink node. Second, the cluster head node adjusts its depth while maintaining the layout formed by the uneven clustering and then adjusts the positions of in-cluster nodes. The algorithm originally considers the network reliability and energy consumption balance during node deployment and considers the coverage redundancy rate of all positions that a node may reach during the node position adjustment. Simulation results show, compared to the connected dominating set (CDS) based depth computation algorithm, that the proposed algorithm can increase the number of the nodes near the sink node and improve network reliability while guaranteeing the network connectivity rate. Moreover, it can balance energy consumption during network operation, further improve network coverage rate and reduce energy consumption. PMID:26784193
Paging memory from random access memory to backing storage in a parallel computer

DOEpatents

Archer, Charles J; Blocksome, Michael A; Inglett, Todd A; Ratterman, Joseph D; Smith, Brian E

2013-05-21

Paging memory from random access memory (`RAM`) to backing storage in a parallel computer that includes a plurality of compute nodes, including: executing a data processing application on a virtual machine operating system in a virtual machine on a first compute node; providing, by a second compute node, backing storage for the contents of RAM on the first compute node; and swapping, by the virtual machine operating system in the virtual machine on the first compute node, a page of memory from RAM on the first compute node to the backing storage on the second compute node.
Direct memory access transfer completion notification

DOEpatents

Archer, Charles J. , Blocksome; Michael A. , Parker; Jeffrey, J [Rochester, MN

2011-02-15

Methods, systems, and products are disclosed for DMA transfer completion notification that include: inserting, by an origin DMA on an origin node in an origin injection FIFO, a data descriptor for an application message; inserting, by the origin DMA, a reflection descriptor in the origin injection FIFO, the reflection descriptor specifying a remote get operation for injecting a completion notification descriptor in a reflection injection FIFO on a reflection node; transferring, by the origin DMA to a target node, the message in dependence upon the data descriptor; in response to completing the message transfer, transferring, by the origin DMA to the reflection node, the completion notification descriptor in dependence upon the reflection descriptor; receiving, by the origin DMA from the reflection node, a completion packet; and notifying, by the origin DMA in response to receiving the completion packet, the origin node's processing core that the message transfer is complete.
Dynamically reassigning a connected node to a block of compute nodes for re-launching a failed job

DOE Office of Scientific and Technical Information (OSTI.GOV)

Budnik, Thomas A; Knudson, Brant L; Megerian, Mark G

Methods, systems, and products for dynamically reassigning a connected node to a block of compute nodes for re-launching a failed job that include: identifying that a job failed to execute on the block of compute nodes because connectivity failed between a compute node assigned as at least one of the connected nodes for the block of compute nodes and its supporting I/O node; and re-launching the job, including selecting an alternative connected node that is actively coupled for data communications with an active I/O node; and assigning the alternative connected node as the connected node for the block of computemore » nodes running the re-launched job.« less
Link failure detection in a parallel computer

DOEpatents

Archer, Charles J.; Blocksome, Michael A.; Megerian, Mark G.; Smith, Brian E.

2010-11-09

Methods, apparatus, and products are disclosed for link failure detection in a parallel computer including compute nodes connected in a rectangular mesh network, each pair of adjacent compute nodes in the rectangular mesh network connected together using a pair of links, that includes: assigning each compute node to either a first group or a second group such that adjacent compute nodes in the rectangular mesh network are assigned to different groups; sending, by each of the compute nodes assigned to the first group, a first test message to each adjacent compute node assigned to the second group; determining, by each of the compute nodes assigned to the second group, whether the first test message was received from each adjacent compute node assigned to the first group; and notifying a user, by each of the compute nodes assigned to the second group, whether the first test message was received.
Administering truncated receive functions in a parallel messaging interface

DOEpatents

Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

2014-12-09

Administering truncated receive functions in a parallel messaging interface (`PMI`) of a parallel computer comprising a plurality of compute nodes coupled for data communications through the PMI and through a data communications network, including: sending, through the PMI on a source compute node, a quantity of data from the source compute node to a destination compute node; specifying, by an application on the destination compute node, a portion of the quantity of data to be received by the application on the destination compute node and a portion of the quantity of data to be discarded; receiving, by the PMI on the destination compute node, all of the quantity of data; providing, by the PMI on the destination compute node to the application on the destination compute node, only the portion of the quantity of data to be received by the application; and discarding, by the PMI on the destination compute node, the portion of the quantity of data to be discarded.
Broadcasting collective operation contributions throughout a parallel computer

DOEpatents

Faraj, Ahmad [Rochester, MN

2012-02-21

Methods, systems, and products are disclosed for broadcasting collective operation contributions throughout a parallel computer. The parallel computer includes a plurality of compute nodes connected together through a data communications network. Each compute node has a plurality of processors for use in collective parallel operations on the parallel computer. Broadcasting collective operation contributions throughout a parallel computer according to embodiments of the present invention includes: transmitting, by each processor on each compute node, that processor's collective operation contribution to the other processors on that compute node using intra-node communications; and transmitting on a designated network link, by each processor on each compute node according to a serial processor transmission sequence, that processor's collective operation contribution to the other processors on the other compute nodes using inter-node communications.

Multiple node remote messaging

DOEpatents

Blumrich, Matthias A.; Chen, Dong; Gara, Alan G.; Giampapa, Mark E.; Heidelberger, Philip; Ohmacht, Martin; Salapura, Valentina; Steinmacher-Burow, Burkhard; Vranas, Pavlos

2010-08-31

A method for passing remote messages in a parallel computer system formed as a network of interconnected compute nodes includes that a first compute node (A) sends a single remote message to a remote second compute node (B) in order to control the remote second compute node (B) to send at least one remote message. The method includes various steps including controlling a DMA engine at first compute node (A) to prepare the single remote message to include a first message descriptor and at least one remote message descriptor for controlling the remote second compute node (B) to send at least one remote message, including putting the first message descriptor into an injection FIFO at the first compute node (A) and sending the single remote message and the at least one remote message descriptor to the second compute node (B).
Design analysis and computer-aided performance evaluation of shuttle orbiter electrical power system. Volume 2: SYSTID user's guide

NASA Technical Reports Server (NTRS)

1974-01-01

The manual for the use of the computer program SYSTID under the Univac operating system is presented. The computer program is used in the simulation and evaluation of the space shuttle orbiter electric power supply. The models described in the handbook are those which were available in the original versions of SYSTID. The subjects discussed are: (1) program description, (2) input language, (3) node typing, (4) problem submission, and (5) basic and power system SYSTID libraries.
False-Positive Cases of Fluorodeoxyglucose-Positron Emission Tomography/Computed Tomographic Scans in Metastasis of Esophageal Cancer

PubMed Central

Yamatsuji, Tomoki; Ishida, Naomasa; Takaoka, Munenori; Hayashi, Jiro; Yoshida, Kazuhiro; Shigemitsu, Kaori; Urakami, Atsushi; Haisa, Minoru; Naomoto, Yoshio

2017-01-01

Of 129 esophagectomies at our institute from June 2010 to March 2015, we experienced three preoperative positron emission tomography-computed tomographic (PET/CT) false positives. Bone metastasis was originally suspected in 2 cases, but they were later found to be bone metastasis negative after a preoperative bone biopsy and clinical course observation. The other cases suspected of mediastinal lymph node metastasis were diagnosed as inflammatory lymphadenopathy by a pathological examination of the removed lymph nodes. Conducting a PET/CT is useful when diagnosing esophageal cancer metastasis, but we need to be aware of the possibility of false positives. Therapeutic decisions should be made based on appropriate and accurate diagnoses, with pathological diagnosis actively introduced if necessary. PMID:28469502
Performing an allreduce operation on a plurality of compute nodes of a parallel computer

DOEpatents

Faraj, Ahmad [Rochester, MN

2012-04-17

Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer. Each compute node includes at least two processing cores. Each processing core has contribution data for the allreduce operation. Performing an allreduce operation on a plurality of compute nodes of a parallel computer includes: establishing one or more logical rings among the compute nodes, each logical ring including at least one processing core from each compute node; performing, for each logical ring, a global allreduce operation using the contribution data for the processing cores included in that logical ring, yielding a global allreduce result for each processing core included in that logical ring; and performing, for each compute node, a local allreduce operation using the global allreduce results for each processing core on that compute node.
Methods and apparatus using commutative error detection values for fault isolation in multiple node computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Almasi, Gheorghe; Blumrich, Matthias Augustin; Chen, Dong

Methods and apparatus perform fault isolation in multiple node computing systems using commutative error detection values for--example, checksums--to identify and to isolate faulty nodes. When information associated with a reproducible portion of a computer program is injected into a network by a node, a commutative error detection value is calculated. At intervals, node fault detection apparatus associated with the multiple node computer system retrieve commutative error detection values associated with the node and stores them in memory. When the computer program is executed again by the multiple node computer system, new commutative error detection values are created and stored inmore » memory. The node fault detection apparatus identifies faulty nodes by comparing commutative error detection values associated with reproducible portions of the application program generated by a particular node from different runs of the application program. Differences in values indicate a possible faulty node.« less
Identifying logical planes formed of compute nodes of a subcommunicator in a parallel computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Davis, Kristan D.; Faraj, Daniel

In a parallel computer, a plurality of logical planes formed of compute nodes of a subcommunicator may be identified by: for each compute node of the subcommunicator and for a number of dimensions beginning with a first dimension: establishing, by a plane building node, in a positive direction of the first dimension, all logical planes that include the plane building node and compute nodes of the subcommunicator in a positive direction of a second dimension, where the second dimension is orthogonal to the first dimension; and establishing, by the plane building node, in a negative direction of the first dimension,more » all logical planes that include the plane building node and compute nodes of the subcommunicator in the positive direction of the second dimension.« less
Constructing a logical, regular axis topology from an irregular topology

DOEpatents

Faraj, Daniel A.

2014-07-22

Constructing a logical regular topology from an irregular topology including, for each axial dimension and recursively, for each compute node in a subcommunicator until returning to a first node: adding to a logical line of the axial dimension a neighbor specified in a nearest neighbor list; calling the added compute node; determining, by the called node, whether any neighbor in the node's nearest neighbor list is available to add to the logical line; if a neighbor in the called compute node's nearest neighbor list is available to add to the logical line, adding, by the called compute node to the logical line, any neighbor in the called compute node's nearest neighbor list for the axial dimension not already added to the logical line; and, if no neighbor in the called compute node's nearest neighbor list is available to add to the logical line, returning to the calling compute node.
Constructing a logical, regular axis topology from an irregular topology

DOEpatents

Faraj, Daniel A.

2014-07-01

Constructing a logical regular topology from an irregular topology including, for each axial dimension and recursively, for each compute node in a subcommunicator until returning to a first node: adding to a logical line of the axial dimension a neighbor specified in a nearest neighbor list; calling the added compute node; determining, by the called node, whether any neighbor in the node's nearest neighbor list is available to add to the logical line; if a neighbor in the called compute node's nearest neighbor list is available to add to the logical line, adding, by the called compute node to the logical line, any neighbor in the called compute node's nearest neighbor list for the axial dimension not already added to the logical line; and, if no neighbor in the called compute node's nearest neighbor list is available to add to the logical line, returning to the calling compute node.
Identifying messaging completion in a parallel computer by checking for change in message received and transmitted count at each node

DOEpatents

Archer, Charles J [Rochester, MN; Hardwick, Camesha R [Fayetteville, NC; McCarthy, Patrick J [Rochester, MN; Wallenfelt, Brian P [Eden Prairie, MN

2009-06-23

Methods, parallel computers, and products are provided for identifying messaging completion on a parallel computer. The parallel computer includes a plurality of compute nodes, the compute nodes coupled for data communications by at least two independent data communications networks including a binary tree data communications network optimal for collective operations that organizes the nodes as a tree and a torus data communications network optimal for point to point operations that organizes the nodes as a torus. Embodiments include reading all counters at each node of the torus data communications network; calculating at each node a current node value in dependence upon the values read from the counters at each node; and determining for all nodes whether the current node value for each node is the same as a previously calculated node value for each node. If the current node is the same as the previously calculated node value for all nodes of the torus data communications network, embodiments include determining that messaging is complete and if the current node is not the same as the previously calculated node value for all nodes of the torus data communications network, embodiments include determining that messaging is currently incomplete.
Executing a gather operation on a parallel computer

DOEpatents

Archer, Charles J [Rochester, MN; Ratterman, Joseph D [Rochester, MN

2012-03-20

Methods, apparatus, and computer program products are disclosed for executing a gather operation on a parallel computer according to embodiments of the present invention. Embodiments include configuring, by the logical root, a result buffer or the logical root, the result buffer having positions, each position corresponding to a ranked node in the operational group and for storing contribution data gathered from that ranked node. Embodiments also include repeatedly for each position in the result buffer: determining, by each compute node of an operational group, whether the current position in the result buffer corresponds with the rank of the compute node, if the current position in the result buffer corresponds with the rank of the compute node, contributing, by that compute node, the compute node's contribution data, if the current position in the result buffer does not correspond with the rank of the compute node, contributing, by that compute node, a value of zero for the contribution data, and storing, by the logical root in the current position in the result buffer, results of a bitwise OR operation of all the contribution data by all compute nodes of the operational group for the current position, the results received through the global combining network.
Ultrafast and scalable cone-beam CT reconstruction using MapReduce in a cloud computing environment.

PubMed

Meng, Bowen; Pratx, Guillem; Xing, Lei

2011-12-01

Four-dimensional CT (4DCT) and cone beam CT (CBCT) are widely used in radiation therapy for accurate tumor target definition and localization. However, high-resolution and dynamic image reconstruction is computationally demanding because of the large amount of data processed. Efficient use of these imaging techniques in the clinic requires high-performance computing. The purpose of this work is to develop a novel ultrafast, scalable and reliable image reconstruction technique for 4D CBCT∕CT using a parallel computing framework called MapReduce. We show the utility of MapReduce for solving large-scale medical physics problems in a cloud computing environment. In this work, we accelerated the Feldcamp-Davis-Kress (FDK) algorithm by porting it to Hadoop, an open-source MapReduce implementation. Gated phases from a 4DCT scans were reconstructed independently. Following the MapReduce formalism, Map functions were used to filter and backproject subsets of projections, and Reduce function to aggregate those partial backprojection into the whole volume. MapReduce automatically parallelized the reconstruction process on a large cluster of computer nodes. As a validation, reconstruction of a digital phantom and an acquired CatPhan 600 phantom was performed on a commercial cloud computing environment using the proposed 4D CBCT∕CT reconstruction algorithm. Speedup of reconstruction time is found to be roughly linear with the number of nodes employed. For instance, greater than 10 times speedup was achieved using 200 nodes for all cases, compared to the same code executed on a single machine. Without modifying the code, faster reconstruction is readily achievable by allocating more nodes in the cloud computing environment. Root mean square error between the images obtained using MapReduce and a single-threaded reference implementation was on the order of 10(-7). Our study also proved that cloud computing with MapReduce is fault tolerant: the reconstruction completed successfully with identical results even when half of the nodes were manually terminated in the middle of the process. An ultrafast, reliable and scalable 4D CBCT∕CT reconstruction method was developed using the MapReduce framework. Unlike other parallel computing approaches, the parallelization and speedup required little modification of the original reconstruction code. MapReduce provides an efficient and fault tolerant means of solving large-scale computing problems in a cloud computing environment.
Ultrafast and scalable cone-beam CT reconstruction using MapReduce in a cloud computing environment

PubMed Central

Meng, Bowen; Pratx, Guillem; Xing, Lei

2011-01-01

Purpose: Four-dimensional CT (4DCT) and cone beam CT (CBCT) are widely used in radiation therapy for accurate tumor target definition and localization. However, high-resolution and dynamic image reconstruction is computationally demanding because of the large amount of data processed. Efficient use of these imaging techniques in the clinic requires high-performance computing. The purpose of this work is to develop a novel ultrafast, scalable and reliable image reconstruction technique for 4D CBCT/CT using a parallel computing framework called MapReduce. We show the utility of MapReduce for solving large-scale medical physics problems in a cloud computing environment. Methods: In this work, we accelerated the Feldcamp–Davis–Kress (FDK) algorithm by porting it to Hadoop, an open-source MapReduce implementation. Gated phases from a 4DCT scans were reconstructed independently. Following the MapReduce formalism, Map functions were used to filter and backproject subsets of projections, and Reduce function to aggregate those partial backprojection into the whole volume. MapReduce automatically parallelized the reconstruction process on a large cluster of computer nodes. As a validation, reconstruction of a digital phantom and an acquired CatPhan 600 phantom was performed on a commercial cloud computing environment using the proposed 4D CBCT/CT reconstruction algorithm. Results: Speedup of reconstruction time is found to be roughly linear with the number of nodes employed. For instance, greater than 10 times speedup was achieved using 200 nodes for all cases, compared to the same code executed on a single machine. Without modifying the code, faster reconstruction is readily achievable by allocating more nodes in the cloud computing environment. Root mean square error between the images obtained using MapReduce and a single-threaded reference implementation was on the order of 10−7. Our study also proved that cloud computing with MapReduce is fault tolerant: the reconstruction completed successfully with identical results even when half of the nodes were manually terminated in the middle of the process. Conclusions: An ultrafast, reliable and scalable 4D CBCT/CT reconstruction method was developed using the MapReduce framework. Unlike other parallel computing approaches, the parallelization and speedup required little modification of the original reconstruction code. MapReduce provides an efficient and fault tolerant means of solving large-scale computing problems in a cloud computing environment. PMID:22149842
Securing Provenance of Distributed Processes in an Untrusted Environment

NASA Astrophysics Data System (ADS)

Syalim, Amril; Nishide, Takashi; Sakurai, Kouichi

Recently, there is much concern about the provenance of distributed processes, that is about the documentation of the origin and the processes to produce an object in a distributed system. The provenance has many applications in the forms of medical records, documentation of processes in the computer systems, recording the origin of data in the cloud, and also documentation of human-executed processes. The provenance of distributed processes can be modeled by a directed acyclic graph (DAG) where each node represents an entity, and an edge represents the origin and causal relationship between entities. Without sufficient security mechanisms, the provenance graph suffers from integrity and confidentiality problems, for example changes or deletions of the correct nodes, additions of fake nodes and edges, and unauthorized accesses to the sensitive nodes and edges. In this paper, we propose an integrity mechanism for provenance graph using the digital signature involving three parties: the process executors who are responsible in the nodes' creation, a provenance owner that records the nodes to the provenance store, and a trusted party that we call the Trusted Counter Server (TCS) that records the number of nodes stored by the provenance owner. We show that the mechanism can detect the integrity problem in the provenance graph, namely unauthorized and malicious “authorized” updates even if all the parties, except the TCS, collude to update the provenance. In this scheme, the TCS only needs a very minimal storage (linear with the number of the provenance owners). To protect the confidentiality and for an efficient access control administration, we propose a method to encrypt the provenance graph that allows access by paths and compartments in the provenance graph. We argue that encryption is important as a mechanism to protect the provenance data stored in an untrusted environment. We analyze the security of the integrity mechanism, and perform experiments to measure the performance of both mechanisms.
Data communications in a parallel active messaging interface of a parallel computer

DOEpatents

Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

2015-02-03

Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI and through data communications resources, including receiving in an origin endpoint of the PAMI a SEND instruction, the SEND instruction specifying a transmission of transfer data from the origin endpoint to a first target endpoint; transmitting from the origin endpoint to the first target endpoint a Request-To-Send (`RTS`) message advising the first target endpoint of the location and size of the transfer data; assigning by the first target endpoint to each of a plurality of target endpoints separate portions of the transfer data; and receiving by the plurality of target endpoints the transfer data.
Data communications in a parallel active messaging interface of a parallel computer

DOEpatents

Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

2014-11-18

Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI and through data communications resources, including receiving in an origin endpoint of the PAMI a SEND instruction, the SEND instruction specifying a transmission of transfer data from the origin endpoint to a first target endpoint; transmitting from the origin endpoint to the first target endpoint a Request-To-Send (`RTS`) message advising the first target endpoint of the location and size of the transfer data; assigning by the first target endpoint to each of a plurality of target endpoints separate portions of the transfer data; and receiving by the plurality of target endpoints the transfer data.
Hybrid data storage system in an HPC exascale environment

DOEpatents

Bent, John M.; Faibish, Sorin; Gupta, Uday K.; Tzelnic, Percy; Ting, Dennis P. J.

2015-08-18

A computer-executable method, system, and computer program product for managing I/O requests from a compute node in communication with a data storage system, including a first burst buffer node and a second burst buffer node, the computer-executable method, system, and computer program product comprising striping data on the first burst buffer node and the second burst buffer node, wherein a first portion of the data is communicated to the first burst buffer node and a second portion of the data is communicated to the second burst buffer node, processing the first portion of the data at the first burst buffer node, and processing the second portion of the data at the second burst buffer node.
A phantom axon setup for validating models of action potential recordings.

PubMed

Rossel, Olivier; Soulier, Fabien; Bernard, Serge; Guiraud, David; Cathébras, Guy

2016-08-01

Electrode designs and strategies for electroneurogram recordings are often tested first by computer simulations and then by animal models, but they are rarely implanted for long-term evaluation in humans. The models show that the amplitude of the potential at the surface of an axon is higher in front of the nodes of Ranvier than at the internodes; however, this has not been investigated through in vivo measurements. An original experimental method is presented to emulate a single fiber action potential in an infinite conductive volume, allowing the potential of an axon to be recorded at both the nodes of Ranvier and the internodes, for a wide range of electrode-to-fiber radial distances. The paper particularly investigates the differences in the action potential amplitude along the longitudinal axis of an axon. At a short radial distance, the action potential amplitude measured in front of a node of Ranvier is two times larger than in the middle of two nodes. Moreover, farther from the phantom axon, the measured action potential amplitude is almost constant along the longitudinal axis. The results of this new method confirm the computer simulations, with a correlation of 97.6 %.
Executing scatter operation to parallel computer nodes by repeatedly broadcasting content of send buffer partition corresponding to each node upon bitwise OR operation

DOEpatents

Archer, Charles J [Rochester, MN; Ratterman, Joseph D [Rochester, MN

2009-11-06

Executing a scatter operation on a parallel computer includes: configuring a send buffer on a logical root, the send buffer having positions, each position corresponding to a ranked node in an operational group of compute nodes and for storing contents scattered to that ranked node; and repeatedly for each position in the send buffer: broadcasting, by the logical root to each of the other compute nodes on a global combining network, the contents of the current position of the send buffer using a bitwise OR operation, determining, by each compute node, whether the current position in the send buffer corresponds with the rank of that compute node, if the current position corresponds with the rank, receiving the contents and storing the contents in a reception buffer of that compute node, and if the current position does not correspond with the rank, discarding the contents.
Internode data communications in a parallel computer

DOEpatents

Archer, Charles J.; Blocksome, Michael A.; Miller, Douglas R.; Parker, Jeffrey J.; Ratterman, Joseph D.; Smith, Brian E.

2013-09-03

Internode data communications in a parallel computer that includes compute nodes that each include main memory and a messaging unit, the messaging unit including computer memory and coupling compute nodes for data communications, in which, for each compute node at compute node boot time: a messaging unit allocates, in the messaging unit's computer memory, a predefined number of message buffers, each message buffer associated with a process to be initialized on the compute node; receives, prior to initialization of a particular process on the compute node, a data communications message intended for the particular process; and stores the data communications message in the message buffer associated with the particular process. Upon initialization of the particular process, the process establishes a messaging buffer in main memory of the compute node and copies the data communications message from the message buffer of the messaging unit into the message buffer of main memory.
Internode data communications in a parallel computer

DOEpatents

Archer, Charles J; Blocksome, Michael A; Miller, Douglas R; Parker, Jeffrey J; Ratterman, Joseph D; Smith, Brian E

2014-02-11

Internode data communications in a parallel computer that includes compute nodes that each include main memory and a messaging unit, the messaging unit including computer memory and coupling compute nodes for data communications, in which, for each compute node at compute node boot time: a messaging unit allocates, in the messaging unit's computer memory, a predefined number of message buffers, each message buffer associated with a process to be initialized on the compute node; receives, prior to initialization of a particular process on the compute node, a data communications message intended for the particular process; and stores the data communications message in the message buffer associated with the particular process. Upon initialization of the particular process, the process establishes a messaging buffer in main memory of the compute node and copies the data communications message from the message buffer of the messaging unit into the message buffer of main memory.

Profiling an application for power consumption during execution on a compute node

DOEpatents

Archer, Charles J; Blocksome, Michael A; Peters, Amanda E; Ratterman, Joseph D; Smith, Brian E

2013-09-17

Methods, apparatus, and products are disclosed for profiling an application for power consumption during execution on a compute node that include: receiving an application for execution on a compute node; identifying a hardware power consumption profile for the compute node, the hardware power consumption profile specifying power consumption for compute node hardware during performance of various processing operations; determining a power consumption profile for the application in dependence upon the application and the hardware power consumption profile for the compute node; and reporting the power consumption profile for the application.
MOLA: a bootable, self-configuring system for virtual screening using AutoDock4/Vina on computer clusters.

PubMed

Abreu, Rui Mv; Froufe, Hugo Jc; Queiroz, Maria João Rp; Ferreira, Isabel Cfr

2010-10-28

Virtual screening of small molecules using molecular docking has become an important tool in drug discovery. However, large scale virtual screening is time demanding and usually requires dedicated computer clusters. There are a number of software tools that perform virtual screening using AutoDock4 but they require access to dedicated Linux computer clusters. Also no software is available for performing virtual screening with Vina using computer clusters. In this paper we present MOLA, an easy-to-use graphical user interface tool that automates parallel virtual screening using AutoDock4 and/or Vina in bootable non-dedicated computer clusters. MOLA automates several tasks including: ligand preparation, parallel AutoDock4/Vina jobs distribution and result analysis. When the virtual screening project finishes, an open-office spreadsheet file opens with the ligands ranked by binding energy and distance to the active site. All results files can automatically be recorded on an USB-flash drive or on the hard-disk drive using VirtualBox. MOLA works inside a customized Live CD GNU/Linux operating system, developed by us, that bypass the original operating system installed on the computers used in the cluster. This operating system boots from a CD on the master node and then clusters other computers as slave nodes via ethernet connections. MOLA is an ideal virtual screening tool for non-experienced users, with a limited number of multi-platform heterogeneous computers available and no access to dedicated Linux computer clusters. When a virtual screening project finishes, the computers can just be restarted to their original operating system. The originality of MOLA lies on the fact that, any platform-independent computer available can he added to the cluster, without ever using the computer hard-disk drive and without interfering with the installed operating system. With a cluster of 10 processors, and a potential maximum speed-up of 10x, the parallel algorithm of MOLA performed with a speed-up of 8,64× using AutoDock4 and 8,60× using Vina.
Network of dedicated processors for finding lowest-cost map path

NASA Technical Reports Server (NTRS)

Eberhardt, Silvio P. (Inventor)

1991-01-01

A method and associated apparatus are disclosed for finding the lowest cost path of several variable paths. The paths are comprised of a plurality of linked cost-incurring areas existing between an origin point and a destination point. The method comprises the steps of connecting a purality of nodes together in the manner of the cost-incurring areas; programming each node to have a cost associated therewith corresponding to one of the cost-incurring areas; injecting a signal into one of the nodes representing the origin point; propagating the signal through the plurality of nodes from inputs to outputs; reducing the signal in magnitude at each node as a function of the respective cost of the node; and, starting at one of the nodes representing the destination point and following a path having the least reduction in magnitude of the signal from node to node back to one of the nodes representing the origin point whereby the lowest cost path from the origin point to the destination point is found.
Development of the Large-Scale Statistical Analysis System of Satellites Observations Data with Grid Datafarm Architecture

NASA Astrophysics Data System (ADS)

Yamamoto, K.; Murata, K.; Kimura, E.; Honda, R.

2006-12-01

In the Solar-Terrestrial Physics (STP) field, the amount of satellite observation data has been increasing every year. It is necessary to solve the following three problems to achieve large-scale statistical analyses of plenty of data. (i) More CPU power and larger memory and disk size are required. However, total powers of personal computers are not enough to analyze such amount of data. Super-computers provide a high performance CPU and rich memory area, but they are usually separated from the Internet or connected only for the purpose of programming or data file transfer. (ii) Most of the observation data files are managed at distributed data sites over the Internet. Users have to know where the data files are located. (iii) Since no common data format in the STP field is available now, users have to prepare reading program for each data by themselves. To overcome the problems (i) and (ii), we constructed a parallel and distributed data analysis environment based on the Gfarm reference implementation of the Grid Datafarm architecture. The Gfarm shares both computational resources and perform parallel distributed processings. In addition, the Gfarm provides the Gfarm filesystem which can be as virtual directory tree among nodes. The Gfarm environment is composed of three parts; a metadata server to manage distributed files information, filesystem nodes to provide computational resources and a client to throw a job into metadata server and manages data processing schedulings. In the present study, both data files and data processes are parallelized on the Gfarm with 6 file system nodes: CPU clock frequency of each node is Pentium V 1GHz, 256MB memory and40GB disk. To evaluate performances of the present Gfarm system, we scanned plenty of data files, the size of which is about 300MB for each, in three processing methods: sequential processing in one node, sequential processing by each node and parallel processing by each node. As a result, in comparison between the number of files and the elapsed time, parallel and distributed processing shorten the elapsed time to 1/5 than sequential processing. On the other hand, sequential processing times were shortened in another experiment, whose file size is smaller than 100KB. In this case, the elapsed time to scan one file is within one second. It implies that disk swap took place in case of parallel processing by each node. We note that the operation became unstable when the number of the files exceeded 1000. To overcome the problem (iii), we developed an original data class. This class supports our reading of data files with various data formats since it converts them into an original data format since it defines schemata for every type of data and encapsulates the structure of data files. In addition, since this class provides a function of time re-sampling, users can easily convert multiple data (array) with different time resolution into the same time resolution array. Finally, using the Gfarm, we achieved a high performance environment for large-scale statistical data analyses. It should be noted that the present method is effective only when one data file size is large enough. At present, we are restructuring the new Gfarm environment with 8 nodes: CPU is Athlon 64 x2 Dual Core 2GHz, 2GB memory and 1.2TB disk (using RAID0) for each node. Our original class is to be implemented on the new Gfarm environment. In the present talk, we show the latest results with applying the present system for data analyses with huge number of satellite observation data files.
Line-plane broadcasting in a data communications network of a parallel computer

DOEpatents

Archer, Charles J.; Berg, Jeremy E.; Blocksome, Michael A.; Smith, Brian E.

2010-06-08

Methods, apparatus, and products are disclosed for line-plane broadcasting in a data communications network of a parallel computer, the parallel computer comprising a plurality of compute nodes connected together through the network, the network optimized for point to point data communications and characterized by at least a first dimension, a second dimension, and a third dimension, that include: initiating, by a broadcasting compute node, a broadcast operation, including sending a message to all of the compute nodes along an axis of the first dimension for the network; sending, by each compute node along the axis of the first dimension, the message to all of the compute nodes along an axis of the second dimension for the network; and sending, by each compute node along the axis of the second dimension, the message to all of the compute nodes along an axis of the third dimension for the network.
Line-plane broadcasting in a data communications network of a parallel computer

DOEpatents

Archer, Charles J.; Berg, Jeremy E.; Blocksome, Michael A.; Smith, Brian E.

2010-11-23

Methods, apparatus, and products are disclosed for line-plane broadcasting in a data communications network of a parallel computer, the parallel computer comprising a plurality of compute nodes connected together through the network, the network optimized for point to point data communications and characterized by at least a first dimension, a second dimension, and a third dimension, that include: initiating, by a broadcasting compute node, a broadcast operation, including sending a message to all of the compute nodes along an axis of the first dimension for the network; sending, by each compute node along the axis of the first dimension, the message to all of the compute nodes along an axis of the second dimension for the network; and sending, by each compute node along the axis of the second dimension, the message to all of the compute nodes along an axis of the third dimension for the network.
Introducing the slime mold graph repository

NASA Astrophysics Data System (ADS)

Dirnberger, M.; Mehlhorn, K.; Mehlhorn, T.

2017-07-01

We introduce the slime mold graph repository or SMGR, a novel data collection promoting the visibility, accessibility and reuse of experimental data revolving around network-forming slime molds. By making data readily available to researchers across multiple disciplines, the SMGR promotes novel research as well as the reproduction of original results. While SMGR data may take various forms, we stress the importance of graph representations of slime mold networks due to their ease of handling and their large potential for reuse. Data added to the SMGR stands to gain impact beyond initial publications or even beyond its domain of origin. We initiate the SMGR with the comprehensive Kist Europe data set focusing on the slime mold Physarum polycephalum, which we obtained in the course of our original research. It contains sequences of images documenting growth and network formation of the organism under constant conditions. Suitable image sequences depicting the typical P. polycephalum network structures are used to compute sequences of graphs faithfully capturing them. Given such sequences, node identities are computed, tracking the development of nodes over time. Based on this information we demonstrate two out of many possible ways to begin exploring the data. The entire data set is well-documented, self-contained and ready for inspection at http://smgr.mpi-inf.mpg.de.
Profiling an application for power consumption during execution on a plurality of compute nodes

DOEpatents

Archer, Charles J.; Blocksome, Michael A.; Peters, Amanda E.; Ratterman, Joseph D.; Smith, Brian E.

2012-08-21

Methods, apparatus, and products are disclosed for profiling an application for power consumption during execution on a compute node that include: receiving an application for execution on a compute node; identifying a hardware power consumption profile for the compute node, the hardware power consumption profile specifying power consumption for compute node hardware during performance of various processing operations; determining a power consumption profile for the application in dependence upon the application and the hardware power consumption profile for the compute node; and reporting the power consumption profile for the application.
Minimally buffered data transfers between nodes in a data communications network

DOEpatents

Miller, Douglas R.

2015-06-23

Methods, apparatus, and products for minimally buffered data transfers between nodes in a data communications network are disclosed that include: receiving, by a messaging module on an origin node, a storage identifier, a origin data type, and a target data type, the storage identifier specifying application storage containing data, the origin data type describing a data subset contained in the origin application storage, the target data type describing an arrangement of the data subset in application storage on a target node; creating, by the messaging module, origin metadata describing the origin data type; selecting, by the messaging module from the origin application storage in dependence upon the origin metadata and the storage identifier, the data subset; and transmitting, by the messaging module to the target node, the selected data subset for storing in the target application storage in dependence upon the target data type without temporarily buffering the data subset.
Broadcasting a message in a parallel computer

DOEpatents

Berg, Jeremy E [Rochester, MN; Faraj, Ahmad A [Rochester, MN

2011-08-02

Methods, systems, and products are disclosed for broadcasting a message in a parallel computer. The parallel computer includes a plurality of compute nodes connected together using a data communications network. The data communications network optimized for point to point data communications and is characterized by at least two dimensions. The compute nodes are organized into at least one operational group of compute nodes for collective parallel operations of the parallel computer. One compute node of the operational group assigned to be a logical root. Broadcasting a message in a parallel computer includes: establishing a Hamiltonian path along all of the compute nodes in at least one plane of the data communications network and in the operational group; and broadcasting, by the logical root to the remaining compute nodes, the logical root's message along the established Hamiltonian path.
Secure anonymous mutual authentication for star two-tier wireless body area networks.

PubMed

Ibrahim, Maged Hamada; Kumari, Saru; Das, Ashok Kumar; Wazid, Mohammad; Odelu, Vanga

2016-10-01

Mutual authentication is a very important service that must be established between sensor nodes in wireless body area network (WBAN) to ensure the originality and integrity of the patient's data sent by sensors distributed on different parts of the body. However, mutual authentication service is not enough. An adversary can benefit from monitoring the traffic and knowing which sensor is in transmission of patient's data. Observing the traffic (even without disclosing the context) and knowing its origin, it can reveal to the adversary information about the patient's medical conditions. Therefore, anonymity of the communicating sensors is an important service as well. Few works have been conducted in the area of mutual authentication among sensor nodes in WBAN. However, none of them has considered anonymity among body sensor nodes. Up to our knowledge, our protocol is the first attempt to consider this service in a two-tier WBAN. We propose a new secure protocol to realize anonymous mutual authentication and confidential transmission for star two-tier WBAN topology. The proposed protocol uses simple cryptographic primitives. We prove the security of the proposed protocol using the widely-accepted Burrows-Abadi-Needham (BAN) logic, and also through rigorous informal security analysis. In addition, to demonstrate the practicality of our protocol, we evaluate it using NS-2 simulator. BAN logic and informal security analysis prove that our proposed protocol achieves the necessary security requirements and goals of an authentication service. The simulation results show the impact on the various network parameters, such as end-to-end delay and throughput. The nodes in the network require to store few hundred bits. Nodes require to perform very few hash invocations, which are computationally very efficient. The communication cost of the proposed protocol is few hundred bits in one round of communication. Due to the low computation cost, the energy consumed by the nodes is also low. Our proposed protocol is a lightweight anonymous mutually authentication protocol to mutually authenticate the sensor nodes with the controller node (hub) in a star two-tier WBAN topology. Results show that our protocol proves efficiency over previously proposed protocols and at the same time, achieves the necessary security requirements for a secure anonymous mutual authentication scheme. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Performing a global barrier operation in a parallel computer

DOEpatents

Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

2014-12-09

Executing computing tasks on a parallel computer that includes compute nodes coupled for data communications, where each compute node executes tasks, with one task on each compute node designated as a master task, including: for each task on each compute node until all master tasks have joined a global barrier: determining whether the task is a master task; if the task is not a master task, joining a single local barrier; if the task is a master task, joining the global barrier and the single local barrier only after all other tasks on the compute node have joined the single local barrier.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Davis, Kristan D.; Faraj, Daniel A.

In a parallel computer, a plurality of logical planes formed of compute nodes of a subcommunicator may be identified by: for each compute node of the subcommunicator and for a number of dimensions beginning with a first dimension: establishing, by a plane building node, in a positive direction of the first dimension, all logical planes that include the plane building node and compute nodes of the subcommunicator in a positive direction of a second dimension, where the second dimension is orthogonal to the first dimension; and establishing, by the plane building node, in a negative direction of the first dimension,more » all logical planes that include the plane building node and compute nodes of the subcommunicator in the positive direction of the second dimension.« less
Message passing with a limited number of DMA byte counters

DOEpatents

Blocksome, Michael [Rochester, MN; Chen, Dong [Croton on Hudson, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Kumar, Sameer [White Plains, NY; Parker, Jeffrey J [Rochester, MN

2011-10-04

A method for passing messages in a parallel computer system constructed as a plurality of compute nodes interconnected as a network where each compute node includes a DMA engine but includes only a limited number of byte counters for tracking a number of bytes that are sent or received by the DMA engine, where the byte counters may be used in shared counter or exclusive counter modes of operation. The method includes using rendezvous protocol, a source compute node deterministically sending a request to send (RTS) message with a single RTS descriptor using an exclusive injection counter to track both the RTS message and message data to be sent in association with the RTS message, to a destination compute node such that the RTS descriptor indicates to the destination compute node that the message data will be adaptively routed to the destination node. Using one DMA FIFO at the source compute node, the RTS descriptors are maintained for rendezvous messages destined for the destination compute node to ensure proper message data ordering thereat. Using a reception counter at a DMA engine, the destination compute node tracks reception of the RTS and associated message data and sends a clear to send (CTS) message to the source node in a rendezvous protocol form of a remote get to accept the RTS message and message data and processing the remote get (CTS) by the source compute node DMA engine to provide the message data to be sent.
Determining when a set of compute nodes participating in a barrier operation on a parallel computer are ready to exit the barrier operation

DOEpatents

Blocksome, Michael A [Rochester, MN

2011-12-20

Methods, apparatus, and products are disclosed for determining when a set of compute nodes participating in a barrier operation on a parallel computer are ready to exit the barrier operation that includes, for each compute node in the set: initializing a barrier counter with no counter underflow interrupt; configuring, upon entering the barrier operation, the barrier counter with a value in dependence upon a number of compute nodes in the set; broadcasting, by a DMA engine on the compute node to each of the other compute nodes upon entering the barrier operation, a barrier control packet; receiving, by the DMA engine from each of the other compute nodes, a barrier control packet; modifying, by the DMA engine, the value for the barrier counter in dependence upon each of the received barrier control packets; exiting the barrier operation if the value for the barrier counter matches the exit value.
Reducing power consumption while performing collective operations on a plurality of compute nodes

DOEpatents

Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Peters, Amanda E [Rochester, MN; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN

2011-10-18

Methods, apparatus, and products are disclosed for reducing power consumption while performing collective operations on a plurality of compute nodes that include: receiving, by each compute node, instructions to perform a type of collective operation; selecting, by each compute node from a plurality of collective operations for the collective operation type, a particular collective operation in dependence upon power consumption characteristics for each of the plurality of collective operations; and executing, by each compute node, the selected collective operation.
I/O routing in a multidimensional torus network

DOEpatents

Chen, Dong; Eisley, Noel A.; Heidelberger, Philip

2017-02-07

A method, system and computer program product are disclosed for routing data packet in a computing system comprising a multidimensional torus compute node network including a multitude of compute nodes, and an I/O node network including a plurality of I/O nodes. In one embodiment, the method comprises assigning to each of the data packets a destination address identifying one of the compute nodes; providing each of the data packets with a toio value; routing the data packets through the compute node network to the destination addresses of the data packets; and when each of the data packets reaches the destination address assigned to said each data packet, routing said each data packet to one of the I/O nodes if the toio value of said each data packet is a specified value. In one embodiment, each of the data packets is also provided with an ioreturn value used to route the data packets through the compute node network.
I/O routing in a multidimensional torus network

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Dong; Eisley, Noel A.; Heidelberger, Philip

A method, system and computer program product are disclosed for routing data packet in a computing system comprising a multidimensional torus compute node network including a multitude of compute nodes, and an I/O node network including a plurality of I/O nodes. In one embodiment, the method comprises assigning to each of the data packets a destination address identifying one of the compute nodes; providing each of the data packets with a toio value; routing the data packets through the compute node network to the destination addresses of the data packets; and when each of the data packets reaches the destinationmore » address assigned to said each data packet, routing said each data packet to one of the I/O nodes if the toio value of said each data packet is a specified value. In one embodiment, each of the data packets is also provided with an ioreturn value used to route the data packets through the compute node network.« less
GRAPE-6A: A Single-Card GRAPE-6 for Parallel PC-GRAPE Cluster Systems

NASA Astrophysics Data System (ADS)

Fukushige, Toshiyuki; Makino, Junichiro; Kawai, Atsushi

2005-12-01

In this paper, we describe the design and performance of GRAPE-6A, a special-purpose computer for gravitational many-body simulations. It was designed to be used with a PC cluster, in which each node has one GRAPE-6A. Such a configuration is particularly cost-effective in running parallel tree algorithms. Though the use of parallel tree algorithms was possible with the original GRAPE-6 hardware, it was not very cost-effective since a single GRAPE-6 board was still too fast and too expensive. Therefore, we designed GRAPE-6A as a single PCI card to minimize the reproduction cost and to optimize the computing speed. The peak performance is 130 Gflops for one GRAPE-6A board and 3.1 Tflops for our 24 node cluster. We describe the implementation of the tree, TreePM and individual timestep algorithms on both a single GRAPE-6A system and GRAPE-6A cluster. Using the tree algorithm on our 16-node GRAPE-6A system, we can complete a collisionless simulation with 100 million particles (8000 steps) within 10 days.
Reducing power consumption during execution of an application on a plurality of compute nodes

DOEpatents

Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Peters, Amanda E [Rochester, MN; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN

2012-06-05

Methods, apparatus, and products are disclosed for reducing power consumption during execution of an application on a plurality of compute nodes that include: executing, by each compute node, an application, the application including power consumption directives corresponding to one or more portions of the application; identifying, by each compute node, the power consumption directives included within the application during execution of the portions of the application corresponding to those identified power consumption directives; and reducing power, by each compute node, to one or more components of that compute node according to the identified power consumption directives during execution of the portions of the application corresponding to those identified power consumption directives.

Reducing power consumption during execution of an application on a plurality of compute nodes

DOEpatents

Archer, Charles J.; Blocksome, Michael A.; Peters, Amanda E.; Ratterman, Joseph D.; Smith, Brian E.

2013-09-10

Methods, apparatus, and products are disclosed for reducing power consumption during execution of an application on a plurality of compute nodes that include: powering up, during compute node initialization, only a portion of computer memory of the compute node, including configuring an operating system for the compute node in the powered up portion of computer memory; receiving, by the operating system, an instruction to load an application for execution; allocating, by the operating system, additional portions of computer memory to the application for use during execution; powering up the additional portions of computer memory allocated for use by the application during execution; and loading, by the operating system, the application into the powered up additional portions of computer memory.
Performing an allreduce operation on a plurality of compute nodes of a parallel computer

DOEpatents

Faraj, Ahmad

2013-07-09

Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer, each node including at least two processing cores, that include: establishing, for each node, a plurality of logical rings, each ring including a different set of at least one core on that node, each ring including the cores on at least two of the nodes; iteratively for each node: assigning each core of that node to one of the rings established for that node to which the core has not previously been assigned, and performing, for each ring for that node, a global allreduce operation using contribution data for the cores assigned to that ring or any global allreduce results from previous global allreduce operations, yielding current global allreduce results for each core; and performing, for each node, a local allreduce operation using the global allreduce results.
Performing an allreduce operation on a plurality of compute nodes of a parallel computer

DOEpatents

Faraj, Ahmad

2013-02-12

Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer, each node including at least two processing cores, that include: performing, for each node, a local reduction operation using allreduce contribution data for the cores of that node, yielding, for each node, a local reduction result for one or more representative cores for that node; establishing one or more logical rings among the nodes, each logical ring including only one of the representative cores from each node; performing, for each logical ring, a global allreduce operation using the local reduction result for the representative cores included in that logical ring, yielding a global allreduce result for each representative core included in that logical ring; and performing, for each node, a local broadcast operation using the global allreduce results for each representative core on that node.
Determining collective barrier operation skew in a parallel computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Faraj, Daniel A.

2015-11-24

Determining collective barrier operation skew in a parallel computer that includes a number of compute nodes organized into an operational group includes: for each of the nodes until each node has been selected as a delayed node: selecting one of the nodes as a delayed node; entering, by each node other than the delayed node, a collective barrier operation; entering, after a delay by the delayed node, the collective barrier operation; receiving an exit signal from a root of the collective barrier operation; and measuring, for the delayed node, a barrier completion time. The barrier operation skew is calculated by:more » identifying, from the compute nodes' barrier completion times, a maximum barrier completion time and a minimum barrier completion time and calculating the barrier operation skew as the difference of the maximum and the minimum barrier completion time.« less
Determining collective barrier operation skew in a parallel computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Faraj, Daniel A.

Determining collective barrier operation skew in a parallel computer that includes a number of compute nodes organized into an operational group includes: for each of the nodes until each node has been selected as a delayed node: selecting one of the nodes as a delayed node; entering, by each node other than the delayed node, a collective barrier operation; entering, after a delay by the delayed node, the collective barrier operation; receiving an exit signal from a root of the collective barrier operation; and measuring, for the delayed node, a barrier completion time. The barrier operation skew is calculated by:more » identifying, from the compute nodes' barrier completion times, a maximum barrier completion time and a minimum barrier completion time and calculating the barrier operation skew as the difference of the maximum and the minimum barrier completion time.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

None

Performing a global barrier operation in a parallel computer that includes compute nodes coupled for data communications, where each compute node executes tasks, with one task on each compute node designated as a master task, including: for each task on each compute node until all master tasks have joined a global barrier: determining whether the task is a master task; if the task is not a master task, joining a single local barrier; if the task is a master task, joining the global barrier and the single local barrier only after all other tasks on the compute node have joinedmore » the single local barrier.« less
Thread selection according to power characteristics during context switching on compute nodes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Archer, Charles J.; Blocksome, Michael A.; Randles, Amanda E.

Methods, apparatus, and products are disclosed for thread selection during context switching on a plurality of compute nodes that includes: executing, by a compute node, an application using a plurality of threads of execution, including executing one or more of the threads of execution; selecting, by the compute node from a plurality of available threads of execution for the application, a next thread of execution in dependence upon power characteristics for each of the available threads; determining, by the compute node, whether criteria for a thread context switch are satisfied; and performing, by the compute node, the thread context switchmore » if the criteria for a thread context switch are satisfied, including executing the next thread of execution.« less
Thread selection according to predefined power characteristics during context switching on compute nodes

DOE Office of Scientific and Technical Information (OSTI.GOV)

None, None

Methods, apparatus, and products are disclosed for thread selection during context switching on a plurality of compute nodes that includes: executing, by a compute node, an application using a plurality of threads of execution, including executing one or more of the threads of execution; selecting, by the compute node from a plurality of available threads of execution for the application, a next thread of execution in dependence upon power characteristics for each of the available threads; determining, by the compute node, whether criteria for a thread context switch are satisfied; and performing, by the compute node, the thread context switchmore » if the criteria for a thread context switch are satisfied, including executing the next thread of execution.« less
First Case of the Cervical Lymph Node as the Only Site of Metastasis from Anal Cancer.

PubMed

Wang, Bo; Jaiswal, Sunny; Saif, Muhammad W

2017-05-30

Anal squamous cell carcinoma was a previously uncommon malignancy that has steadily increased in incidence with the increased prevalence of human papillomavirus (HPV) and human immunodeficiency virus (HIV). Anal squamous cell carcinoma is typically characterized by local and regional involvement and distant metastases are far less common. Here, we report a case of a 36-year-old female initially diagnosed with anal squamous cell carcinoma manifesting as an anal mass along with an enlarged inguinal lymph node. After receiving chemoradiation therapy, she remained disease-free until recently, when she presented with an isolated left infraclavicular lymph node found on physical examination followed by a biopsy that was consistent with recurrent anal squamous cell carcinoma. The positron emission tomography-computed tomography (PET-CT) uptake of her original left inguinal lymph node was decreased, suggesting improved regional disease, and no other metastases were found. Our case represents a rare occurrence of metastatic anal squamous cell carcinoma to an isolated distal lymph node and reminds physicians not to forget a unusual site of metastasis and prevent any delay in treatment.
Method and apparatus for obtaining stack traceback data for multiple computing nodes of a massively parallel computer system

DOEpatents

Gooding, Thomas Michael; McCarthy, Patrick Joseph

2010-03-02

A data collector for a massively parallel computer system obtains call-return stack traceback data for multiple nodes by retrieving partial call-return stack traceback data from each node, grouping the nodes in subsets according to the partial traceback data, and obtaining further call-return stack traceback data from a representative node or nodes of each subset. Preferably, the partial data is a respective instruction address from each node, nodes having identical instruction address being grouped together in the same subset. Preferably, a single node of each subset is chosen and full stack traceback data is retrieved from the call-return stack within the chosen node.
Performance Analysis and Optimization on the UCLA Parallel Atmospheric General Circulation Model Code

NASA Technical Reports Server (NTRS)

Lou, John; Ferraro, Robert; Farrara, John; Mechoso, Carlos

1996-01-01

An analysis is presented of several factors influencing the performance of a parallel implementation of the UCLA atmospheric general circulation model (AGCM) on massively parallel computer systems. Several modificaitons to the original parallel AGCM code aimed at improving its numerical efficiency, interprocessor communication cost, load-balance and issues affecting single-node code performance are discussed.
Patterns of regional head and neck lymph node metastasis in primary conjunctival malignant melanoma

PubMed Central

Lim, M; Tatla, T; Hersh, D; Hungerford, J

2006-01-01

Objective To correlate patterns of regional lymph node metastasis with the site of origin in primary conjunctival malignant melanoma. Design Retrospective analysis (1990–2003) of clinical data. Setting Two London tertiary referral centres. Participants 12 patients presenting with regional metastases after failed local treatment for conjunctival malignant melanoma. Results 6 cases predominantly involving the temporal conjunctiva metastasised to the pre‐auricular lymph nodes. Two cases predominantly involving the nasal conjunctiva metastasised to the submandibular nodes. Of the two cases with purely multifocal disease, one metastasised to the pre‐auricular nodes and another to both submandibular and parotid nodes. One primary conjunctival malignant melanoma had its origin in temporal conjunctiva but metastasised to submandibular nodes, and another case originating from nasal conjunctiva metastasised to pre‐auricular nodes. Conclusions Temporal conjunctival melanotic lesions tend to metastasise clinically to pre‐auricular lymph nodes and nasal conjunctival melanotic lesions metastasise to the submandibular lymph nodes. Patterns appear consistent with laboratory‐based anatomically mapped lymphatic drainage basins of the conjunctiva. PMID:16928703
SEAODV: A Security Enhanced AODV Routing Protocol for Wireless Mesh Networks

NASA Astrophysics Data System (ADS)

Li, Celia; Wang, Zhuang; Yang, Cungang

In this paper, we propose a Security Enhanced AODV routing protocol (SEAODV) for wireless mesh networks (WMN). SEAODV employs Blom's key pre-distribution scheme to compute the pairwise transient key (PTK) through the flooding of enhanced HELLO message and subsequently uses the established PTK to distribute the group transient key (GTK). PTK and GTK authenticate unicast and broadcast routing messages respectively. In WMN, a unique PTK is shared by each pair of nodes, while GTK is shared secretly between the node and all its one-hop neighbours. A message authentication code (MAC) is attached as the extension to the original AODV routing message to guarantee the message's authenticity and integrity in a hop-by-hop fashion. Security analysis and performance evaluation show that SEAODV is more effective in preventing identified routing attacks and outperforms ARAN and SAODV in terms of computation cost and route acquisition latency.
Replenishing data descriptors in a DMA injection FIFO buffer

DOEpatents

Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Cernohous, Bob R [Rochester, MN; Heidelberger, Philip [Cortlandt Manor, NY; Kumar, Sameer [White Plains, NY; Parker, Jeffrey J [Rochester, MN

2011-10-11

Methods, apparatus, and products are disclosed for replenishing data descriptors in a Direct Memory Access (`DMA`) injection first-in-first-out (`FIFO`) buffer that include: determining, by a messaging module on an origin compute node, whether a number of data descriptors in a DMA injection FIFO buffer exceeds a predetermined threshold, each data descriptor specifying an application message for transmission to a target compute node; queuing, by the messaging module, a plurality of new data descriptors in a pending descriptor queue if the number of the data descriptors in the DMA injection FIFO buffer exceeds the predetermined threshold; establishing, by the messaging module, interrupt criteria that specify when to replenish the injection FIFO buffer with the plurality of new data descriptors in the pending descriptor queue; and injecting, by the messaging module, the plurality of new data descriptors into the injection FIFO buffer in dependence upon the interrupt criteria.
THC-MP: High performance numerical simulation of reactive transport and multiphase flow in porous media

NASA Astrophysics Data System (ADS)

Wei, Xiaohui; Li, Weishan; Tian, Hailong; Li, Hongliang; Xu, Haixiao; Xu, Tianfu

2015-07-01

The numerical simulation of multiphase flow and reactive transport in the porous media on complex subsurface problem is a computationally intensive application. To meet the increasingly computational requirements, this paper presents a parallel computing method and architecture. Derived from TOUGHREACT that is a well-established code for simulating subsurface multi-phase flow and reactive transport problems, we developed a high performance computing THC-MP based on massive parallel computer, which extends greatly on the computational capability for the original code. The domain decomposition method was applied to the coupled numerical computing procedure in the THC-MP. We designed the distributed data structure, implemented the data initialization and exchange between the computing nodes and the core solving module using the hybrid parallel iterative and direct solver. Numerical accuracy of the THC-MP was verified through a CO2 injection-induced reactive transport problem by comparing the results obtained from the parallel computing and sequential computing (original code). Execution efficiency and code scalability were examined through field scale carbon sequestration applications on the multicore cluster. The results demonstrate successfully the enhanced performance using the THC-MP on parallel computing facilities.
GPU-accelerated Modeling and Element-free Reverse-time Migration with Gauss Points Partition

NASA Astrophysics Data System (ADS)

Zhen, Z.; Jia, X.

2014-12-01

Element-free method (EFM) has been applied to seismic modeling and migration. Compared with finite element method (FEM) and finite difference method (FDM), it is much cheaper and more flexible because only the information of the nodes and the boundary of the study area are required in computation. In the EFM, the number of Gauss points should be consistent with the number of model nodes; otherwise the accuracy of the intermediate coefficient matrices would be harmed. Thus when we increase the nodes of velocity model in order to obtain higher resolution, we find that the size of the computer's memory will be a bottleneck. The original EFM can deal with at most 81×81 nodes in the case of 2G memory, as tested by Jia and Hu (2006). In order to solve the problem of storage and computation efficiency, we propose a concept of Gauss points partition (GPP), and utilize the GPUs to improve the computation efficiency. Considering the characteristics of the Gaussian points, the GPP method doesn't influence the propagation of seismic wave in the velocity model. To overcome the time-consuming computation of the stiffness matrix (K) and the mass matrix (M), we also use the GPUs in our computation program. We employ the compressed sparse row (CSR) format to compress the intermediate sparse matrices and try to simplify the operations by solving the linear equations with the CULA Sparse's Conjugate Gradient (CG) solver instead of the linear sparse solver 'PARDISO'. It is observed that our strategy can significantly reduce the computational time of K and Mcompared with the algorithm based on CPU. The model tested is Marmousi model. The length of the model is 7425m and the depth is 2990m. We discretize the model with 595x298 nodes, 300x300 Gauss cells and 3x3 Gauss points in each cell. In contrast to the computational time of the conventional EFM, the GPUs-GPP approach can substantially improve the efficiency. The speedup ratio of time consumption of computing K, M is 120 and the speedup ratio time consumption of RTM is 11.5. At the same time, the accuracy of imaging is not harmed. Another advantage of the GPUs-GPP method is its easy applications in other numerical methods such as the FEM. Finally, in the GPUs-GPP method, the arrays require quite limited memory storage, which makes the method promising in dealing with large-scale 3D problems.
Budget-based power consumption for application execution on a plurality of compute nodes

DOEpatents

Archer, Charles J; Blocksome, Michael A; Peters, Amanda E; Ratterman, Joseph D; Smith, Brian E

2013-02-05

Methods, apparatus, and products are disclosed for budget-based power consumption for application execution on a plurality of compute nodes that include: assigning an execution priority to each of one or more applications; executing, on the plurality of compute nodes, the applications according to the execution priorities assigned to the applications at an initial power level provided to the compute nodes until a predetermined power consumption threshold is reached; and applying, upon reaching the predetermined power consumption threshold, one or more power conservation actions to reduce power consumption of the plurality of compute nodes during execution of the applications.
Budget-based power consumption for application execution on a plurality of compute nodes

DOEpatents

Archer, Charles J; Inglett, Todd A; Ratterman, Joseph D

2012-10-23

Methods, apparatus, and products are disclosed for budget-based power consumption for application execution on a plurality of compute nodes that include: assigning an execution priority to each of one or more applications; executing, on the plurality of compute nodes, the applications according to the execution priorities assigned to the applications at an initial power level provided to the compute nodes until a predetermined power consumption threshold is reached; and applying, upon reaching the predetermined power consumption threshold, one or more power conservation actions to reduce power consumption of the plurality of compute nodes during execution of the applications.
Extended precision data types for the development of the original computer aided engineering applications

NASA Astrophysics Data System (ADS)

Pescaru, A.; Oanta, E.; Axinte, T.; Dascalescu, A.-D.

2015-11-01

Computer aided engineering is based on models of the phenomena which are expressed as algorithms. The implementations of the algorithms are usually software applications which are processing a large volume of numerical data, regardless the size of the input data. In this way, the finite element method applications used to have an input data generator which was creating the entire volume of geometrical data, starting from the initial geometrical information and the parameters stored in the input data file. Moreover, there were several data processing stages, such as: renumbering of the nodes meant to minimize the size of the band length of the system of equations to be solved, computation of the equivalent nodal forces, computation of the element stiffness matrix, assemblation of system of equations, solving the system of equations, computation of the secondary variables. The modern software application use pre-processing and post-processing programs to easily handle the information. Beside this example, CAE applications use various stages of complex computation, being very interesting the accuracy of the final results. Along time, the development of CAE applications was a constant concern of the authors and the accuracy of the results was a very important target. The paper presents the various computing techniques which were imagined and implemented in the resulting applications: finite element method programs, finite difference element method programs, applied general numerical methods applications, data generators, graphical applications, experimental data reduction programs. In this context, the use of the extended precision data types was one of the solutions, the limitations being imposed by the size of the memory which may be allocated. To avoid the memory-related problems the data was stored in files. To minimize the execution time, part of the file was accessed using the dynamic memory allocation facilities. One of the most important consequences of the paper is the design of a library which includes the optimized solutions previously tested, that may be used for the easily development of original CAE cross-platform applications. Last but not least, beside the generality of the data type solutions, there is targeted the development of a software library which may be used for the easily development of node-based CAE applications, each node having several known or unknown parameters, the system of equations being automatically generated and solved.
Direct memory access transfer completion notification

DOEpatents

Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Parker, Jeffrey J [Rochester, MN

2011-02-15

DMA transfer completion notification includes: inserting, by an origin DMA engine on an origin node in an injection first-in-first-out (`FIFO`) buffer, a data descriptor for an application message to be transferred to a target node on behalf of an application on the origin node; inserting, by the origin DMA engine, a completion notification descriptor in the injection FIFO buffer after the data descriptor for the message, the completion notification descriptor specifying a packet header for a completion notification packet; transferring, by the origin DMA engine to the target node, the message in dependence upon the data descriptor; sending, by the origin DMA engine, the completion notification packet to a local reception FIFO buffer using a local memory FIFO transfer operation; and notifying, by the origin DMA engine, the application that transfer of the message is complete in response to receiving the completion notification packet in the local reception FIFO buffer.

A Fast SVD-Hidden-nodes based Extreme Learning Machine for Large-Scale Data Analytics.

PubMed

Deng, Wan-Yu; Bai, Zuo; Huang, Guang-Bin; Zheng, Qing-Hua

2016-05-01

Big dimensional data is a growing trend that is emerging in many real world contexts, extending from web mining, gene expression analysis, protein-protein interaction to high-frequency financial data. Nowadays, there is a growing consensus that the increasing dimensionality poses impeding effects on the performances of classifiers, which is termed as the "peaking phenomenon" in the field of machine intelligence. To address the issue, dimensionality reduction is commonly employed as a preprocessing step on the Big dimensional data before building the classifiers. In this paper, we propose an Extreme Learning Machine (ELM) approach for large-scale data analytic. In contrast to existing approaches, we embed hidden nodes that are designed using singular value decomposition (SVD) into the classical ELM. These SVD nodes in the hidden layer are shown to capture the underlying characteristics of the Big dimensional data well, exhibiting excellent generalization performances. The drawback of using SVD on the entire dataset, however, is the high computational complexity involved. To address this, a fast divide and conquer approximation scheme is introduced to maintain computational tractability on high volume data. The resultant algorithm proposed is labeled here as Fast Singular Value Decomposition-Hidden-nodes based Extreme Learning Machine or FSVD-H-ELM in short. In FSVD-H-ELM, instead of identifying the SVD hidden nodes directly from the entire dataset, SVD hidden nodes are derived from multiple random subsets of data sampled from the original dataset. Comprehensive experiments and comparisons are conducted to assess the FSVD-H-ELM against other state-of-the-art algorithms. The results obtained demonstrated the superior generalization performance and efficiency of the FSVD-H-ELM. Copyright © 2016 Elsevier Ltd. All rights reserved.
DMA engine for repeating communication patterns

DOEpatents

Chen, Dong; Gara, Alan G.; Giampapa, Mark E.; Heidelberger, Philip; Steinmacher-Burow, Burkhard; Vranas, Pavlos

2010-09-21

A parallel computer system is constructed as a network of interconnected compute nodes to operate a global message-passing application for performing communications across the network. Each of the compute nodes includes one or more individual processors with memories which run local instances of the global message-passing application operating at each compute node to carry out local processing operations independent of processing operations carried out at other compute nodes. Each compute node also includes a DMA engine constructed to interact with the application via Injection FIFO Metadata describing multiple Injection FIFOs where each Injection FIFO may containing an arbitrary number of message descriptors in order to process messages with a fixed processing overhead irrespective of the number of message descriptors included in the Injection FIFO.
Parallel compression of data chunks of a shared data object using a log-structured file system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bent, John M.; Faibish, Sorin; Grider, Gary

2016-10-25

Techniques are provided for parallel compression of data chunks being written to a shared object. A client executing on a compute node or a burst buffer node in a parallel computing system stores a data chunk generated by the parallel computing system to a shared data object on a storage node by compressing the data chunk; and providing the data compressed data chunk to the storage node that stores the shared object. The client and storage node may employ Log-Structured File techniques. The compressed data chunk can be de-compressed by the client when the data chunk is read. A storagemore » node stores a data chunk as part of a shared object by receiving a compressed version of the data chunk from a compute node; and storing the compressed version of the data chunk to the shared data object on the storage node.« less
Methods for operating parallel computing systems employing sequenced communications

DOEpatents

Benner, R.E.; Gustafson, J.L.; Montry, G.R.

1999-08-10

A parallel computing system and method are disclosed having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system. 15 figs.
Methods for operating parallel computing systems employing sequenced communications

DOEpatents

Benner, Robert E.; Gustafson, John L.; Montry, Gary R.

1999-01-01

A parallel computing system and method having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system.
A family of position- and orientation-independent embedded boundary methods for viscous flow and fluid-structure interaction problems

NASA Astrophysics Data System (ADS)

Huang, Daniel Z.; De Santis, Dante; Farhat, Charbel

2018-07-01

The Finite Volume method with Exact two-material Riemann Problems (FIVER) is both a computational framework for multi-material flows characterized by large density jumps, and an Embedded Boundary Method (EBM) for computational fluid dynamics and highly nonlinear Fluid-Structure Interaction (FSI) problems. This paper deals with the EBM aspect of FIVER. For FSI problems, this EBM has already demonstrated the ability to address viscous effects along wall boundaries, and large deformations and topological changes of such boundaries. However, like for most EBMs - also known as immersed boundary methods - the performance of FIVER in the vicinity of a wall boundary can be sensitive with respect to the position and orientation of this boundary relative to the embedding mesh. This is mainly due to ill-conditioning issues that arise when an embedded interface becomes too close to a node of the embedding mesh, which may lead to spurious oscillations in the computed solution gradients at the wall boundary. This paper resolves these issues by introducing an alternative definition of the active/inactive status of a mesh node that leads to the removal of all sources of potential ill-conditioning from all spatial approximations performed by FIVER in the vicinity of a fluid-structure interface. It also makes two additional contributions. The first one is a new procedure for constructing the fluid-structure half Riemann problem underlying the semi-discretization by FIVER of the convective fluxes. This procedure eliminates one extrapolation from the conventional treatment of the wall boundary conditions and replaces it by an interpolation, which improves robustness. The second contribution is a post-processing algorithm for computing quantities of interest at the wall that achieves smoothness in the computed solution and its gradients. Lessons learned from these enhancements and contributions that are triggered by the new definition of the status of a mesh node are then generalized and exploited to eliminate from the original version of the FIVER method its sensitivities with respect to both of the position and orientation of the wall boundary relative to the embedding mesh, while maintaining the original definition of the status of a mesh node. This leads to a family of second-generation FIVER methods whose performance is illustrated in this paper for several flow and FSI problems. These include a challenging flow problem over a bird wing characterized by a feather-induced surface roughness, and a complex flexible flapping wing problem for which experimental data is available.
Design & implementation of distributed spatial computing node based on WPS

NASA Astrophysics Data System (ADS)

Liu, Liping; Li, Guoqing; Xie, Jibo

2014-03-01

Currently, the research work of SIG (Spatial Information Grid) technology mostly emphasizes on the spatial data sharing in grid environment, while the importance of spatial computing resources is ignored. In order to implement the sharing and cooperation of spatial computing resources in grid environment, this paper does a systematical research of the key technologies to construct Spatial Computing Node based on the WPS (Web Processing Service) specification by OGC (Open Geospatial Consortium). And a framework of Spatial Computing Node is designed according to the features of spatial computing resources. Finally, a prototype of Spatial Computing Node is implemented and the relevant verification work under the environment is completed.
Running ATLAS workloads within massively parallel distributed applications using Athena Multi-Process framework (AthenaMP)

NASA Astrophysics Data System (ADS)

Calafiura, Paolo; Leggett, Charles; Seuster, Rolf; Tsulaia, Vakhtang; Van Gemmeren, Peter

2015-12-01

AthenaMP is a multi-process version of the ATLAS reconstruction, simulation and data analysis framework Athena. By leveraging Linux fork and copy-on-write mechanisms, it allows for sharing of memory pages between event processors running on the same compute node with little to no change in the application code. Originally targeted to optimize the memory footprint of reconstruction jobs, AthenaMP has demonstrated that it can reduce the memory usage of certain configurations of ATLAS production jobs by a factor of 2. AthenaMP has also evolved to become the parallel event-processing core of the recently developed ATLAS infrastructure for fine-grained event processing (Event Service) which allows the running of AthenaMP inside massively parallel distributed applications on hundreds of compute nodes simultaneously. We present the architecture of AthenaMP, various strategies implemented by AthenaMP for scheduling workload to worker processes (for example: Shared Event Queue and Shared Distributor of Event Tokens) and the usage of AthenaMP in the diversity of ATLAS event processing workloads on various computing resources: Grid, opportunistic resources and HPC.
Method and apparatus for offloading compute resources to a flash co-processing appliance

DOEpatents

Tzelnic, Percy; Faibish, Sorin; Gupta, Uday K.; Bent, John; Grider, Gary Alan; Chen, Hsing -bung

2015-10-13

Solid-State Drive (SSD) burst buffer nodes are interposed into a parallel supercomputing cluster to enable fast burst checkpoint of cluster memory to or from nearby interconnected solid-state storage with asynchronous migration between the burst buffer nodes and slower more distant disk storage. The SSD nodes also perform tasks offloaded from the compute nodes or associated with the checkpoint data. For example, the data for the next job is preloaded in the SSD node and very fast uploaded to the respective compute node just before the next job starts. During a job, the SSD nodes perform fast visualization and statistical analysis upon the checkpoint data. The SSD nodes can also perform data reduction and encryption of the checkpoint data.
A Scheduling Algorithm for Cloud Computing System Based on the Driver of Dynamic Essential Path.

PubMed

Xie, Zhiqiang; Shao, Xia; Xin, Yu

2016-01-01

To solve the problem of task scheduling in the cloud computing system, this paper proposes a scheduling algorithm for cloud computing based on the driver of dynamic essential path (DDEP). This algorithm applies a predecessor-task layer priority strategy to solve the problem of constraint relations among task nodes. The strategy assigns different priority values to every task node based on the scheduling order of task node as affected by the constraint relations among task nodes, and the task node list is generated by the different priority value. To address the scheduling order problem in which task nodes have the same priority value, the dynamic essential long path strategy is proposed. This strategy computes the dynamic essential path of the pre-scheduling task nodes based on the actual computation cost and communication cost of task node in the scheduling process. The task node that has the longest dynamic essential path is scheduled first as the completion time of task graph is indirectly influenced by the finishing time of task nodes in the longest dynamic essential path. Finally, we demonstrate the proposed algorithm via simulation experiments using Matlab tools. The experimental results indicate that the proposed algorithm can effectively reduce the task Makespan in most cases and meet a high quality performance objective.
A Scheduling Algorithm for Cloud Computing System Based on the Driver of Dynamic Essential Path

PubMed Central

Xie, Zhiqiang; Shao, Xia; Xin, Yu

2016-01-01

To solve the problem of task scheduling in the cloud computing system, this paper proposes a scheduling algorithm for cloud computing based on the driver of dynamic essential path (DDEP). This algorithm applies a predecessor-task layer priority strategy to solve the problem of constraint relations among task nodes. The strategy assigns different priority values to every task node based on the scheduling order of task node as affected by the constraint relations among task nodes, and the task node list is generated by the different priority value. To address the scheduling order problem in which task nodes have the same priority value, the dynamic essential long path strategy is proposed. This strategy computes the dynamic essential path of the pre-scheduling task nodes based on the actual computation cost and communication cost of task node in the scheduling process. The task node that has the longest dynamic essential path is scheduled first as the completion time of task graph is indirectly influenced by the finishing time of task nodes in the longest dynamic essential path. Finally, we demonstrate the proposed algorithm via simulation experiments using Matlab tools. The experimental results indicate that the proposed algorithm can effectively reduce the task Makespan in most cases and meet a high quality performance objective. PMID:27490901
Asynchronous broadcast for ordered delivery between compute nodes in a parallel computing system where packet header space is limited

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kumar, Sameer

Disclosed is a mechanism on receiving processors in a parallel computing system for providing order to data packets received from a broadcast call and to distinguish data packets received at nodes from several incoming asynchronous broadcast messages where header space is limited. In the present invention, processors at lower leafs of a tree do not need to obtain a broadcast message by directly accessing the data in a root processor's buffer. Instead, each subsequent intermediate node's rank id information is squeezed into the software header of packet headers. In turn, the entire broadcast message is not transferred from the rootmore » processor to each processor in a communicator but instead is replicated on several intermediate nodes which then replicated the message to nodes in lower leafs. Hence, the intermediate compute nodes become "virtual root compute nodes" for the purpose of replicating the broadcast message to lower levels of a tree.« less
Exploring the use of I/O nodes for computation in a MIMD multiprocessor

NASA Technical Reports Server (NTRS)

Kotz, David; Cai, Ting

1995-01-01

As parallel systems move into the production scientific-computing world, the emphasis will be on cost-effective solutions that provide high throughput for a mix of applications. Cost effective solutions demand that a system make effective use of all of its resources. Many MIMD multiprocessors today, however, distinguish between 'compute' and 'I/O' nodes, the latter having attached disks and being dedicated to running the file-system server. This static division of responsibilities simplifies system management but does not necessarily lead to the best performance in workloads that need a different balance of computation and I/O. Of course, computational processes sharing a node with a file-system service may receive less CPU time, network bandwidth, and memory bandwidth than they would on a computation-only node. In this paper we begin to examine this issue experimentally. We found that high performance I/O does not necessarily require substantial CPU time, leaving plenty of time for application computation. There were some complex file-system requests, however, which left little CPU time available to the application. (The impact on network and memory bandwidth still needs to be determined.) For applications (or users) that cannot tolerate an occasional interruption, we recommend that they continue to use only compute nodes. For tolerant applications needing more cycles than those provided by the compute nodes, we recommend that they take full advantage of both compute and I/O nodes for computation, and that operating systems should make this possible.
Treecode with a Special-Purpose Processor

NASA Astrophysics Data System (ADS)

Makino, Junichiro

1991-08-01

We describe an implementation of the modified Barnes-Hut tree algorithm for a gravitational N-body calculation on a GRAPE (GRAvity PipE) backend processor. GRAPE is a special-purpose computer for N-body calculations. It receives the positions and masses of particles from a host computer and then calculates the gravitational force at each coordinate specified by the host. To use this GRAPE processor with the hierarchical tree algorithm, the host computer must maintain a list of all nodes that exert force on a particle. If we create this list for each particle of the system at each timestep, the number of floating-point operations on the host and that on GRAPE would become comparable, and the increased speed obtained by using GRAPE would be small. In our modified algorithm, we create a list of nodes for many particles. Thus, the amount of the work required of the host is significantly reduced. This algorithm was originally developed by Barnes in order to vectorize the force calculation on a Cyber 205. With this algorithm, the computing time of the force calculation becomes comparable to that of the tree construction, if the GRAPE backend processor is sufficiently fast. The obtained speed-up factor is 30 to 50 for a RISC-based host computer and GRAPE-1A with a peak speed of 240 Mflops.
Computer hardware fault administration

DOEpatents

Archer, Charles J.; Megerian, Mark G.; Ratterman, Joseph D.; Smith, Brian E.

2010-09-14

Computer hardware fault administration carried out in a parallel computer, where the parallel computer includes a plurality of compute nodes. The compute nodes are coupled for data communications by at least two independent data communications networks, where each data communications network includes data communications links connected to the compute nodes. Typical embodiments carry out hardware fault administration by identifying a location of a defective link in the first data communications network of the parallel computer and routing communications data around the defective link through the second data communications network of the parallel computer.
SU-E-T-222: Computational Optimization of Monte Carlo Simulation On 4D Treatment Planning Using the Cloud Computing Technology

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chow, J

Purpose: This study evaluated the efficiency of 4D lung radiation treatment planning using Monte Carlo simulation on the cloud. The EGSnrc Monte Carlo code was used in dose calculation on the 4D-CT image set. Methods: 4D lung radiation treatment plan was created by the DOSCTP linked to the cloud, based on the Amazon elastic compute cloud platform. Dose calculation was carried out by Monte Carlo simulation on the 4D-CT image set on the cloud, and results were sent to the FFD4D image deformation program for dose reconstruction. The dependence of computing time for treatment plan on the number of computemore » node was optimized with variations of the number of CT image set in the breathing cycle and dose reconstruction time of the FFD4D. Results: It is found that the dependence of computing time on the number of compute node was affected by the diminishing return of the number of node used in Monte Carlo simulation. Moreover, the performance of the 4D treatment planning could be optimized by using smaller than 10 compute nodes on the cloud. The effects of the number of image set and dose reconstruction time on the dependence of computing time on the number of node were not significant, as more than 15 compute nodes were used in Monte Carlo simulations. Conclusion: The issue of long computing time in 4D treatment plan, requiring Monte Carlo dose calculations in all CT image sets in the breathing cycle, can be solved using the cloud computing technology. It is concluded that the optimized number of compute node selected in simulation should be between 5 and 15, as the dependence of computing time on the number of node is significant.« less
Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by routing through transporter nodes

DOEpatents

Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

2010-11-16

A massively parallel computer system contains an inter-nodal communications network of node-to-node links. An automated routing strategy routes packets through one or more intermediate nodes of the network to reach a destination. Some packets are constrained to be routed through respective designated transporter nodes, the automated routing strategy determining a path from a respective source node to a respective transporter node, and from a respective transporter node to a respective destination node. Preferably, the source node chooses a routing policy from among multiple possible choices, and that policy is followed by all intermediate nodes. The use of transporter nodes allows greater flexibility in routing.
Secure chaotic map based block cryptosystem with application to camera sensor networks.

PubMed

Guo, Xianfeng; Zhang, Jiashu; Khan, Muhammad Khurram; Alghathbar, Khaled

2011-01-01

Recently, Wang et al. presented an efficient logistic map based block encryption system. The encryption system employs feedback ciphertext to achieve plaintext dependence of sub-keys. Unfortunately, we discovered that their scheme is unable to withstand key stream attack. To improve its security, this paper proposes a novel chaotic map based block cryptosystem. At the same time, a secure architecture for camera sensor network is constructed. The network comprises a set of inexpensive camera sensors to capture the images, a sink node equipped with sufficient computation and storage capabilities and a data processing server. The transmission security between the sink node and the server is gained by utilizing the improved cipher. Both theoretical analysis and simulation results indicate that the improved algorithm can overcome the flaws and maintain all the merits of the original cryptosystem. In addition, computational costs and efficiency of the proposed scheme are encouraging for the practical implementation in the real environment as well as camera sensor network.
Secure Chaotic Map Based Block Cryptosystem with Application to Camera Sensor Networks

PubMed Central

Guo, Xianfeng; Zhang, Jiashu; Khan, Muhammad Khurram; Alghathbar, Khaled

2011-01-01

Recently, Wang et al. presented an efficient logistic map based block encryption system. The encryption system employs feedback ciphertext to achieve plaintext dependence of sub-keys. Unfortunately, we discovered that their scheme is unable to withstand key stream attack. To improve its security, this paper proposes a novel chaotic map based block cryptosystem. At the same time, a secure architecture for camera sensor network is constructed. The network comprises a set of inexpensive camera sensors to capture the images, a sink node equipped with sufficient computation and storage capabilities and a data processing server. The transmission security between the sink node and the server is gained by utilizing the improved cipher. Both theoretical analysis and simulation results indicate that the improved algorithm can overcome the flaws and maintain all the merits of the original cryptosystem. In addition, computational costs and efficiency of the proposed scheme are encouraging for the practical implementation in the real environment as well as camera sensor network. PMID:22319371
Broadcasting a message in a parallel computer

DOEpatents

Archer, Charles J; Faraj, Ahmad A

2013-04-16

Methods, systems, and products are disclosed for broadcasting a message in a parallel computer that includes: transmitting, by the logical root to all of the nodes directly connected to the logical root, a message; and for each node except the logical root: receiving the message; if that node is the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received; if that node received the message from a parent node and if that node is not a leaf node, then transmitting the message to all of the child nodes; and if that node received the message from a child node and if that node is not the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received and transmitting the message to the parent node.

Broadcasting a message in a parallel computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

None

Methods, systems, and products are disclosed for broadcasting a message in a parallel computer that includes: transmitting, by the logical root to all of the nodes directly connected to the logical root, a message; and for each node except the logical root: receiving the message; if that node is the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received; if that node received the message from a parent node and if that node is not a leaf node, then transmitting the message to all of the childmore » nodes; and if that node received the message from a child node and if that node is not the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received and transmitting the message to the parent node.« less
The origin of spurious solutions in computational electromagnetics

NASA Technical Reports Server (NTRS)

Jiang, Bo-Nan; Wu, Jie; Povinelli, L. A.

1995-01-01

The origin of spurious solutions in computational electromagnetics, which violate the divergence equations, is deeply rooted in a misconception about the first-order Maxwell's equations and in an incorrect derivation and use of the curl-curl equations. The divergence equations must be always included in the first-order Maxwell's equations to maintain the ellipticity of the system in the space domain and to guarantee the uniqueness of the solution and/or the accuracy of the numerical solutions. The div-curl method and the least-squares method provide rigorous derivation of the equivalent second-order Maxwell's equations and their boundary conditions. The node-based least-squares finite element method (LSFEM) is recommended for solving the first-order full Maxwell equations directly. Examples of the numerical solutions by LSFEM for time-harmonic problems are given to demonstrate that the LSFEM is free of spurious solutions.
Cloud computing method for dynamically scaling a process across physical machine boundaries

DOEpatents

Gillen, Robert E.; Patton, Robert M.; Potok, Thomas E.; Rojas, Carlos C.

2014-09-02

A cloud computing platform includes first device having a graph or tree structure with a node which receives data. The data is processed by the node or communicated to a child node for processing. A first node in the graph or tree structure determines the reconfiguration of a portion of the graph or tree structure on a second device. The reconfiguration may include moving a second node and some or all of its descendant nodes. The second and descendant nodes may be copied to the second device.
Method for simultaneous overlapped communications between neighboring processors in a multiple

DOEpatents

Benner, Robert E.; Gustafson, John L.; Montry, Gary R.

1991-01-01

A parallel computing system and method having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system.
Fault tolerant features and experiments of ANTS distributed real-time system

NASA Astrophysics Data System (ADS)

Dominic-Savio, Patrick; Lo, Jien-Chung; Tufts, Donald W.

1995-01-01

The ANTS project at the University of Rhode Island introduces the concept of Active Nodal Task Seeking (ANTS) as a way to efficiently design and implement dependable, high-performance, distributed computing. This paper presents the fault tolerant design features that have been incorporated in the ANTS experimental system implementation. The results of performance evaluations and fault injection experiments are reported. The fault-tolerant version of ANTS categorizes all computing nodes into three groups. They are: the up-and-running green group, the self-diagnosing yellow group and the failed red group. Each available computing node will be placed in the yellow group periodically for a routine diagnosis. In addition, for long-life missions, ANTS uses a monitoring scheme to identify faulty computing nodes. In this monitoring scheme, the communication pattern of each computing node is monitored by two other nodes.
Architecture and method for a burst buffer using flash technology

DOEpatents

Tzelnic, Percy; Faibish, Sorin; Gupta, Uday K.; Bent, John; Grider, Gary Alan; Chen, Hsing-bung

2016-03-15

A parallel supercomputing cluster includes compute nodes interconnected in a mesh of data links for executing an MPI job, and solid-state storage nodes each linked to a respective group of the compute nodes for receiving checkpoint data from the respective compute nodes, and magnetic disk storage linked to each of the solid-state storage nodes for asynchronous migration of the checkpoint data from the solid-state storage nodes to the magnetic disk storage. Each solid-state storage node presents a file system interface to the MPI job, and multiple MPI processes of the MPI job write the checkpoint data to a shared file in the solid-state storage in a strided fashion, and the solid-state storage node asynchronously migrates the checkpoint data from the shared file in the solid-state storage to the magnetic disk storage and writes the checkpoint data to the magnetic disk storage in a sequential fashion.
Hierarchical sequencing of online social graphs

NASA Astrophysics Data System (ADS)

Andjelković, Miroslav; Tadić, Bosiljka; Maletić, Slobodan; Rajković, Milan

2015-10-01

In online communications, patterns of conduct of individual actors and use of emotions in the process can lead to a complex social graph exhibiting multilayered structure and mesoscopic communities. Using simplicial complexes representation of graphs, we investigate in-depth topology of the online social network constructed from MySpace dialogs which exhibits original community structure. A simulation of emotion spreading in this network leads to the identification of two emotion-propagating layers. Three topological measures are introduced, referred to as the structure vectors, which quantify graph's architecture at different dimension levels. Notably, structures emerging through shared links, triangles and tetrahedral faces, frequently occur and range from tree-like to maximal 5-cliques and their respective complexes. On the other hand, the structures which spread only negative or only positive emotion messages appear to have much simpler topology consisting of links and triangles. The node's structure vector represents the number of simplices at each topology level in which the node resides and the total number of such simplices determines what we define as the node's topological dimension. The presented results suggest that the node's topological dimension provides a suitable measure of the social capital which measures the actor's ability to act as a broker in compact communities, the so called Simmelian brokerage. We also generalize the results to a wider class of computer-generated networks. Investigating components of the node's vector over network layers reveals that same nodes develop different socio-emotional relations and that the influential nodes build social capital by combining their connections in different layers.
Checkpointing for a hybrid computing node

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cher, Chen-Yong

2016-03-08

According to an aspect, a method for checkpointing in a hybrid computing node includes executing a task in a processing accelerator of the hybrid computing node. A checkpoint is created in a local memory of the processing accelerator. The checkpoint includes state data to restart execution of the task in the processing accelerator upon a restart operation. Execution of the task is resumed in the processing accelerator after creating the checkpoint. The state data of the checkpoint are transferred from the processing accelerator to a main processor of the hybrid computing node while the processing accelerator is executing the task.
Managing internode data communications for an uninitialized process in a parallel computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Archer, Charles J; Blocksome, Michael A; Miller, Douglas R

2014-05-20

A parallel computer includes nodes, each having main memory and a messaging unit (MU). Each MU includes computer memory, which in turn includes, MU message buffers. Each MU message buffer is associated with an uninitialized process on the compute node. In the parallel computer, managing internode data communications for an uninitialized process includes: receiving, by an MU of a compute node, one or more data communications messages in an MU message buffer associated with an uninitialized process on the compute node; determining, by an application agent, that the MU message buffer associated with the uninitialized process is full prior tomore » initialization of the uninitialized process; establishing, by the application agent, a temporary message buffer for the uninitialized process in main computer memory; and moving, by the application agent, data communications messages from the MU message buffer associated with the uninitialized process to the temporary message buffer in main computer memory.« less
Managing internode data communications for an uninitialized process in a parallel computer

DOEpatents

Archer, Charles J; Blocksome, Michael A; Miller, Douglas R; Parker, Jeffrey J; Ratterman, Joseph D; Smith, Brian E

2014-05-20

A parallel computer includes nodes, each having main memory and a messaging unit (MU). Each MU includes computer memory, which in turn includes, MU message buffers. Each MU message buffer is associated with an uninitialized process on the compute node. In the parallel computer, managing internode data communications for an uninitialized process includes: receiving, by an MU of a compute node, one or more data communications messages in an MU message buffer associated with an uninitialized process on the compute node; determining, by an application agent, that the MU message buffer associated with the uninitialized process is full prior to initialization of the uninitialized process; establishing, by the application agent, a temporary message buffer for the uninitialized process in main computer memory; and moving, by the application agent, data communications messages from the MU message buffer associated with the uninitialized process to the temporary message buffer in main computer memory.
Optimizing R with SparkR on a commodity cluster for biomedical research.

PubMed

Sedlmayr, Martin; Würfl, Tobias; Maier, Christian; Häberle, Lothar; Fasching, Peter; Prokosch, Hans-Ulrich; Christoph, Jan

2016-12-01

Medical researchers are challenged today by the enormous amount of data collected in healthcare. Analysis methods such as genome-wide association studies (GWAS) are often computationally intensive and thus require enormous resources to be performed in a reasonable amount of time. While dedicated clusters and public clouds may deliver the desired performance, their use requires upfront financial efforts or anonymous data, which is often not possible for preliminary or occasional tasks. We explored the possibilities to build a private, flexible cluster for processing scripts in R based on commodity, non-dedicated hardware of our department. For this, a GWAS-calculation in R on a single desktop computer, a Message Passing Interface (MPI)-cluster, and a SparkR-cluster were compared with regards to the performance, scalability, quality, and simplicity. The original script had a projected runtime of three years on a single desktop computer. Optimizing the script in R already yielded a significant reduction in computing time (2 weeks). By using R-MPI and SparkR, we were able to parallelize the computation and reduce the time to less than three hours (2.6 h) on already available, standard office computers. While MPI is a proven approach in high-performance clusters, it requires rather static, dedicated nodes. SparkR and its Hadoop siblings allow for a dynamic, elastic environment with automated failure handling. SparkR also scales better with the number of nodes in the cluster than MPI due to optimized data communication. R is a popular environment for clinical data analysis. The new SparkR solution offers elastic resources and allows supporting big data analysis using R even on non-dedicated resources with minimal change to the original code. To unleash the full potential, additional efforts should be invested to customize and improve the algorithms, especially with regards to data distribution. Copyright © 2016 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
Hyperswitch Network For Hypercube Computer

NASA Technical Reports Server (NTRS)

Chow, Edward; Madan, Herbert; Peterson, John

1989-01-01

Data-driven dynamic switching enables high speed data transfer. Proposed hyperswitch network based on mixed static and dynamic topologies. Routing header modified in response to congestion or faults encountered as path established. Static topology meets requirement if nodes have switching elements that perform necessary routing header revisions dynamically. Hypercube topology now being implemented with switching element in each computer node aimed at designing very-richly-interconnected multicomputer system. Interconnection network connects great number of small computer nodes, using fixed hypercube topology, characterized by point-to-point links between nodes.
A Loader for Executing Multi-Binary Applications on the Thinking Machines CM-5: It's Not Just for SPMD Anymore

NASA Technical Reports Server (NTRS)

Becker, Jeffrey C.

1995-01-01

The Thinking Machines CM-5 platform was designed to run single program, multiple data (SPMD) applications, i.e., to run a single binary across all nodes of a partition, with each node possibly operating on different data. Certain classes of applications, such as multi-disciplinary computational fluid dynamics codes, are facilitated by the ability to have subsets of the partition nodes running different binaries. In order to extend the CM-5 system software to permit such applications, a multi-program loader was developed. This system is based on the dld loader which was originally developed for workstations. This paper provides a high level description of dld, and describes how it was ported to the CM-5 to provide support for multi-binary applications. Finally, it elaborates how the loader has been used to implement the CM-5 version of MPIRUN, a portable facility for running multi-disciplinary/multi-zonal MPI (Message-Passing Interface Standard) codes.
Running Batch Jobs on Peregrine | High-Performance Computing | NREL

Science.gov Websites

Using Resource Feature to Request Different Node Types Peregrine has several types of compute nodes incompatibility and get the job running. More information about requesting different node types in Peregrine is available. Queues In order to meet the needs of different types of jobs, nodes on Peregrine are available
Balancing computation and communication power in power constrained clusters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Piga, Leonardo; Paul, Indrani; Huang, Wei

Systems, apparatuses, and methods for balancing computation and communication power in power constrained environments. A data processing cluster with a plurality of compute nodes may perform parallel processing of a workload in a power constrained environment. Nodes that finish tasks early may be power-gated based on one or more conditions. In some scenarios, a node may predict a wait duration and go into a reduced power consumption state if the wait duration is predicted to be greater than a threshold. The power saved by power-gating one or more nodes may be reassigned for use by other nodes. A cluster agentmore » may be configured to reassign the unused power to the active nodes to expedite workload processing.« less
Protocol for multiple node network

NASA Technical Reports Server (NTRS)

Kirkham, Harold (Inventor)

1995-01-01

The invention is a multiple interconnected network of intelligent message-repeating remote nodes which employs an antibody recognition message termination process performed by all remote nodes and a remote node polling process performed by other nodes which are master units controlling remote nodes in respective zones of the network assigned to respective master nodes. Each remote node repeats only those messages originated in the local zone, to provide isolation among the master nodes.
Protocol for multiple node network

NASA Technical Reports Server (NTRS)

Kirkham, Harold (Inventor)

1994-01-01

The invention is a multiple interconnected network of intelligent message-repeating remote nodes which employs an antibody recognition message termination process performed by all remote nodes and a remote node polling process performed by other nodes which are master units controlling remote nodes in respective zones of the network assigned to respective master nodes. Each remote node repeats only those messages originated in the local zone, to provide isolation among the master nodes.
Message communications of particular message types between compute nodes using DMA shadow buffers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Blocksome, Michael A.; Parker, Jeffrey J.

Message communications of particular message types between compute nodes using DMA shadow buffers includes: receiving a buffer identifier specifying an application buffer having a message of a particular type for transmission to a target compute node through a network; selecting one of a plurality of shadow buffers for a DMA engine on the compute node for storing the message, each shadow buffer corresponding to a slot of an injection FIFO buffer maintained by the DMA engine; storing the message in the selected shadow buffer; creating a data descriptor for the message stored in the selected shadow buffer; injecting the datamore » descriptor into the slot of the injection FIFO buffer corresponding to the selected shadow buffer; selecting the data descriptor from the injection FIFO buffer; and transmitting the message specified by the selected data descriptor through the data communications network to the target compute node.« less
Parallel file system with metadata distributed across partitioned key-value store c

DOEpatents

Bent, John M.; Faibish, Sorin; Grider, Gary; Torres, Aaron

2017-09-19

Improved techniques are provided for storing metadata associated with a plurality of sub-files associated with a single shared file in a parallel file system. The shared file is generated by a plurality of applications executing on a plurality of compute nodes. A compute node implements a Parallel Log Structured File System (PLFS) library to store at least one portion of the shared file generated by an application executing on the compute node and metadata for the at least one portion of the shared file on one or more object storage servers. The compute node is also configured to implement a partitioned data store for storing a partition of the metadata for the shared file, wherein the partitioned data store communicates with partitioned data stores on other compute nodes using a message passing interface. The partitioned data store can be implemented, for example, using Multidimensional Data Hashing Indexing Middleware (MDHIM).
Error recovery to enable error-free message transfer between nodes of a computer network

DOEpatents

Blumrich, Matthias A.; Coteus, Paul W.; Chen, Dong; Gara, Alan; Giampapa, Mark E.; Heidelberger, Philip; Hoenicke, Dirk; Takken, Todd; Steinmacher-Burow, Burkhard; Vranas, Pavlos M.

2016-01-26

An error-recovery method to enable error-free message transfer between nodes of a computer network. A first node of the network sends a packet to a second node of the network over a link between the nodes, and the first node keeps a copy of the packet on a sending end of the link until the first node receives acknowledgment from the second node that the packet was received without error. The second node tests the packet to determine if the packet is error free. If the packet is not error free, the second node sets a flag to mark the packet as corrupt. The second node returns acknowledgement to the first node specifying whether the packet was received with or without error. When the packet is received with error, the link is returned to a known state and the packet is sent again to the second node.

Peregrine System Configuration | High-Performance Computing | NREL

Science.gov Websites

nodes and storage are connected by a high speed InfiniBand network. Compute nodes are diskless with an directories are mounted on all nodes, along with a file system dedicated to shared projects. A brief processors with 64 GB of memory. All nodes are connected to the high speed Infiniband network and and a
Efficient implementation of multidimensional fast fourier transform on a distributed-memory parallel multi-node computer

DOEpatents

Bhanot, Gyan V [Princeton, NJ; Chen, Dong [Croton-On-Hudson, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY

2012-01-10

The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via "all-to-all" distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The "all-to-all" re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.
Efficient implementation of a multidimensional fast fourier transform on a distributed-memory parallel multi-node computer

DOEpatents

Bhanot, Gyan V [Princeton, NJ; Chen, Dong [Croton-On-Hudson, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY

2008-01-01

The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via "all-to-all" distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The "all-to-all" re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.
Distrubtion Tolerant Network Technology Flight Validation Report: DINET

NASA Technical Reports Server (NTRS)

Jones, Ross M.

2009-01-01

In October and November of 2008, the Jet Propulsion Laboratory installed and tested essential elements of Delay/Disruption Tolerant Networking (DTN) technology on the Deep Impact spacecraft. This experiment, called Deep Impact Network Experiment (DINET), was performed in close cooperation with the EPOXI project which has responsibility for the spacecraft. During DINET some 300 images were transmitted from the JPL nodes to the spacecraft. Then, they were automatically forwarded from the spacecraft back to the JPL nodes, exercising DTN's bundle origination, transmission, acquisition, dynamic route computation, congestion control, prioritization, custody transfer, and automatic retransmission procedures, both on the spacecraft and on the ground, over a period of 27 days. All transmitted bundles were successfully received, without corruption. The DINET experiment demonstrated DTN readiness for operational use in space missions.
Distribution Tolerant Network Technology Flight Validation Report: DINET

NASA Technical Reports Server (NTRS)

Jones, Ross M.

2009-01-01

In October and November of 2008, the Jet Propulsion Laboratory installed and tested essential elements of Delay/Disruption Tolerant Networking (DTN) technology on the Deep Impact spacecraft. This experiment, called Deep Impact Network Experiment (DINET), was performed in close cooperation with the EPOXI project which has responsibility for the spacecraft. During DINET some 300 images were transmitted from the JPL nodes to the spacecraft. Then, they were automatically forwarded from the spacecraft back to the JPL nodes, exercising DTN's bundle origination, transmission, acquisition, dynamic route computation, congestion control, prioritization, custody transfer, and automatic retransmission procedures, both on the spacecraft and on the ground, over a period of 27 days. All transmitted bundles were successfully received, without corruption. The DINET experiment demonstrated DTN readiness for operational use in space missions.
Need for speed: An optimized gridding approach for spatially explicit disease simulations.

PubMed

Sellman, Stefan; Tsao, Kimberly; Tildesley, Michael J; Brommesson, Peter; Webb, Colleen T; Wennergren, Uno; Keeling, Matt J; Lindström, Tom

2018-04-01

Numerical models for simulating outbreaks of infectious diseases are powerful tools for informing surveillance and control strategy decisions. However, large-scale spatially explicit models can be limited by the amount of computational resources they require, which poses a problem when multiple scenarios need to be explored to provide policy recommendations. We introduce an easily implemented method that can reduce computation time in a standard Susceptible-Exposed-Infectious-Removed (SEIR) model without introducing any further approximations or truncations. It is based on a hierarchical infection process that operates on entire groups of spatially related nodes (cells in a grid) in order to efficiently filter out large volumes of susceptible nodes that would otherwise have required expensive calculations. After the filtering of the cells, only a subset of the nodes that were originally at risk are then evaluated for actual infection. The increase in efficiency is sensitive to the exact configuration of the grid, and we describe a simple method to find an estimate of the optimal configuration of a given landscape as well as a method to partition the landscape into a grid configuration. To investigate its efficiency, we compare the introduced methods to other algorithms and evaluate computation time, focusing on simulated outbreaks of foot-and-mouth disease (FMD) on the farm population of the USA, the UK and Sweden, as well as on three randomly generated populations with varying degree of clustering. The introduced method provided up to 500 times faster calculations than pairwise computation, and consistently performed as well or better than other available methods. This enables large scale, spatially explicit simulations such as for the entire continental USA without sacrificing realism or predictive power.
Need for speed: An optimized gridding approach for spatially explicit disease simulations

PubMed Central

Tildesley, Michael J.; Brommesson, Peter; Webb, Colleen T.; Wennergren, Uno; Lindström, Tom

2018-01-01

Numerical models for simulating outbreaks of infectious diseases are powerful tools for informing surveillance and control strategy decisions. However, large-scale spatially explicit models can be limited by the amount of computational resources they require, which poses a problem when multiple scenarios need to be explored to provide policy recommendations. We introduce an easily implemented method that can reduce computation time in a standard Susceptible-Exposed-Infectious-Removed (SEIR) model without introducing any further approximations or truncations. It is based on a hierarchical infection process that operates on entire groups of spatially related nodes (cells in a grid) in order to efficiently filter out large volumes of susceptible nodes that would otherwise have required expensive calculations. After the filtering of the cells, only a subset of the nodes that were originally at risk are then evaluated for actual infection. The increase in efficiency is sensitive to the exact configuration of the grid, and we describe a simple method to find an estimate of the optimal configuration of a given landscape as well as a method to partition the landscape into a grid configuration. To investigate its efficiency, we compare the introduced methods to other algorithms and evaluate computation time, focusing on simulated outbreaks of foot-and-mouth disease (FMD) on the farm population of the USA, the UK and Sweden, as well as on three randomly generated populations with varying degree of clustering. The introduced method provided up to 500 times faster calculations than pairwise computation, and consistently performed as well or better than other available methods. This enables large scale, spatially explicit simulations such as for the entire continental USA without sacrificing realism or predictive power. PMID:29624574
Send-side matching of data communications messages

DOEpatents

Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

2014-06-17

Send-side matching of data communications messages in a distributed computing system comprising a plurality of compute nodes, including: issuing by a receiving node to source nodes a receive message that specifies receipt of a single message to be sent from any source node, the receive message including message matching information, a specification of a hardware-level mutual exclusion device, and an identification of a receive buffer; matching by two or more of the source nodes the receive message with pending send messages in the two or more source nodes; operating by one of the source nodes having a matching send message the mutual exclusion device, excluding messages from other source nodes with matching send messages and identifying to the receiving node the source node operating the mutual exclusion device; and sending to the receiving node from the source node operating the mutual exclusion device a matched pending message.
Locating hardware faults in a parallel computer

DOEpatents

Archer, Charles J.; Megerian, Mark G.; Ratterman, Joseph D.; Smith, Brian E.

2010-04-13

Locating hardware faults in a parallel computer, including defining within a tree network of the parallel computer two or more sets of non-overlapping test levels of compute nodes of the network that together include all the data communications links of the network, each non-overlapping test level comprising two or more adjacent tiers of the tree; defining test cells within each non-overlapping test level, each test cell comprising a subtree of the tree including a subtree root compute node and all descendant compute nodes of the subtree root compute node within a non-overlapping test level; performing, separately on each set of non-overlapping test levels, an uplink test on all test cells in a set of non-overlapping test levels; and performing, separately from the uplink tests and separately on each set of non-overlapping test levels, a downlink test on all test cells in a set of non-overlapping test levels.
Dynamic resource allocation scheme for distributed heterogeneous computer systems

NASA Technical Reports Server (NTRS)

Liu, Howard T. (Inventor); Silvester, John A. (Inventor)

1991-01-01

This invention relates to a resource allocation in computer systems, and more particularly, to a method and associated apparatus for shortening response time and improving efficiency of a heterogeneous distributed networked computer system by reallocating the jobs queued up for busy nodes to idle, or less-busy nodes. In accordance with the algorithm (SIDA for short), the load-sharing is initiated by the server device in a manner such that extra overhead in not imposed on the system during heavily-loaded conditions. The algorithm employed in the present invention uses a dual-mode, server-initiated approach. Jobs are transferred from heavily burdened nodes (i.e., over a high threshold limit) to low burdened nodes at the initiation of the receiving node when: (1) a job finishes at a node which is burdened below a pre-established threshold level, or (2) a node is idle for a period of time as established by a wakeup timer at the node. The invention uses a combination of the local queue length and the local service rate ratio at each node as the workload indicator.
Intranode data communications in a parallel computer

DOEpatents

Archer, Charles J; Blocksome, Michael A; Miller, Douglas R; Ratterman, Joseph D; Smith, Brian E

2014-01-07

Intranode data communications in a parallel computer that includes compute nodes configured to execute processes, where the data communications include: allocating, upon initialization of a first process of a computer node, a region of shared memory; establishing, by the first process, a predefined number of message buffers, each message buffer associated with a process to be initialized on the compute node; sending, to a second process on the same compute node, a data communications message without determining whether the second process has been initialized, including storing the data communications message in the message buffer of the second process; and upon initialization of the second process: retrieving, by the second process, a pointer to the second process's message buffer; and retrieving, by the second process from the second process's message buffer in dependence upon the pointer, the data communications message sent by the first process.
Intranode data communications in a parallel computer

DOEpatents

Archer, Charles J; Blocksome, Michael A; Miller, Douglas R; Ratterman, Joseph D; Smith, Brian E

2013-07-23

Intranode data communications in a parallel computer that includes compute nodes configured to execute processes, where the data communications include: allocating, upon initialization of a first process of a compute node, a region of shared memory; establishing, by the first process, a predefined number of message buffers, each message buffer associated with a process to be initialized on the compute node; sending, to a second process on the same compute node, a data communications message without determining whether the second process has been initialized, including storing the data communications message in the message buffer of the second process; and upon initialization of the second process: retrieving, by the second process, a pointer to the second process's message buffer; and retrieving, by the second process from the second process's message buffer in dependence upon the pointer, the data communications message sent by the first process.
Performing an allreduce operation using shared memory

DOEpatents

Archer, Charles J [Rochester, MN; Dozsa, Gabor [Ardsley, NY; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN

2012-04-17

Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.
Performing an allreduce operation using shared memory

DOEpatents

Archer, Charles J; Dozsa, Gabor; Ratterman, Joseph D; Smith, Brian E

2014-06-10

Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.
Distributed computation of graphics primitives on a transputer network

NASA Technical Reports Server (NTRS)

Ellis, Graham K.

1988-01-01

A method is developed for distributing the computation of graphics primitives on a parallel processing network. Off-the-shelf transputer boards are used to perform the graphics transformations and scan-conversion tasks that would normally be assigned to a single transputer based display processor. Each node in the network performs a single graphics primitive computation. Frequently requested tasks can be duplicated on several nodes. The results indicate that the current distribution of commands on the graphics network shows a performance degradation when compared to the graphics display board alone. A change to more computation per node for every communication (perform more complex tasks on each node) may cause the desired increase in throughput.
Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by dynamically adjusting local routing strategies

DOEpatents

Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

2010-03-16

A massively parallel computer system contains an inter-nodal communications network of node-to-node links. Each node implements a respective routing strategy for routing data through the network, the routing strategies not necessarily being the same in every node. The routing strategies implemented in the nodes are dynamically adjusted during application execution to shift network workload as required. Preferably, adjustment of routing policies in selective nodes is performed at synchronization points. The network may be dynamically monitored, and routing strategies adjusted according to detected network conditions.
Efficient DV-HOP Localization for Wireless Cyber-Physical Social Sensing System: A Correntropy-Based Neural Network Learning Scheme

PubMed Central

Xu, Yang; Luo, Xiong; Wang, Weiping; Zhao, Wenbing

2017-01-01

Integrating wireless sensor network (WSN) into the emerging computing paradigm, e.g., cyber-physical social sensing (CPSS), has witnessed a growing interest, and WSN can serve as a social network while receiving more attention from the social computing research field. Then, the localization of sensor nodes has become an essential requirement for many applications over WSN. Meanwhile, the localization information of unknown nodes has strongly affected the performance of WSN. The received signal strength indication (RSSI) as a typical range-based algorithm for positioning sensor nodes in WSN could achieve accurate location with hardware saving, but is sensitive to environmental noises. Moreover, the original distance vector hop (DV-HOP) as an important range-free localization algorithm is simple, inexpensive and not related to the environment factors, but performs poorly when lacking anchor nodes. Motivated by these, various improved DV-HOP schemes with RSSI have been introduced, and we present a new neural network (NN)-based node localization scheme, named RHOP-ELM-RCC, through the use of DV-HOP, RSSI and a regularized correntropy criterion (RCC)-based extreme learning machine (ELM) algorithm (ELM-RCC). Firstly, the proposed scheme employs both RSSI and DV-HOP to evaluate the distances between nodes to enhance the accuracy of distance estimation at a reasonable cost. Then, with the help of ELM featured with a fast learning speed with a good generalization performance and minimal human intervention, a single hidden layer feedforward network (SLFN) on the basis of ELM-RCC is used to implement the optimization task for obtaining the location of unknown nodes. Since the RSSI may be influenced by the environmental noises and may bring estimation error, the RCC instead of the mean square error (MSE) estimation, which is sensitive to noises, is exploited in ELM. Hence, it may make the estimation more robust against outliers. Additionally, the least square estimation (LSE) in ELM is replaced by the half-quadratic optimization technique. Simulation results show that our proposed scheme outperforms other traditional localization schemes. PMID:28085084
Embedding global barrier and collective in torus network with each node combining input from receivers according to class map for output to senders

DOEpatents

Chen, Dong; Coteus, Paul W; Eisley, Noel A; Gara, Alan; Heidelberger, Philip; Senger, Robert M; Salapura, Valentina; Steinmacher-Burow, Burkhard; Sugawara, Yutaka; Takken, Todd E

2013-08-27

Embodiments of the invention provide a method, system and computer program product for embedding a global barrier and global interrupt network in a parallel computer system organized as a torus network. The computer system includes a multitude of nodes. In one embodiment, the method comprises taking inputs from a set of receivers of the nodes, dividing the inputs from the receivers into a plurality of classes, combining the inputs of each of the classes to obtain a result, and sending said result to a set of senders of the nodes. Embodiments of the invention provide a method, system and computer program product for embedding a collective network in a parallel computer system organized as a torus network. In one embodiment, the method comprises adding to a torus network a central collective logic to route messages among at least a group of nodes in a tree structure.
High Fidelity Simulations of Unsteady Flow through Turbopumps and Flowliners

NASA Technical Reports Server (NTRS)

Kiris, Cetin C.; Kwak, dochan; Chan, William; Housman, Jeff

2006-01-01

High fidelity computations were carried out to analyze the orbiter LH2 feedline flowliner. Computations were performed on the Columbia platform which is a 10,240-processor supercluster consisting of 20 Altix nodes with 512 processor each. Various computational models were used to characterize the unsteady flow features in the turbopump, including the orbiter Low-Pressure-Fuel-Turbopump (LPFTP) inducer, the orbiter manifold and a test article used to represent the manifold. Unsteady flow originating from the orbiter LPFTP inducer is one of the major contributors to the high frequency cyclic loading that results in high cycle fatigue damage to the gimbal flowliners just upstream of the LPFTP. The flow fields for the orbiter manifold and representative test article are computed and analyzed for similarities and differences. The incompressible Navier-Stokes flow solver INS3D, based on the artificial compressibility method, was used to compute the flow of liquid hydrogen in each test article.
Community Cloud Computing

NASA Astrophysics Data System (ADS)

Marinos, Alexandros; Briscoe, Gerard

Cloud Computing is rising fast, with its data centres growing at an unprecedented rate. However, this has come with concerns over privacy, efficiency at the expense of resilience, and environmental sustainability, because of the dependence on Cloud vendors such as Google, Amazon and Microsoft. Our response is an alternative model for the Cloud conceptualisation, providing a paradigm for Clouds in the community, utilising networked personal computers for liberation from the centralised vendor model. Community Cloud Computing (C3) offers an alternative architecture, created by combing the Cloud with paradigms from Grid Computing, principles from Digital Ecosystems, and sustainability from Green Computing, while remaining true to the original vision of the Internet. It is more technically challenging than Cloud Computing, having to deal with distributed computing issues, including heterogeneous nodes, varying quality of service, and additional security constraints. However, these are not insurmountable challenges, and with the need to retain control over our digital lives and the potential environmental consequences, it is a challenge we must pursue.

Three layers multi-granularity OCDM switching system based on learning-stateful PCE

NASA Astrophysics Data System (ADS)

Wang, Yubao; Liu, Yanfei; Sun, Hao

2017-10-01

In the existing three layers multi-granularity OCDM switching system (TLMG-OCDMSS), F-LSP, L-LSP and OC-LSP can be bundled as switching granularity. For CPU-intensive network, the node not only needs to compute the path but also needs to bundle the switching granularity so that the load of single node is heavy. The node will paralyze when the traffic of the node is too heavy, which will impact the performance of the whole network seriously. The introduction of stateful PCE(S-PCE) will effectively solve these problems. PCE is composed of two parts, namely, the path computation element and the database (TED and LSPDB), and returns the result of path computation to PCC (path computation clients) after PCC sends the path computation request to it. In this way, the pressure of the distributed path computation in each node is reduced. In this paper, we propose the concept of Learning PCE (L-PCE), which uses the existing LSPDB as the data source of PCE's learning. By this means, we can simplify the path computation and reduce the network delay, as a result, improving the performance of network.
Data communications in a parallel active messaging interface of a parallel computer

DOEpatents

Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

2013-11-12

Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer composed of compute nodes that execute a parallel application, each compute node including application processors that execute the parallel application and at least one management processor dedicated to gathering information regarding data communications. The PAMI is composed of data communications endpoints, each endpoint composed of a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications through the PAMI and through data communications resources. Embodiments function by gathering call site statistics describing data communications resulting from execution of data communications instructions and identifying in dependence upon the call cite statistics a data communications algorithm for use in executing a data communications instruction at a call site in the parallel application.
Deadlock-free class routes for collective communications embedded in a multi-dimensional torus network

DOEpatents

Chen, Dong; Eisley, Noel A.; Steinmacher-Burow, Burkhard; Heidelberger, Philip

2013-01-29

A computer implemented method and a system for routing data packets in a multi-dimensional computer network. The method comprises routing a data packet among nodes along one dimension towards a root node, each node having input and output communication links, said root node not having any outgoing uplinks, and determining at each node if the data packet has reached a predefined coordinate for the dimension or an edge of the subrectangle for the dimension, and if the data packet has reached the predefined coordinate for the dimension or the edge of the subrectangle for the dimension, determining if the data packet has reached the root node, and if the data packet has not reached the root node, routing the data packet among nodes along another dimension towards the root node.
Anatomical classification of breast sentinel lymph nodes using computed tomography-lymphography.

PubMed

Fujita, Tamaki; Miura, Hiroyuki; Seino, Hiroko; Ono, Shuichi; Nishi, Takashi; Nishimura, Akimasa; Hakamada, Kenichi; Aoki, Masahiko

2018-05-03

To evaluate the anatomical classification and location of breast sentinel lymph nodes, preoperative computed tomography-lymphography examinations were retrospectively reviewed for sentinel lymph nodes in 464 cases clinically diagnosed with node-negative breast cancer between July 2007 and June 2016. Anatomical classification was performed based on the numbers of lymphatic routes and sentinel lymph nodes, the flow direction of lymphatic routes, and the location of sentinel lymph nodes. Of the 464 cases reviewed, anatomical classification could be performed in 434 (93.5 %). The largest number of cases showed single route/single sentinel lymph node (n = 296, 68.2 %), followed by multiple routes/multiple sentinel lymph nodes (n = 59, 13.6 %), single route/multiple sentinel lymph nodes (n = 53, 12.2 %), and multiple routes/single sentinel lymph node (n = 26, 6.0 %). Classification based on the flow direction of lymphatic routes showed that 429 cases (98.8 %) had outward flow on the superficial fascia toward axillary lymph nodes, whereas classification based on the height of sentinel lymph nodes showed that 323 cases (74.4 %) belonged to the upper pectoral group of axillary lymph nodes. There was wide variation in the number of lymphatic routes and their branching patterns and in the number, location, and direction of flow of sentinel lymph nodes. It is clinically very important to preoperatively understand the anatomical morphology of lymphatic routes and sentinel lymph nodes for optimal treatment of breast cancer, and computed tomography-lymphography is suitable for this purpose.
Fixed node diffusion Monte Carlo using a genetic algorithm: a study of the CO-(4)He(N) complex, N = 1…10.

PubMed

Ramilowski, Jordan A; Farrelly, David

2012-06-14

The diffusion Monte Carlo (DMC) method is a widely used algorithm for computing both ground and excited states of many-particle systems; for states without nodes the algorithm is numerically exact. In the presence of nodes approximations must be introduced, for example, the fixed-node approximation. Recently we have developed a genetic algorithm (GA) based approach which allows the computation of nodal surfaces on-the-fly [Ramilowski and Farrelly, Phys. Chem. Chem. Phys., 2010, 12, 12450]. Here GA-DMC is applied to the computation of rovibrational states of CO-(4)He(N) complexes with N≤ 10. These complexes have been the subject of recent high resolution microwave and millimeter-wave studies which traced the onset of microscopic superfluidity in a doped (4)He droplet, one atom at a time, up to N = 10 [Surin et al., Phys. Rev. Lett., 2008, 101, 233401; Raston et al., Phys. Chem. Chem. Phys., 2010, 12, 8260]. The frequencies of the a-type (microwave) series, which correlate with end-over-end rotation in the CO-(4)He dimer, decrease from N = 1 to 3 and then smoothly increase. This signifies the transition from a molecular complex to a quantum solvated system. The frequencies of the b-type (millimeter-wave) series, which evolves from free rotation of the rigid CO molecule, initially increase from N = 0 to N∼ 6 before starting to decrease with increasing N. An interesting feature of the b-type series, originally observed in the high resolution infra-red (IR) experiments of Tang and McKellar [J. Chem. Phys., 2003, 119, 754] is that, for N = 7, two lines are observed. The GA-DMC algorithm is found to be in good agreement with experimental results and possibly detects the small (∼0.7 cm(-1)) splitting in the b-series line at N = 7. Advantages and disadvantages of GA-DMC are discussed.
Network Coding for Function Computation

ERIC Educational Resources Information Center

Appuswamy, Rathinakumar

2011-01-01

In this dissertation, the following "network computing problem" is considered. Source nodes in a directed acyclic network generate independent messages and a single receiver node computes a target function f of the messages. The objective is to maximize the average number of times f can be computed per network usage, i.e., the "computing…
Computed tomographic atlas for the new international lymph node map for lung cancer: A radiation oncologist perspective.

PubMed

Lynch, Rod; Pitson, Graham; Ball, David; Claude, Line; Sarrut, David

2013-01-01

To develop a reproducible definition for each mediastinal lymph node station based on the new TNM classification for lung cancer. This paper proposes an atlas using the new international lymph node map used in the seventh edition of the TNM classification for lung cancer. Four radiation oncologists and 1 diagnostic radiologist were involved in the project to put forward a reproducible radiologic description for the lung lymph node stations. The International Association for the Study of Lung Cancer lymph node definitions for stations 1 to 11 have been described and illustrated on axial computed tomographic scan images using a certified radiotherapy planning system. This atlas will assist both diagnostic radiologists and radiation oncologists in accurately defining the lymph node stations on computed tomographic scan in patients diagnosed with lung cancer. Copyright © 2013 American Society for Radiation Oncology. Published by Elsevier Inc. All rights reserved.
Parallel checksumming of data chunks of a shared data object using a log-structured file system

DOEpatents

Bent, John M.; Faibish, Sorin; Grider, Gary

2016-09-06

Checksum values are generated and used to verify the data integrity. A client executing in a parallel computing system stores a data chunk to a shared data object on a storage node in the parallel computing system. The client determines a checksum value for the data chunk; and provides the checksum value with the data chunk to the storage node that stores the shared object. The data chunk can be stored on the storage node with the corresponding checksum value as part of the shared object. The storage node may be part of a Parallel Log-Structured File System (PLFS), and the client may comprise, for example, a Log-Structured File System client on a compute node or burst buffer. The checksum value can be evaluated when the data chunk is read from the storage node to verify the integrity of the data that is read.
Comparing Networks from a Data Analysis Perspective

NASA Astrophysics Data System (ADS)

Li, Wei; Yang, Jing-Yu

To probe network characteristics, two predominant ways of network comparison are global property statistics and subgraph enumeration. However, they suffer from limited information and exhaustible computing. Here, we present an approach to compare networks from the perspective of data analysis. Initially, the approach projects each node of original network as a high-dimensional data point, and the network is seen as clouds of data points. Then the dispersion information of the principal component analysis (PCA) projection of the generated data clouds can be used to distinguish networks. We applied this node projection method to the yeast protein-protein interaction networks and the Internet Autonomous System networks, two types of networks with several similar higher properties. The method can efficiently distinguish one from the other. The identical result of different datasets from independent sources also indicated that the method is a robust and universal framework.
Bad data packet capture device

DOEpatents

Chen, Dong; Gara, Alan; Heidelberger, Philip; Vranas, Pavlos

2010-04-20

An apparatus and method for capturing data packets for analysis on a network computing system includes a sending node and a receiving node connected by a bi-directional communication link. The sending node sends a data transmission to the receiving node on the bi-directional communication link, and the receiving node receives the data transmission and verifies the data transmission to determine valid data and invalid data and verify retransmissions of invalid data as corresponding valid data. A memory device communicates with the receiving node for storing the invalid data and the corresponding valid data. A computing node communicates with the memory device and receives and performs an analysis of the invalid data and the corresponding valid data received from the memory device.
Calculus domains modelled using an original bool algebra based on polygons

NASA Astrophysics Data System (ADS)

Oanta, E.; Panait, C.; Raicu, A.; Barhalescu, M.; Axinte, T.

2016-08-01

Analytical and numerical computer based models require analytical definitions of the calculus domains. The paper presents a method to model a calculus domain based on a bool algebra which uses solid and hollow polygons. The general calculus relations of the geometrical characteristics that are widely used in mechanical engineering are tested using several shapes of the calculus domain in order to draw conclusions regarding the most effective methods to discretize the domain. The paper also tests the results of several CAD commercial software applications which are able to compute the geometrical characteristics, being drawn interesting conclusions. The tests were also targeting the accuracy of the results vs. the number of nodes on the curved boundary of the cross section. The study required the development of an original software consisting of more than 1700 computer code lines. In comparison with other calculus methods, the discretization using convex polygons is a simpler approach. Moreover, this method doesn't lead to large numbers as the spline approximation did, in that case being required special software packages in order to offer multiple, arbitrary precision. The knowledge resulted from this study may be used to develop complex computer based models in engineering.
Matching pursuit parallel decomposition of seismic data

NASA Astrophysics Data System (ADS)

Li, Chuanhui; Zhang, Fanchang

2017-07-01

In order to improve the computation speed of matching pursuit decomposition of seismic data, a matching pursuit parallel algorithm is designed in this paper. We pick a fixed number of envelope peaks from the current signal in every iteration according to the number of compute nodes and assign them to the compute nodes on average to search the optimal Morlet wavelets in parallel. With the help of parallel computer systems and Message Passing Interface, the parallel algorithm gives full play to the advantages of parallel computing to significantly improve the computation speed of the matching pursuit decomposition and also has good expandability. Besides, searching only one optimal Morlet wavelet by every compute node in every iteration is the most efficient implementation.
Hypercluster - Parallel processing for computational mechanics

NASA Technical Reports Server (NTRS)

Blech, Richard A.

1988-01-01

An account is given of the development status, performance capabilities and implications for further development of NASA-Lewis' testbed 'hypercluster' parallel computer network, in which multiple processors communicate through a shared memory. Processors have local as well as shared memory; the hypercluster is expanded in the same manner as the hypercube, with processor clusters replacing the normal single processor node. The NASA-Lewis machine has three nodes with a vector personality and one node with a scalar personality. Each of the vector nodes uses four board-level vector processors, while the scalar node uses four general-purpose microcomputer boards.
Method and apparatus for analyzing error conditions in a massively parallel computer system by identifying anomalous nodes within a communicator set

DOEpatents

Gooding, Thomas Michael [Rochester, MN

2011-04-19

An analytical mechanism for a massively parallel computer system automatically analyzes data retrieved from the system, and identifies nodes which exhibit anomalous behavior in comparison to their immediate neighbors. Preferably, anomalous behavior is determined by comparing call-return stack tracebacks for each node, grouping like nodes together, and identifying neighboring nodes which do not themselves belong to the group. A node, not itself in the group, having a large number of neighbors in the group, is a likely locality of error. The analyzer preferably presents this information to the user by sorting the neighbors according to number of adjoining members of the group.
The efficacy of preoperative positron emission tomography-computed tomography (PET-CT) for detection of lymph node metastasis in cervical and endometrial cancer: clinical and pathological factors influencing it.

PubMed

Nogami, Yuya; Banno, Kouji; Irie, Haruko; Iida, Miho; Kisu, Iori; Masugi, Yohei; Tanaka, Kyoko; Tominaga, Eiichiro; Okuda, Shigeo; Murakami, Koji; Aoki, Daisuke

2015-01-01

We studied the diagnostic performance of (18)F-fluoro-2-deoxy-d-glucose-positron emission tomography/computed tomography in cervical and endometrial cancers with particular focus on lymph node metastases. Seventy patients with cervical cancer and 53 with endometrial cancer were imaged with (18)F-fluoro-2-deoxy-D-glucose-positron emission tomography/computed tomography before lymphadenectomy. We evaluated the diagnostic performance of (18)F-fluoro-2-deoxy-D-glucose-positron emission tomography/computed tomography using the final pathological diagnoses as the golden standard. We calculated the sensitivity, specificity, positive predictive value and negative predictive value of (18)F-fluoro-2-deoxy-D-glucose-positron emission tomography/computed tomography. In cervical cancer, the results evaluated by cases were 33.3, 92.7, 55.6 and 83.6%, respectively. When evaluated by the area of lymph nodes, the results were 30.6, 98.9, 55.0 and 97.0%, respectively. As for endometrial cancer, the results evaluated by cases were 50.0, 93.9, 40.0 and 95.8%, and by area of lymph nodes, 45.0, 99.4, 64.3 and 98.5%, respectively. The limitation of the efficacy was found out by analyzing it by the region of the lymph node, the size of metastatic node, the historical type of tumor in cervical cancer and the prevalence of lymph node metastasis. The efficacy of positron emission tomography/computed tomography regarding the detection of lymph node metastasis in cervical and endometrial cancer is not established and has limitations associated with the region of the lymph node, the size of metastasis lesion in lymph node and the pathological type of primary tumor. The indication for the imaging and the interpretation of the results requires consideration for each case by the pretest probability based on the information obtained preoperatively. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Queues on a Dynamically Evolving Graph

NASA Astrophysics Data System (ADS)

Mandjes, Michel; Starreveld, Nicos J.; Bekker, René

2018-04-01

This paper considers a population process on a dynamically evolving graph, which can be alternatively interpreted as a queueing network. The queues are of infinite-server type, entailing that at each node all customers present are served in parallel. The links that connect the queues have the special feature that they are unreliable, in the sense that their status alternates between `up' and `down'. If a link between two nodes is down, with a fixed probability each of the clients attempting to use that link is lost; otherwise the client remains at the origin node and reattempts using the link (and jumps to the destination node when it finds the link restored). For these networks we present the following results: (a) a system of coupled partial differential equations that describes the joint probability generating function corresponding to the queues' time-dependent behavior (and a system of ordinary differential equations for its stationary counterpart), (b) an algorithm to evaluate the (time-dependent and stationary) moments, and procedures to compute user-perceived performance measures which facilitate the quantification of the impact of the links' outages, (c) a diffusion limit for the joint queue length process. We include explicit results for a series relevant special cases, such as tandem networks and symmetric fully connected networks.
Cytological Diagnosis of an Uncommon High Grade Malignant Thyroid Tumour: A Case Report.

PubMed

Nagpal, Ruchi; Kaushal, Manju; Kumar, Sawan

2017-07-01

Anaplastic Thyroid Carcinoma (ATC) is a relatively uncommon highly malignant tumour originating from the follicular cells of thyroid gland having poor prognosis. It accounts for 2% to 5% of all thyroid carcinomas and patients typically present with a rapidly growing anterior neck mass with aggressive symptoms. A 53-year-old male presented with diffuse neck swelling measuring 8x6 cm and right cervical lymph node measuring 2x2 cm since one month which was associated with dyspepsia and dyspnoea. Ultrasound and Contrast Enhanced Computed Tomography (CECT) neck revealed enlarged right lobe of thyroid and multiple enlarged cervical lymph nodes with soft tissue density nodules in bilateral lungs. Fine Needle Aspiration (FNA) from the swelling revealed giant cell, spindle cell and squamoid pattern. Focal areas showed follicular epithelial cells arranged in repeated microfollicular pattern suggesting an underlying follicular neoplasm. FNAC smears from the lymph node also revealed similar findings. Based on the cytomorphological and radiological findings, final diagnosis of ATC probably arising from underlying follicular carcinoma with cervical lymph node and lung metastasis was given. FNAC leads to prompt and definitive diagnosis, so that therapy can be initiated as soon as possible for better outcome. Multimodality therapy (surgery, external beam radiation, and chemotherapy) is the mainstay of treatment.
Cytological Diagnosis of an Uncommon High Grade Malignant Thyroid Tumour: A Case Report

PubMed Central

Kaushal, Manju; Kumar, Sawan

2017-01-01

Anaplastic Thyroid Carcinoma (ATC) is a relatively uncommon highly malignant tumour originating from the follicular cells of thyroid gland having poor prognosis. It accounts for 2% to 5% of all thyroid carcinomas and patients typically present with a rapidly growing anterior neck mass with aggressive symptoms. A 53-year-old male presented with diffuse neck swelling measuring 8x6 cm and right cervical lymph node measuring 2x2 cm since one month which was associated with dyspepsia and dyspnoea. Ultrasound and Contrast Enhanced Computed Tomography (CECT) neck revealed enlarged right lobe of thyroid and multiple enlarged cervical lymph nodes with soft tissue density nodules in bilateral lungs. Fine Needle Aspiration (FNA) from the swelling revealed giant cell, spindle cell and squamoid pattern. Focal areas showed follicular epithelial cells arranged in repeated microfollicular pattern suggesting an underlying follicular neoplasm. FNAC smears from the lymph node also revealed similar findings. Based on the cytomorphological and radiological findings, final diagnosis of ATC probably arising from underlying follicular carcinoma with cervical lymph node and lung metastasis was given. FNAC leads to prompt and definitive diagnosis, so that therapy can be initiated as soon as possible for better outcome. Multimodality therapy (surgery, external beam radiation, and chemotherapy) is the mainstay of treatment. PMID:28892908
Global tree network for computing structures enabling global processing operations

DOEpatents

Blumrich; Matthias A.; Chen, Dong; Coteus, Paul W.; Gara, Alan G.; Giampapa, Mark E.; Heidelberger, Philip; Hoenicke, Dirk; Steinmacher-Burow, Burkhard D.; Takken, Todd E.; Vranas, Pavlos M.

2010-01-19

A system and method for enabling high-speed, low-latency global tree network communications among processing nodes interconnected according to a tree network structure. The global tree network enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices are included that interconnect the nodes of the tree via links to facilitate performance of low-latency global processing operations at nodes of the virtual tree and sub-tree structures. The global operations performed include one or more of: broadcast operations downstream from a root node to leaf nodes of a virtual tree, reduction operations upstream from leaf nodes to the root node in the virtual tree, and point-to-point message passing from any node to the root node. The global tree network is configurable to provide global barrier and interrupt functionality in asynchronous or synchronized manner, and, is physically and logically partitionable.
ERDC MSRC Resource. High Performance Computing for the Warfighter. Spring 2006

DTIC Science & Technology

2006-01-01

named Ruby, and the HP/Compaq SC45, named Emerald , continue to add their unique sparkle to the ERDC MSRC computer infrastructure. ERDC invited the...configuration on B-52H purchased additional memory for the login nodes so that this part of the solution process could be done as a preprocessing step. On...application and system services. Of the service nodes, 10 are login nodes and 23 are input/output (I/O) server nodes for the Lustre file system (i.e., the

A Search Strategy of Level-Based Flooding for the Internet of Things

PubMed Central

Qiu, Tie; Ding, Yanhong; Xia, Feng; Ma, Honglian

2012-01-01

This paper deals with the query problem in the Internet of Things (IoT). Flooding is an important query strategy. However, original flooding is prone to cause heavy network loads. To address this problem, we propose a variant of flooding, called Level-Based Flooding (LBF). With LBF, the whole network is divided into several levels according to the distances (i.e., hops) between the sensor nodes and the sink node. The sink node knows the level information of each node. Query packets are broadcast in the network according to the levels of nodes. Upon receiving a query packet, sensor nodes decide how to process it according to the percentage of neighbors that have processed it. When the target node receives the query packet, it sends its data back to the sink node via random walk. We show by extensive simulations that the performance of LBF in terms of cost and latency is much better than that of original flooding, and LBF can be used in IoT of different scales. PMID:23112594
Dense, Efficient Chip-to-Chip Communication at the Extremes of Computing

ERIC Educational Resources Information Center

Loh, Matthew

2013-01-01

The scalability of CMOS technology has driven computation into a diverse range of applications across the power consumption, performance and size spectra. Communication is a necessary adjunct to computation, and whether this is to push data from node-to-node in a high-performance computing cluster or from the receiver of wireless link to a neural…
Decomposition method for fast computation of gigapixel-sized Fresnel holograms on a graphics processing unit cluster.

PubMed

Jackin, Boaz Jessie; Watanabe, Shinpei; Ootsu, Kanemitsu; Ohkawa, Takeshi; Yokota, Takashi; Hayasaki, Yoshio; Yatagai, Toyohiko; Baba, Takanobu

2018-04-20

A parallel computation method for large-size Fresnel computer-generated hologram (CGH) is reported. The method was introduced by us in an earlier report as a technique for calculating Fourier CGH from 2D object data. In this paper we extend the method to compute Fresnel CGH from 3D object data. The scale of the computation problem is also expanded to 2 gigapixels, making it closer to real application requirements. The significant feature of the reported method is its ability to avoid communication overhead and thereby fully utilize the computing power of parallel devices. The method exhibits three layers of parallelism that favor small to large scale parallel computing machines. Simulation and optical experiments were conducted to demonstrate the workability and to evaluate the efficiency of the proposed technique. A two-times improvement in computation speed has been achieved compared to the conventional method, on a 16-node cluster (one GPU per node) utilizing only one layer of parallelism. A 20-times improvement in computation speed has been estimated utilizing two layers of parallelism on a very large-scale parallel machine with 16 nodes, where each node has 16 GPUs.
Waggle: A Framework for Intelligent Attentive Sensing and Actuation

NASA Astrophysics Data System (ADS)

Sankaran, R.; Jacob, R. L.; Beckman, P. H.; Catlett, C. E.; Keahey, K.

2014-12-01

Advances in sensor-driven computation and computationally steered sensing will greatly enable future research in fields including environmental and atmospheric sciences. We will present "Waggle," an open-source hardware and software infrastructure developed with two goals: (1) reducing the separation and latency between sensing and computing and (2) improving the reliability and longevity of sensing-actuation platforms in challenging and costly deployments. Inspired by "deep-space probe" systems, the Waggle platform design includes features that can support longitudinal studies, deployments with varying communication links, and remote management capabilities. Waggle lowers the barrier for scientists to incorporate real-time data from their sensors into their computations and to manipulate the sensors or provide feedback through actuators. A standardized software and hardware design allows quick addition of new sensors/actuators and associated software in the nodes and enables them to be coupled with computational codes both insitu and on external compute infrastructure. The Waggle framework currently drives the deployment of two observational systems - a portable and self-sufficient weather platform for study of small-scale effects in Chicago's urban core and an open-ended distributed instrument in Chicago that aims to support several research pursuits across a broad range of disciplines including urban planning, microbiology and computer science. Built around open-source software, hardware, and Linux OS, the Waggle system comprises two components - the Waggle field-node and Waggle cloud-computing infrastructure. Waggle field-node affords a modular, scalable, fault-tolerant, secure, and extensible platform for hosting sensors and actuators in the field. It supports insitu computation and data storage, and integration with cloud-computing infrastructure. The Waggle cloud infrastructure is designed with the goal of scaling to several hundreds of thousands of Waggle nodes. It supports aggregating data from sensors hosted by the nodes, staging computation, relaying feedback to the nodes and serving data to end-users. We will discuss the Waggle design principles and their applicability to various observational research pursuits, and demonstrate its capabilities.
Scalable computing for evolutionary genomics.

PubMed

Prins, Pjotr; Belhachemi, Dominique; Möller, Steffen; Smant, Geert

2012-01-01

Genomic data analysis in evolutionary biology is becoming so computationally intensive that analysis of multiple hypotheses and scenarios takes too long on a single desktop computer. In this chapter, we discuss techniques for scaling computations through parallelization of calculations, after giving a quick overview of advanced programming techniques. Unfortunately, parallel programming is difficult and requires special software design. The alternative, especially attractive for legacy software, is to introduce poor man's parallelization by running whole programs in parallel as separate processes, using job schedulers. Such pipelines are often deployed on bioinformatics computer clusters. Recent advances in PC virtualization have made it possible to run a full computer operating system, with all of its installed software, on top of another operating system, inside a "box," or virtual machine (VM). Such a VM can flexibly be deployed on multiple computers, in a local network, e.g., on existing desktop PCs, and even in the Cloud, to create a "virtual" computer cluster. Many bioinformatics applications in evolutionary biology can be run in parallel, running processes in one or more VMs. Here, we show how a ready-made bioinformatics VM image, named BioNode, effectively creates a computing cluster, and pipeline, in a few steps. This allows researchers to scale-up computations from their desktop, using available hardware, anytime it is required. BioNode is based on Debian Linux and can run on networked PCs and in the Cloud. Over 200 bioinformatics and statistical software packages, of interest to evolutionary biology, are included, such as PAML, Muscle, MAFFT, MrBayes, and BLAST. Most of these software packages are maintained through the Debian Med project. In addition, BioNode contains convenient configuration scripts for parallelizing bioinformatics software. Where Debian Med encourages packaging free and open source bioinformatics software through one central project, BioNode encourages creating free and open source VM images, for multiple targets, through one central project. BioNode can be deployed on Windows, OSX, Linux, and in the Cloud. Next to the downloadable BioNode images, we provide tutorials online, which empower bioinformaticians to install and run BioNode in different environments, as well as information for future initiatives, on creating and building such images.
Dedicated heterogeneous node scheduling including backfill scheduling

DOEpatents

Wood, Robert R [Livermore, CA; Eckert, Philip D [Livermore, CA; Hommes, Gregg [Pleasanton, CA

2006-07-25

A method and system for job backfill scheduling dedicated heterogeneous nodes in a multi-node computing environment. Heterogeneous nodes are grouped into homogeneous node sub-pools. For each sub-pool, a free node schedule (FNS) is created so that the number of to chart the free nodes over time. For each prioritized job, using the FNS of sub-pools having nodes useable by a particular job, to determine the earliest time range (ETR) capable of running the job. Once determined for a particular job, scheduling the job to run in that ETR. If the ETR determined for a lower priority job (LPJ) has a start time earlier than a higher priority job (HPJ), then the LPJ is scheduled in that ETR if it would not disturb the anticipated start times of any HPJ previously scheduled for a future time. Thus, efficient utilization and throughput of such computing environments may be increased by utilizing resources otherwise remaining idle.
Support Vector Machines Model of Computed Tomography for Assessing Lymph Node Metastasis in Esophageal Cancer with Neoadjuvant Chemotherapy.

PubMed

Wang, Zhi-Long; Zhou, Zhi-Guo; Chen, Ying; Li, Xiao-Ting; Sun, Ying-Shi

The aim of this study was to diagnose lymph node metastasis of esophageal cancer by support vector machines model based on computed tomography. A total of 131 esophageal cancer patients with preoperative chemotherapy and radical surgery were included. Various indicators (tumor thickness, tumor length, tumor CT value, total number of lymph nodes, and long axis and short axis sizes of largest lymph node) on CT images before and after neoadjuvant chemotherapy were recorded. A support vector machines model based on these CT indicators was built to predict lymph node metastasis. Support vector machines model diagnosed lymph node metastasis better than preoperative short axis size of largest lymph node on CT. The area under the receiver operating characteristic curves were 0.887 and 0.705, respectively. The support vector machine model of CT images can help diagnose lymph node metastasis in esophageal cancer with preoperative chemotherapy.
Development and Implementation of Production Area of Agricultural Product Data Collection System Based on Embedded System

NASA Astrophysics Data System (ADS)

Xi, Lei; Guo, Wei; Che, Yinchao; Zhang, Hao; Wang, Qiang; Ma, Xinming

To solve problems in detecting the origin of agricultural products, this paper brings about an embedded data-based terminal, applies middleware thinking, and provides reusable long-range two-way data exchange module between business equipment and data acquisition systems. The system is constructed by data collection node and data center nodes. Data collection nodes taking embedded data terminal NetBoxII as the core, consisting of data acquisition interface layer, controlling information layer and data exchange layer, completing the data reading of different front-end acquisition equipments, and packing the data TCP to realize the data exchange between data center nodes according to the physical link (GPRS / CDMA / Ethernet). Data center node consists of the data exchange layer, the data persistence layer, and the business interface layer, which make the data collecting durable, and provide standardized data for business systems based on mapping relationship of collected data and business data. Relying on public communications networks, application of the system could establish the road of flow of information between the scene of origin certification and management center, and could realize the real-time collection, storage and processing between data of origin certification scene and databases of certification organization, and could achieve needs of long-range detection of agricultural origin.
Studying an Eulerian Computer Model on Different High-performance Computer Platforms and Some Applications

NASA Astrophysics Data System (ADS)

Georgiev, K.; Zlatev, Z.

2010-11-01

The Danish Eulerian Model (DEM) is an Eulerian model for studying the transport of air pollutants on large scale. Originally, the model was developed at the National Environmental Research Institute of Denmark. The model computational domain covers Europe and some neighbour parts belong to the Atlantic Ocean, Asia and Africa. If DEM model is to be applied by using fine grids, then its discretization leads to a huge computational problem. This implies that such a model as DEM must be run only on high-performance computer architectures. The implementation and tuning of such a complex large-scale model on each different computer is a non-trivial task. Here, some comparison results of running of this model on different kind of vector (CRAY C92A, Fujitsu, etc.), parallel computers with distributed memory (IBM SP, CRAY T3E, Beowulf clusters, Macintosh G4 clusters, etc.), parallel computers with shared memory (SGI Origin, SUN, etc.) and parallel computers with two levels of parallelism (IBM SMP, IBM BlueGene/P, clusters of multiprocessor nodes, etc.) will be presented. The main idea in the parallel version of DEM is domain partitioning approach. Discussions according to the effective use of the cache and hierarchical memories of the modern computers as well as the performance, speed-ups and efficiency achieved will be done. The parallel code of DEM, created by using MPI standard library, appears to be highly portable and shows good efficiency and scalability on different kind of vector and parallel computers. Some important applications of the computer model output are presented in short.
Spiking network simulation code for petascale computers.

PubMed

Kunkel, Susanne; Schmidt, Maximilian; Eppler, Jochen M; Plesser, Hans E; Masumoto, Gen; Igarashi, Jun; Ishii, Shin; Fukai, Tomoki; Morrison, Abigail; Diesmann, Markus; Helias, Moritz

2014-01-01

Brain-scale networks exhibit a breathtaking heterogeneity in the dynamical properties and parameters of their constituents. At cellular resolution, the entities of theory are neurons and synapses and over the past decade researchers have learned to manage the heterogeneity of neurons and synapses with efficient data structures. Already early parallel simulation codes stored synapses in a distributed fashion such that a synapse solely consumes memory on the compute node harboring the target neuron. As petaflop computers with some 100,000 nodes become increasingly available for neuroscience, new challenges arise for neuronal network simulation software: Each neuron contacts on the order of 10,000 other neurons and thus has targets only on a fraction of all compute nodes; furthermore, for any given source neuron, at most a single synapse is typically created on any compute node. From the viewpoint of an individual compute node, the heterogeneity in the synaptic target lists thus collapses along two dimensions: the dimension of the types of synapses and the dimension of the number of synapses of a given type. Here we present a data structure taking advantage of this double collapse using metaprogramming techniques. After introducing the relevant scaling scenario for brain-scale simulations, we quantitatively discuss the performance on two supercomputers. We show that the novel architecture scales to the largest petascale supercomputers available today.
Spiking network simulation code for petascale computers

PubMed Central

Kunkel, Susanne; Schmidt, Maximilian; Eppler, Jochen M.; Plesser, Hans E.; Masumoto, Gen; Igarashi, Jun; Ishii, Shin; Fukai, Tomoki; Morrison, Abigail; Diesmann, Markus; Helias, Moritz

2014-01-01

Brain-scale networks exhibit a breathtaking heterogeneity in the dynamical properties and parameters of their constituents. At cellular resolution, the entities of theory are neurons and synapses and over the past decade researchers have learned to manage the heterogeneity of neurons and synapses with efficient data structures. Already early parallel simulation codes stored synapses in a distributed fashion such that a synapse solely consumes memory on the compute node harboring the target neuron. As petaflop computers with some 100,000 nodes become increasingly available for neuroscience, new challenges arise for neuronal network simulation software: Each neuron contacts on the order of 10,000 other neurons and thus has targets only on a fraction of all compute nodes; furthermore, for any given source neuron, at most a single synapse is typically created on any compute node. From the viewpoint of an individual compute node, the heterogeneity in the synaptic target lists thus collapses along two dimensions: the dimension of the types of synapses and the dimension of the number of synapses of a given type. Here we present a data structure taking advantage of this double collapse using metaprogramming techniques. After introducing the relevant scaling scenario for brain-scale simulations, we quantitatively discuss the performance on two supercomputers. We show that the novel architecture scales to the largest petascale supercomputers available today. PMID:25346682
Primary pulmonary spindle cell tumour (haemangiopericytoma) in a dog.

PubMed

Vignoli, M; Buchholz, J; Morandi, F; Laddaga, E; Brunetti, B; Rossi, F; Terragni, R; Sarli, G

2008-10-01

Haemangiopericytoma is a soft tissue sarcoma believed to originate from pericytes. These tumours are commonly located on the skin and subcutaneous tissue of dogs and are most commonly found on the limbs. To the authors' knowledge, primary lung haemangiopericytomas have not been previously described in dogs. This case report describes the diagnostic evaluation and treatment of a primary haemangiopericytoma of the lung in a 10-year-old male, neutered, Siberian husky dog. Staging of the tumour was performed using a computed tomography scan of the thorax and a computed tomography-guided fine-needle aspiration biopsy of the lesion. Treatment was a right caudal lobectomy from a right lateral approach. No regional lymph node changes were noted on computed tomography or intraoperative assessments. Histopathology confirmed a spindle cell tumour that stained positive for vimentin and negative for desmin and S-100.
Noguchi uses laptop computer in the Node 2 during Expedition 22

NASA Image and Video Library

2010-01-19

ISS022-E-030641 (19 Jan. 2010) --- Japan Aerospace Exploration Agency (JAXA) astronaut Soichi Noguchi, Expedition 22 flight engineer, uses a computer in the Harmony node of the International Space Station.
Nasopharyngeal Carcinoma with Cystic Cervical Metastasis Masquerading as Branchial Cleft Cyst: A Potential Pitfall in Diagnosis and Management.

PubMed

Sai-Guan, Lum; Min-Han, Kong; Kah-Wai, Ngan; Mohamad-Yunus, Mohd-Razif

2017-03-01

Most metastatic lymph nodes from head and neck malignancy are solid. Cystic nodes are found in 33% - 61% of carcinomas arise from Waldeyer's ring, of which only 1.8% - 8% originate are from the nasopharynx. Some cystic cervical metastases were initially presumed to be branchial cleft cyst. This case report aims to highlight the unusual presentation of cystic cervical metastasis secondary to nasopharyngeal carcinoma in a young adult. The histopathology, radiological features and management strategy were discussed. A 36-year-old man presented with a solitary cystic cervical swelling, initially diagnosed as branchial cleft cyst. Fine needle aspiration yielded 18 ml of straw-coloured fluid. During cytological examination no atypical cells were observed. Computed tomography of the neck showed a heterogeneous mass with multiseptation medial to the sternocleidomastoid muscle. Histopathological examination of the mass, post excision, revealed a metastatic lymph node. A suspicious mucosal lesion at the nasopharynx was detected after repeated thorough head and neck examinations and the biopsy result confirmed undifferentiated nasopharyngeal carcinoma. Cystic cervical metastasis may occur in young patients under 40 years. The primary tumour may not be obvious during initial presentation because it mimicks benign branchial cleft cyst clinically. Retrospective review of the computed tomography images revealed features that were not characteristic of simple branchial cleft cyst. The inadequacy of assessment and interpretation had lead to the error in diagnosis and subsequent management. Metastatic head and neck lesion must be considered in a young adult with a cystic neck mass.
An S N Algorithm for Modern Architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, Randal Scott

2016-08-29

LANL discrete ordinates transport packages are required to perform large, computationally intensive time-dependent calculations on massively parallel architectures, where even a single such calculation may need many months to complete. While KBA methods scale out well to very large numbers of compute nodes, we are limited by practical constraints on the number of such nodes we can actually apply to any given calculation. Instead, we describe a modified KBA algorithm that allows realization of the reductions in solution time offered by both the current, and future, architectural changes within a compute node.
Multi-hop routing mechanism for reliable sensor computing.

PubMed

Chen, Jiann-Liang; Ma, Yi-Wei; Lai, Chia-Ping; Hu, Chia-Cheng; Huang, Yueh-Min

2009-01-01

Current research on routing in wireless sensor computing concentrates on increasing the service lifetime, enabling scalability for large number of sensors and supporting fault tolerance for battery exhaustion and broken nodes. A sensor node is naturally exposed to various sources of unreliable communication channels and node failures. Sensor nodes have many failure modes, and each failure degrades the network performance. This work develops a novel mechanism, called Reliable Routing Mechanism (RRM), based on a hybrid cluster-based routing protocol to specify the best reliable routing path for sensor computing. Table-driven intra-cluster routing and on-demand inter-cluster routing are combined by changing the relationship between clusters for sensor computing. Applying a reliable routing mechanism in sensor computing can improve routing reliability, maintain low packet loss, minimize management overhead and save energy consumption. Simulation results indicate that the reliability of the proposed RRM mechanism is around 25% higher than that of the Dynamic Source Routing (DSR) and ad hoc On-demand Distance Vector routing (AODV) mechanisms.
Multi-petascale highly efficient parallel supercomputer

DOEpatents

Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.; Blumrich, Matthias A.; Boyle, Peter; Brunheroto, Jose R.; Chen, Dong; Cher, Chen -Yong; Chiu, George L.; Christ, Norman; Coteus, Paul W.; Davis, Kristan D.; Dozsa, Gabor J.; Eichenberger, Alexandre E.; Eisley, Noel A.; Ellavsky, Matthew R.; Evans, Kahn C.; Fleischer, Bruce M.; Fox, Thomas W.; Gara, Alan; Giampapa, Mark E.; Gooding, Thomas M.; Gschwind, Michael K.; Gunnels, John A.; Hall, Shawn A.; Haring, Rudolf A.; Heidelberger, Philip; Inglett, Todd A.; Knudson, Brant L.; Kopcsay, Gerard V.; Kumar, Sameer; Mamidala, Amith R.; Marcella, James A.; Megerian, Mark G.; Miller, Douglas R.; Miller, Samuel J.; Muff, Adam J.; Mundy, Michael B.; O'Brien, John K.; O'Brien, Kathryn M.; Ohmacht, Martin; Parker, Jeffrey J.; Poole, Ruth J.; Ratterman, Joseph D.; Salapura, Valentina; Satterfield, David L.; Senger, Robert M.; Smith, Brian; Steinmacher-Burow, Burkhard; Stockdell, William M.; Stunkel, Craig B.; Sugavanam, Krishnan; Sugawara, Yutaka; Takken, Todd E.; Trager, Barry M.; Van Oosten, James L.; Wait, Charles D.; Walkup, Robert E.; Watson, Alfred T.; Wisniewski, Robert W.; Wu, Peng

2015-07-14

A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.
A universal computer control system for motors

NASA Technical Reports Server (NTRS)

Szakaly, Zoltan F. (Inventor)

1991-01-01

A control system for a multi-motor system such as a space telerobot, having a remote computational node and a local computational node interconnected with one another by a high speed data link is described. A Universal Computer Control System (UCCS) for the telerobot is located at each node. Each node is provided with a multibus computer system which is characterized by a plurality of processors with all processors being connected to a common bus, and including at least one command processor. The command processor communicates over the bus with a plurality of joint controller cards. A plurality of direct current torque motors, of the type used in telerobot joints and telerobot hand-held controllers, are connected to the controller cards and responds to digital control signals from the command processor. Essential motor operating parameters are sensed by analog sensing circuits and the sensed analog signals are converted to digital signals for storage at the controller cards where such signals can be read during an address read/write cycle of the command processing processor.
Toward real-time Monte Carlo simulation using a commercial cloud computing infrastructure.

PubMed

Wang, Henry; Ma, Yunzhi; Pratx, Guillem; Xing, Lei

2011-09-07

Monte Carlo (MC) methods are the gold standard for modeling photon and electron transport in a heterogeneous medium; however, their computational cost prohibits their routine use in the clinic. Cloud computing, wherein computing resources are allocated on-demand from a third party, is a new approach for high performance computing and is implemented to perform ultra-fast MC calculation in radiation therapy. We deployed the EGS5 MC package in a commercial cloud environment. Launched from a single local computer with Internet access, a Python script allocates a remote virtual cluster. A handshaking protocol designates master and worker nodes. The EGS5 binaries and the simulation data are initially loaded onto the master node. The simulation is then distributed among independent worker nodes via the message passing interface, and the results aggregated on the local computer for display and data analysis. The described approach is evaluated for pencil beams and broad beams of high-energy electrons and photons. The output of cloud-based MC simulation is identical to that produced by single-threaded implementation. For 1 million electrons, a simulation that takes 2.58 h on a local computer can be executed in 3.3 min on the cloud with 100 nodes, a 47× speed-up. Simulation time scales inversely with the number of parallel nodes. The parallelization overhead is also negligible for large simulations. Cloud computing represents one of the most important recent advances in supercomputing technology and provides a promising platform for substantially improved MC simulation. In addition to the significant speed up, cloud computing builds a layer of abstraction for high performance parallel computing, which may change the way dose calculations are performed and radiation treatment plans are completed.
Sentinel nodes identified by computed tomography-lymphography accurately stage the axilla in patients with breast cancer

PubMed Central

2013-01-01

Background Sentinel node biopsy often results in the identification and removal of multiple nodes as sentinel nodes, although most of these nodes could be non-sentinel nodes. This study investigated whether computed tomography-lymphography (CT-LG) can distinguish sentinel nodes from non-sentinel nodes and whether sentinel nodes identified by CT-LG can accurately stage the axilla in patients with breast cancer. Methods This study included 184 patients with breast cancer and clinically negative nodes. Contrast agent was injected interstitially. The location of sentinel nodes was marked on the skin surface using a CT laser light navigator system. Lymph nodes located just under the marks were first removed as sentinel nodes. Then, all dyed nodes or all hot nodes were removed. Results The mean number of sentinel nodes identified by CT-LG was significantly lower than that of dyed and/or hot nodes removed (1.1 vs 1.8, p <0.0001). Twenty-three (12.5%) patients had ≥2 sentinel nodes identified by CT-LG removed, whereas 94 (51.1%) of patients had ≥2 dyed and/or hot nodes removed (p <0.0001). Pathological evaluation demonstrated that 47 (25.5%) of 184 patients had metastasis to at least one node. All 47 patients demonstrated metastases to at least one of the sentinel nodes identified by CT-LG. Conclusions CT-LG can distinguish sentinel nodes from non-sentinel nodes, and sentinel nodes identified by CT-LG can accurately stage the axilla in patients with breast cancer. Successful identification of sentinel nodes using CT-LG may facilitate image-based diagnosis of metastasis, possibly leading to the omission of sentinel node biopsy. PMID:24321242

The predictive value of single-photon emission computed tomography/computed tomography for sentinel lymph node localization in head and neck cutaneous malignancy.

PubMed

Remenschneider, Aaron K; Dilger, Amanda E; Wang, Yingbing; Palmer, Edwin L; Scott, James A; Emerick, Kevin S

2015-04-01

Preoperative localization of sentinel lymph nodes in head and neck cutaneous malignancies can be aided by single-photon emission computed tomography/computed tomography (SPECT/CT); however, its true predictive value for identifying lymph nodes intraoperatively remains unquantified. This study aims to understand the sensitivity, specificity, and positive and negative predictive values of SPECT/CT in sentinel lymph node biopsy for cutaneous malignancies of the head and neck. Blinded retrospective imaging review with comparison to intraoperative gamma probe confirmed sentinel lymph nodes. A consecutive series of patients with a head and neck cutaneous malignancy underwent preoperative SPECT/CT followed by sentinel lymph node biopsy with a gamma probe. Two nuclear medicine physicians, blinded to clinical data, independently reviewed each SPECT/CT. Activity within radiographically defined nodal basins was recorded and compared to intraoperative gamma probe findings. Sensitivity, specificity, and negative and positive predictive values were calculated with subgroup stratification by primary tumor site. Ninety-two imaging reads were performed on 47 patients with cutaneous malignancy who underwent SPECT/CT followed by sentinel lymph node biopsy. Overall sensitivity was 73%, specificity 92%, positive predictive value 54%, and negative predictive value 96%. The predictive ability of SPECT/CT to identify the basin or an adjacent basin containing the single hottest node was 92%. SPECT/CT overestimated uptake by an average of one nodal basin. In the head and neck, SPECT/CT has higher reliability for primary lesions of the eyelid, scalp, and cheek. SPECT/CT has high sensitivity, specificity, and negative predictive value, but may overestimate relevant nodal basins in sentinel lymph node biopsy. © 2014 The American Laryngological, Rhinological and Otological Society, Inc.
OPEX: Optimized Eccentricity Computation in Graphs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Henderson, Keith

2011-11-14

Real-world graphs have many properties of interest, but often these properties are expensive to compute. We focus on eccentricity, radius and diameter in this work. These properties are useful measures of the global connectivity patterns in a graph. Unfortunately, computing eccentricity for all nodes is O(n2) for a graph with n nodes. We present OPEX, a novel combination of optimizations which improves computation time of these properties by orders of magnitude in real-world experiments on graphs of many different sizes. We run OPEX on graphs with up to millions of links. OPEX gives either exact results or bounded approximations, unlikemore » its competitors which give probabilistic approximations or sacrifice node-level information (eccentricity) to compute graphlevel information (diameter).« less
Effecting a broadcast with an allreduce operation on a parallel computer

DOEpatents

Almasi, Gheorghe; Archer, Charles J.; Ratterman, Joseph D.; Smith, Brian E.

2010-11-02

A parallel computer comprises a plurality of compute nodes organized into at least one operational group for collective parallel operations. Each compute node is assigned a unique rank and is coupled for data communications through a global combining network. One compute node is assigned to be a logical root. A send buffer and a receive buffer is configured. Each element of a contribution of the logical root in the send buffer is contributed. One or more zeros corresponding to a size of the element are injected. An allreduce operation with a bitwise OR using the element and the injected zeros is performed. And the result for the allreduce operation is determined and stored in each receive buffer.
Embedding global and collective in a torus network with message class map based tree path selection

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Dong; Coteus, Paul W.; Eisley, Noel A.

Embodiments of the invention provide a method, system and computer program product for embedding a global barrier and global interrupt network in a parallel computer system organized as a torus network. The computer system includes a multitude of nodes. In one embodiment, the method comprises taking inputs from a set of receivers of the nodes, dividing the inputs from the receivers into a plurality of classes, combining the inputs of each of the classes to obtain a result, and sending said result to a set of senders of the nodes. Embodiments of the invention provide a method, system and computermore » program product for embedding a collective network in a parallel computer system organized as a torus network. In one embodiment, the method comprises adding to a torus network a central collective logic to route messages among at least a group of nodes in a tree structure.« less
INDIRECT COMPUTED TOMOGRAPHIC LYMPHOGRAPHY FOR ILIOSACRAL LYMPHATIC MAPPING IN A COHORT OF DOGS WITH ANAL SAC GLAND ADENOCARCINOMA: TECHNIQUE DESCRIPTION.

PubMed

Majeski, Stephanie A; Steffey, Michele A; Fuller, Mark; Hunt, Geraldine B; Mayhew, Philipp D; Pollard, Rachel E

2017-05-01

Sentinel lymph node mapping can help to direct surgical oncologic staging and metastatic disease detection in patients with complex lymphatic pathways. We hypothesized that indirect computed tomographic lymphography (ICTL) with a water-soluble iodinated contrast agent would successfully map lymphatic pathways of the iliosacral lymphatic center in dogs with anal sac gland carcinoma, providing a potential preoperative method for iliosacral sentinel lymph node identification in dogs. Thirteen adult dogs diagnosed with anal sac gland carcinoma were enrolled in this prospective, pilot study, and ICTL was performed via peritumoral contrast injection with serial caudal abdominal computed tomography scans for iliosacral sentinel lymph node identification. Technical and descriptive details for ICTL were recorded, including patient positioning, total contrast injection volume, timing of contrast visualization, and sentinel lymph nodes and lymphatic pathways identified. Indirect CT lymphography identified lymphatic pathways and sentinel lymph nodes in 12/13 cases (92%). Identified sentinel lymph nodes were ipsilateral to the anal sac gland carcinoma in 8/12 and contralateral to the anal sac gland carcinoma in 4/12 cases. Sacral, internal iliac, and medial iliac lymph nodes were identified as sentinel lymph nodes, and patterns were widely variable. Patient positioning and timing of imaging may impact successful sentinel lymph node identification. Positioning in supported sternal recumbency is recommended. Results indicate that ICTL may be a feasible technique for sentinel lymph node identification in dogs with anal sac gland carcinoma and offer preliminary data to drive further investigation of iliosacral lymphatic metastatic patterns using ICTL and sentinel lymph node biopsy. © 2017 American College of Veterinary Radiology.
Collectively loading programs in a multiple program multiple data environment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aho, Michael E.; Attinella, John E.; Gooding, Thomas M.

Techniques are disclosed for loading programs efficiently in a parallel computing system. In one embodiment, nodes of the parallel computing system receive a load description file which indicates, for each program of a multiple program multiple data (MPMD) job, nodes which are to load the program. The nodes determine, using collective operations, a total number of programs to load and a number of programs to load in parallel. The nodes further generate a class route for each program to be loaded in parallel, where the class route generated for a particular program includes only those nodes on which the programmore » needs to be loaded. For each class route, a node is selected using a collective operation to be a load leader which accesses a file system to load the program associated with a class route and broadcasts the program via the class route to other nodes which require the program.« less
ALMA Correlator Real-Time Data Processor

NASA Astrophysics Data System (ADS)

Pisano, J.; Amestica, R.; Perez, J.

2005-10-01

The design of a real-time Linux application utilizing Real-Time Application Interface (RTAI) to process real-time data from the radio astronomy correlator for the Atacama Large Millimeter Array (ALMA) is described. The correlator is a custom-built digital signal processor which computes the cross-correlation function of two digitized signal streams. ALMA will have 64 antennas with 2080 signal streams each with a sample rate of 4 giga-samples per second. The correlator's aggregate data output will be 1 gigabyte per second. The software is defined by hard deadlines with high input and processing data rates, while requiring interfaces to non real-time external computers. The designed computer system - the Correlator Data Processor or CDP, consists of a cluster of 17 SMP computers, 16 of which are compute nodes plus a master controller node all running real-time Linux kernels. Each compute node uses an RTAI kernel module to interface to a 32-bit parallel interface which accepts raw data at 64 megabytes per second in 1 megabyte chunks every 16 milliseconds. These data are transferred to tasks running on multiple CPUs in hard real-time using RTAI's LXRT facility to perform quantization corrections, data windowing, FFTs, and phase corrections for a processing rate of approximately 1 GFLOPS. Highly accurate timing signals are distributed to all seventeen computer nodes in order to synchronize them to other time-dependent devices in the observatory array. RTAI kernel tasks interface to the timing signals providing sub-millisecond timing resolution. The CDP interfaces, via the master node, to other computer systems on an external intra-net for command and control, data storage, and further data (image) processing. The master node accesses these external systems utilizing ALMA Common Software (ACS), a CORBA-based client-server software infrastructure providing logging, monitoring, data delivery, and intra-computer function invocation. The software is being developed in tandem with the correlator hardware which presents software engineering challenges as the hardware evolves. The current status of this project and future goals are also presented.
Sequoia Messaging Rate Benchmark

DOE Office of Scientific and Technical Information (OSTI.GOV)

Friedley, Andrew

2008-01-22

The purpose of this benchmark is to measure the maximal message rate of a single compute node. The first num_cores ranks are expected to reside on the 'core' compute node for which message rate is being tested. After that, the next num_nbors ranks are neighbors for the first core rank, the next set of num_nbors ranks are neighbors for the second core rank, and so on. For example, testing an 8-core node (num_cores = 8) with 4 neighbors (num_nbors = 4) requires 8 + 8 * 4 - 40 ranks. The first 8 of those 40 ranks are expected tomore » be on the 'core' node being benchmarked, while the rest of the ranks are on separate nodes.« less
Indoor A* Pathfinding Through an Octree Representation of a Point Cloud

NASA Astrophysics Data System (ADS)

Rodenberg, O. B. P. M.; Verbree, E.; Zlatanova, S.

2016-10-01

There is a growing demand of 3D indoor pathfinding applications. Researched in the field of robotics during the last decades of the 20th century, these methods focussed on 2D navigation. Nowadays we would like to have the ability to help people navigate inside buildings or send a drone inside a building when this is too dangerous for people. What these examples have in common is that an object with a certain geometry needs to find an optimal collision free path between a start and goal point. This paper presents a new workflow for pathfinding through an octree representation of a point cloud. We applied the following steps: 1) the point cloud is processed so it fits best in an octree; 2) during the octree generation the interior empty nodes are filtered and further processed; 3) for each interior empty node the distance to the closest occupied node directly under it is computed; 4) a network graph is computed for all empty nodes; 5) the A* pathfinding algorithm is conducted. This workflow takes into account the connectivity for each node to all possible neighbours (face, edge and vertex and all sizes). Besides, a collision avoidance system is pre-processed in two steps: first, the clearance of each empty node is computed, and then the maximal crossing value between two empty neighbouring nodes is computed. The clearance is used to select interior empty nodes of appropriate size and the maximal crossing value is used to filter the network graph. Finally, both these datasets are used in A* pathfinding.
Large scale cardiac modeling on the Blue Gene supercomputer.

PubMed

Reumann, Matthias; Fitch, Blake G; Rayshubskiy, Aleksandr; Keller, David U; Weiss, Daniel L; Seemann, Gunnar; Dössel, Olaf; Pitman, Michael C; Rice, John J

2008-01-01

Multi-scale, multi-physical heart models have not yet been able to include a high degree of accuracy and resolution with respect to model detail and spatial resolution due to computational limitations of current systems. We propose a framework to compute large scale cardiac models. Decomposition of anatomical data in segments to be distributed on a parallel computer is carried out by optimal recursive bisection (ORB). The algorithm takes into account a computational load parameter which has to be adjusted according to the cell models used. The diffusion term is realized by the monodomain equations. The anatomical data-set was given by both ventricles of the Visible Female data-set in a 0.2 mm resolution. Heterogeneous anisotropy was included in the computation. Model weights as input for the decomposition and load balancing were set to (a) 1 for tissue and 0 for non-tissue elements; (b) 10 for tissue and 1 for non-tissue elements. Scaling results for 512, 1024, 2048, 4096 and 8192 computational nodes were obtained for 10 ms simulation time. The simulations were carried out on an IBM Blue Gene/L parallel computer. A 1 s simulation was then carried out on 2048 nodes for the optimal model load. Load balances did not differ significantly across computational nodes even if the number of data elements distributed to each node differed greatly. Since the ORB algorithm did not take into account computational load due to communication cycles, the speedup is close to optimal for the computation time but not optimal overall due to the communication overhead. However, the simulation times were reduced form 87 minutes on 512 to 11 minutes on 8192 nodes. This work demonstrates that it is possible to run simulations of the presented detailed cardiac model within hours for the simulation of a heart beat.
B-MIC: An Ultrafast Three-Level Parallel Sequence Aligner Using MIC.

PubMed

Cui, Yingbo; Liao, Xiangke; Zhu, Xiaoqian; Wang, Bingqiang; Peng, Shaoliang

2016-03-01

Sequence alignment is the central process for sequence analysis, where mapping raw sequencing data to reference genome. The large amount of data generated by NGS is far beyond the process capabilities of existing alignment tools. Consequently, sequence alignment becomes the bottleneck of sequence analysis. Intensive computing power is required to address this challenge. Intel recently announced the MIC coprocessor, which can provide massive computing power. The Tianhe-2 is the world's fastest supercomputer now equipped with three MIC coprocessors each compute node. A key feature of sequence alignment is that different reads are independent. Considering this property, we proposed a MIC-oriented three-level parallelization strategy to speed up BWA, a widely used sequence alignment tool, and developed our ultrafast parallel sequence aligner: B-MIC. B-MIC contains three levels of parallelization: firstly, parallelization of data IO and reads alignment by a three-stage parallel pipeline; secondly, parallelization enabled by MIC coprocessor technology; thirdly, inter-node parallelization implemented by MPI. In this paper, we demonstrate that B-MIC outperforms BWA by a combination of those techniques using Inspur NF5280M server and the Tianhe-2 supercomputer. To the best of our knowledge, B-MIC is the first sequence alignment tool to run on Intel MIC and it can achieve more than fivefold speedup over the original BWA while maintaining the alignment precision.
Exact and heuristic algorithms for Space Information Flow.

PubMed

Uwitonze, Alfred; Huang, Jiaqing; Ye, Yuanqing; Cheng, Wenqing; Li, Zongpeng

2018-01-01

Space Information Flow (SIF) is a new promising research area that studies network coding in geometric space, such as Euclidean space. The design of algorithms that compute the optimal SIF solutions remains one of the key open problems in SIF. This work proposes the first exact SIF algorithm and a heuristic SIF algorithm that compute min-cost multicast network coding for N (N ≥ 3) given terminal nodes in 2-D Euclidean space. Furthermore, we find that the Butterfly network in Euclidean space is the second example besides the Pentagram network where SIF is strictly better than Euclidean Steiner minimal tree. The exact algorithm design is based on two key techniques: Delaunay triangulation and linear programming. Delaunay triangulation technique helps to find practically good candidate relay nodes, after which a min-cost multicast linear programming model is solved over the terminal nodes and the candidate relay nodes, to compute the optimal multicast network topology, including the optimal relay nodes selected by linear programming from all the candidate relay nodes and the flow rates on the connection links. The heuristic algorithm design is also based on Delaunay triangulation and linear programming techniques. The exact algorithm can achieve the optimal SIF solution with an exponential computational complexity, while the heuristic algorithm can achieve the sub-optimal SIF solution with a polynomial computational complexity. We prove the correctness of the exact SIF algorithm. The simulation results show the effectiveness of the heuristic SIF algorithm.
Dynamic Extension of a Virtualized Cluster by using Cloud Resources

NASA Astrophysics Data System (ADS)

Oberst, Oliver; Hauth, Thomas; Kernert, David; Riedel, Stephan; Quast, Günter

2012-12-01

The specific requirements concerning the software environment within the HEP community constrain the choice of resource providers for the outsourcing of computing infrastructure. The use of virtualization in HPC clusters and in the context of cloud resources is therefore a subject of recent developments in scientific computing. The dynamic virtualization of worker nodes in common batch systems provided by ViBatch serves each user with a dynamically virtualized subset of worker nodes on a local cluster. Now it can be transparently extended by the use of common open source cloud interfaces like OpenNebula or Eucalyptus, launching a subset of the virtual worker nodes within the cloud. This paper demonstrates how a dynamically virtualized computing cluster is combined with cloud resources by attaching remotely started virtual worker nodes to the local batch system.
Methods, apparatus and system for selective duplication of subtasks

DOEpatents

Andrade Costa, Carlos H.; Cher, Chen-Yong; Park, Yoonho; Rosenburg, Bryan S.; Ryu, Kyung D.

2016-03-29

A method for selective duplication of subtasks in a high-performance computing system includes: monitoring a health status of one or more nodes in a high-performance computing system, where one or more subtasks of a parallel task execute on the one or more nodes; identifying one or more nodes as having a likelihood of failure which exceeds a first prescribed threshold; selectively duplicating the one or more subtasks that execute on the one or more nodes having a likelihood of failure which exceeds the first prescribed threshold; and notifying a messaging library that one or more subtasks were duplicated.
AF-DHNN: Fuzzy Clustering and Inference-Based Node Fault Diagnosis Method for Fire Detection

PubMed Central

Jin, Shan; Cui, Wen; Jin, Zhigang; Wang, Ying

2015-01-01

Wireless Sensor Networks (WSNs) have been utilized for node fault diagnosis in the fire detection field since the 1990s. However, the traditional methods have some problems, including complicated system structures, intensive computation needs, unsteady data detection and local minimum values. In this paper, a new diagnosis mechanism for WSN nodes is proposed, which is based on fuzzy theory and an Adaptive Fuzzy Discrete Hopfield Neural Network (AF-DHNN). First, the original status of each sensor over time is obtained with two features. One is the root mean square of the filtered signal (FRMS), the other is the normalized summation of the positive amplitudes of the difference spectrum between the measured signal and the healthy one (NSDS). Secondly, distributed fuzzy inference is introduced. The evident abnormal nodes’ status is pre-alarmed to save time. Thirdly, according to the dimensions of the diagnostic data, an adaptive diagnostic status system is established with a Fuzzy C-Means Algorithm (FCMA) and Sorting and Classification Algorithm to reducing the complexity of the fault determination. Fourthly, a Discrete Hopfield Neural Network (DHNN) with iterations is improved with the optimization of the sensors’ detected status information and standard diagnostic levels, with which the associative memory is achieved, and the search efficiency is improved. The experimental results show that the AF-DHNN method can diagnose abnormal WSN node faults promptly and effectively, which improves the WSN reliability. PMID:26193280
Runtime optimization of an application executing on a parallel computer

DOEpatents

None

2014-11-25

Identifying a collective operation within an application executing on a parallel computer; identifying a call site of the collective operation; determining whether the collective operation is root-based; if the collective operation is not root-based: establishing a tuning session and executing the collective operation in the tuning session; if the collective operation is root-based, determining whether all compute nodes executing the application identified the collective operation at the same call site; if all compute nodes identified the collective operation at the same call site, establishing a tuning session and executing the collective operation in the tuning session; and if all compute nodes executing the application did not identify the collective operation at the same call site, executing the collective operation without establishing a tuning session.
Runtime optimization of an application executing on a parallel computer

DOEpatents

Faraj, Daniel A; Smith, Brian E

2014-11-18

Identifying a collective operation within an application executing on a parallel computer; identifying a call site of the collective operation; determining whether the collective operation is root-based; if the collective operation is not root-based: establishing a tuning session and executing the collective operation in the tuning session; if the collective operation is root-based, determining whether all compute nodes executing the application identified the collective operation at the same call site; if all compute nodes identified the collective operation at the same call site, establishing a tuning session and executing the collective operation in the tuning session; and if all compute nodes executing the application did not identify the collective operation at the same call site, executing the collective operation without establishing a tuning session.
Runtime optimization of an application executing on a parallel computer

DOEpatents

Faraj, Daniel A.; Smith, Brian E.

2013-01-29

Identifying a collective operation within an application executing on a parallel computer; identifying a call site of the collective operation; determining whether the collective operation is root-based; if the collective operation is not root-based: establishing a tuning session and executing the collective operation in the tuning session; if the collective operation is root-based, determining whether all compute nodes executing the application identified the collective operation at the same call site; if all compute nodes identified the collective operation at the same call site, establishing a tuning session and executing the collective operation in the tuning session; and if all compute nodes executing the application did not identify the collective operation at the same call site, executing the collective operation without establishing a tuning session.
A computer method for schedule processing and quick-time updating.

NASA Technical Reports Server (NTRS)

Mccoy, W. H.

1972-01-01

A schedule analysis program is presented which can be used to process any schedule with continuous flow and with no loops. Although generally thought of as a management tool, it has applicability to such extremes as music composition and computer program efficiency analysis. Other possibilities for its use include the determination of electrical power usage during some operation such as spacecraft checkout, and the determination of impact envelopes for the purpose of scheduling payloads in launch processing. At the core of the described computer method is an algorithm which computes the position of each activity bar on the output waterfall chart. The algorithm is basically a maximal-path computation which gives to each node in the schedule network the maximal path from the initial node to the given node.
Robust Routing Protocol For Digital Messages

NASA Technical Reports Server (NTRS)

Marvit, Maclen

1994-01-01

Refinement of ditigal-message-routing protocol increases fault tolerance of polled networks. AbNET-3 is latest of generic AbNET protocols for transmission of messages among computing nodes. AbNET concept described in "Multiple-Ring Digital Communication Network" (NPO-18133). Specifically aimed at increasing fault tolerance of network in broadcast mode, in which one node broadcasts message to and receives responses from all other nodes. Communication in network of computers maintained even when links fail.

Requesting Different Nodes Types When Submitting Jobs on the Peregrine

Science.gov Websites

System | High-Performance Computing | NREL Requesting Different Nodes Types When Submitting Jobs on the Peregrine System Requesting Different Nodes Types When Submitting Jobs on the Peregrine
User's guide for a general purpose dam-break flood simulation model (K-634)

USGS Publications Warehouse

Land, Larry F.

1981-01-01

An existing computer program for simulating dam-break floods for forecast purposes has been modified with an emphasis on general purpose applications. The original model was formulated, developed and documented by the National Weather Service. This model is based on the complete flow equations and uses a nonlinear implicit finite-difference numerical method. The first phase of the simulation routes a flood wave through the reservoir and computes an outflow hydrograph which is the sum of the flow through the dam 's structures and the gradually developing breach. The second phase routes this outflow hydrograph through the stream which may be nonprismatic and have segments with subcritical or supercritical flow. The results are discharge and stage hydrographs at the dam as well as all of the computational nodes in the channel. From these hydrographs, peak discharge and stage profiles are tabulated. (USGS)
A study on the value of computer-assisted assessment for SPECT/CT-scans in sentinel lymph node diagnostics of penile cancer as well as clinical reliability and morbidity of this procedure.

PubMed

Lützen, Ulf; Naumann, Carsten Maik; Marx, Marlies; Zhao, Yi; Jüptner, Michael; Baumann, René; Papp, László; Zsótér, Norbert; Aksenov, Alexey; Jünemann, Klaus-Peter; Zuhayra, Maaz

2016-09-07

Because of the increasing importance of computer-assisted post processing of image data in modern medical diagnostic we studied the value of an algorithm for assessment of single photon emission computed tomography/computed tomography (SPECT/CT)-data, which has been used for the first time for lymph node staging in penile cancer with non-palpable inguinal lymph nodes. In the guidelines of the relevant international expert societies, sentinel lymph node-biopsy (SLNB) is recommended as a diagnostic method of choice. The aim of this study is to evaluate the value of the afore-mentioned algorithm and in the clinical context the reliability and the associated morbidity of this procedure. Between 2008 and 2015, 25 patients with invasive penile cancer and inconspicuous inguinal lymph node status underwent SLNB after application of the radiotracer Tc-99m labelled nanocolloid. We recorded in a prospective approach the reliability and the complication rate of the procedure. In addition, we evaluated the results of an algorithm for SPECT/CT-data assessment of these patients. SLNB was carried out in 44 groins of 25 patients. In three patients, inguinal lymph node metastases were detected via SLNB. In one patient, bilateral lymph node recurrence of the groins occurred after negative SLNB. There was a false-negative rate of 4 % in relation to the number of patients (1/25), resp. 4.5 % in relation to the number of groins (2/44). Morbidity was 4 % in relation to the number of patients (1/25), resp. 2.3 % in relation to the number of groins (1/44). The results of computer-assisted assessment of SPECT/CT data for sentinel lymph node (SLN)-diagnostics demonstrated high sensitivity of 88.8 % and specificity of 86.7 %. SLNB is a very reliable method, associated with low morbidity. Computer-assisted assessment of SPECT/CT data of the SLN-diagnostics shows high sensitivity and specificity. While it cannot replace the assessment by medical experts, it can still provide substantial supplement and assistance.
Laboratory for Computer Science Progress Report 16, 1 July 1978 - 30 June 1979,

DTIC Science & Technology

1980-08-01

name strongly distinguishes the XLMS node from ordinary nameless semantic network nodes. The name of a node has two parts: the " genus ", itself a node...and the "specializer", a node or an atomic symbol. The genus and specializer of a node are almost always semantically meaningful, though their...meaning is almost never suppliec by XLMS, but rather by some system built on top of XLMS. The genus of a node almost always plays a crucial role in its
Service Migration from Cloud to Multi-tier Fog Nodes for Multimedia Dissemination with QoE Support

PubMed Central

Camargo, João; Rochol, Juergen; Gerla, Mario

2018-01-01

A wide range of multimedia services is expected to be offered for mobile users via various wireless access networks. Even the integration of Cloud Computing in such networks does not support an adequate Quality of Experience (QoE) in areas with high demands for multimedia contents. Fog computing has been conceptualized to facilitate the deployment of new services that cloud computing cannot provide, particularly those demanding QoE guarantees. These services are provided using fog nodes located at the network edge, which is capable of virtualizing their functions/applications. Service migration from the cloud to fog nodes can be actuated by request patterns and the timing issues. To the best of our knowledge, existing works on fog computing focus on architecture and fog node deployment issues. In this article, we describe the operational impacts and benefits associated with service migration from the cloud to multi-tier fog computing for video distribution with QoE support. Besides that, we perform the evaluation of such service migration of video services. Finally, we present potential research challenges and trends. PMID:29364172
Service Migration from Cloud to Multi-tier Fog Nodes for Multimedia Dissemination with QoE Support.

PubMed

Rosário, Denis; Schimuneck, Matias; Camargo, João; Nobre, Jéferson; Both, Cristiano; Rochol, Juergen; Gerla, Mario

2018-01-24

A wide range of multimedia services is expected to be offered for mobile users via various wireless access networks. Even the integration of Cloud Computing in such networks does not support an adequate Quality of Experience (QoE) in areas with high demands for multimedia contents. Fog computing has been conceptualized to facilitate the deployment of new services that cloud computing cannot provide, particularly those demanding QoE guarantees. These services are provided using fog nodes located at the network edge, which is capable of virtualizing their functions/applications. Service migration from the cloud to fog nodes can be actuated by request patterns and the timing issues. To the best of our knowledge, existing works on fog computing focus on architecture and fog node deployment issues. In this article, we describe the operational impacts and benefits associated with service migration from the cloud to multi-tier fog computing for video distribution with QoE support. Besides that, we perform the evaluation of such service migration of video services. Finally, we present potential research challenges and trends.
A model and nomogram to predict tumor site origin for squamous cell cancer confined to cervical lymph nodes.

PubMed

Ali, Arif N; Switchenko, Jeffrey M; Kim, Sungjin; Kowalski, Jeanne; El-Deiry, Mark W; Beitler, Jonathan J

2014-11-15

The current study was conducted to develop a multifactorial statistical model to predict the specific head and neck (H&N) tumor site origin in cases of squamous cell carcinoma confined to the cervical lymph nodes ("unknown primaries"). The Surveillance, Epidemiology, and End Results (SEER) database was analyzed for patients with an H&N tumor site who were diagnosed between 2004 and 2011. The SEER patients were identified according to their H&N primary tumor site and clinically positive cervical lymph node levels at the time of presentation. The SEER patient data set was randomly divided into 2 data sets for the purposes of internal split-sample validation. The effects of cervical lymph node levels, age, race, and sex on H&N primary tumor site were examined using univariate and multivariate analyses. Multivariate logistic regression models and an associated set of nomograms were developed based on relevant factors to provide probabilities of tumor site origin. Analysis of the SEER database identified 20,011 patients with H&N disease with both site-level and lymph node-level data. Sex, race, age, and lymph node levels were associated with primary H&N tumor site (nasopharynx, hypopharynx, oropharynx, and larynx) in the multivariate models. Internal validation techniques affirmed the accuracy of these models on separate data. The incorporation of epidemiologic and lymph node data into a predictive model has the potential to provide valuable guidance to clinicians in the treatment of patients with squamous cell carcinoma confined to the cervical lymph nodes. © 2014 The Authors. Cancer published by Wiley Periodicals, Inc. on behalf of American Cancer Society.
Recent Performance Results of VPIC on Trinity

NASA Astrophysics Data System (ADS)

Nystrom, W. D.; Bergen, B.; Bird, R. F.; Bowers, K. J.; Daughton, W. S.; Guo, F.; Le, A.; Li, H.; Nam, H.; Pang, X.; Stark, D. J.; Rust, W. N., III; Yin, L.; Albright, B. J.

2017-10-01

Trinity is a new DOE compute resource now in production at Los Alamos National Laboratory. Trinity has several new and unique features including two compute partitions, one with dual socket Intel Haswell Xeon compute nodes and one with Intel Knights Landing (KNL) Xeon Phi compute nodes, use of on package high bandwidth memory (HBM) for KNL nodes, ability to configure KNL nodes with respect to HBM model and on die network topology in a variety of operational modes at run time, and use of solid state storage via burst buffer technology to reduce time required to perform I/O. An effort is in progress to optimize VPIC on Trinity by taking advantage of these new architectural features. Results of work will be presented on performance of VPIC on Haswell and KNL partitions for single node runs and runs at scale. Results include use of burst buffers at scale to optimize I/O, comparison of strategies for using MPI and threads, performance benefits using HBM and effectiveness of using intrinsics for vectorization. Work performed under auspices of U.S. Dept. of Energy by Los Alamos National Security, LLC Los Alamos National Laboratory under contract DE-AC52-06NA25396 and supported by LANL LDRD program.
Accelerating Dust Storm Simulation by Balancing Task Allocation in Parallel Computing Environment

NASA Astrophysics Data System (ADS)

Gui, Z.; Yang, C.; XIA, J.; Huang, Q.; YU, M.

2013-12-01

Dust storm has serious negative impacts on environment, human health, and assets. The continuing global climate change has increased the frequency and intensity of dust storm in the past decades. To better understand and predict the distribution, intensity and structure of dust storm, a series of dust storm models have been developed, such as Dust Regional Atmospheric Model (DREAM), the NMM meteorological module (NMM-dust) and Chinese Unified Atmospheric Chemistry Environment for Dust (CUACE/Dust). The developments and applications of these models have contributed significantly to both scientific research and our daily life. However, dust storm simulation is a data and computing intensive process. Normally, a simulation for a single dust storm event may take several days or hours to run. It seriously impacts the timeliness of prediction and potential applications. To speed up the process, high performance computing is widely adopted. By partitioning a large study area into small subdomains according to their geographic location and executing them on different computing nodes in a parallel fashion, the computing performance can be significantly improved. Since spatiotemporal correlations exist in the geophysical process of dust storm simulation, each subdomain allocated to a node need to communicate with other geographically adjacent subdomains to exchange data. Inappropriate allocations may introduce imbalance task loads and unnecessary communications among computing nodes. Therefore, task allocation method is the key factor, which may impact the feasibility of the paralleling. The allocation algorithm needs to carefully leverage the computing cost and communication cost for each computing node to minimize total execution time and reduce overall communication cost for the entire system. This presentation introduces two algorithms for such allocation and compares them with evenly distributed allocation method. Specifically, 1) In order to get optimized solutions, a quadratic programming based modeling method is proposed. This algorithm performs well with small amount of computing tasks. However, its efficiency decreases significantly as the subdomain number and computing node number increase. 2) To compensate performance decreasing for large scale tasks, a K-Means clustering based algorithm is introduced. Instead of dedicating to get optimized solutions, this method can get relatively good feasible solutions within acceptable time. However, it may introduce imbalance communication for nodes or node-isolated subdomains. This research shows both two algorithms have their own strength and weakness for task allocation. A combination of the two algorithms is under study to obtain a better performance. Keywords: Scheduling; Parallel Computing; Load Balance; Optimization; Cost Model
A Non-Cut Cell Immersed Boundary Method for Use in Icing Simulations

NASA Technical Reports Server (NTRS)

Sarofeen, Christian M.; Noack, Ralph W.; Kreeger, Richard E.

2013-01-01

This paper describes a computational fluid dynamic method used for modelling changes in aircraft geometry due to icing. While an aircraft undergoes icing, the accumulated ice results in a geometric alteration of the aerodynamic surfaces. In computational simulations for icing, it is necessary that the corresponding geometric change is taken into consideration. The method used, herein, for the representation of the geometric change due to icing is a non-cut cell Immersed Boundary Method (IBM). Computational cells that are in a body fitted grid of a clean aerodynamic geometry that are inside a predicted ice formation are identified. An IBM is then used to change these cells from being active computational cells to having properties of viscous solid bodies. This method has been implemented in the NASA developed node centered, finite volume computational fluid dynamics code, FUN3D. The presented capability is tested for two-dimensional airfoils including a clean airfoil, an iced airfoil, and an airfoil in harmonic pitching motion about its quarter chord. For these simulations velocity contours, pressure distributions, coefficients of lift, coefficients of drag, and coefficients of pitching moment about the airfoil's quarter chord are computed and used for comparison against experimental results, a higher order panel method code with viscous effects, XFOIL, and the results from FUN3D's original solution process. The results of the IBM simulations show that the accuracy of the IBM compares satisfactorily with the experimental results, XFOIL results, and the results from FUN3D's original solution process.
Embedding Task-Based Neural Models into a Connectome-Based Model of the Cerebral Cortex.

PubMed

Ulloa, Antonio; Horwitz, Barry

2016-01-01

A number of recent efforts have used large-scale, biologically realistic, neural models to help understand the neural basis for the patterns of activity observed in both resting state and task-related functional neural imaging data. An example of the former is The Virtual Brain (TVB) software platform, which allows one to apply large-scale neural modeling in a whole brain framework. TVB provides a set of structural connectomes of the human cerebral cortex, a collection of neural processing units for each connectome node, and various forward models that can convert simulated neural activity into a variety of functional brain imaging signals. In this paper, we demonstrate how to embed a previously or newly constructed task-based large-scale neural model into the TVB platform. We tested our method on a previously constructed large-scale neural model (LSNM) of visual object processing that consisted of interconnected neural populations that represent, primary and secondary visual, inferotemporal, and prefrontal cortex. Some neural elements in the original model were "non-task-specific" (NS) neurons that served as noise generators to "task-specific" neurons that processed shapes during a delayed match-to-sample (DMS) task. We replaced the NS neurons with an anatomical TVB connectome model of the cerebral cortex comprising 998 regions of interest interconnected by white matter fiber tract weights. We embedded our LSNM of visual object processing into corresponding nodes within the TVB connectome. Reciprocal connections between TVB nodes and our task-based modules were included in this framework. We ran visual object processing simulations and showed that the TVB simulator successfully replaced the noise generation originally provided by NS neurons; i.e., the DMS tasks performed with the hybrid LSNM/TVB simulator generated equivalent neural and fMRI activity to that of the original task-based models. Additionally, we found partial agreement between the functional connectivities using the hybrid LSNM/TVB model and the original LSNM. Our framework thus presents a way to embed task-based neural models into the TVB platform, enabling a better comparison between empirical and computational data, which in turn can lead to a better understanding of how interacting neural populations give rise to human cognitive behaviors.
Toward real-time Monte Carlo simulation using a commercial cloud computing infrastructure

NASA Astrophysics Data System (ADS)

Wang, Henry; Ma, Yunzhi; Pratx, Guillem; Xing, Lei

2011-09-01

Monte Carlo (MC) methods are the gold standard for modeling photon and electron transport in a heterogeneous medium; however, their computational cost prohibits their routine use in the clinic. Cloud computing, wherein computing resources are allocated on-demand from a third party, is a new approach for high performance computing and is implemented to perform ultra-fast MC calculation in radiation therapy. We deployed the EGS5 MC package in a commercial cloud environment. Launched from a single local computer with Internet access, a Python script allocates a remote virtual cluster. A handshaking protocol designates master and worker nodes. The EGS5 binaries and the simulation data are initially loaded onto the master node. The simulation is then distributed among independent worker nodes via the message passing interface, and the results aggregated on the local computer for display and data analysis. The described approach is evaluated for pencil beams and broad beams of high-energy electrons and photons. The output of cloud-based MC simulation is identical to that produced by single-threaded implementation. For 1 million electrons, a simulation that takes 2.58 h on a local computer can be executed in 3.3 min on the cloud with 100 nodes, a 47× speed-up. Simulation time scales inversely with the number of parallel nodes. The parallelization overhead is also negligible for large simulations. Cloud computing represents one of the most important recent advances in supercomputing technology and provides a promising platform for substantially improved MC simulation. In addition to the significant speed up, cloud computing builds a layer of abstraction for high performance parallel computing, which may change the way dose calculations are performed and radiation treatment plans are completed. This work was presented in part at the 2010 Annual Meeting of the American Association of Physicists in Medicine (AAPM), Philadelphia, PA.
Peregrine System | High-Performance Computing | NREL

Science.gov Websites

) and longer-term (/projects) storage. These file systems are mounted on all nodes. Peregrine has three -2670 Xeon processors and 64 GB of memory. In addition to mounting the /home, /nopt, /projects and # cores/node Memory/node Peak (DP) performance per node 88 Intel Xeon E5-2670 "Sandy Bridge" 8
Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by employing bandwidth shells at areas of overutilization

DOEpatents

Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

2010-04-27

A massively parallel computer system contains an inter-nodal communications network of node-to-node links. An automated routing strategy routes packets through one or more intermediate nodes of the network to reach a final destination. The default routing strategy is altered responsive to detection of overutilization of a particular path of one or more links, and at least some traffic is re-routed by distributing the traffic among multiple paths (which may include the default path). An alternative path may require a greater number of link traversals to reach the destination node.
Clock Agreement Among Parallel Supercomputer Nodes

DOE Data Explorer

Jones, Terry R.; Koenig, Gregory A.

2014-04-30

This dataset presents measurements that quantify the clock synchronization time-agreement characteristics among several high performance computers including the current world's most powerful machine for open science, the U.S. Department of Energy's Titan machine sited at Oak Ridge National Laboratory. These ultra-fast machines derive much of their computational capability from extreme node counts (over 18000 nodes in the case of the Titan machine). Time-agreement is commonly utilized by parallel programming applications and tools, distributed programming application and tools, and system software. Our time-agreement measurements detail the degree of time variance between nodes and how that variance changes over time. The dataset includes empirical measurements and the accompanying spreadsheets.
Global interrupt and barrier networks

DOEpatents

Blumrich, Matthias A.; Chen, Dong; Coteus, Paul W.; Gara, Alan G.; Giampapa, Mark E; Heidelberger, Philip; Kopcsay, Gerard V.; Steinmacher-Burow, Burkhard D.; Takken, Todd E.

2008-10-28

A system and method for generating global asynchronous signals in a computing structure. Particularly, a global interrupt and barrier network is implemented that implements logic for generating global interrupt and barrier signals for controlling global asynchronous operations performed by processing elements at selected processing nodes of a computing structure in accordance with a processing algorithm; and includes the physical interconnecting of the processing nodes for communicating the global interrupt and barrier signals to the elements via low-latency paths. The global asynchronous signals respectively initiate interrupt and barrier operations at the processing nodes at times selected for optimizing performance of the processing algorithms. In one embodiment, the global interrupt and barrier network is implemented in a scalable, massively parallel supercomputing device structure comprising a plurality of processing nodes interconnected by multiple independent networks, with each node including one or more processing elements for performing computation or communication activity as required when performing parallel algorithm operations. One multiple independent network includes a global tree network for enabling high-speed global tree communications among global tree network nodes or sub-trees thereof. The global interrupt and barrier network may operate in parallel with the global tree network for providing global asynchronous sideband signals.
A spread willingness computing-based information dissemination model.

PubMed

Huang, Haojing; Cui, Zhiming; Zhang, Shukui

2014-01-01

This paper constructs a kind of spread willingness computing based on information dissemination model for social network. The model takes into account the impact of node degree and dissemination mechanism, combined with the complex network theory and dynamics of infectious diseases, and further establishes the dynamical evolution equations. Equations characterize the evolutionary relationship between different types of nodes with time. The spread willingness computing contains three factors which have impact on user's spread behavior: strength of the relationship between the nodes, views identity, and frequency of contact. Simulation results show that different degrees of nodes show the same trend in the network, and even if the degree of node is very small, there is likelihood of a large area of information dissemination. The weaker the relationship between nodes, the higher probability of views selection and the higher the frequency of contact with information so that information spreads rapidly and leads to a wide range of dissemination. As the dissemination probability and immune probability change, the speed of information dissemination is also changing accordingly. The studies meet social networking features and can help to master the behavior of users and understand and analyze characteristics of information dissemination in social network.
A Spread Willingness Computing-Based Information Dissemination Model

PubMed Central

Cui, Zhiming; Zhang, Shukui

2014-01-01

This paper constructs a kind of spread willingness computing based on information dissemination model for social network. The model takes into account the impact of node degree and dissemination mechanism, combined with the complex network theory and dynamics of infectious diseases, and further establishes the dynamical evolution equations. Equations characterize the evolutionary relationship between different types of nodes with time. The spread willingness computing contains three factors which have impact on user's spread behavior: strength of the relationship between the nodes, views identity, and frequency of contact. Simulation results show that different degrees of nodes show the same trend in the network, and even if the degree of node is very small, there is likelihood of a large area of information dissemination. The weaker the relationship between nodes, the higher probability of views selection and the higher the frequency of contact with information so that information spreads rapidly and leads to a wide range of dissemination. As the dissemination probability and immune probability change, the speed of information dissemination is also changing accordingly. The studies meet social networking features and can help to master the behavior of users and understand and analyze characteristics of information dissemination in social network. PMID:25110738
Shared address collectives using counter mechanisms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Blocksome, Michael; Dozsa, Gabor; Gooding, Thomas M

A shared address space on a compute node stores data received from a network and data to transmit to the network. The shared address space includes an application buffer that can be directly operated upon by a plurality of processes, for instance, running on different cores on the compute node. A shared counter is used for one or more of signaling arrival of the data across the plurality of processes running on the compute node, signaling completion of an operation performed by one or more of the plurality of processes, obtaining reservation slots by one or more of the pluralitymore » of processes, or combinations thereof.« less
Branchial cleft-like cysts in Hashimoto's thyroiditis: A case report and literature review.

PubMed

Miyazaki, Masaya; Kiuchi, Shizuka; Fujioka, Yasunori

2016-05-01

We report an extremely rare case of branchial cleft-like cysts in Hashimoto's thyroiditis. The patient was a 77-year-old man with a growing mass in the anterior neck. Ultrasonography and computed tomography revealed a cystic lesion with septum in the left thyroid and multiple small cystic lesions in the right thyroid. Lymph node swelling of the cervical region, supraclavicular fossa and submandibular region was also observed. Left thyroidectomy and lymph node dissection were performed. Histologically, cysts were lined by stratified squamous epithelium and dense lymphoid tissue having conspicuous follicle formation surrounded the epithelial lining. Solid cell nest (SCN)-like aggregations were seen in the thyroid parenchyma adjacent to the cyst walls and a small number of thyroid follicles were observed in the fibrous wall. Immunohistochemically, it is suggested that both the cyst lining and SCN-like aggregations are originally from thyroid follicles. Although, the exact histogenesis of branchial cleft-like cysts remains unclear, there are probably two different processes for its development, one is of branchial cleft origin and the other is mere squamous metaplasia, while in our case the latter is suggested. Herein, we report our new case and update information about branchial cleft-like cysts that appears in the literature. © 2016 Japanese Society of Pathology and John Wiley & Sons Australia, Ltd.

Accurate evaluation of axillary sentinel lymph node metastasis using contrast-enhanced ultrasonography with Sonazoid in breast cancer: a preliminary clinical trial.

PubMed

Matsuzawa, Fumihiko; Omoto, Kiyoka; Einama, Takahiro; Abe, Hironori; Suzuki, Takashi; Hamaguchi, Jun; Kaga, Terumi; Sato, Mami; Oomura, Masako; Takata, Yumiko; Fujibe, Ayako; Takeda, Chie; Tamura, Etsuya; Taketomi, Akinobu; Kyuno, Kenichi

2015-01-01

Breast cancer is the most common type of cancer in women. The 5-year survival rate in patients with breast cancer ranges from 74 to 82 %. Sentinel lymph node biopsy has become an alternative to axillary lymph node dissection for nodal staging. We evaluated the detection of the sentinel lymph node and metastasis of the lymph node using contrast enhanced ultrasonography with Sonazoid. Between December 2013 and May 2014, 32 patients with operable breast cancer were enrolled in this study. We evaluated the detection of axillary sentinel lymph nodes and the evaluation of axillary lymph nodes metastasis using contrast enhanced computed tomography, color Doppler ultrasonography and contrast enhanced ultrasonography with Sonazoid. All the sentinel lymph nodes were identified, and the sentinel lymph nodes detected by contrast enhanced ultrasonography with Sonazoid corresponded with those detected by computed tomography lymphography and indigo carmine method. The detection of metastasis based on contrast enhanced computed tomography were sensitivity 20.0 %, specificity 88.2 %, PPV 60.0 %, NPV 55.6 %, accuracy 56.3 %. Based on color Doppler ultrasonography, the results were sensitivity 36.4 %, specificity 95.2 %, PPV 80.0 %, NPV 74.1 %, accuracy 75.0 %. Based on contrast enhanced ultrasonography with Sonazoid, the results were sensitivity 81.8 %, specificity 95.2 %, PPV 90.0 %, NPV 90.9 %, accuracy 90.6 %. The results suggested that contrast enhanced ultrasonography with Sonazoid was the most accurate among the evaluations of these modalities. In the future, we believe that our method would take the place of conventional sentinel lymph node biopsy for an axillary staging method.
The Mark III Hypercube-Ensemble Computers

NASA Technical Reports Server (NTRS)

Peterson, John C.; Tuazon, Jesus O.; Lieberman, Don; Pniel, Moshe

1988-01-01

Mark III Hypercube concept applied in development of series of increasingly powerful computers. Processor of each node of Mark III Hypercube ensemble is specialized computer containing three subprocessors and shared main memory. Solves problem quickly by simultaneously processing part of problem at each such node and passing combined results to host computer. Disciplines benefitting from speed and memory capacity include astrophysics, geophysics, chemistry, weather, high-energy physics, applied mechanics, image processing, oil exploration, aircraft design, and microcircuit design.
Solving a Hamiltonian Path Problem with a bacterial computer

PubMed Central

Baumgardner, Jordan; Acker, Karen; Adefuye, Oyinade; Crowley, Samuel Thomas; DeLoache, Will; Dickson, James O; Heard, Lane; Martens, Andrew T; Morton, Nickolaus; Ritter, Michelle; Shoecraft, Amber; Treece, Jessica; Unzicker, Matthew; Valencia, Amanda; Waters, Mike; Campbell, A Malcolm; Heyer, Laurie J; Poet, Jeffrey L; Eckdahl, Todd T

2009-01-01

Background The Hamiltonian Path Problem asks whether there is a route in a directed graph from a beginning node to an ending node, visiting each node exactly once. The Hamiltonian Path Problem is NP complete, achieving surprising computational complexity with modest increases in size. This challenge has inspired researchers to broaden the definition of a computer. DNA computers have been developed that solve NP complete problems. Bacterial computers can be programmed by constructing genetic circuits to execute an algorithm that is responsive to the environment and whose result can be observed. Each bacterium can examine a solution to a mathematical problem and billions of them can explore billions of possible solutions. Bacterial computers can be automated, made responsive to selection, and reproduce themselves so that more processing capacity is applied to problems over time. Results We programmed bacteria with a genetic circuit that enables them to evaluate all possible paths in a directed graph in order to find a Hamiltonian path. We encoded a three node directed graph as DNA segments that were autonomously shuffled randomly inside bacteria by a Hin/hixC recombination system we previously adapted from Salmonella typhimurium for use in Escherichia coli. We represented nodes in the graph as linked halves of two different genes encoding red or green fluorescent proteins. Bacterial populations displayed phenotypes that reflected random ordering of edges in the graph. Individual bacterial clones that found a Hamiltonian path reported their success by fluorescing both red and green, resulting in yellow colonies. We used DNA sequencing to verify that the yellow phenotype resulted from genotypes that represented Hamiltonian path solutions, demonstrating that our bacterial computer functioned as expected. Conclusion We successfully designed, constructed, and tested a bacterial computer capable of finding a Hamiltonian path in a three node directed graph. This proof-of-concept experiment demonstrates that bacterial computing is a new way to address NP-complete problems using the inherent advantages of genetic systems. The results of our experiments also validate synthetic biology as a valuable approach to biological engineering. We designed and constructed basic parts, devices, and systems using synthetic biology principles of standardization and abstraction. PMID:19630940
Parallel workflow manager for non-parallel bioinformatic applications to solve large-scale biological problems on a supercomputer.

PubMed

Suplatov, Dmitry; Popova, Nina; Zhumatiy, Sergey; Voevodin, Vladimir; Švedas, Vytas

2016-04-01

Rapid expansion of online resources providing access to genomic, structural, and functional information associated with biological macromolecules opens an opportunity to gain a deeper understanding of the mechanisms of biological processes due to systematic analysis of large datasets. This, however, requires novel strategies to optimally utilize computer processing power. Some methods in bioinformatics and molecular modeling require extensive computational resources. Other algorithms have fast implementations which take at most several hours to analyze a common input on a modern desktop station, however, due to multiple invocations for a large number of subtasks the full task requires a significant computing power. Therefore, an efficient computational solution to large-scale biological problems requires both a wise parallel implementation of resource-hungry methods as well as a smart workflow to manage multiple invocations of relatively fast algorithms. In this work, a new computer software mpiWrapper has been developed to accommodate non-parallel implementations of scientific algorithms within the parallel supercomputing environment. The Message Passing Interface has been implemented to exchange information between nodes. Two specialized threads - one for task management and communication, and another for subtask execution - are invoked on each processing unit to avoid deadlock while using blocking calls to MPI. The mpiWrapper can be used to launch all conventional Linux applications without the need to modify their original source codes and supports resubmission of subtasks on node failure. We show that this approach can be used to process huge amounts of biological data efficiently by running non-parallel programs in parallel mode on a supercomputer. The C++ source code and documentation are available from http://biokinet.belozersky.msu.ru/mpiWrapper .
Performance of VPIC on Sequoia

NASA Astrophysics Data System (ADS)

Nystrom, William

2014-10-01

Sequoia is a major DOE computing resource which is characteristic of future resources in that it has many threads per compute node, 64, and the individual processor cores are simpler and less powerful than cores on previous processors like Intel's Sandy Bridge or AMD's Opteron. An effort is in progress to port VPIC to the Blue Gene Q architecture of Sequoia and evaluate its performance. Results of this work will be presented on single node performance of VPIC as well as multi-node scaling.
Single-node orbit analsyis with radiation heat transfer only

NASA Technical Reports Server (NTRS)

Peoples, J. A.

1977-01-01

The steady-state temperature of a single node which dissipates energy by radiation only is discussed for a nontime varying thermal environment. Relationships are developed to illustrate how shields can be utilized to represent a louver system. A computer program is presented which can assess periodic temperature characteristics of a single node in a time varying thermal environment having energy dissipation by radiation only. The computer program performs thermal orbital analysis for five combinations of plate, shields, and louvers.
Fog computing job scheduling optimization based on bees swarm

NASA Astrophysics Data System (ADS)

Bitam, Salim; Zeadally, Sherali; Mellouk, Abdelhamid

2018-04-01

Fog computing is a new computing architecture, composed of a set of near-user edge devices called fog nodes, which collaborate together in order to perform computational services such as running applications, storing an important amount of data, and transmitting messages. Fog computing extends cloud computing by deploying digital resources at the premise of mobile users. In this new paradigm, management and operating functions, such as job scheduling aim at providing high-performance, cost-effective services requested by mobile users and executed by fog nodes. We propose a new bio-inspired optimization approach called Bees Life Algorithm (BLA) aimed at addressing the job scheduling problem in the fog computing environment. Our proposed approach is based on the optimized distribution of a set of tasks among all the fog computing nodes. The objective is to find an optimal tradeoff between CPU execution time and allocated memory required by fog computing services established by mobile users. Our empirical performance evaluation results demonstrate that the proposal outperforms the traditional particle swarm optimization and genetic algorithm in terms of CPU execution time and allocated memory.
ORA User’s Guide 2012

DTIC Science & Technology

2012-06-11

places, resources, knowledge sets or other common Node Classes*. 285 This example will use the Stargate dataset (SG-1). This dataset is included...create a new Meta-Network. Below is the NodeSet for Stargate with the original 16 node NodeSet. 376 From the main menu select, Actions > Add...measures by simply gauging their size visually and intuitively. First, visualize one of your networks. Below is the Stargate agent x event network to
N-cadherin locks left-right asymmetry by ending the leftward movement of Hensen's node cells.

PubMed

Mendes, Raquel V; Martins, Gabriel G; Cristovão, Ana M; Saúde, Leonor

2014-08-11

The stereotypic left-right (LR) asymmetric distribution of internal organs is due to an asymmetric molecular cascade in the lateral plate mesoderm (LPM) that is originated at the embryonic node. In chicken embryos, molecular asymmetries at Hensen's node are created by leftward cell movements that occur transiently. What terminates these movements, and, moreover, what is the impact of prolonging them on the LR asymmetry cascade? We show that leftward movements last longer when N-cadherin function is blocked and cease prematurely when N-cadherin is overexpressed on the right side of the node. The prolonged leftward movements lead to loss of asymmetric expression of fgf8 and nodal at the node region. This originates an abnormal expression of the asymmetric genes cer1 and snai1 in the LPM, resulting in mispositioned hearts. We conclude that N-cadherin stops the leftward cell movements and that this termination is an essential step in the establishment of LR asymmetry. Copyright © 2014 Elsevier Inc. All rights reserved.
Broadcasting Topology and Routing Information in Computer Networks

DTIC Science & Technology

1985-05-01

DOWN\\ linki inki FIgwre 1.2.1: Topology Problem Example messages from node 2 before receiving the first DOWN message from node 3. Now assume that before...node to each of the link’s end nodes. 54 link.1 cc 4 1 -. distances to linki Figue 3.4.2: SPTA Port Distance Table Example An example of these
Modeling node bandwidth limits and their effects on vector combining algorithms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Littlefield, R.J.

Each node in a message-passing multicomputer typically has several communication links. However, the maximum aggregate communication speed of a node is often less than the sum of its individual link speeds. Such computers are called node bandwidth limited (NBL). The NBL constraint is important when choosing algorithms because it can change the relative performance of different algorithms that accomplish the same task. This paper introduces a model of communication performance for NBL computers and uses the model to analyze the overall performance of three algorithms for vector combining (global sum) on the Intel Touchstone DELTA computer. Each of the threemore » algorithms is found to be at least 33% faster than the other two for some combinations of machine size and vector length. The NBL constraint is shown to significantly affect the conditions under which each algorithm is fastest.« less
Enhancing PC Cluster-Based Parallel Branch-and-Bound Algorithms for the Graph Coloring Problem

NASA Astrophysics Data System (ADS)

Taoka, Satoshi; Takafuji, Daisuke; Watanabe, Toshimasa

A branch-and-bound algorithm (BB for short) is the most general technique to deal with various combinatorial optimization problems. Even if it is used, computation time is likely to increase exponentially. So we consider its parallelization to reduce it. It has been reported that the computation time of a parallel BB heavily depends upon node-variable selection strategies. And, in case of a parallel BB, it is also necessary to prevent increase in communication time. So, it is important to pay attention to how many and what kind of nodes are to be transferred (called sending-node selection strategy). In this paper, for the graph coloring problem, we propose some sending-node selection strategies for a parallel BB algorithm by adopting MPI for parallelization and experimentally evaluate how these strategies affect computation time of a parallel BB on a PC cluster network.
The Case for Modular Redundancy in Large-Scale High Performance Computing Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Engelmann, Christian; Ong, Hong Hoe; Scott, Stephen L

2009-01-01

Recent investigations into resilience of large-scale high-performance computing (HPC) systems showed a continuous trend of decreasing reliability and availability. Newly installed systems have a lower mean-time to failure (MTTF) and a higher mean-time to recover (MTTR) than their predecessors. Modular redundancy is being used in many mission critical systems today to provide for resilience, such as for aerospace and command \\& control systems. The primary argument against modular redundancy for resilience in HPC has always been that the capability of a HPC system, and respective return on investment, would be significantly reduced. We argue that modular redundancy can significantly increasemore » compute node availability as it removes the impact of scale from single compute node MTTR. We further argue that single compute nodes can be much less reliable, and therefore less expensive, and still be highly available, if their MTTR/MTTF ratio is maintained.« less
Stereological Cell Morphometry In Right Atrium Myocardium Of Primates

NASA Astrophysics Data System (ADS)

Mandarim-De-Lacerda, Carlos A...; Hureau, Jacques

1986-07-01

The mechanism by which the cardiac impulse is propagated in normal hearts from its origin in the sinus node to the atrio-ventricular node has not been agreed on fully. We studied the "internodal posterior tract" through the crista terminalis by light microscopy and stereological morphometry. The hearts of 12 Papio cynocephalus were perfused , after sacrifice,with phosphate-buffered formol saline. The regions of the crista terminalis (CT), interatrial septum (IAS), atrioventricular bundle (AVB) and interventricular septum (IVS) were cut off and embedded in paraplast and sectioned (10 4m). The multipurpose test system M 42 was superimposed over the photomicrographs (1,890 points test, ESR = 2%) to the stereological computing. The quantitative results show that the cells from CT were more closely relationed with IAS cells than others cells (IVS and AVB cells). This results are not a morphological evidence to establish the specificity of the "internodal posterior tract". The cellular arrangement and anatomical variation in CT myocardium is very important.
Consistency mapping of 16 lymph node stations in gastric cancer by CT-based vessel-guided delineation of 255 patients.

PubMed

Xu, Shuhang; Feng, Lingling; Chen, Yongming; Sun, Ying; Lu, Yao; Huang, Shaomin; Fu, Yang; Zheng, Rongqin; Zhang, Yujing; Zhang, Rong

2017-06-20

In order to refine the location and metastasis-risk density of 16 lymph node stations of gastric cancer for neoadjuvant radiotherapy, we retrospectively reviewed the initial images and pathological reports of 255 gastric cancer patients with lymphatic metastasis. Metastatic lymph nodes identified in the initial computed tomography images were investigated by two radiologists with gastrointestinal specialty. A circle with a diameter of 5 mm was used to identify the central position of each metastatic lymph node, defined as the LNc (the central position of the lymph node). The LNc was drawn at the equivalent location on the reference images of a standard patient based on the relative distances to the same reference vessels and the gastric wall using a Monaco® version 5.0 workstation. The image manipulation software Medi-capture was programmed for image analysis to produce a contour and density atlas of 16 lymph node stations. Based on a total of 2846 LNcs contoured (31-599 per lymph node station), we created a density distribution map of 16 lymph node drainage stations of the stomach on computed tomography images, showing the detailed radiographic delineation of each lymph node station as well as high-risk areas for lymph node metastasis. Our mapping can serve as a template for the delineation of gastric lymph node stations when defining clinical target volume in pre-operative radiotherapy for gastric cancer.
HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks

PubMed Central

Azad, Ariful; Ouzounis, Christos A; Kyrpides, Nikos C; Buluç, Aydin

2018-01-01

Abstract Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times and memory demands. Here, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ∼70 million nodes with ∼68 billion edges in ∼2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license. PMID:29315405
HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks

DOE PAGES

Azad, Ariful; Pavlopoulos, Georgios A.; Ouzounis, Christos A.; ...

2018-01-05

Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times andmore » memory demands. In this paper, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ~70 million nodes with ~68 billion edges in ~2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. Finally, HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license.« less
HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Azad, Ariful; Pavlopoulos, Georgios A.; Ouzounis, Christos A.

Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times andmore » memory demands. In this paper, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ~70 million nodes with ~68 billion edges in ~2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. Finally, HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license.« less
Two-dimensional nonsteady viscous flow simulation on the Navier-Stokes computer miniNode

NASA Technical Reports Server (NTRS)

Nosenchuck, Daniel M.; Littman, Michael G.; Flannery, William

1986-01-01

The needs of large-scale scientific computation are outpacing the growth in performance of mainframe supercomputers. In particular, problems in fluid mechanics involving complex flow simulations require far more speed and capacity than that provided by current and proposed Class VI supercomputers. To address this concern, the Navier-Stokes Computer (NSC) was developed. The NSC is a parallel-processing machine, comprised of individual Nodes, each comparable in performance to current supercomputers. The global architecture is that of a hypercube, and a 128-Node NSC has been designed. New architectural features, such as a reconfigurable many-function ALU pipeline and a multifunction memory-ALU switch, have provided the capability to efficiently implement a wide range of algorithms. Efficient algorithms typically involve numerically intensive tasks, which often include conditional operations. These operations may be efficiently implemented on the NSC without, in general, sacrificing vector-processing speed. To illustrate the architecture, programming, and several of the capabilities of the NSC, the simulation of two-dimensional, nonsteady viscous flows on a prototype Node, called the miniNode, is presented.
Self-Organizing OFDMA System for Broadband Communication

NASA Technical Reports Server (NTRS)

Roy, Aloke (Inventor); Anandappan, Thanga (Inventor); Malve, Sharath Babu (Inventor)

2016-01-01

Systems and methods for a self-organizing OFDMA system for broadband communication are provided. In certain embodiments a communication node for a self organizing network comprises a communication interface configured to transmit data to and receive data from a plurality of nodes; and a processing unit configured to execute computer readable instructions. Further, computer readable instructions direct the processing unit to identify a sub-region within a cell, wherein the communication node is located in the sub-region; and transmit at least one data frame, wherein the data from the communication node is transmitted at a particular time and frequency as defined within the at least one data frame, where the time and frequency are associated with the sub-region.

Collective network for computer structures

DOEpatents

Blumrich, Matthias A; Coteus, Paul W; Chen, Dong; Gara, Alan; Giampapa, Mark E; Heidelberger, Philip; Hoenicke, Dirk; Takken, Todd E; Steinmacher-Burow, Burkhard D; Vranas, Pavlos M

2014-01-07

A system and method for enabling high-speed, low-latency global collective communications among interconnected processing nodes. The global collective network optimally enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices are included that interconnect the nodes of the network via links to facilitate performance of low-latency global processing operations at nodes of the virtual network. The global collective network may be configured to provide global barrier and interrupt functionality in asynchronous or synchronized manner. When implemented in a massively-parallel supercomputing structure, the global collective network is physically and logically partitionable according to the needs of a processing algorithm.
Prevention of Malicious Nodes Communication in MANETs by Using Authorized Tokens

NASA Astrophysics Data System (ADS)

Chandrakant, N.; Shenoy, P. Deepa; Venugopal, K. R.; Patnaik, L. M.

A rapid increase of wireless networks and mobile computing applications has changed the landscape of network security. A MANET is more susceptible to the attacks than wired network. As a result, attacks with malicious intent have been and will be devised to take advantage of these vulnerabilities and to cripple the MANET operation. Hence we need to search for new architecture and mechanisms to protect the wireless networks and mobile computing applications. In this paper, we examine the nodes that come under the vicinity of base node and members of the network and communication is provided to genuine nodes only. It is found that the proposed algorithm is a effective algorithm for security in MANETs.
Collective network for computer structures

DOEpatents

Blumrich, Matthias A [Ridgefield, CT; Coteus, Paul W [Yorktown Heights, NY; Chen, Dong [Croton On Hudson, NY; Gara, Alan [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Hoenicke, Dirk [Ossining, NY; Takken, Todd E [Brewster, NY; Steinmacher-Burow, Burkhard D [Wernau, DE; Vranas, Pavlos M [Bedford Hills, NY

2011-08-16

A system and method for enabling high-speed, low-latency global collective communications among interconnected processing nodes. The global collective network optimally enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices ate included that interconnect the nodes of the network via links to facilitate performance of low-latency global processing operations at nodes of the virtual network and class structures. The global collective network may be configured to provide global barrier and interrupt functionality in asynchronous or synchronized manner. When implemented in a massively-parallel supercomputing structure, the global collective network is physically and logically partitionable according to needs of a processing algorithm.
Technique for Calculating Solution Derivatives With Respect to Geometry Parameters in a CFD Code

NASA Technical Reports Server (NTRS)

Mathur, Sanjay

2011-01-01

A solution has been developed to the challenges of computation of derivatives with respect to geometry, which is not straightforward because these are not typically direct inputs to the computational fluid dynamics (CFD) solver. To overcome these issues, a procedure has been devised that can be used without having access to the mesh generator, while still being applicable to all types of meshes. The basic approach is inspired by the mesh motion algorithms used to deform the interior mesh nodes in a smooth manner when the surface nodes, for example, are in a fluid structure interaction problem. The general idea is to model the mesh edges and nodes as constituting a spring-mass system. Changes to boundary node locations are propagated to interior nodes by allowing them to assume their new equilibrium positions, for instance, one where the forces on each node are in balance. The main advantage of the technique is that it is independent of the volumetric mesh generator, and can be applied to structured, unstructured, single- and multi-block meshes. It essentially reduces the problem down to defining the surface mesh node derivatives with respect to the geometry parameters of interest. For analytical geometries, this is quite straightforward. In the more general case, one would need to be able to interrogate the underlying parametric CAD (computer aided design) model and to evaluate the derivatives either analytically, or by a finite difference technique. Because the technique is based on a partial differential equation (PDE), it is applicable not only to forward mode problems (where derivatives of all the output quantities are computed with respect to a single input), but it could also be extended to the adjoint problem, either by using an analytical adjoint of the PDE or a discrete analog.
Monoclonal origin of peritoneal implants and lymph node deposits in serous borderline ovarian tumors (s-BOT) with high intratumoral homogeneity.

PubMed

Horn, Lars-Christian; Höhn, Anne K; Einenkel, Jens; Siebolts, Udo

2014-11-01

Molecular studies have shown that the most prevalent mutations in serous ovarian borderline tumors (s-BOT) are BRAF and/or KRAS alterations. About one third of s-BOT represent peritoneal implants and/or lymph node involvement. These extraovarian deposits may be monoclonal or polyclonal in origin. To test both the hypotheses, mutational analyses using pyrosequencing for BRAF codon 600 and KRAS codon 12/13 and 61 of microdissected tissue was performed in 15 s-BOT and their invasive and noninvasive peritoneal implants. Two to 6 implants from different peritoneal sites were examined in 13 cases. Lymph node deposits were available for the analysis in 3 cases. Six s-BOT showed mutation in exon 2 codon 12 of the KRAS proto-oncogen. Five additional cases showed BRAF p.V600E mutation representing an overall mutation rate of 73.3%. Multiple (2-6) peritoneal implants were analyzed after microdissection in 13 of 15 cases. All showed identical mutational results when compared with the ovarian site of the disease. All lymph node deposits, including those with multiple deposits in different nodes, showed identical results, suggesting high intratumoral mutational homogeneity. The evidence presented in this study and the majority of data reported in the literature support the hypothesis that s-BOT with their peritoneal implants and lymph node deposits show identical mutational status of BRAF and KRAS suggesting a monoclonal rather than a polyclonal disease regarding these both tested genetic loci. In addition, a high intratumoral genetic homogeneity can be suggested. In conclusion, the results of the present study support the monoclonal origin of s-BOT and their peritoneal implants and lymph node deposits.
Utility of Computed Tomography versus Abdominal Ultrasound Examination to Identify Iliosacral Lymphadenomegaly in Dogs with Apocrine Gland Adenocarcinoma of the Anal Sac.

PubMed

Palladino, S; Keyerleber, M A; King, R G; Burgess, K E

2016-11-01

Apocrine gland adenocarcinoma of the anal sac (AGAAS) is associated with high rates of iliosacral lymph node metastasis, which may influence treatment and prognosis. Magnetic resonance imaging (MRI) recently has been shown to be more sensitive than abdominal ultrasound examination (AUS) in affected patients. To compare the rate of detection of iliosacral lymphadenomegaly between AUS and computed tomography (CT) in dogs with AGAAS. Cohort A: A total of 30 presumed normal dogs. Cohort B: A total of 20 dogs with AGAAS that underwent AUS and CT. Using cohort A, mean normalized lymph node : aorta (LN : AO) ratios were established for medial iliac, internal iliac, and sacral lymph nodes. The CT images in cohort B then were reviewed retrospectively and considered enlarged if their LN : AO ratio measured 2 standard deviations above the mean normalized ratio for that particular node in cohort A. Classification and visibility of lymph nodes identified on AUS were compared to corresponding measurements obtained on CT. Computed tomography identified lymphadenomegaly in 13 of 20 AGAAS dogs. Of these 13 dogs, AUS correctly identified and detected all enlarged nodes in only 30.8%, and either misidentified or failed to detect additional enlarged nodes in the remaining dogs. Despite limitations in identifying enlargement in all affected lymph nodes, AUS identified at least 1 enlarged node in 100% of affected dogs. Abdominal ultrasound examination is an effective screening test for lymphadenomegaly in dogs with AGAAS, but CT should be considered in any patient in which an additional metastatic site would impact therapeutic planning. Copyright © 2016 The Authors. Journal of Veterinary Internal Medicine published by Wiley Periodicals, Inc. on behalf of the American College of Veterinary Internal Medicine.
SU-E-T-628: A Cloud Computing Based Multi-Objective Optimization Method for Inverse Treatment Planning.

PubMed

Na, Y; Suh, T; Xing, L

2012-06-01

Multi-objective (MO) plan optimization entails generation of an enormous number of IMRT or VMAT plans constituting the Pareto surface, which presents a computationally challenging task. The purpose of this work is to overcome the hurdle by developing an efficient MO method using emerging cloud computing platform. As a backbone of cloud computing for optimizing inverse treatment planning, Amazon Elastic Compute Cloud with a master node (17.1 GB memory, 2 virtual cores, 420 GB instance storage, 64-bit platform) is used. The master node is able to scale seamlessly a number of working group instances, called workers, based on the user-defined setting account for MO functions in clinical setting. Each worker solved the objective function with an efficient sparse decomposition method. The workers are automatically terminated if there are finished tasks. The optimized plans are archived to the master node to generate the Pareto solution set. Three clinical cases have been planned using the developed MO IMRT and VMAT planning tools to demonstrate the advantages of the proposed method. The target dose coverage and critical structure sparing of plans are comparable obtained using the cloud computing platform are identical to that obtained using desktop PC (Intel Xeon® CPU 2.33GHz, 8GB memory). It is found that the MO planning speeds up the processing of obtaining the Pareto set substantially for both types of plans. The speedup scales approximately linearly with the number of nodes used for computing. With the use of N nodes, the computational time is reduced by the fitting model, 0.2+2.3/N, with r̂2>0.99, on average of the cases making real-time MO planning possible. A cloud computing infrastructure is developed for MO optimization. The algorithm substantially improves the speed of inverse plan optimization. The platform is valuable for both MO planning and future off- or on-line adaptive re-planning. © 2012 American Association of Physicists in Medicine.
A Hybrid MPI/OpenMP Approach for Parallel Groundwater Model Calibration on Multicore Computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tang, Guoping; D'Azevedo, Ed F; Zhang, Fan

2010-01-01

Groundwater model calibration is becoming increasingly computationally time intensive. We describe a hybrid MPI/OpenMP approach to exploit two levels of parallelism in software and hardware to reduce calibration time on multicore computers with minimal parallelization effort. At first, HydroGeoChem 5.0 (HGC5) is parallelized using OpenMP for a uranium transport model with over a hundred species involving nearly a hundred reactions, and a field scale coupled flow and transport model. In the first application, a single parallelizable loop is identified to consume over 97% of the total computational time. With a few lines of OpenMP compiler directives inserted into the code,more » the computational time reduces about ten times on a compute node with 16 cores. The performance is further improved by selectively parallelizing a few more loops. For the field scale application, parallelizable loops in 15 of the 174 subroutines in HGC5 are identified to take more than 99% of the execution time. By adding the preconditioned conjugate gradient solver and BICGSTAB, and using a coloring scheme to separate the elements, nodes, and boundary sides, the subroutines for finite element assembly, soil property update, and boundary condition application are parallelized, resulting in a speedup of about 10 on a 16-core compute node. The Levenberg-Marquardt (LM) algorithm is added into HGC5 with the Jacobian calculation and lambda search parallelized using MPI. With this hybrid approach, compute nodes at the number of adjustable parameters (when the forward difference is used for Jacobian approximation), or twice that number (if the center difference is used), are used to reduce the calibration time from days and weeks to a few hours for the two applications. This approach can be extended to global optimization scheme and Monte Carol analysis where thousands of compute nodes can be efficiently utilized.« less
A Survey on the Feasibility of Sound Classification on Wireless Sensor Nodes

PubMed Central

Salomons, Etto L.; Havinga, Paul J. M.

2015-01-01

Wireless sensor networks are suitable to gain context awareness for indoor environments. As sound waves form a rich source of context information, equipping the nodes with microphones can be of great benefit. The algorithms to extract features from sound waves are often highly computationally intensive. This can be problematic as wireless nodes are usually restricted in resources. In order to be able to make a proper decision about which features to use, we survey how sound is used in the literature for global sound classification, age and gender classification, emotion recognition, person verification and identification and indoor and outdoor environmental sound classification. The results of the surveyed algorithms are compared with respect to accuracy and computational load. The accuracies are taken from the surveyed papers; the computational loads are determined by benchmarking the algorithms on an actual sensor node. We conclude that for indoor context awareness, the low-cost algorithms for feature extraction perform equally well as the more computationally-intensive variants. As the feature extraction still requires a large amount of processing time, we present four possible strategies to deal with this problem. PMID:25822142
Distributed solar radiation fast dynamic measurement for PV cells

NASA Astrophysics Data System (ADS)

Wan, Xuefen; Yang, Yi; Cui, Jian; Du, Xingjing; Zheng, Tao; Sardar, Muhammad Sohail

2017-10-01

To study the operating characteristics about PV cells, attention must be given to the dynamic behavior of the solar radiation. The dynamic behaviors of annual, monthly, daily and hourly averages of solar radiation have been studied in detail. But faster dynamic behaviors of solar radiation need more researches. The solar radiation random fluctuations in minute-long or second-long range, which lead to alternating radiation and cool down/warm up PV cell frequently, decrease conversion efficiency. Fast dynamic processes of solar radiation are mainly relevant to stochastic moving of clouds. Even in clear sky condition, the solar irradiations show a certain degree of fast variation. To evaluate operating characteristics of PV cells under fast dynamic irradiation, a solar radiation measuring array (SRMA) based on large active area photodiode, LoRa spread spectrum communication and nanoWatt MCU is proposed. This cross photodiodes structure tracks fast stochastic moving of clouds. To compensate response time of pyranometer and reduce system cost, the terminal nodes with low-cost fast-responded large active area photodiode are placed besides positions of tested PV cells. A central node, consists with pyranometer, large active area photodiode, wind detector and host computer, is placed in the center of the central topologies coordinate to scale temporal envelope of solar irradiation and get calibration information between pyranometer and large active area photodiodes. In our SRMA system, the terminal nodes are designed based on Microchip's nanoWatt XLP PIC16F1947. FDS-100 is adopted for large active area photodiode in terminal nodes and host computer. The output current and voltage of each PV cell are monitored by I/V measurement. AS62-T27/SX1278 LoRa communication modules are used for communicating between terminal nodes and host computer. Because the LoRa LPWAN (Low Power Wide Area Network) specification provides seamless interoperability among Smart Things without the need of complex local installations, configuring of our SRMA system is very easy. Lora also provides SRMA a means to overcome the short communication distance and weather signal propagation decline such as in ZigBee and WiFi. The host computer in SRMA system uses the low power single-board PC EMB-3870 which was produced by NORCO. Wind direction sensor SM5386B and wind-force sensor SM5387B are installed to host computer through RS-485 bus for wind reference data collection. And Davis 6450 solar radiation sensor, which is a precision instrument that detects radiation at wavelengths of 300 to 1100 nanometers, allow host computer to follow real-time solar radiation. A LoRa polling scheme is adopt for the communication between host computer and terminal nodes in SRMA. An experimental SRMA has been established. This system was tested in Ganyu, Jiangshu province from May to August, 2016. In the test, the distances between the nodes and the host computer were between 100m and 1900m. At work, SRMA system showed higher reliability. Terminal nodes could follow the instructions from host computer and collect solar radiation data of distributed PV cells effectively. And the host computer managed the SRAM and achieves reference parameters well. Communications between the host computer and terminal nodes were almost unaffected by the weather. In conclusion, the testing results show that SRMA could be a capable method for fast dynamic measuring about solar radiation and related PV cell operating characteristics.
Load balancing strategy and its lookup-table enhancement in deterministic space delay/disruption tolerant networks

NASA Astrophysics Data System (ADS)

Huang, Jinhui; Liu, Wenxiang; Su, Yingxue; Wang, Feixue

2018-02-01

Space networks, in which connectivity is deterministic and intermittent, can be modeled by delay/disruption tolerant networks. In space delay/disruption tolerant networks, a packet is usually transmitted from the source node to the destination node indirectly via a series of relay nodes. If anyone of the nodes in the path becomes congested, the packet will be dropped due to buffer overflow. One of the main reasons behind congestion is the unbalanced network traffic distribution. We propose a load balancing strategy which takes the congestion status of both the local node and relay nodes into account. The congestion status, together with the end-to-end delay, is used in the routing selection. A lookup-table enhancement is also proposed. The off-line computation and the on-line adjustment are combined together to make a more precise estimate of the end-to-end delay while at the same time reducing the onboard computation. Simulation results show that the proposed strategy helps to distribute network traffic more evenly and therefore reduces the packet drop ratio. In addition, the average delay is also decreased in most cases. The lookup-table enhancement provides a compromise between the need for better communication performance and the desire for less onboard computation.
Accurate Analysis of the Change in Volume, Location, and Shape of Metastatic Cervical Lymph Nodes During Radiotherapy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Takao, Seishin, E-mail: takao@mech-me.eng.hokudai.ac.jp; Tadano, Shigeru; Taguchi, Hiroshi

2011-11-01

Purpose: To establish a method for the accurate acquisition and analysis of the variations in tumor volume, location, and three-dimensional (3D) shape of tumors during radiotherapy in the era of image-guided radiotherapy. Methods and Materials: Finite element models of lymph nodes were developed based on computed tomography (CT) images taken before the start of treatment and every week during the treatment period. A surface geometry map with a volumetric scale was adopted and used for the analysis. Six metastatic cervical lymph nodes, 3.5 to 55.1 cm{sup 3} before treatment, in 6 patients with head and neck carcinomas were analyzed inmore » this study. Three fiducial markers implanted in mouthpieces were used for the fusion of CT images. Changes in the location of the lymph nodes were measured on the basis of these fiducial markers. Results: The surface geometry maps showed convex regions in red and concave regions in blue to ensure that the characteristics of the 3D tumor geometries are simply understood visually. After the irradiation of 66 to 70 Gy in 2 Gy daily doses, the patterns of the colors had not changed significantly, and the maps before and during treatment were strongly correlated (average correlation coefficient was 0.808), suggesting that the tumors shrank uniformly, maintaining the original characteristics of the shapes in all 6 patients. The movement of the gravitational center of the lymph nodes during the treatment period was everywhere less than {+-}5 mm except in 1 patient, in whom the change reached nearly 10 mm. Conclusions: The surface geometry map was useful for an accurate evaluation of the changes in volume and 3D shapes of metastatic lymph nodes. The fusion of the initial and follow-up CT images based on fiducial markers enabled an analysis of changes in the location of the targets. Metastatic cervical lymph nodes in patients were suggested to decrease in size without significant changes in the 3D shape during radiotherapy. The movements of the gravitational center of the lymph nodes were almost all less than {+-}5 mm.« less
RTOG GU Radiation Oncology Specialists Reach Consensus on Pelvic Lymph Node Volumes for High-Risk Prostate Cancer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lawton, Colleen A.F.; Michalski, Jeff; El-Naqa, Issam

2009-06-01

Purpose: Radiation therapy to the pelvic lymph nodes in high-risk prostate cancer is required on several Radiation Therapy Oncology Group (RTOG) clinical trials. Based on a prior lymph node contouring project, we have shown significant disagreement in the definition of pelvic lymph node volumes among genitourinary radiation oncology specialists involved in developing and executing current RTOG trials. Materials and Methods: A consensus meeting was held on October 3, 2007, to reach agreement on pelvic lymph node volumes. Data were presented to address the lymph node drainage of the prostate. Extensive discussion ensued to develop clinical target volume (CTV) pelvic lymphmore » node consensus. Results: Consensus was obtained resulting in computed tomography image-based pelvic lymph node CTVs. Based on this consensus, the pelvic lymph node volumes to be irradiated include: distal common iliac, presacral lymph nodes (S{sub 1}-S{sub 3}), external iliac lymph nodes, internal iliac lymph nodes, and obturator lymph nodes. Lymph node CTVs include the vessels (artery and vein) and a 7-mm radial margin being careful to 'carve out' bowel, bladder, bone, and muscle. Volumes begin at the L5/S1 interspace and end at the superior aspect of the pubic bone. Consensus on dose-volume histogram constraints for OARs was also attained. Conclusions: Consensus on pelvic lymph node CTVs for radiation therapy to address high-risk prostate cancer was attained and is available as web-based computed tomography images as well as a descriptive format through the RTOG. This will allow for uniformity in evaluating the benefit and risk of such treatment.« less
Distributed downhole drilling network

DOEpatents

Hall, David R.; Hall, Jr., H. Tracy; Fox, Joe; Pixton, David S.

2006-11-21

A high-speed downhole network providing real-time data from downhole components of a drilling strings includes a bottom-hole node interfacing to a bottom-hole assembly located proximate the bottom end of a drill string. A top-hole node is connected proximate the top end of the drill string. One or several intermediate nodes are located along the drill string between the bottom-hole node and the top-hole node. The intermediate nodes are configured to receive and transmit data packets transmitted between the bottom-hole node and the top-hole node. A communications link, integrated into the drill string, is used to operably connect the bottom-hole node, the intermediate nodes, and the top-hole node. In selected embodiments, a personal or other computer may be connected to the top-hole node, to analyze data received from the intermediate and bottom-hole nodes.
Efficient computation of kinship and identity coefficients on large pedigrees.

PubMed

Cheng, En; Elliott, Brendan; Ozsoyoglu, Z Meral

2009-06-01

With the rapidly expanding field of medical genetics and genetic counseling, genealogy information is becoming increasingly abundant. An important computation on pedigree data is the calculation of identity coefficients, which provide a complete description of the degree of relatedness of a pair of individuals. The areas of application of identity coefficients are numerous and diverse, from genetic counseling to disease tracking, and thus, the computation of identity coefficients merits special attention. However, the computation of identity coefficients is not done directly, but rather as the final step after computing a set of generalized kinship coefficients. In this paper, we first propose a novel Path-Counting Formula for calculating generalized kinship coefficients, which is motivated by Wright's path-counting method for computing inbreeding coefficient. We then present an efficient and scalable scheme for calculating generalized kinship coefficients on large pedigrees using NodeCodes, a special encoding scheme for expediting the evaluation of queries on pedigree graph structures. Furthermore, we propose an improved scheme using Family NodeCodes for the computation of generalized kinship coefficients, which is motivated by the significant improvement of using Family NodeCodes for inbreeding coefficient over the use of NodeCodes. We also perform experiments for evaluating the efficiency of our method, and compare it with the performance of the traditional recursive algorithm for three individuals. Experimental results demonstrate that the resulting scheme is more scalable and efficient than the traditional recursive methods for computing generalized kinship coefficients.
Design and implementation of a hybrid MPI-CUDA model for the Smith-Waterman algorithm.

PubMed

Khaled, Heba; Faheem, Hossam El Deen Mostafa; El Gohary, Rania

2015-01-01

This paper provides a novel hybrid model for solving the multiple pair-wise sequence alignment problem combining message passing interface and CUDA, the parallel computing platform and programming model invented by NVIDIA. The proposed model targets homogeneous cluster nodes equipped with similar Graphical Processing Unit (GPU) cards. The model consists of the Master Node Dispatcher (MND) and the Worker GPU Nodes (WGN). The MND distributes the workload among the cluster working nodes and then aggregates the results. The WGN performs the multiple pair-wise sequence alignments using the Smith-Waterman algorithm. We also propose a modified implementation to the Smith-Waterman algorithm based on computing the alignment matrices row-wise. The experimental results demonstrate a considerable reduction in the running time by increasing the number of the working GPU nodes. The proposed model achieved a performance of about 12 Giga cell updates per second when we tested against the SWISS-PROT protein knowledge base running on four nodes.
Esophageal cancer associated with a sarcoid-like reaction and systemic sarcoidosis in lymph nodes: supportive findings of [18F]-fluorodeoxyglucose positron emission tomography-computed tomography during neoadjuvant therapy.

PubMed

Kishino, Takayoshi; Okano, Keiichi; Ando, Yasuhisa; Suto, Hironobu; Asano, Eisuke; Oshima, Minoru; Fujiwara, Masao; Usuki, Hisashi; Kobara, Hideki; Masaki, Tsutomu; Ibuki, Emi; Kushida, Yoshio; Haba, Reiji; Suzuki, Yasuyuki

2018-06-25

In patients with esophageal cancer, differentiation between lymph node metastasis and lymphadenopathies from sarcoidosis or sarcoid-like reactions of lymph nodes is clinically important. Herein, we report two esophageal cancer cases with lymph node involvement of sarcoid-like reaction or sarcoidosis. One patient received chemotherapy and the other chemoradiotherapy as initial treatments. In both cases, [ 18 F]-fluorodeoxyglucose positron emission tomography-computed tomography (FDG-PET/CT) was performed before and after chemo(radio)therapy. After the treatment, FDG uptake was not detected in the primary tumor, but it was slightly reduced in the hilar and mediastinal lymph nodes in both cases. These non-identical responses to chemo(radio)therapy suggest the presence of sarcoid-like reaction of lymph nodes associated with squamous cell carcinoma of the esophagus. Curative surgical resection was performed as treatment. These FDG-PET/CT findings may be helpful to distinguish between metastasis and sarcoidosis-associated lymphadenopathy in esophageal cancer.
The Role of Energy Reservoirs in Distributed Computing: Manufacturing, Implementing, and Optimizing Energy Storage in Energy-Autonomous Sensor Nodes

NASA Astrophysics Data System (ADS)

Cowell, Martin Andrew

The world already hosts more internet connected devices than people, and that ratio is only increasing. These devices seamlessly integrate with peoples lives to collect rich data and give immediate feedback about complex systems from business, health care, transportation, and security. As every aspect of global economies integrate distributed computing into their industrial systems and these systems benefit from rich datasets. Managing the power demands of these distributed computers will be paramount to ensure the continued operation of these networks, and is elegantly addressed by including local energy harvesting and storage on a per-node basis. By replacing non-rechargeable batteries with energy harvesting, wireless sensor nodes will increase their lifetimes by an order of magnitude. This work investigates the coupling of high power energy storage with energy harvesting technologies to power wireless sensor nodes; with sections covering device manufacturing, system integration, and mathematical modeling. First we consider the energy storage mechanism of supercapacitors and batteries, and identify favorable characteristics in both reservoir types. We then discuss experimental methods used to manufacture high power supercapacitors in our labs. We go on to detail the integration of our fabricated devices with collaborating labs to create functional sensor node demonstrations. With the practical knowledge gained through in-lab manufacturing and system integration, we build mathematical models to aid in device and system design. First, we model the mechanism of energy storage in porous graphene supercapacitors to aid in component architecture optimization. We then model the operation of entire sensor nodes for the purpose of optimally sizing the energy harvesting and energy reservoir components. In consideration of deploying these sensor nodes in real-world environments, we model the operation of our energy harvesting and power management systems subject to spatially and temporally varying energy availability in order to understand sensor node reliability. Looking to the future, we see an opportunity for further research to implement machine learning algorithms to control the energy resources of distributed computing networks.
An evaluation of the state of time synchronization on leadership class supercomputers

DOE PAGES

Jones, Terry; Ostrouchov, George; Koenig, Gregory A.; ...

2017-10-09

We present a detailed examination of time agreement characteristics for nodes within extreme-scale parallel computers. Using a software tool we introduce in this paper, we quantify attributes of clock skew among nodes in three representative high-performance computers sited at three national laboratories. Our measurements detail the statistical properties of time agreement among nodes and how time agreement drifts over typical application execution durations. We discuss the implications of our measurements, why the current state of the field is inadequate, and propose strategies to address observed shortcomings.
An evaluation of the state of time synchronization on leadership class supercomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jones, Terry; Ostrouchov, George; Koenig, Gregory A.

We present a detailed examination of time agreement characteristics for nodes within extreme-scale parallel computers. Using a software tool we introduce in this paper, we quantify attributes of clock skew among nodes in three representative high-performance computers sited at three national laboratories. Our measurements detail the statistical properties of time agreement among nodes and how time agreement drifts over typical application execution durations. We discuss the implications of our measurements, why the current state of the field is inadequate, and propose strategies to address observed shortcomings.

Lymph Node Size on Computed Tomography Images Is a Predictive Indicator for Lymph Node Metastasis in Patients with Colorectal Neuroendocrine Tumors.

PubMed

Tanaka, Toshiaki; Nozawa, Hiroaki; Kawai, Kazushige; Hata, Keisuke; Kiyomatsu, Tomomichi; Nishikawa, Takeshi; Otani, Kensuke; Sasaki, Kazuhito; Murono, Koji; Watanabe, Toshiaki

2017-01-01

Colorectal neuroendocrine tumors (NET) are a rare manifestation of colorectal neoplasia, requiring for radical dissection of the regional lymph nodes along with colorectal resection similar to that required for colorectal cancer. However, thus far, no reports have described the ability of computed tomography (CT) to predict lymph node involvement. In this study, we revealed the prediction rate of lymph node metastasis using contrast-enhanced CT. A total of 21 patients with colorectal NET undergoing colorectal resection were recruited from January 2010 to June 2016. We compared the CT findings between samples with or without pathologically proven lymph node metastasis, in each field (pericolic/perirectal and intermediate nodes). Within the pericolic/perirectal field, any lymph node larger than 5 mm in the CT images was a predictive indicator of lymph node metastasis with a sensitivity, specificity, and area under ROC curve (AUC) of 66.7%, 87.5%, and 0.844, respectively. Within the intermediate field, any visible lymph node on the CT was a predictive indicator of lymph node metastasis with a sensitivity, specificity, and AUC of 100%, 76.4%, and 0.890, respectively. In addition, when we observed lymph nodes larger than 3 mm on the CT images, the sensitivity and specificity were 100% and 82.4%, respectively, with an AUC of 0.8971. CT images provide predictive information for lymph node metastasis with a high rate of accuracy. Copyright© 2017, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.
An Energy-Aware Hybrid ARQ Scheme with Multi-ACKs for Data Sensing Wireless Sensor Networks.

PubMed

Zhang, Jinhuan; Long, Jun

2017-06-12

Wireless sensor networks (WSNs) are one of the important supporting technologies of edge computing. In WSNs, reliable communications are essential for most applications due to the unreliability of wireless links. In addition, network lifetime is also an important performance metric and needs to be considered in many WSN studies. In the paper, an energy-aware hybrid Automatic Repeat-reQuest protocol (ARQ) scheme is proposed to ensure energy efficiency under the guarantee of network transmission reliability. In the scheme, the source node sends data packets continuously with the correct window size and it does not need to wait for the acknowledgement (ACK) confirmation for each data packet. When the destination receives K data packets, it will return multiple copies of one ACK for confirmation to avoid ACK packet loss. The energy consumption of each node in flat circle network applying the proposed scheme is statistical analyzed and the cases under which it is more energy efficiency than the original scheme is discussed. Moreover, how to select parameters of the scheme is addressed to extend the network lifetime under the constraint of the network reliability. In addition, the energy efficiency of the proposed schemes is evaluated. Simulation results are presented to demonstrate that a node energy consumption reduction could be gained and the network lifetime is prolonged.
Mathematical Models of Cardiac Pacemaking Function

NASA Astrophysics Data System (ADS)

Li, Pan; Lines, Glenn T.; Maleckar, Mary M.; Tveito, Aslak

2013-10-01

Over the past half century, there has been intense and fruitful interaction between experimental and computational investigations of cardiac function. This interaction has, for example, led to deep understanding of cardiac excitation-contraction coupling; how it works, as well as how it fails. However, many lines of inquiry remain unresolved, among them the initiation of each heartbeat. The sinoatrial node, a cluster of specialized pacemaking cells in the right atrium of the heart, spontaneously generates an electro-chemical wave that spreads through the atria and through the cardiac conduction system to the ventricles, initiating the contraction of cardiac muscle essential for pumping blood to the body. Despite the fundamental importance of this primary pacemaker, this process is still not fully understood, and ionic mechanisms underlying cardiac pacemaking function are currently under heated debate. Several mathematical models of sinoatrial node cell membrane electrophysiology have been constructed as based on different experimental data sets and hypotheses. As could be expected, these differing models offer diverse predictions about cardiac pacemaking activities. This paper aims to present the current state of debate over the origins of the pacemaking function of the sinoatrial node. Here, we will specifically review the state-of-the-art of cardiac pacemaker modeling, with a special emphasis on current discrepancies, limitations, and future challenges.
Parallel definition of tear film maps on distributed-memory clusters for the support of dry eye diagnosis.

PubMed

González-Domínguez, Jorge; Remeseiro, Beatriz; Martín, María J

2017-02-01

The analysis of the interference patterns on the tear film lipid layer is a useful clinical test to diagnose dry eye syndrome. This task can be automated with a high degree of accuracy by means of the use of tear film maps. However, the time required by the existing applications to generate them prevents a wider acceptance of this method by medical experts. Multithreading has been previously successfully employed by the authors to accelerate the tear film map definition on multicore single-node machines. In this work, we propose a hybrid message-passing and multithreading parallel approach that further accelerates the generation of tear film maps by exploiting the computational capabilities of distributed-memory systems such as multicore clusters and supercomputers. The algorithm for drawing tear film maps is parallelized using Message Passing Interface (MPI) for inter-node communications and the multithreading support available in the C++11 standard for intra-node parallelization. The original algorithm is modified to reduce the communications and increase the scalability. The hybrid method has been tested on 32 nodes of an Intel cluster (with two 12-core Haswell 2680v3 processors per node) using 50 representative images. Results show that maximum runtime is reduced from almost two minutes using the previous only-multithreaded approach to less than ten seconds using the hybrid method. The hybrid MPI/multithreaded implementation can be used by medical experts to obtain tear film maps in only a few seconds, which will significantly accelerate and facilitate the diagnosis of the dry eye syndrome. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Research in Wireless Networks and Communications

DTIC Science & Technology

2008-05-01

TESTBED SETUP AND INITIAL MULTI-HOP EXPERIENCE As a proof of concept, we assembled a testbed platform of nodes based on 400MHz AMD Geode single-board...experi- ments on a testbed network consisting of 400MHz AMD Geode single-board computers made by Thecus Inc. We equipped each of these nodes with two...ground nodes were placed on a line, with about 3 feet of separation between adjacent nodes. The nodes were powered by 400MHz AMD Geode single-board
WinHPC System Configuration | High-Performance Computing | NREL

Science.gov Websites

CPUs with 48GB of memory. Node 04 has dual Intel Xeon E5530 CPUs with 24GB of memory. Nodes 05-20 have dual AMD Opteron 2374 HE CPUs with 16GB of memory. Nodes 21-30 have been decommissioned. Nodes 31-35 have dual Intel Xeon X5675 CPUs with 48GB of memory. Nodes 36-37 have dual Intel Xeon E5-2680 CPUs with
Live imaging and genetic analysis of mouse notochord formation reveals regional morphogenetic mechanisms.

PubMed

Yamanaka, Yojiro; Tamplin, Owen J; Beckers, Anja; Gossler, Achim; Rossant, Janet

2007-12-01

The node and notochord have been extensively studied as signaling centers in the vertebrate embryo. The morphogenesis of these tissues, particularly in mouse, is not well understood. Using time-lapse live imaging and cell lineage tracking, we show the notochord has distinct morphogenetic origins along the anterior-posterior axis. The anterior head process notochord arises independently of the node by condensation of dispersed cells. The trunk notochord is derived from the node and forms by convergent extension. The tail notochord forms by node-derived progenitors that actively migrate toward the posterior. We also reveal distinct genetic regulation within these different regions. We show that Foxa2 compensates for and genetically interacts with Noto in the trunk notochord, and that Noto has an evolutionarily conserved role in regulating axial versus paraxial cell fate. Therefore, we propose three distinct regions within the mouse notochord, each with unique morphogenetic origins.
Job Management Requirements for NAS Parallel Systems and Clusters

NASA Technical Reports Server (NTRS)

Saphir, William; Tanner, Leigh Ann; Traversat, Bernard

1995-01-01

A job management system is a critical component of a production supercomputing environment, permitting oversubscribed resources to be shared fairly and efficiently. Job management systems that were originally designed for traditional vector supercomputers are not appropriate for the distributed-memory parallel supercomputers that are becoming increasingly important in the high performance computing industry. Newer job management systems offer new functionality but do not solve fundamental problems. We address some of the main issues in resource allocation and job scheduling we have encountered on two parallel computers - a 160-node IBM SP2 and a cluster of 20 high performance workstations located at the Numerical Aerodynamic Simulation facility. We describe the requirements for resource allocation and job management that are necessary to provide a production supercomputing environment on these machines, prioritizing according to difficulty and importance, and advocating a return to fundamental issues.
Preventing messaging queue deadlocks in a DMA environment

DOEpatents

Blocksome, Michael A; Chen, Dong; Gooding, Thomas; Heidelberger, Philip; Parker, Jeff

2014-01-14

Embodiments of the invention may be used to manage message queues in a parallel computing environment to prevent message queue deadlock. A direct memory access controller of a compute node may determine when a messaging queue is full. In response, the DMA may generate and interrupt. An interrupt handler may stop the DMA and swap all descriptors from the full messaging queue into a larger queue (or enlarge the original queue). The interrupt handler then restarts the DMA. Alternatively, the interrupt handler stops the DMA, allocates a memory block to hold queue data, and then moves descriptors from the full messaging queue into the allocated memory block. The interrupt handler then restarts the DMA. During a normal messaging advance cycle, a messaging manager attempts to inject the descriptors in the memory block into other messaging queues until the descriptors have all been processed.
Mediastinal lymph node detection and station mapping on chest CT using spatial priors and random forest

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liu, Jiamin; Hoffman, Joanne; Zhao, Jocelyn

2016-07-15

Purpose: To develop an automated system for mediastinal lymph node detection and station mapping for chest CT. Methods: The contextual organs, trachea, lungs, and spine are first automatically identified to locate the region of interest (ROI) (mediastinum). The authors employ shape features derived from Hessian analysis, local object scale, and circular transformation that are computed per voxel in the ROI. Eight more anatomical structures are simultaneously segmented by multiatlas label fusion. Spatial priors are defined as the relative multidimensional distance vectors corresponding to each structure. Intensity, shape, and spatial prior features are integrated and parsed by a random forest classifiermore » for lymph node detection. The detected candidates are then segmented by the following curve evolution process. Texture features are computed on the segmented lymph nodes and a support vector machine committee is used for final classification. For lymph node station labeling, based on the segmentation results of the above anatomical structures, the textual definitions of mediastinal lymph node map according to the International Association for the Study of Lung Cancer are converted into patient-specific color-coded CT image, where the lymph node station can be automatically assigned for each detected node. Results: The chest CT volumes from 70 patients with 316 enlarged mediastinal lymph nodes are used for validation. For lymph node detection, their system achieves 88% sensitivity at eight false positives per patient. For lymph node station labeling, 84.5% of lymph nodes are correctly assigned to their stations. Conclusions: Multiple-channel shape, intensity, and spatial prior features aggregated by a random forest classifier improve mediastinal lymph node detection on chest CT. Using the location information of segmented anatomic structures from the multiatlas formulation enables accurate identification of lymph node stations.« less
Effects of maximum node degree on computer virus spreading in scale-free networks

NASA Astrophysics Data System (ADS)

Bamaarouf, O.; Ould Baba, A.; Lamzabi, S.; Rachadi, A.; Ez-Zahraouy, H.

2017-10-01

The increase of the use of the Internet networks favors the spread of viruses. In this paper, we studied the spread of viruses in the scale-free network with different topologies based on the Susceptible-Infected-External (SIE) model. It is found that the network structure influences the virus spreading. We have shown also that the nodes of high degree are more susceptible to infection than others. Furthermore, we have determined a critical maximum value of node degree (Kc), below which the network is more resistible and the computer virus cannot expand into the whole network. The influence of network size is also studied. We found that the network with low size is more effective to reduce the proportion of infected nodes.
Optimized scalable network switch

DOEpatents

Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton On Hudson, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Takken, Todd E [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY

2007-12-04

In a massively parallel computing system having a plurality of nodes configured in m multi-dimensions, each node including a computing device, a method for routing packets towards their destination nodes is provided which includes generating at least one of a 2m plurality of compact bit vectors containing information derived from downstream nodes. A multilevel arbitration process in which downstream information stored in the compact vectors, such as link status information and fullness of downstream buffers, is used to determine a preferred direction and virtual channel for packet transmission. Preferred direction ranges are encoded and virtual channels are selected by examining the plurality of compact bit vectors. This dynamic routing method eliminates the necessity of routing tables, thus enhancing scalability of the switch.
Optimized scalable network switch

DOEpatents

Blumrich, Matthias A.; Chen, Dong; Coteus, Paul W.

2010-02-23

In a massively parallel computing system having a plurality of nodes configured in m multi-dimensions, each node including a computing device, a method for routing packets towards their destination nodes is provided which includes generating at least one of a 2m plurality of compact bit vectors containing information derived from downstream nodes. A multilevel arbitration process in which downstream information stored in the compact vectors, such as link status information and fullness of downstream buffers, is used to determine a preferred direction and virtual channel for packet transmission. Preferred direction ranges are encoded and virtual channels are selected by examining the plurality of compact bit vectors. This dynamic routing method eliminates the necessity of routing tables, thus enhancing scalability of the switch.
Fault-Tolerant Local-Area Network

NASA Technical Reports Server (NTRS)

Morales, Sergio; Friedman, Gary L.

1988-01-01

Local-area network (LAN) for computers prevents single-point failure from interrupting communication between nodes of network. Includes two complete cables, LAN 1 and LAN 2. Microprocessor-based slave switches link cables to network-node devices as work stations, print servers, and file servers. Slave switches respond to commands from master switch, connecting nodes to two cable networks or disconnecting them so they are completely isolated. System monitor and control computer (SMC) acts as gateway, allowing nodes on either cable to communicate with each other and ensuring that LAN 1 and LAN 2 are fully used when functioning properly. Network monitors and controls itself, automatically routes traffic for efficient use of resources, and isolates and corrects its own faults, with potential dramatic reduction in time out of service.
Comparative Effects of Computer-Based Concept Maps, Refutational Texts, and Expository Texts on Science Learning

ERIC Educational Resources Information Center

Adesope, Olusola O.; Cavagnetto, Andy; Hunsu, Nathaniel J.; Anguiano, Carlos; Lloyd, Joshua

2017-01-01

This study used a between-subjects experimental design to examine the effects of three different computer-based instructional strategies (concept map, refutation text, and expository scientific text) on science learning. Concept maps are node-link diagrams that show concepts as nodes and relationships among the concepts as labeled links.…
High Performance Active Database Management on a Shared-Nothing Parallel Processor

DTIC Science & Technology

1998-05-01

either stored or virtual. A stored node is like a materialized view. It actually contains the specified tuples. A virtual node is like a real view...90292-6695 DL-5 COLUMBIA UNIV/DEPT COMPUTER SCIENCi ATTN: OR GAIL £. KAISER 450 COMPUTER SCIENCE 3LDG 500 WEST 12ÖTH STRSET NEW YORK NY 10027
A site oriented supercomputer for theoretical physics: The Fermilab Advanced Computer Program Multi Array Processor System (ACMAPS)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nash, T.; Atac, R.; Cook, A.

1989-03-06

The ACPMAPS multipocessor is a highly cost effective, local memory parallel computer with a hypercube or compound hypercube architecture. Communication requires the attention of only the two communicating nodes. The design is aimed at floating point intensive, grid like problems, particularly those with extreme computing requirements. The processing nodes of the system are single board array processors, each with a peak power of 20 Mflops, supported by 8 Mbytes of data and 2 Mbytes of instruction memory. The system currently being assembled has a peak power of 5 Gflops. The nodes are based on the Weitek XL Chip set. Themore » system delivers performance at approximately $300/Mflop. 8 refs., 4 figs.« less
Robust scalable stabilisability conditions for large-scale heterogeneous multi-agent systems with uncertain nonlinear interactions: towards a distributed computing architecture

NASA Astrophysics Data System (ADS)

Manfredi, Sabato

2016-06-01

Large-scale dynamic systems are becoming highly pervasive in their occurrence with applications ranging from system biology, environment monitoring, sensor networks, and power systems. They are characterised by high dimensionality, complexity, and uncertainty in the node dynamic/interactions that require more and more computational demanding methods for their analysis and control design, as well as the network size and node system/interaction complexity increase. Therefore, it is a challenging problem to find scalable computational method for distributed control design of large-scale networks. In this paper, we investigate the robust distributed stabilisation problem of large-scale nonlinear multi-agent systems (briefly MASs) composed of non-identical (heterogeneous) linear dynamical systems coupled by uncertain nonlinear time-varying interconnections. By employing Lyapunov stability theory and linear matrix inequality (LMI) technique, new conditions are given for the distributed control design of large-scale MASs that can be easily solved by the toolbox of MATLAB. The stabilisability of each node dynamic is a sufficient assumption to design a global stabilising distributed control. The proposed approach improves some of the existing LMI-based results on MAS by both overcoming their computational limits and extending the applicative scenario to large-scale nonlinear heterogeneous MASs. Additionally, the proposed LMI conditions are further reduced in terms of computational requirement in the case of weakly heterogeneous MASs, which is a common scenario in real application where the network nodes and links are affected by parameter uncertainties. One of the main advantages of the proposed approach is to allow to move from a centralised towards a distributed computing architecture so that the expensive computation workload spent to solve LMIs may be shared among processors located at the networked nodes, thus increasing the scalability of the approach than the network size. Finally, a numerical example shows the applicability of the proposed method and its advantage in terms of computational complexity when compared with the existing approaches.
Clustering in complex directed networks

NASA Astrophysics Data System (ADS)

Fagiolo, Giorgio

2007-08-01

Many empirical networks display an inherent tendency to cluster, i.e., to form circles of connected nodes. This feature is typically measured by the clustering coefficient (CC). The CC, originally introduced for binary, undirected graphs, has been recently generalized to weighted, undirected networks. Here we extend the CC to the case of (binary and weighted) directed networks and we compute its expected value for random graphs. We distinguish between CCs that count all directed triangles in the graph (independently of the direction of their edges) and CCs that only consider particular types of directed triangles (e.g., cycles). The main concepts are illustrated by employing empirical data on world-trade flows.
Community detection using preference networks

NASA Astrophysics Data System (ADS)

Tasgin, Mursel; Bingol, Haluk O.

2018-04-01

Community detection is the task of identifying clusters or groups of nodes in a network where nodes within the same group are more connected with each other than with nodes in different groups. It has practical uses in identifying similar functions or roles of nodes in many biological, social and computer networks. With the availability of very large networks in recent years, performance and scalability of community detection algorithms become crucial, i.e. if time complexity of an algorithm is high, it cannot run on large networks. In this paper, we propose a new community detection algorithm, which has a local approach and is able to run on large networks. It has a simple and effective method; given a network, algorithm constructs a preference network of nodes where each node has a single outgoing edge showing its preferred node to be in the same community with. In such a preference network, each connected component is a community. Selection of the preferred node is performed using similarity based metrics of nodes. We use two alternatives for this purpose which can be calculated in 1-neighborhood of nodes, i.e. number of common neighbors of selector node and its neighbors and, the spread capability of neighbors around the selector node which is calculated by the gossip algorithm of Lind et.al. Our algorithm is tested on both computer generated LFR networks and real-life networks with ground-truth community structure. It can identify communities accurately in a fast way. It is local, scalable and suitable for distributed execution on large networks.

Multi-agent grid system Agent-GRID with dynamic load balancing of cluster nodes

NASA Astrophysics Data System (ADS)

Satymbekov, M. N.; Pak, I. T.; Naizabayeva, L.; Nurzhanov, Ch. A.

2017-12-01

In this study the work presents the system designed for automated load balancing of the contributor by analysing the load of compute nodes and the subsequent migration of virtual machines from loaded nodes to less loaded ones. This system increases the performance of cluster nodes and helps in the timely processing of data. A grid system balances the work of cluster nodes the relevance of the system is the award of multi-agent balancing for the solution of such problems.
Performance of VPIC on Trinity

NASA Astrophysics Data System (ADS)

Nystrom, W. D.; Bergen, B.; Bird, R. F.; Bowers, K. J.; Daughton, W. S.; Guo, F.; Li, H.; Nam, H. A.; Pang, X.; Rust, W. N., III; Wohlbier, J.; Yin, L.; Albright, B. J.

2016-10-01

Trinity is a new major DOE computing resource which is going through final acceptance testing at Los Alamos National Laboratory. Trinity has several new and unique architectural features including two compute partitions, one with dual socket Intel Haswell Xeon compute nodes and one with Intel Knights Landing (KNL) Xeon Phi compute nodes. Additional unique features include use of on package high bandwidth memory (HBM) for the KNL nodes, the ability to configure the KNL nodes with respect to HBM model and on die network topology in a variety of operational modes at run time, and use of solid state storage via burst buffer technology to reduce time required to perform I/O. An effort is in progress to port and optimize VPIC to Trinity and evaluate its performance. Because VPIC was recently released as Open Source, it is being used as part of acceptance testing for Trinity and is participating in the Trinity Open Science Program which has resulted in excellent collaboration activities with both Cray and Intel. Results of this work will be presented on performance of VPIC on both Haswell and KNL partitions for both single node runs and runs at scale. Work performed under the auspices of the U.S. Dept. of Energy by the Los Alamos National Security, LLC Los Alamos National Laboratory under contract DE-AC52-06NA25396 and supported by the LANL LDRD program.
Synthesis of natural flows at selected sites in the upper Missouri River basin, Montana, 1928-89

USGS Publications Warehouse

Cary, L.E.; Parrett, Charles

1996-01-01

Natural monthly streamflows were synthesized for the years 1928-89 for 43 sites in the upper Missouri River Basin upstream from Fort Peck Lake in Montana. The sites are represented as nodes in a streamflow accounting model being developed by the Bureau of Reclamation. Recorded and historical flows at most sites have been affected by human activities including reservoir storage, diversions for irrigation, and municipal use. Natural flows at the sites were synthesized by eliminating the effects of these activities. Recorded data at some sites do not include the entire study period. The missing flows at these sites were estimated using a statistical procedure. The methods of synthesis varied, depending on upstream activities and information available. Recorded flows were transferred to nodes that did not have streamflow-gaging stations from the nearest station with a sufficient length of record. The flows at one node were computed as the sum of flows from three upstream tributaries. Monthly changes in reservoir storage were computed from monthend contents. The changes in storage were corrected for the effects of evaporation and precipitation using pan-evaporation and precipitation data from climate stations. Irrigation depletions and consumptive use by the three largest municipalities were computed. Synthesized natural flow at most nodes was computed by adding algebraically the upstream depletions and changes in reservoir storage to recorded or historical flow at the nodes.
GATE Monte Carlo simulation in a cloud computing environment

NASA Astrophysics Data System (ADS)

Rowedder, Blake Austin

The GEANT4-based GATE is a unique and powerful Monte Carlo (MC) platform, which provides a single code library allowing the simulation of specific medical physics applications, e.g. PET, SPECT, CT, radiotherapy, and hadron therapy. However, this rigorous yet flexible platform is used only sparingly in the clinic due to its lengthy calculation time. By accessing the powerful computational resources of a cloud computing environment, GATE's runtime can be significantly reduced to clinically feasible levels without the sizable investment of a local high performance cluster. This study investigated a reliable and efficient execution of GATE MC simulations using a commercial cloud computing services. Amazon's Elastic Compute Cloud was used to launch several nodes equipped with GATE. Job data was initially broken up on the local computer, then uploaded to the worker nodes on the cloud. The results were automatically downloaded and aggregated on the local computer for display and analysis. Five simulations were repeated for every cluster size between 1 and 20 nodes. Ultimately, increasing cluster size resulted in a decrease in calculation time that could be expressed with an inverse power model. Comparing the benchmark results to the published values and error margins indicated that the simulation results were not affected by the cluster size and thus that integrity of a calculation is preserved in a cloud computing environment. The runtime of a 53 minute long simulation was decreased to 3.11 minutes when run on a 20-node cluster. The ability to improve the speed of simulation suggests that fast MC simulations are viable for imaging and radiotherapy applications. With high power computing continuing to lower in price and accessibility, implementing Monte Carlo techniques with cloud computing for clinical applications will continue to become more attractive.
Clinical significance of the pattern of lymph node metastasis depending on the location of gastric cancer.

PubMed

Han, Ki Bin; Jang, You Jin; Kim, Jong Han; Park, Sung Soo; Park, Seong Heum; Kim, Seung Joo; Mok, Young Jae; Kim, Chong Suk

2011-06-01

When performing a laparoscopic assisted gastrectomy, a function-preserving gastrectomy is performed depending on the location of the primary gastric cancer. This study examined the incidence of lymph node metastasis by the lymph node station number by tumor location to determine the optimal extent of the lymph node dissection. The subjects consisted of 1,510 patients diagnosed with gastric cancer who underwent a gastrectomy between 1996 and 2005. The patients were divided into three groups: upper, middle and lower third, depending on the location of the primary tumor. The lymph node metastasis patterns were analyzed in the total and early gastric cancer patients. In all patients, lymph node station numbers 1, 2, 3, 7, 10 and 11 metastases were dominant in the cancer originating in the upper third, whereas station numbers 4, 5, 6 and 8 were dominant in the lower third. In early gastric cancer patients, the station number of lymph nodes with a metastasis did not show a significant difference in stage pT1a disease. On the other hand, a metastasis in lymph node station number 6 was dominant in stage pT1b disease that originated in the lower third of the stomach. When performing a laparoscopic-assisted gastrectomy for early gastric cancer, a limited lymphadenectomy is considered adequate during a function-preserving gastrectomy in mucosal (T1a) cancer. On the other hand, for submucosal (T1b) cancer, a number 6 node dissection should be performed when performing a pylorus preserving gastrectomy.
Texture Analysis and Synthesis of Malignant and Benign Mediastinal Lymph Nodes in Patients with Lung Cancer on Computed Tomography

NASA Astrophysics Data System (ADS)

Pham, Tuan D.; Watanabe, Yuzuru; Higuchi, Mitsunori; Suzuki, Hiroyuki

2017-02-01

Texture analysis of computed tomography (CT) imaging has been found useful to distinguish subtle differences, which are in- visible to human eyes, between malignant and benign tissues in cancer patients. This study implemented two complementary methods of texture analysis, known as the gray-level co-occurrence matrix (GLCM) and the experimental semivariogram (SV) with an aim to improve the predictive value of evaluating mediastinal lymph nodes in lung cancer. The GLCM was explored with the use of a rich set of its derived features, whereas the SV feature was extracted on real and synthesized CT samples of benign and malignant lymph nodes. A distinct advantage of the computer methodology presented herein is the alleviation of the need for an automated precise segmentation of the lymph nodes. Using the logistic regression model, a sensitivity of 75%, specificity of 90%, and area under curve of 0.89 were obtained in the test population. A tenfold cross-validation of 70% accuracy of classifying between benign and malignant lymph nodes was obtained using the support vector machines as a pattern classifier. These results are higher than those recently reported in literature with similar studies.
An operating system for future aerospace vehicle computer systems

NASA Technical Reports Server (NTRS)

Foudriat, E. C.; Berman, W. J.; Will, R. W.; Bynum, W. L.

1984-01-01

The requirements for future aerospace vehicle computer operating systems are examined in this paper. The computer architecture is assumed to be distributed with a local area network connecting the nodes. Each node is assumed to provide a specific functionality. The network provides for communication so that the overall tasks of the vehicle are accomplished. The O/S structure is based upon the concept of objects. The mechanisms for integrating node unique objects with node common objects in order to implement both the autonomy and the cooperation between nodes is developed. The requirements for time critical performance and reliability and recovery are discussed. Time critical performance impacts all parts of the distributed operating system; e.g., its structure, the functional design of its objects, the language structure, etc. Throughout the paper the tradeoffs - concurrency, language structure, object recovery, binding, file structure, communication protocol, programmer freedom, etc. - are considered to arrive at a feasible, maximum performance design. Reliability of the network system is considered. A parallel multipath bus structure is proposed for the control of delivery time for time critical messages. The architecture also supports immediate recovery for the time critical message system after a communication failure.
Wide-area-distributed storage system for a multimedia database

NASA Astrophysics Data System (ADS)

Ueno, Masahiro; Kinoshita, Shigechika; Kuriki, Makato; Murata, Setsuko; Iwatsu, Shigetaro

1998-12-01

We have developed a wide-area-distribution storage system for multimedia databases, which minimizes the possibility of simultaneous failure of multiple disks in the event of a major disaster. It features a RAID system, whose member disks are spatially distributed over a wide area. Each node has a device, which includes the controller of the RAID and the controller of the member disks controlled by other nodes. The devices in the node are connected to a computer, using fiber optic cables and communicate using fiber-channel technology. Any computer at a node can utilize multiple devices connected by optical fibers as a single 'virtual disk.' The advantage of this system structure is that devices and fiber optic cables are shared by the computers. In this report, we first described our proposed system, and a prototype was used for testing. We then discussed its performance; i.e., how to read and write throughputs are affected by data-access delay, the RAID level, and queuing.
Percolation Centrality: Quantifying Graph-Theoretic Impact of Nodes during Percolation in Networks

PubMed Central

Piraveenan, Mahendra; Prokopenko, Mikhail; Hossain, Liaquat

2013-01-01

A number of centrality measures are available to determine the relative importance of a node in a complex network, and betweenness is prominent among them. However, the existing centrality measures are not adequate in network percolation scenarios (such as during infection transmission in a social network of individuals, spreading of computer viruses on computer networks, or transmission of disease over a network of towns) because they do not account for the changing percolation states of individual nodes. We propose a new measure, percolation centrality, that quantifies relative impact of nodes based on their topological connectivity, as well as their percolation states. The measure can be extended to include random walk based definitions, and its computational complexity is shown to be of the same order as that of betweenness centrality. We demonstrate the usage of percolation centrality by applying it to a canonical network as well as simulated and real world scale-free and random networks. PMID:23349699
Template based parallel checkpointing in a massively parallel computer system

DOEpatents

Archer, Charles Jens [Rochester, MN; Inglett, Todd Alan [Rochester, MN

2009-01-13

A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.
Mobile clusters of single board computers: an option for providing resources to student projects and researchers.

PubMed

Baun, Christian

2016-01-01

Clusters usually consist of servers, workstations or personal computers as nodes. But especially for academic purposes like student projects or scientific projects, the cost for purchase and operation can be a challenge. Single board computers cannot compete with the performance or energy-efficiency of higher-value systems, but they are an option to build inexpensive cluster systems. Because of the compact design and modest energy consumption, it is possible to build clusters of single board computers in a way that they are mobile and can be easily transported by the users. This paper describes the construction of such a cluster, useful applications and the performance of the single nodes. Furthermore, the clusters' performance and energy-efficiency is analyzed by executing the High Performance Linpack benchmark with a different number of nodes and different proportion of the systems total main memory utilized.
Scheduling based on a dynamic resource connection

NASA Astrophysics Data System (ADS)

Nagiyev, A. E.; Botygin, I. A.; Shersntneva, A. I.; Konyaev, P. A.

2017-02-01

The practical using of distributed computing systems associated with many problems, including troubles with the organization of an effective interaction between the agents located at the nodes of the system, with the specific configuration of each node of the system to perform a certain task, with the effective distribution of the available information and computational resources of the system, with the control of multithreading which implements the logic of solving research problems and so on. The article describes the method of computing load balancing in distributed automatic systems, focused on the multi-agency and multi-threaded data processing. The scheme of the control of processing requests from the terminal devices, providing the effective dynamic scaling of computing power under peak load is offered. The results of the model experiments research of the developed load scheduling algorithm are set out. These results show the effectiveness of the algorithm even with a significant expansion in the number of connected nodes and zoom in the architecture distributed computing system.
Masked Proportional Routing

NASA Technical Reports Server (NTRS)

Wolpert, David H. (Inventor)

2003-01-01

Distributed approach for determining a path connecting adjacent network nodes, for probabilistically or deterministically transporting an entity, with entity characteristic mu from a source node to a destination node. Each node i is directly connected to an arbitrary number J(mu) of nodes, labeled or numbered j=jl, j2, .... jJ(mu). In a deterministic version, a J(mu)-component baseline proportion vector p(i;mu) is associated with node i. A J(mu)-component applied proportion vector p*(i;mu) is determined from p(i;mu) to preclude an entity visiting a node more than once. Third and fourth J(mu)-component vectors, with components iteratively determined by Target(i;n(mu);mu),=alpha(mu).Target(i;n(mu)-1;mu)j+beta(mu).p* (i;mu)j and Actual(i;n(mu);+a(mu)j. Actual(i;n(mu)-l;mu)j+beta(mu).Sent(i;j'(mu);n(mu)-1;mu)j, are computed, where n(mu) is an entity sequence index and alpha(mu) and beta(mu) are selected numbers. In one embodiment, at each node i, the node j=j'(mu) with the largest vector component difference, Target(i;n(mu);mu)j'- Actual (i;n(mu);mu)j'. is chosen for the next link for entity transport, except in special gap circumstances, where the same link is optionally used for transporting consecutively arriving entities. The network nodes may be computer-controlled routers that switch collections of packets, frames, cells or other information units. Alternatively, the nodes may be waypoints for movement of physical items in a network or for transformation of a physical item. The nodes may be states of an entity undergoing state transitions, where allowed transitions are specified by the network and/or the destination node.
Send-side matching of data communications messages

DOEpatents

Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

2014-07-01

Send-side matching of data communications messages includes a plurality of compute nodes organized for collective operations, including: issuing by a receiving node to source nodes a receive message that specifies receipt of a single message to be sent from any source node, the receive message including message matching information, a specification of a hardware-level mutual exclusion device, and an identification of a receive buffer; matching by two or more of the source nodes the receive message with pending send messages in the two or more source nodes; operating by one of the source nodes having a matching send message the mutual exclusion device, excluding messages from other source nodes with matching send messages and identifying to the receiving node the source node operating the mutual exclusion device; and sending to the receiving node from the source node operating the mutual exclusion device a matched pending message.
Detection of lymph node metastases in pediatric and adolescent/young adult sarcoma: Sentinel lymph node biopsy versus fludeoxyglucose positron emission tomography imaging-A prospective trial.

PubMed

Wagner, Lars M; Kremer, Nathalie; Gelfand, Michael J; Sharp, Susan E; Turpin, Brian K; Nagarajan, Rajaram; Tiao, Gregory M; Pressey, Joseph G; Yin, Julie; Dasgupta, Roshni

2017-01-01

Lymph node metastases are an important cause of treatment failure for pediatric and adolescent/young adult (AYA) sarcoma patients. Nodal sampling is recommended for certain sarcoma subtypes that have a predilection for lymphatic spread. Sentinel lymph node biopsy (SLNB) may improve the diagnostic yield of nodal sampling, particularly when single-photon emission computed tomography/computed tomography (SPECT-CT) is used to facilitate anatomic localization. Functional imaging with positron emission tomography/computed tomography (PET-CT) is increasingly used for sarcoma staging and is a less invasive alternative to SLNB. To assess the utility of these 2 staging methods, this study prospectively compared SLNB plus SPECT-CT with PET-CT for the identification of nodal metastases in pediatric and AYA patients. Twenty-eight pediatric and AYA sarcoma patients underwent SLNB with SPECT-CT. The histological findings of the excised lymph nodes were then correlated with preoperative PET-CT imaging. A median of 2.4 sentinel nodes were sampled per patient. No wound infections or chronic lymphedema occurred. SLNB identified tumors in 7 of the 28 patients (25%), including 3 patients who had normal PET-CT imaging of the nodal basin. In contrast, PET-CT demonstrated hypermetabolic regional nodes in 14 patients, and this resulted in a positive predictive value of only 29%. The sensitivity and specificity of PET-CT for detecting histologically confirmed nodal metastases were only 57% and 52%, respectively. SLNB can safely guide the rational selection of nodes for biopsy in pediatric and AYA sarcoma patients and can identify therapy-changing nodal disease not appreciated with PET-CT. Cancer 2017;155-160. © 2016 American Cancer Society. © 2016 American Cancer Society.
Potential advantage of studying the lymphatic drainage by sentinel node technique and SPECT-CT image fusion for pelvic irradiation of prostate cancer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Krengli, Marco; Ballare, Andrea; Cannillo, Barbara

2006-11-15

Purpose: This study aims to investigate the in vivo drainage of lymphatic spread by using the sentinel node (SN) technique and single-photon emission computed tomography (SPECT)-computed tomography (CT) image fusion, and to analyze the impact of such information on conformal pelvic irradiation. Methods and Materials: Twenty-three prostate cancer patients, candidates for radical prostatectomy already included in a trial studying the SN technique, were enrolled. CT and SPECT images were obtained after intraprostate injection of 115 MBq of {sup 99m}Tc-nanocolloid, allowing identification of SN and other pelvic lymph nodes. Target and nontarget structures, including lymph nodes identified by SPECT, were drawnmore » on SPECT-CT fusion images. A three-dimensional conformal treatment plan was performed for each patient. Results: Single-photon emission computed tomography lymph nodal uptake was detected in 20 of 23 cases (87%). The SN was inside the pelvic clinical target volume (CTV{sub 2}) in 16 of 20 cases (80%) and received no less than the prescribed dose in 17 of 20 cases (85%). The most frequent locations of SN outside the CTV{sub 2} were the common iliac and presacral lymph nodes. Sixteen of the 32 other lymph nodes (50%) identified by SPECT were found outside the CTV{sub 2}. Overall, the SN and other intrapelvic lymph nodes identified by SPECT were not included in the CTV{sub 2} in 5 of 20 (25%) patients. Conclusions: The study of lymphatic drainage can contribute to a better knowledge of the in vivo potential pattern of lymph node metastasis in prostate cancer and can lead to a modification of treatment volume with consequent optimization of pelvic irradiation.« less
Predicting axillary lymph node metastasis from kinetic statistics of DCE-MRI breast images

NASA Astrophysics Data System (ADS)

Ashraf, Ahmed B.; Lin, Lilie; Gavenonis, Sara C.; Mies, Carolyn; Xanthopoulos, Eric; Kontos, Despina

2012-03-01

The presence of axillary lymph node metastases is the most important prognostic factor in breast cancer and can influence the selection of adjuvant therapy, both chemotherapy and radiotherapy. In this work we present a set of kinetic statistics derived from DCE-MRI for predicting axillary node status. Breast DCE-MRI images from 69 women with known nodal status were analyzed retrospectively under HIPAA and IRB approval. Axillary lymph nodes were positive in 12 patients while 57 patients had no axillary lymph node involvement. Kinetic curves for each pixel were computed and a pixel-wise map of time-to-peak (TTP) was obtained. Pixels were first partitioned according to the similarity of their kinetic behavior, based on TTP values. For every kinetic curve, the following pixel-wise features were computed: peak enhancement (PE), wash-in-slope (WIS), wash-out-slope (WOS). Partition-wise statistics for every feature map were calculated, resulting in a total of 21 kinetic statistic features. ANOVA analysis was done to select features that differ significantly between node positive and node negative women. Using the computed kinetic statistic features a leave-one-out SVM classifier was learned that performs with AUC=0.77 under the ROC curve, outperforming the conventional kinetic measures, including maximum peak enhancement (MPE) and signal enhancement ratio (SER), (AUCs of 0.61 and 0.57 respectively). These findings suggest that our DCE-MRI kinetic statistic features can be used to improve the prediction of axillary node status in breast cancer patients. Such features could ultimately be used as imaging biomarkers to guide personalized treatment choices for women diagnosed with breast cancer.
GATE Monte Carlo simulation of dose distribution using MapReduce in a cloud computing environment.

PubMed

Liu, Yangchuan; Tang, Yuguo; Gao, Xin

2017-12-01

The GATE Monte Carlo simulation platform has good application prospects of treatment planning and quality assurance. However, accurate dose calculation using GATE is time consuming. The purpose of this study is to implement a novel cloud computing method for accurate GATE Monte Carlo simulation of dose distribution using MapReduce. An Amazon Machine Image installed with Hadoop and GATE is created to set up Hadoop clusters on Amazon Elastic Compute Cloud (EC2). Macros, the input files for GATE, are split into a number of self-contained sub-macros. Through Hadoop Streaming, the sub-macros are executed by GATE in Map tasks and the sub-results are aggregated into final outputs in Reduce tasks. As an evaluation, GATE simulations were performed in a cubical water phantom for X-ray photons of 6 and 18 MeV. The parallel simulation on the cloud computing platform is as accurate as the single-threaded simulation on a local server and the simulation correctness is not affected by the failure of some worker nodes. The cloud-based simulation time is approximately inversely proportional to the number of worker nodes. For the simulation of 10 million photons on a cluster with 64 worker nodes, time decreases of 41× and 32× were achieved compared to the single worker node case and the single-threaded case, respectively. The test of Hadoop's fault tolerance showed that the simulation correctness was not affected by the failure of some worker nodes. The results verify that the proposed method provides a feasible cloud computing solution for GATE.
Scalable cloud without dedicated storage

NASA Astrophysics Data System (ADS)

Batkovich, D. V.; Kompaniets, M. V.; Zarochentsev, A. K.

2015-05-01

We present a prototype of a scalable computing cloud. It is intended to be deployed on the basis of a cluster without the separate dedicated storage. The dedicated storage is replaced by the distributed software storage. In addition, all cluster nodes are used both as computing nodes and as storage nodes. This solution increases utilization of the cluster resources as well as improves fault tolerance and performance of the distributed storage. Another advantage of this solution is high scalability with a relatively low initial and maintenance cost. The solution is built on the basis of the open source components like OpenStack, CEPH, etc.
Non-preconditioned conjugate gradient on cell and FPGA based hybrid supercomputer nodes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dubois, David H; Dubois, Andrew J; Boorman, Thomas M

2009-01-01

This work presents a detailed implementation of a double precision, non-preconditioned, Conjugate Gradient algorithm on a Roadrunner heterogeneous supercomputer node. These nodes utilize the Cell Broadband Engine Architecture{sup TM} in conjunction with x86 Opteron{sup TM} processors from AMD. We implement a common Conjugate Gradient algorithm, on a variety of systems, to compare and contrast performance. Implementation results are presented for the Roadrunner hybrid supercomputer, SRC Computers, Inc. MAPStation SRC-6 FPGA enhanced hybrid supercomputer, and AMD Opteron only. In all hybrid implementations wall clock time is measured, including all transfer overhead and compute timings.

Non-preconditioned conjugate gradient on cell and FPCA-based hybrid supercomputer nodes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dubois, David H; Dubois, Andrew J; Boorman, Thomas M

2009-03-10

This work presents a detailed implementation of a double precision, Non-Preconditioned, Conjugate Gradient algorithm on a Roadrunner heterogeneous supercomputer node. These nodes utilize the Cell Broadband Engine Architecture{trademark} in conjunction with x86 Opteron{trademark} processors from AMD. We implement a common Conjugate Gradient algorithm, on a variety of systems, to compare and contrast performance. Implementation results are presented for the Roadrunner hybrid supercomputer, SRC Computers, Inc. MAPStation SRC-6 FPGA enhanced hybrid supercomputer, and AMD Opteron only. In all hybrid implementations wall clock time is measured, including all transfer overhead and compute timings.
A Multi-Level Parallelization Concept for High-Fidelity Multi-Block Solvers

NASA Technical Reports Server (NTRS)

Hatay, Ferhat F.; Jespersen, Dennis C.; Guruswamy, Guru P.; Rizk, Yehia M.; Byun, Chansup; Gee, Ken; VanDalsem, William R. (Technical Monitor)

1997-01-01

The integration of high-fidelity Computational Fluid Dynamics (CFD) analysis tools with the industrial design process benefits greatly from the robust implementations that are transportable across a wide range of computer architectures. In the present work, a hybrid domain-decomposition and parallelization concept was developed and implemented into the widely-used NASA multi-block Computational Fluid Dynamics (CFD) packages implemented in ENSAERO and OVERFLOW. The new parallel solver concept, PENS (Parallel Euler Navier-Stokes Solver), employs both fine and coarse granularity in data partitioning as well as data coalescing to obtain the desired load-balance characteristics on the available computer platforms. This multi-level parallelism implementation itself introduces no changes to the numerical results, hence the original fidelity of the packages are identically preserved. The present implementation uses the Message Passing Interface (MPI) library for interprocessor message passing and memory accessing. By choosing an appropriate combination of the available partitioning and coalescing capabilities only during the execution stage, the PENS solver becomes adaptable to different computer architectures from shared-memory to distributed-memory platforms with varying degrees of parallelism. The PENS implementation on the IBM SP2 distributed memory environment at the NASA Ames Research Center obtains 85 percent scalable parallel performance using fine-grain partitioning of single-block CFD domains using up to 128 wide computational nodes. Multi-block CFD simulations of complete aircraft simulations achieve 75 percent perfect load-balanced executions using data coalescing and the two levels of parallelism. SGI PowerChallenge, SGI Origin 2000, and a cluster of workstations are the other platforms where the robustness of the implementation is tested. The performance behavior on the other computer platforms with a variety of realistic problems will be included as this on-going study progresses.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Kim, Junghyun; Gangwon, Jo; Jaehoon, Jung

Applications written solely in OpenCL or CUDA cannot execute on a cluster as a whole. Most previous approaches that extend these programming models to clusters are based on a common idea: designating a centralized host node and coordinating the other nodes with the host for computation. However, the centralized host node is a serious performance bottleneck when the number of nodes is large. In this paper, we propose a scalable and distributed OpenCL framework called SnuCL-D for large-scale clusters. SnuCL-D's remote device virtualization provides an OpenCL application with an illusion that all compute devices in a cluster are confined inmore » a single node. To reduce the amount of control-message and data communication between nodes, SnuCL-D replicates the OpenCL host program execution and data in each node. We also propose a new OpenCL host API function and a queueing optimization technique that significantly reduce the overhead incurred by the previous centralized approaches. To show the effectiveness of SnuCL-D, we evaluate SnuCL-D with a microbenchmark and eleven benchmark applications on a large-scale CPU cluster and a medium-scale GPU cluster.« less
Pseudo-random dynamic address configuration (PRDAC) algorithm for mobile ad hoc networks

NASA Astrophysics Data System (ADS)

Wu, Shaochuan; Tan, Xuezhi

2007-11-01

By analyzing all kinds of address configuration algorithms, this paper provides a new pseudo-random dynamic address configuration (PRDAC) algorithm for mobile ad hoc networks. Based on PRDAC, the first node that initials this network randomly chooses a nonlinear shift register that can generates an m-sequence. When another node joins this network, the initial node will act as an IP address configuration sever to compute an IP address according to this nonlinear shift register, and then allocates this address and tell the generator polynomial of this shift register to this new node. By this means, when other node joins this network, any node that has obtained an IP address can act as a server to allocate address to this new node. PRDAC can also efficiently avoid IP conflicts and deal with network partition and merge as same as prophet address (PA) allocation and dynamic configuration and distribution protocol (DCDP). Furthermore, PRDAC has less algorithm complexity, less computational complexity and more sufficient assumption than PA. In addition, PRDAC radically avoids address conflicts and maximizes the utilization rate of IP addresses. Analysis and simulation results show that PRDAC has rapid convergence, low overhead and immune from topological structures.
DCS-Neural-Network Program for Aircraft Control and Testing

NASA Technical Reports Server (NTRS)

Jorgensen, Charles C.

2006-01-01

A computer program implements a dynamic-cell-structure (DCS) artificial neural network that can perform such tasks as learning selected aerodynamic characteristics of an airplane from wind-tunnel test data and computing real-time stability and control derivatives of the airplane for use in feedback linearized control. A DCS neural network is one of several types of neural networks that can incorporate additional nodes in order to rapidly learn increasingly complex relationships between inputs and outputs. In the DCS neural network implemented by the present program, the insertion of nodes is based on accumulated error. A competitive Hebbian learning rule (a supervised-learning rule in which connection weights are adjusted to minimize differences between actual and desired outputs for training examples) is used. A Kohonen-style learning rule (derived from a relatively simple training algorithm, implements a Delaunay triangulation layout of neurons) is used to adjust node positions during training. Neighborhood topology determines which nodes are used to estimate new values. The network learns, starting with two nodes, and adds new nodes sequentially in locations chosen to maximize reductions in global error. At any given time during learning, the error becomes homogeneously distributed over all nodes.
Design and Training of Limited-Interconnect Architectures

DTIC Science & Technology

1991-07-16

and signal processing. Neuromorphic (brain like) models, allow an alternative for achieving real-time operation tor such tasks, while having a...compact and robust architecture. Neuromorphic models consist of interconnections of simple computational nodes. In this approach, each node computes a...operational performance. I1. Research Objectives The research objectives were: 1. Development of on- chip local training rules specifically designed for
A hybrid parallel architecture for electrostatic interactions in the simulation of dissipative particle dynamics

NASA Astrophysics Data System (ADS)

Yang, Sheng-Chun; Lu, Zhong-Yuan; Qian, Hu-Jun; Wang, Yong-Lei; Han, Jie-Ping

2017-11-01

In this work, we upgraded the electrostatic interaction method of CU-ENUF (Yang, et al., 2016) which first applied CUNFFT (nonequispaced Fourier transforms based on CUDA) to the reciprocal-space electrostatic computation and made the computation of electrostatic interaction done thoroughly in GPU. The upgraded edition of CU-ENUF runs concurrently in a hybrid parallel way that enables the computation parallelizing on multiple computer nodes firstly, then further on the installed GPU in each computer. By this parallel strategy, the size of simulation system will be never restricted to the throughput of a single CPU or GPU. The most critical technical problem is how to parallelize a CUNFFT in the parallel strategy, which is conquered effectively by deep-seated research of basic principles and some algorithm skills. Furthermore, the upgraded method is capable of computing electrostatic interactions for both the atomistic molecular dynamics (MD) and the dissipative particle dynamics (DPD). Finally, the benchmarks conducted for validation and performance indicate that the upgraded method is able to not only present a good precision when setting suitable parameters, but also give an efficient way to compute electrostatic interactions for huge simulation systems. Program Files doi:http://dx.doi.org/10.17632/zncf24fhpv.1 Licensing provisions: GNU General Public License 3 (GPL) Programming language: C, C++, and CUDA C Supplementary material: The program is designed for effective electrostatic interactions of large-scale simulation systems, which runs on particular computers equipped with NVIDIA GPUs. It has been tested on (a) single computer node with Intel(R) Core(TM) i7-3770@ 3.40 GHz (CPU) and GTX 980 Ti (GPU), and (b) MPI parallel computer nodes with the same configurations. Nature of problem: For molecular dynamics simulation, the electrostatic interaction is the most time-consuming computation because of its long-range feature and slow convergence in simulation space, which approximately take up most of the total simulation time. Although the parallel method CU-ENUF (Yang et al., 2016) based on GPU has achieved a qualitative leap compared with previous methods in electrostatic interactions computation, the computation capability is limited to the throughput capacity of a single GPU for super-scale simulation system. Therefore, we should look for an effective method to handle the calculation of electrostatic interactions efficiently for a simulation system with super-scale size. Solution method: We constructed a hybrid parallel architecture, in which CPU and GPU are combined to accelerate the electrostatic computation effectively. Firstly, the simulation system is divided into many subtasks via domain-decomposition method. Then MPI (Message Passing Interface) is used to implement the CPU-parallel computation with each computer node corresponding to a particular subtask, and furthermore each subtask in one computer node will be executed in GPU in parallel efficiently. In this hybrid parallel method, the most critical technical problem is how to parallelize a CUNFFT (nonequispaced fast Fourier transform based on CUDA) in the parallel strategy, which is conquered effectively by deep-seated research of basic principles and some algorithm skills. Restrictions: The HP-ENUF is mainly oriented to super-scale system simulations, in which the performance superiority is shown adequately. However, for a small simulation system containing less than 106 particles, the mode of multiple computer nodes has no apparent efficiency advantage or even lower efficiency due to the serious network delay among computer nodes, than the mode of single computer node. References: (1) S.-C. Yang, H.-J. Qian, Z.-Y. Lu, Appl. Comput. Harmon. Anal. 2016, http://dx.doi.org/10.1016/j.acha.2016.04.009. (2) S.-C. Yang, Y.-L. Wang, G.-S. Jiao, H.-J. Qian, Z.-Y. Lu, J. Comput. Chem. 37 (2016) 378. (3) S.-C. Yang, Y.-L. Zhu, H.-J. Qian, Z.-Y. Lu, Appl. Chem. Res. Chin. Univ., 2017, http://dx.doi.org/10.1007/s40242-016-6354-5. (4) Y.-L. Zhu, H. Liu, Z.-W. Li, H.-J. Qian, G. Milano, Z.-Y. Lu, J. Comput. Chem. 34 (2013) 2197.
Network support for system initiated checkpoints

DOEpatents

Chen, Dong; Heidelberger, Philip

2013-01-29

A system, method and computer program product for supporting system initiated checkpoints in parallel computing systems. The system and method generates selective control signals to perform checkpointing of system related data in presence of messaging activity associated with a user application running at the node. The checkpointing is initiated by the system such that checkpoint data of a plurality of network nodes may be obtained even in the presence of user applications running on highly parallel computers that include ongoing user messaging activity.
Efficient computation of k-Nearest Neighbour Graphs for large high-dimensional data sets on GPU clusters.

PubMed

Dashti, Ali; Komarov, Ivan; D'Souza, Roshan M

2013-01-01

This paper presents an implementation of the brute-force exact k-Nearest Neighbor Graph (k-NNG) construction for ultra-large high-dimensional data cloud. The proposed method uses Graphics Processing Units (GPUs) and is scalable with multi-levels of parallelism (between nodes of a cluster, between different GPUs on a single node, and within a GPU). The method is applicable to homogeneous computing clusters with a varying number of nodes and GPUs per node. We achieve a 6-fold speedup in data processing as compared with an optimized method running on a cluster of CPUs and bring a hitherto impossible [Formula: see text]-NNG generation for a dataset of twenty million images with 15 k dimensionality into the realm of practical possibility.
Parallel scalability of Hartree-Fock calculations

NASA Astrophysics Data System (ADS)

Chow, Edmond; Liu, Xing; Smelyanskiy, Mikhail; Hammond, Jeff R.

2015-03-01

Quantum chemistry is increasingly performed using large cluster computers consisting of multiple interconnected nodes. For a fixed molecular problem, the efficiency of a calculation usually decreases as more nodes are used, due to the cost of communication between the nodes. This paper empirically investigates the parallel scalability of Hartree-Fock calculations. The construction of the Fock matrix and the density matrix calculation are analyzed separately. For the former, we use a parallelization of Fock matrix construction based on a static partitioning of work followed by a work stealing phase. For the latter, we use density matrix purification from the linear scaling methods literature, but without using sparsity. When using large numbers of nodes for moderately sized problems, density matrix computations are network-bandwidth bound, making purification methods potentially faster than eigendecomposition methods.
Using LAMMPS Software on the Peregrine System | High-Performance Computing

Science.gov Websites

-l walltime=4:00:00 # WALLTIME #PBS -l nodes=2:ppn=16 # Number of nodes and processes per node #PBS module purge module load impi-intel/2017.0.5 mkl/2017.0.5 lammps/11Aug17 mpirun -np 32 lmp -in lmp.in -l
Use of Networked Collaborative Concept Mapping To Measure Team Processes and Team Outcomes.

ERIC Educational Resources Information Center

Chung, Gregory K. W. K.; O'Neil, Harold F., Jr.; Herl, Howard E.; Dennis, Robert A.

The feasibility of using a computer-based networked collaborative concept mapping system to measure teamwork skills was studied. A concept map is a node-link-node representation of content, where the nodes represent concepts and links represent relationships between connected concepts. Teamwork processes were examined for a group concept mapping…
Development of response models for the Earth Radiation Budget Experiment (ERBE) sensors. Part 1: Dynamic models and computer simulations for the ERBE nonscanner, scanner and solar monitor sensors

NASA Technical Reports Server (NTRS)

Halyo, Nesim; Choi, Sang H.; Chrisman, Dan A., Jr.; Samms, Richard W.

1987-01-01

Dynamic models and computer simulations were developed for the radiometric sensors utilized in the Earth Radiation Budget Experiment (ERBE). The models were developed to understand performance, improve measurement accuracy by updating model parameters and provide the constants needed for the count conversion algorithms. Model simulations were compared with the sensor's actual responses demonstrated in the ground and inflight calibrations. The models consider thermal and radiative exchange effects, surface specularity, spectral dependence of a filter, radiative interactions among an enclosure's nodes, partial specular and diffuse enclosure surface characteristics and steady-state and transient sensor responses. Relatively few sensor nodes were chosen for the models since there is an accuracy tradeoff between increasing the number of nodes and approximating parameters such as the sensor's size, material properties, geometry, and enclosure surface characteristics. Given that the temperature gradients within a node and between nodes are small enough, approximating with only a few nodes does not jeopardize the accuracy required to perform the parameter estimates and error analyses.
Ground Node

DTIC Science & Technology

2009-09-30

Node deployment. Original plans were to deploy directly to Fort Jefferson on Dry Tortugas (near Key West, FL). Current plans are to initially deploy...to the USCG Station on Ismoralda Key for training operations; then deploy at a to-be- determined date to Fort Jefferson on Dry Tortugas . During FY09...Dry Tortugas . NRL expects to deliver the Ground Node to Ismoralda Key in October 2009. FY09 continued the third year of providing Ground
Wakata in Node 2

NASA Image and Video Library

2009-06-30

ISS020-E-016151 (30 June 2009) --- Japan Aerospace Exploration Agency (JAXA) astronaut Koichi Wakata, Expedition 20 flight engineer, enters data in a computer in the Harmony node of the International Space Station.
Semi-automatic central-chest lymph-node definition from 3D MDCT images

NASA Astrophysics Data System (ADS)

Lu, Kongkuo; Higgins, William E.

2010-03-01

Central-chest lymph nodes play a vital role in lung-cancer staging. The three-dimensional (3D) definition of lymph nodes from multidetector computed-tomography (MDCT) images, however, remains an open problem. This is because of the limitations in the MDCT imaging of soft-tissue structures and the complicated phenomena that influence the appearance of a lymph node in an MDCT image. In the past, we have made significant efforts toward developing (1) live-wire-based segmentation methods for defining 2D and 3D chest structures and (2) a computer-based system for automatic definition and interactive visualization of the Mountain central-chest lymph-node stations. Based on these works, we propose new single-click and single-section live-wire methods for segmenting central-chest lymph nodes. The single-click live wire only requires the user to select an object pixel on one 2D MDCT section and is designed for typical lymph nodes. The single-section live wire requires the user to process one selected 2D section using standard 2D live wire, but it is more robust. We applied these methods to the segmentation of 20 lymph nodes from two human MDCT chest scans (10 per scan) drawn from our ground-truth database. The single-click live wire segmented 75% of the selected nodes successfully and reproducibly, while the success rate for the single-section live wire was 85%. We are able to segment the remaining nodes, using our previously derived (but more interaction intense) 2D live-wire method incorporated in our lymph-node analysis system. Both proposed methods are reliable and applicable to a wide range of pulmonary lymph nodes.
Scalable asynchronous execution of cellular automata

NASA Astrophysics Data System (ADS)

Folino, Gianluigi; Giordano, Andrea; Mastroianni, Carlo

2016-10-01

The performance and scalability of cellular automata, when executed on parallel/distributed machines, are limited by the necessity of synchronizing all the nodes at each time step, i.e., a node can execute only after the execution of the previous step at all the other nodes. However, these synchronization requirements can be relaxed: a node can execute one step after synchronizing only with the adjacent nodes. In this fashion, different nodes can execute different time steps. This can be a notable advantageous in many novel and increasingly popular applications of cellular automata, such as smart city applications, simulation of natural phenomena, etc., in which the execution times can be different and variable, due to the heterogeneity of machines and/or data and/or executed functions. Indeed, a longer execution time at a node does not slow down the execution at all the other nodes but only at the neighboring nodes. This is particularly advantageous when the nodes that act as bottlenecks vary during the application execution. The goal of the paper is to analyze the benefits that can be achieved with the described asynchronous implementation of cellular automata, when compared to the classical all-to-all synchronization pattern. The performance and scalability have been evaluated through a Petri net model, as this model is very useful to represent the synchronization barrier among nodes. We examined the usual case in which the territory is partitioned into a number of regions, and the computation associated with a region is assigned to a computing node. We considered both the cases of mono-dimensional and two-dimensional partitioning. The results show that the advantage obtained through the asynchronous execution, when compared to the all-to-all synchronous approach is notable, and it can be as large as 90% in terms of speedup.
Embedding Task-Based Neural Models into a Connectome-Based Model of the Cerebral Cortex

PubMed Central

Ulloa, Antonio; Horwitz, Barry

2016-01-01

A number of recent efforts have used large-scale, biologically realistic, neural models to help understand the neural basis for the patterns of activity observed in both resting state and task-related functional neural imaging data. An example of the former is The Virtual Brain (TVB) software platform, which allows one to apply large-scale neural modeling in a whole brain framework. TVB provides a set of structural connectomes of the human cerebral cortex, a collection of neural processing units for each connectome node, and various forward models that can convert simulated neural activity into a variety of functional brain imaging signals. In this paper, we demonstrate how to embed a previously or newly constructed task-based large-scale neural model into the TVB platform. We tested our method on a previously constructed large-scale neural model (LSNM) of visual object processing that consisted of interconnected neural populations that represent, primary and secondary visual, inferotemporal, and prefrontal cortex. Some neural elements in the original model were “non-task-specific” (NS) neurons that served as noise generators to “task-specific” neurons that processed shapes during a delayed match-to-sample (DMS) task. We replaced the NS neurons with an anatomical TVB connectome model of the cerebral cortex comprising 998 regions of interest interconnected by white matter fiber tract weights. We embedded our LSNM of visual object processing into corresponding nodes within the TVB connectome. Reciprocal connections between TVB nodes and our task-based modules were included in this framework. We ran visual object processing simulations and showed that the TVB simulator successfully replaced the noise generation originally provided by NS neurons; i.e., the DMS tasks performed with the hybrid LSNM/TVB simulator generated equivalent neural and fMRI activity to that of the original task-based models. Additionally, we found partial agreement between the functional connectivities using the hybrid LSNM/TVB model and the original LSNM. Our framework thus presents a way to embed task-based neural models into the TVB platform, enabling a better comparison between empirical and computational data, which in turn can lead to a better understanding of how interacting neural populations give rise to human cognitive behaviors. PMID:27536235
The Added Value of a Single-photon Emission Computed Tomography-Computed Tomography in Sentinel Lymph Node Mapping in Patients with Breast Cancer and Malignant Melanoma.

PubMed

Bennie, George; Vorster, Mariza; Buscombe, John; Sathekge, Mike

2015-01-01

Single-photon emission computed tomography-computed tomography (SPECT-CT) allows for physiological and anatomical co-registration in sentinel lymph node (SLN) mapping and offers additional benefits over conventional planar imaging. However, the clinical relevance when considering added costs and radiation burden of these reported benefits remains somewhat uncertain. This study aimed to evaluate the possible added value of SPECT-CT and intra-operative gamma-probe use over planar imaging alone in the South African setting. 80 patients with breast cancer or malignant melanoma underwent both planar and SPECT-CT imaging for SLN mapping. We assessed and compared the number of nodes detected on each study, false positive and negative findings, changes in surgical approach and or patient management. In all cases where a sentinel node was identified, SPECT-CT was more accurate anatomically. There was a significant change in surgical approach in 30 cases - breast cancer (n = 13; P 0.001) and malignant melanoma (n = 17; P 0.0002). In 4 cases a node not identified on planar imaging was seen on SPECT-CT. In 16 cases additional echelon nodes were identified. False positives were excluded by SPECT-CT in 12 cases. The addition of SPECT-CT and use of intra-operative gamma-probe to planar imaging offers important benefits in patients who present with breast cancer and melanoma. These benefits include increased nodal detection, elimination of false positives and negatives and improved anatomical localization that ultimately aids and expedites surgical management. This has been demonstrated in the context of industrialized country previously and has now also been confirmed in the setting of a emerging-market nation.
Solving global shallow water equations on heterogeneous supercomputers

PubMed Central

Fu, Haohuan; Gan, Lin; Yang, Chao; Xue, Wei; Wang, Lanning; Wang, Xinliang; Huang, Xiaomeng; Yang, Guangwen

2017-01-01

The scientific demand for more accurate modeling of the climate system calls for more computing power to support higher resolutions, inclusion of more component models, more complicated physics schemes, and larger ensembles. As the recent improvements in computing power mostly come from the increasing number of nodes in a system and the integration of heterogeneous accelerators, how to scale the computing problems onto more nodes and various kinds of accelerators has become a challenge for the model development. This paper describes our efforts on developing a highly scalable framework for performing global atmospheric modeling on heterogeneous supercomputers equipped with various accelerators, such as GPU (Graphic Processing Unit), MIC (Many Integrated Core), and FPGA (Field Programmable Gate Arrays) cards. We propose a generalized partition scheme of the problem domain, so as to keep a balanced utilization of both CPU resources and accelerator resources. With optimizations on both computing and memory access patterns, we manage to achieve around 8 to 20 times speedup when comparing one hybrid GPU or MIC node with one CPU node with 12 cores. Using a customized FPGA-based data-flow engines, we see the potential to gain another 5 to 8 times improvement on performance. On heterogeneous supercomputers, such as Tianhe-1A and Tianhe-2, our framework is capable of achieving ideally linear scaling efficiency, and sustained double-precision performances of 581 Tflops on Tianhe-1A (using 3750 nodes) and 3.74 Pflops on Tianhe-2 (using 8644 nodes). Our study also provides an evaluation on the programming paradigm of various accelerator architectures (GPU, MIC, FPGA) for performing global atmospheric simulation, to form a picture about both the potential performance benefits and the programming efforts involved. PMID:28282428

Space physics analysis network node directory (The Yellow Pages): Fourth edition

NASA Technical Reports Server (NTRS)

Peters, David J.; Sisson, Patricia L.; Green, James L.; Thomas, Valerie L.

1989-01-01

The Space Physics Analysis Network (SPAN) is a component of the global DECnet Internet, which has over 17,000 host computers. The growth of SPAN from its implementation in 1981 to its present size of well over 2,500 registered SPAN host computers, has created a need for users to acquire timely information about the network through a central source. The SPAN Network Information Center (SPAN-NIC) an online facility managed by the National Space Science Data Center (NSSDC) was developed to meet this need for SPAN-wide information. The remote node descriptive information in this document is not currently contained in the SPAN-NIC database, but will be incorporated in the near future. Access to this information is also available to non-DECnet users over a variety of networks such as Telenet, the NASA Packet Switched System (NPSS), and the TCP/IP Internet. This publication serves as the Yellow Pages for SPAN node information. The document also provides key information concerning other computer networks connected to SPAN, nodes associated with each SPAN routing center, science discipline nodes, contacts for primary SPAN nodes, and SPAN reference information. A section on DECnet Internetworking discusses SPAN connections with other wide-area DECnet networks (many with thousands of nodes each). Another section lists node names and their disciplines, countries, and institutions in the SPAN Network Information Center Online Data Base System. All remote sites connected to US-SPAN and European-SPAN (E-SPAN) are indexed. Also provided is information on the SPAN tail circuits, i.e., those remote nodes connected directly to a SPAN routing center, which is the local point of contact for resolving SPAN-related problems. Reference material is included for those who wish to know more about SPAN. Because of the rapid growth of SPAN, the SPAN Yellow Pages is reissued periodically.
Endobronchial ultrasound-guided transbronchial needle aspiration in the diagnosis of non-lymph node thoracic lesions

PubMed Central

Yang, Huizhen; Zhao, Heng; Garfield, David H.; Teng, Jiajun; Han, Baohui; Sun, Jiayuan

2013-01-01

AIMS: Endobronchial ultrasound-guided transbronchial needle aspiration (EBUS-TBNA) has shown excellent diagnostic capabilities for mediastinal and hilar lymphadenopathy. However, its value in thoracic non-lymph node lesions is less clear. This study was designed to assess the value of EBUS-TBNA in distinguishing malignant from benign thoracic non-lymph node lesions. METHODS: From October 2009 to August 2011, 552 patients underwent EBUS-TBNA under local anesthesia and with conscious sedation. We retrospectively reviewed 81 of these patients who had tracheobronchial wall-adjacent intrapulmonary or isolated mediastinal non-lymph node lesions. On-site cytological evaluation was not used. Immunohistochemistry (IHC) was performed to distinguish the origin or type of malignancy when necessary. RESULTS: EBUS-TBNA was performed in 68 tracheobronchial wall-adjacent intrapulmonary and 13 isolated mediastinal non-lymph node lesions. Of the 81 patients, 77 (95.1%, 60 malignancies and 17 benignancies) were diagnosed through EBUS-TBNA, including 57 primary lung cancers, 2 mediastinal tumors, 1 pulmonary metastatic adenocarcinoma, 7 inflammation, 5 tuberculosis, 3 mediastinal cysts, 1 esophageal schwannoma, and 1 focal fibrosis. There were four false-negative cases (4.9%). Of the 60 malignancies, there were 9 (15.0%) which originally had no definite histologic origin or type. Thus, IHC was performed, with 7 (77.8%) being subsequently confirmed. Sensitivity, specificity, positive predictive value, negative predictive value, and accuracy of EBUS-TBNA in distinguishing malignant from benign lesions were 93.4% (60/64), 100% (17/17), 100% (60/60), 81.0% (17/21), and 95.1% (77/81), respectively. CONCLUSION: EBUS-TBNA is a safe procedure with a high sensitivity for distinguishing malignant from benign thoracic non-lymph node lesions within the reach of EBUS-TBNA, with IHC usually providing a more definitive diagnosis. PMID:23439919
ISS Node-1 and PMA-1 rotated in KSC's SSPF

NASA Technical Reports Server (NTRS)

1997-01-01

The International Space Station's Node 1 and Pressurized Mating Adapter-1 (PMA-1) are rotated by workers in KSC's Space Station Processing Facility. The node is rotated to provide access to different areas of the flight element for processing. Here, the node is rotated to provide access for the installation of heat pipe radiators and a flight computer. The node is scheduled to launch into space on STS-88, slated for a July 9 liftoff at 1:11 p.m. from KSC's Launch Pad 39B.
Role of computed tomography of abdomen in difficult to diagnose typhoid fever: a case series.

PubMed

Hafeez, Wajid; Rajalakshmi, S; Sripriya, S; Madhu Bashini, M

2018-04-01

Background and Aim Diagnosis of typhoid is challenging when blood cultures fail to isolate Salmonella species. We report our experience with interpreting computed tomography (CT) abdomen findings in a case series of typhoid fever. Methods The case series consisted of patients who had a CT abdomen done as part of their investigations and a final diagnosis of typhoid fever. The CT films were reviewed and findings evaluated for distinctive features. Results During 2011-2017, 11 patients met the inclusion criteria. Indication for CT was pyrexia of unknown origin in the majority of patients. Review of CT films revealed mesenteric lymphadenopathy (100%), terminal ileum thickening (85%), hepatosplenomegaly (45%), retroperitoneal lymphadenopathy (18%) and ascites (9%). Conclusions Enhancing discrete mesenteric lymphadenopathy and terminal ileum thickening are non-specific findings noted in typhoid fever. Absence of matted necrotic nodes and peritoneal thickening rule out tuberculosis and raise suspicion of typhoid fever in endemic regions.
Increasing available FIFO space to prevent messaging queue deadlocks in a DMA environment

DOEpatents

Blocksome, Michael A [Rochester, MN; Chen, Dong [Croton On Hudson, NY; Gooding, Thomas [Rochester, MN; Heidelberger, Philip [Cortlandt Manor, NY; Parker, Jeff [Rochester, MN

2012-02-07

Embodiments of the invention may be used to manage message queues in a parallel computing environment to prevent message queue deadlock. A direct memory access controller of a compute node may determine when a messaging queue is full. In response, the DMA may generate an interrupt. An interrupt handler may stop the DMA and swap all descriptors from the full messaging queue into a larger queue (or enlarge the original queue). The interrupt handler then restarts the DMA. Alternatively, the interrupt handler stops the DMA, allocates a memory block to hold queue data, and then moves descriptors from the full messaging queue into the allocated memory block. The interrupt handler then restarts the DMA. During a normal messaging advance cycle, a messaging manager attempts to inject the descriptors in the memory block into other messaging queues until the descriptors have all been processed.
Simulation of Detecting Damage in Composite Stiffened Panel Using Lamb Waves

NASA Technical Reports Server (NTRS)

Wang, John T.; Ross, Richard W.; Huang, Guo L.; Yuan, Fuh G.

2013-01-01

Lamb wave damage detection in a composite stiffened panel is simulated by performing explicit transient dynamic finite element analyses and using signal imaging techniques. This virtual test process does not need to use real structures, actuators/sensors, or laboratory equipment. Quasi-isotropic laminates are used for the stiffened panels. Two types of damage are studied. One type is a damage in the skin bay and the other type is a debond between the stiffener flange and the skin. Innovative approaches for identifying the damage location and imaging the damage were developed. The damage location is identified by finding the intersection of the damage locus and the path of the time reversal wave packet re-emitted from the sensor nodes. The damage locus is a circle that envelops the potential damage locations. Its center is at the actuator location and its radius is computed by multiplying the group velocity by the time of flight to damage. To create a damage image for estimating the size of damage, a group of nodes in the neighborhood of the damage location is identified for applying an image condition. The image condition, computed at a finite element node, is the zero-lag cross-correlation (ZLCC) of the time-reversed incident wave signal and the time reversal wave signal from the sensor nodes. This damage imaging process is computationally efficient since only the ZLCC values of a small amount of nodes in the neighborhood of the identified damage location are computed instead of those of the full model.
Running Interactive Jobs on Peregrine | High-Performance Computing | NREL

Science.gov Websites

The qsub -I command is used to start an interactive session on one or more compute nodes. When . You will see a message such as qsub : waiting for job 12090.admin1 to start When it has, you'll see a exports your environment variables to the interactive job. Type exit when finished using the node. Like
Using the Parallel Computing Toolbox with MATLAB on the Peregrine System |

Science.gov Websites

parallel pool took %g seconds.\\n', toc) % "single program multiple data" spmd fprintf('Worker %d says Hello World!\\n', labindex) end delete(gcp); % close the parallel pool exit To run the script on a compute node, create the file helloWorld.sub: #!/bin/bash #PBS -l walltime=05:00 #PBS -l nodes=1 #PBS -N
Need to improve SWMM's subsurface flow routing algorithm for green infrastructure modeling

EPA Science Inventory

SWMM can simulate various subsurface flows, including groundwater (GW) release from a subcatchment to a node, percolation out of storage units and low impact development (LID) controls, and rainfall derived inflow and infiltration (RDII) at a node. Originally, the subsurface flow...
Transient thermal analysis of fluid systems

NASA Technical Reports Server (NTRS)

Chandler, G. D.; Trust, R. D.

1977-01-01

Computer program performs transient thermal analysis of any 2-node to 200-node-thermal network, which transports heat by fluid flow convection. Program can be modified to add conduction along tubes and radiation.
Sentinel lymph node mapping in melanoma with technetium-99m dextran.

PubMed

Neubauer, S; Mena, I; Iglesis, R; Schwartz, R; Acevedo, J C; Leon, A; Gomez, L

2001-06-01

The aim of this work is to evaluate the capability of Tc99m B Dextran as a lymphoscintigraphic agent in the detection of the sentinel node in skin lesions. Forty-one patients with melanomas (39) and Merkel cell tumors (2) had perilesional intradermal injection of Tc99m-Dextran 2 hours before surgery. Serial gamma camera images and a handheld gamma probe were used to direct sentinel node biopsy. In 39/41 patients, lymph channels and 52 sentinel nodes (one to three sentinel nodes/patient) could be visualized. In one patient, with a dorsal melanoma, no lymph channels or lymph nodes could be demonstrated on the images and only minimal radioactivity was found in the regional nodes with the probe. Another patient with a facial lesion failed to demonstrate lymph channels or nodes. No adverse reactions were observed. Tc99m-Dextran provided good definition of lymph channels and sentinel node localization, without the risks related to the use of potentially hazardous labeled materials of biological origin.
Implementing asyncronous collective operations in a multi-node processing system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Dong; Eisley, Noel A.; Heidelberger, Philip

A method, system, and computer program product are disclosed for implementing an asynchronous collective operation in a multi-node data processing system. In one embodiment, the method comprises sending data to a plurality of nodes in the data processing system, broadcasting a remote get to the plurality of nodes, and using this remote get to implement asynchronous collective operations on the data by the plurality of nodes. In one embodiment, each of the nodes performs only one task in the asynchronous operations, and each nodes sets up a base address table with an entry for a base address of a memorymore » buffer associated with said each node. In another embodiment, each of the nodes performs a plurality of tasks in said collective operations, and each task of each node sets up a base address table with an entry for a base address of a memory buffer associated with the task.« less
Delphian node metastasis in head and neck cancers--oracle or myth?

PubMed

Iyer, N Gopalakrishna; Shaha, Ashok R; Ferlito, Alfio; Thomas Robbins, K; Medina, Jesus E; Silver, Carl E; Rinaldo, Alessandra; Takes, Robert P; Suárez, Carlos; Rodrigo, Juan P; Bradley, Patrick J; Werner, Jochen A

2010-09-15

Delphian node (DN) refers to the pre-laryngeal or pre-cricoid nodal tissue often identified during laryngeal or thyroid surgery. The original nomenclature is based on the assumption that metastasis to this node was predictive of aggressive disease and poor outcome for patients. In this article, we review the existing literature on the topic to determine the significance of DN metastasis in laryngeal, hypopharyngeal and thyroid cancers. (c) 2010 Wiley-Liss, Inc.
Bifurcation phenomena in cylindrical convection

NASA Astrophysics Data System (ADS)

Tuckerman, Laurette; Boronska, K.; Bordja, L.; Martin-Witkowski, L.; Navarro, M. C.

2008-11-01

We present two bifurcation scenarios occurring in Rayleigh-Benard convection in a small-aspect-ratio cylinder. In water (Pr=6.7) with R/H=2, Hof et al. (1999) observed five convective patterns at Ra=14200. We have computed 14 stable and unstable steady branches, as well as novel time-dependent branches. The resulting complicated bifurcation diagram, can be partitioned according to azimuthal symmetry. For example, three-roll and dipole states arise from an m=1 bifurcation, four-roll and ``pizza'' branches from m=2, and the ``mercedes'' state from an m=3 bifurcation after successive saddle-node bifurcations via ``marigold'', ``mitsubishi'' and ``cloverleaf'' states. The diagram represents a compromise between the physical tendency towards parallel rolls and the mathematical requirement that primary bifurcations be towards trigonometric states. Our second investigation explores the effect of exact counter-rotation of the upper and lower bounding disks on axisymmetric flows with Pr=1 and R/H=1. The convection threshold increases and, for sufficiently high rotation, the instability becomes oscillatory. Limit cycles originating at the Hopf bifurcation are annihilated when their period becomes infinite at saddle-node-on-periodic-orbit (SNOPER) bifurcations.
Disruption Tolerant Networking Flight Validation Experiment on NASA's EPOXI Mission

NASA Technical Reports Server (NTRS)

Wyatt, Jay; Burleigh, Scott; Jones, Ross; Torgerson, Leigh; Wissler, Steve

2009-01-01

In October and November of 2008, the Jet Propulsion Laboratory installed and tested essential elements of Delay/Disruption Tolerant Networking (DTN) technology on the Deep Impact spacecraft. This experiment, called Deep Impact Network Experiment (DINET), was performed in close cooperation with the EPOXI project which has responsibility for the spacecraft. During DINET some 300 images were transmitted from the JPL nodes to the spacecraft. Then they were automatically forwarded from the spacecraft back to the JPL nodes, exercising DTN's bundle origination, transmission, acquisition, dynamic route computation, congestion control, prioritization, custody transfer, and automatic retransmission procedures, both on the spacecraft and on the ground, over a period of 27 days. All transmitted bundles were successfully received, without corruption. The DINET experiment demonstrated DTN readiness for operational use in space missions. This activity was part of a larger NASA space DTN development program to mature DTN to flight readiness for a wide variety of mission types by the end of 2011. This paper describes the DTN protocols, the flight demo implementation, validation metrics which were created for the experiment, and validation results.
Elastic extension of a local analysis facility on external clouds for the LHC experiments

NASA Astrophysics Data System (ADS)

Ciaschini, V.; Codispoti, G.; Rinaldi, L.; Aiftimiei, D. C.; Bonacorsi, D.; Calligola, P.; Dal Pra, S.; De Girolamo, D.; Di Maria, R.; Grandi, C.; Michelotto, D.; Panella, M.; Taneja, S.; Semeria, F.

2017-10-01

The computing infrastructures serving the LHC experiments have been designed to cope at most with the average amount of data recorded. The usage peaks, as already observed in Run-I, may however originate large backlogs, thus delaying the completion of the data reconstruction and ultimately the data availability for physics analysis. In order to cope with the production peaks, the LHC experiments are exploring the opportunity to access Cloud resources provided by external partners or commercial providers. In this work we present the proof of concept of the elastic extension of a local analysis facility, specifically the Bologna Tier-3 Grid site, for the LHC experiments hosted at the site, on an external OpenStack infrastructure. We focus on the Cloud Bursting of the Grid site using DynFarm, a newly designed tool that allows the dynamic registration of new worker nodes to LSF. In this approach, the dynamically added worker nodes instantiated on an OpenStack infrastructure are transparently accessed by the LHC Grid tools and at the same time they serve as an extension of the farm for the local usage.
Optimal one-way and roundtrip journeys design by mixed-integer programming

NASA Astrophysics Data System (ADS)

Ribeiro, Isabel M.; Vale, Cecília

2017-12-01

The introduction of multimodal/intermodal networks in transportation problems, especially when considering roundtrips, adds complexity to the models. This article presents two models for the optimization of intermodal trips as a contribution to the integration of transport modes in networks. The first model is devoted to one-way trips while the second one is dedicated to roundtrips. The original contribution of this research to transportation is mainly the consideration of roundtrips in the optimization process of intermodal transport, especially because the transport mode between two nodes on the return trip should be the same as the one on the outward trip if both nodes are visited on the return trip, which is a valuable aspect for transport companies. The mathematical formulations of both models leads to mixed binary linear programs, which is not a common approach for this type of problem. In this article, as well as the model description, computational experience is included to highlight the importance and efficiency of the proposed models, which may provide a valuable tool for transport managers.
Integration and validation of a data grid software

NASA Astrophysics Data System (ADS)

Carenton-Madiec, Nicolas; Berger, Katharina; Cofino, Antonio

2014-05-01

The Earth System Grid Federation (ESGF) Peer-to-Peer (P2P) is a software infrastructure for the management, dissemination, and analysis of model output and observational data. The ESGF grid is composed with several types of nodes which have different roles. About 40 data nodes host model outputs and datasets using thredds catalogs. About 25 compute nodes offer remote visualization and analysis tools. About 15 index nodes crawl data nodes catalogs and implement faceted and federated search in a web interface. About 15 Identity providers nodes manage accounts, authentication and authorization. Here we will present an actual size test federation spread across different institutes in different countries and a python test suite that were started in December 2013. The first objective of the test suite is to provide a simple tool that helps to test and validate a single data node and its closest index, compute and identity provider peer. The next objective will be to run this test suite on every data node of the federation and therefore test and validate every single node of the whole federation. The suite already implements nosetests, requests, myproxy-logon, subprocess, selenium and fabric python libraries in order to test both web front ends, back ends and security services. The goal of this project is to improve the quality of deliverable in a small developers team context. Developers are widely spread around the world working collaboratively and without hierarchy. This kind of working organization context en-lighted the need of a federated integration test and validation process.
Simple, efficient allocation of modelling runs on heterogeneous clusters with MPI

USGS Publications Warehouse

Donato, David I.

2017-01-01

In scientific modelling and computation, the choice of an appropriate method for allocating tasks for parallel processing depends on the computational setting and on the nature of the computation. The allocation of independent but similar computational tasks, such as modelling runs or Monte Carlo trials, among the nodes of a heterogeneous computational cluster is a special case that has not been specifically evaluated previously. A simulation study shows that a method of on-demand (that is, worker-initiated) pulling from a bag of tasks in this case leads to reliably short makespans for computational jobs despite heterogeneity both within and between cluster nodes. A simple reference implementation in the C programming language with the Message Passing Interface (MPI) is provided.
Computational lymphatic node models in pediatric and adult hybrid phantoms for radiation dosimetry

NASA Astrophysics Data System (ADS)

Lee, Choonsik; Lamart, Stephanie; Moroz, Brian E.

2013-03-01

We developed models of lymphatic nodes for six pediatric and two adult hybrid computational phantoms to calculate the lymphatic node dose estimates from external and internal radiation exposures. We derived the number of lymphatic nodes from the recommendations in International Commission on Radiological Protection (ICRP) Publications 23 and 89 at 16 cluster locations for the lymphatic nodes: extrathoracic, cervical, thoracic (upper and lower), breast (left and right), mesentery (left and right), axillary (left and right), cubital (left and right), inguinal (left and right) and popliteal (left and right), for different ages (newborn, 1-, 5-, 10-, 15-year-old and adult). We modeled each lymphatic node within the voxel format of the hybrid phantoms by assuming that all nodes have identical size derived from published data except narrow cluster sites. The lymph nodes were generated by the following algorithm: (1) selection of the lymph node site among the 16 cluster sites; (2) random sampling of the location of the lymph node within a spherical space centered at the chosen cluster site; (3) creation of the sphere or ovoid of tissue representing the node based on lymphatic node characteristics defined in ICRP Publications 23 and 89. We created lymph nodes until the pre-defined number of lymphatic nodes at the selected cluster site was reached. This algorithm was applied to pediatric (newborn, 1-, 5-and 10-year-old male, and 15-year-old males) and adult male and female ICRP-compliant hybrid phantoms after voxelization. To assess the performance of our models for internal dosimetry, we calculated dose conversion coefficients, called S values, for selected organs and tissues with Iodine-131 distributed in six lymphatic node cluster sites using MCNPX2.6, a well validated Monte Carlo radiation transport code. Our analysis of the calculations indicates that the S values were significantly affected by the location of the lymph node clusters and that the values increased for smaller phantoms due to the shorter inter-organ distances compared to the bigger phantoms. By testing sensitivity of S values to random sampling and voxel resolution, we confirmed that the lymph node model is reasonably stable and consistent for different random samplings and voxel resolutions.

THEORY OF SOLAR MERIDIONAL CIRCULATION AT HIGH LATITUDES

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dikpati, Mausumi; Gilman, Peter A., E-mail: dikpati@ucar.edu, E-mail: gilman@ucar.edu

2012-02-10

We build a hydrodynamic model for computing and understanding the Sun's large-scale high-latitude flows, including Coriolis forces, turbulent diffusion of momentum, and gyroscopic pumping. Side boundaries of the spherical 'polar cap', our computational domain, are located at latitudes {>=} 60 Degree-Sign . Implementing observed low-latitude flows as side boundary conditions, we solve the flow equations for a Cartesian analog of the polar cap. The key parameter that determines whether there are nodes in the high-latitude meridional flow is {epsilon} = 2{Omega}n{pi}H{sup 2}/{nu}, where {Omega} is the interior rotation rate, n is the radial wavenumber of the meridional flow, H ismore » the depth of the convection zone, and {nu} is the turbulent viscosity. The smaller the {epsilon} (larger turbulent viscosity), the fewer the number of nodes in high latitudes. For all latitudes within the polar cap, we find three nodes for {nu} = 10{sup 12} cm{sup 2} s{sup -1}, two for 10{sup 13}, and one or none for 10{sup 15} or higher. For {nu} near 10{sup 14} our model exhibits 'node merging': as the meridional flow speed is increased, two nodes cancel each other, leaving no nodes. On the other hand, for fixed flow speed at the boundary, as {nu} is increased the poleward-most node migrates to the pole and disappears, ultimately for high enough {nu} leaving no nodes. These results suggest that primary poleward surface meridional flow can extend from 60 Degree-Sign to the pole either by node merging or by node migration and disappearance.« less
Ryazanskiy and Nyberg in Node 1

NASA Image and Video Library

2013-10-02

ISS037-E-005750 (2 Oct. 2013) --- NASA astronaut Karen Nyberg and Russian cosmonaut Sergey Ryazanskiy, both Expedition 37 flight engineers, look at a computer monitor in the Unity node of the International Space Station.
Parallel-aware, dedicated job co-scheduling within/across symmetric multiprocessing nodes

DOEpatents

Jones, Terry R.; Watson, Pythagoras C.; Tuel, William; Brenner, Larry; ,Caffrey, Patrick; Fier, Jeffrey

2010-10-05

In a parallel computing environment comprising a network of SMP nodes each having at least one processor, a parallel-aware co-scheduling method and system for improving the performance and scalability of a dedicated parallel job having synchronizing collective operations. The method and system uses a global co-scheduler and an operating system kernel dispatcher adapted to coordinate interfering system and daemon activities on a node and across nodes to promote intra-node and inter-node overlap of said interfering system and daemon activities as well as intra-node and inter-node overlap of said synchronizing collective operations. In this manner, the impact of random short-lived interruptions, such as timer-decrement processing and periodic daemon activity, on synchronizing collective operations is minimized on large processor-count SPMD bulk-synchronous programming styles.
Probabilistic divergence time estimation without branch lengths: dating the origins of dinosaurs, avian flight and crown birds.

PubMed

Lloyd, G T; Bapst, D W; Friedman, M; Davis, K E

2016-11-01

Branch lengths-measured in character changes-are an essential requirement of clock-based divergence estimation, regardless of whether the fossil calibrations used represent nodes or tips. However, a separate set of divergence time approaches are typically used to date palaeontological trees, which may lack such branch lengths. Among these methods, sophisticated probabilistic approaches have recently emerged, in contrast with simpler algorithms relying on minimum node ages. Here, using a novel phylogenetic hypothesis for Mesozoic dinosaurs, we apply two such approaches to estimate divergence times for: (i) Dinosauria, (ii) Avialae (the earliest birds) and (iii) Neornithes (crown birds). We find: (i) the plausibility of a Permian origin for dinosaurs to be dependent on whether Nyasasaurus is the oldest dinosaur, (ii) a Middle to Late Jurassic origin of avian flight regardless of whether Archaeopteryx or Aurornis is considered the first bird and (iii) a Late Cretaceous origin for Neornithes that is broadly congruent with other node- and tip-dating estimates. Demonstrating the feasibility of probabilistic time-scaling further opens up divergence estimation to the rich histories of extinct biodiversity in the fossil record, even in the absence of detailed character data. © 2016 The Authors.
Massively parallel algorithm and implementation of RI-MP2 energy calculation for peta-scale many-core supercomputers.

PubMed

Katouda, Michio; Naruse, Akira; Hirano, Yukihiko; Nakajima, Takahito

2016-11-15

A new parallel algorithm and its implementation for the RI-MP2 energy calculation utilizing peta-flop-class many-core supercomputers are presented. Some improvements from the previous algorithm (J. Chem. Theory Comput. 2013, 9, 5373) have been performed: (1) a dual-level hierarchical parallelization scheme that enables the use of more than 10,000 Message Passing Interface (MPI) processes and (2) a new data communication scheme that reduces network communication overhead. A multi-node and multi-GPU implementation of the present algorithm is presented for calculations on a central processing unit (CPU)/graphics processing unit (GPU) hybrid supercomputer. Benchmark results of the new algorithm and its implementation using the K computer (CPU clustering system) and TSUBAME 2.5 (CPU/GPU hybrid system) demonstrate high efficiency. The peak performance of 3.1 PFLOPS is attained using 80,199 nodes of the K computer. The peak performance of the multi-node and multi-GPU implementation is 514 TFLOPS using 1349 nodes and 4047 GPUs of TSUBAME 2.5. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Method for nonlinear optimization for gas tagging and other systems

DOEpatents

Chen, Ting; Gross, Kenny C.; Wegerich, Stephan

1998-01-01

A method and system for providing nuclear fuel rods with a configuration of isotopic gas tags. The method includes selecting a true location of a first gas tag node, selecting initial locations for the remaining n-1 nodes using target gas tag compositions, generating a set of random gene pools with L nodes, applying a Hopfield network for computing on energy, or cost, for each of the L gene pools and using selected constraints to establish minimum energy states to identify optimal gas tag nodes with each energy compared to a convergence threshold and then upon identifying the gas tag node continuing this procedure until establishing the next gas tag node until all remaining n nodes have been established.
Method for nonlinear optimization for gas tagging and other systems

DOEpatents

Chen, T.; Gross, K.C.; Wegerich, S.

1998-01-06

A method and system are disclosed for providing nuclear fuel rods with a configuration of isotopic gas tags. The method includes selecting a true location of a first gas tag node, selecting initial locations for the remaining n-1 nodes using target gas tag compositions, generating a set of random gene pools with L nodes, applying a Hopfield network for computing on energy, or cost, for each of the L gene pools and using selected constraints to establish minimum energy states to identify optimal gas tag nodes with each energy compared to a convergence threshold and then upon identifying the gas tag node continuing this procedure until establishing the next gas tag node until all remaining n nodes have been established. 6 figs.
Fast Katz and Commuters: Efficient Estimation of Social Relatedness in Large Networks

NASA Astrophysics Data System (ADS)

Esfandiar, Pooya; Bonchi, Francesco; Gleich, David F.; Greif, Chen; Lakshmanan, Laks V. S.; On, Byung-Won

Motivated by social network data mining problems such as link prediction and collaborative filtering, significant research effort has been devoted to computing topological measures including the Katz score and the commute time. Existing approaches typically approximate all pairwise relationships simultaneously. In this paper, we are interested in computing: the score for a single pair of nodes, and the top-k nodes with the best scores from a given source node. For the pairwise problem, we apply an iterative algorithm that computes upper and lower bounds for the measures we seek. This algorithm exploits a relationship between the Lanczos process and a quadrature rule. For the top-k problem, we propose an algorithm that only accesses a small portion of the graph and is related to techniques used in personalized PageRank computing. To test the scalability and accuracy of our algorithms we experiment with three real-world networks and find that these algorithms run in milliseconds to seconds without any preprocessing.
Fast katz and commuters : efficient estimation of social relatedness in large networks.

DOE Office of Scientific and Technical Information (OSTI.GOV)

On, Byung-Won; Lakshmanan, Laks V. S.; Greif, Chen

Motivated by social network data mining problems such as link prediction and collaborative filtering, significant research effort has been devoted to computing topological measures including the Katz score and the commute time. Existing approaches typically approximate all pairwise relationships simultaneously. In this paper, we are interested in computing: the score for a single pair of nodes, and the top-k nodes with the best scores from a given source node. For the pairwise problem, we apply an iterative algorithm that computes upper and lower bounds for the measures we seek. This algorithm exploits a relationship between the Lanczos process and amore » quadrature rule. For the top-k problem, we propose an algorithm that only accesses a small portion of the graph and is related to techniques used in personalized PageRank computing. To test the scalability and accuracy of our algorithms we experiment with three real-world networks and find that these algorithms run in milliseconds to seconds without any preprocessing.« less
Proactive Fault Tolerance Using Preemptive Migration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Engelmann, Christian; Vallee, Geoffroy R; Naughton, III, Thomas J

2009-01-01

Proactive fault tolerance (FT) in high-performance computing is a concept that prevents compute node failures from impacting running parallel applications by preemptively migrating application parts away from nodes that are about to fail. This paper provides a foundation for proactive FT by defining its architecture and classifying implementation options. This paper further relates prior work to the presented architecture and classification, and discusses the challenges ahead for needed supporting technologies.
ORA User’s Guide 2013

DTIC Science & Technology

2013-06-03

and a C++ computational backend . The most current version of ORA (3.0.8.5) software is available on the casos website: http://casos.cs.cmu.edu...optimizing a network’s design structure. ORA uses a Java interface for ease of use, and a C++ computational backend . The most current version of ORA...Eigenvector Centrality : Node most connected to other highly connected nodes. Assists in identifying those who can mobilize others Entity Class
Method and system for dynamic probabilistic risk assessment

NASA Technical Reports Server (NTRS)

Dugan, Joanne Bechta (Inventor); Xu, Hong (Inventor)

2013-01-01

The DEFT methodology, system and computer readable medium extends the applicability of the PRA (Probabilistic Risk Assessment) methodology to computer-based systems, by allowing DFT (Dynamic Fault Tree) nodes as pivot nodes in the Event Tree (ET) model. DEFT includes a mathematical model and solution algorithm, supports all common PRA analysis functions and cutsets. Additional capabilities enabled by the DFT include modularization, phased mission analysis, sequence dependencies, and imperfect coverage.
Homemade Buckeye-Pi: A Learning Many-Node Platform for High-Performance Parallel Computing

NASA Astrophysics Data System (ADS)

Amooie, M. A.; Moortgat, J.

2017-12-01

We report on the "Buckeye-Pi" cluster, the supercomputer developed in The Ohio State University School of Earth Sciences from 128 inexpensive Raspberry Pi (RPi) 3 Model B single-board computers. Each RPi is equipped with fast Quad Core 1.2GHz ARMv8 64bit processor, 1GB of RAM, and 32GB microSD card for local storage. Therefore, the cluster has a total RAM of 128GB that is distributed on the individual nodes and a flash capacity of 4TB with 512 processors, while it benefits from low power consumption, easy portability, and low total cost. The cluster uses the Message Passing Interface protocol to manage the communications between each node. These features render our platform the most powerful RPi supercomputer to date and suitable for educational applications in high-performance-computing (HPC) and handling of large datasets. In particular, we use the Buckeye-Pi to implement optimized parallel codes in our in-house simulator for subsurface media flows with the goal of achieving a massively-parallelized scalable code. We present benchmarking results for the computational performance across various number of RPi nodes. We believe our project could inspire scientists and students to consider the proposed unconventional cluster architecture as a mainstream and a feasible learning platform for challenging engineering and scientific problems.
GPU-Accelerated Large-Scale Electronic Structure Theory on Titan with a First-Principles All-Electron Code

NASA Astrophysics Data System (ADS)

Huhn, William Paul; Lange, Björn; Yu, Victor; Blum, Volker; Lee, Seyong; Yoon, Mina

Density-functional theory has been well established as the dominant quantum-mechanical computational method in the materials community. Large accurate simulations become very challenging on small to mid-scale computers and require high-performance compute platforms to succeed. GPU acceleration is one promising approach. In this talk, we present a first implementation of all-electron density-functional theory in the FHI-aims code for massively parallel GPU-based platforms. Special attention is paid to the update of the density and to the integration of the Hamiltonian and overlap matrices, realized in a domain decomposition scheme on non-uniform grids. The initial implementation scales well across nodes on ORNL's Titan Cray XK7 supercomputer (8 to 64 nodes, 16 MPI ranks/node) and shows an overall speed up in runtime due to utilization of the K20X Tesla GPUs on each Titan node of 1.4x, with the charge density update showing a speed up of 2x. Further acceleration opportunities will be discussed. Work supported by the LDRD Program of ORNL managed by UT-Battle, LLC, for the U.S. DOE and by the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725.
LEGION: Lightweight Expandable Group of Independently Operating Nodes

NASA Technical Reports Server (NTRS)

Burl, Michael C.

2012-01-01

LEGION is a lightweight C-language software library that enables distributed asynchronous data processing with a loosely coupled set of compute nodes. Loosely coupled means that a node can offer itself in service to a larger task at any time and can withdraw itself from service at any time, provided it is not actively engaged in an assignment. The main program, i.e., the one attempting to solve the larger task, does not need to know up front which nodes will be available, how many nodes will be available, or at what times the nodes will be available, which is normally the case in a "volunteer computing" framework. The LEGION software accomplishes its goals by providing message-based, inter-process communication similar to MPI (message passing interface), but without the tight coupling requirements. The software is lightweight and easy to install as it is written in standard C with no exotic library dependencies. LEGION has been demonstrated in a challenging planetary science application in which a machine learning system is used in closed-loop fashion to efficiently explore the input parameter space of a complex numerical simulation. The machine learning system decides which jobs to run through the simulator; then, through LEGION calls, the system farms those jobs out to a collection of compute nodes, retrieves the job results as they become available, and updates a predictive model of how the simulator maps inputs to outputs. The machine learning system decides which new set of jobs would be most informative to run given the results so far; this basic loop is repeated until sufficient insight into the physical system modeled by the simulator is obtained.
De Winne in Node 2

NASA Image and Video Library

2009-10-05

ISS020-E-045314 (5 Oct. 2009) --- European Space Agency astronaut Frank De Winne, Expedition 20 flight engineer and Expedition 21 commander, uses a communication system near a computer in the Harmony node of the International Space Station.
Cell boundary fault detection system

DOEpatents

Archer, Charles Jens [Rochester, MN; Pinnow, Kurt Walter [Rochester, MN; Ratterman, Joseph D [Rochester, MN; Smith, Brian Edward [Rochester, MN

2009-05-05

A method determines a nodal fault along the boundary, or face, of a computing cell. Nodes on adjacent cell boundaries communicate with each other, and the communications are analyzed to determine if a node or connection is faulty.
Parmitano in Node 2 module

NASA Image and Video Library

2013-10-04

ISS037-E-006528 (4 Oct. 2013) --- European Space Agency astronaut Luca Parmitano, Expedition 37 flight engineer, holds a light fixture as he enters data into a computer in the Harmony node of the International Space Station.
Lambda network having 2.sup.m-1 nodes in each of m stages with each node coupled to four other nodes for bidirectional routing of data packets between nodes

DOEpatents

Napolitano, Jr., Leonard M.

1995-01-01

The Lambda network is a single stage, packet-switched interprocessor communication network for a distributed memory, parallel processor computer. Its design arises from the desired network characteristics of minimizing mean and maximum packet transfer time, local routing, expandability, deadlock avoidance, and fault tolerance. The network is based on fixed degree nodes and has mean and maximum packet transfer distances where n is the number of processors. The routing method is detailed, as are methods for expandability, deadlock avoidance, and fault tolerance.
External ultrasonography of the neck does not add diagnostic value to integrated positron emission tomography-computed tomography (PET-CT) scanning in the diagnosis of cervical lymph node metastases in patients with esophageal carcinoma.

PubMed

Blom, R L G M; Vliegen, R F A; Schreurs, W M J; Belgers, H J; Stohr, I; Oostenbrug, L E; Sosef, M N

2012-08-01

One of the objectives of preoperative imaging in esophageal cancer patients is the detection of cervical lymph node metastases. Traditionally, external ultrasonography of the neck has been combined with computed tomography (CT) in order to improve the detection of cervical metastases. In general, integrated positron emission tomography-computed tomography (PET-CT) has been shown to be superior to CT or PET regarding staging and therefore may limit the role of external ultrasonography of the neck. The objective of this study was to determine the additional value of external ultrasonography of the neck to PET-CT. This study included all patients referred our center for treatment of esophageal carcinoma. Diagnostic staging was performed to determine treatment plan. Cervical lymph nodes were evaluated by external ultrasonography of the neck and PET-CT. In case of suspect lymph nodes on external ultrasonography or PET-CT, fine needle aspiration (FNA) was performed. Between 2008 and 2010, 170 out of 195 referred patients underwent both external ultrasonography of the neck and PET-CT. Of all patients, 84% were diagnosed with a tumor at or below the distal esophagus. In 140 of 170 patients, the cervical region was not suspect; no FNA was performed. Seven out of 170 patients had suspect nodes on both PET-CT and external ultrasonography. Five out of seven patients had cytologically confirmed malignant lymph nodes, one of seven had benign nodes, in one patient FNA was not performed; exclusion from esophagectomy was based on intra-abdominal metastases. In one out of 170 patients, PET-CT showed suspect nodes combined with a negative external ultrasonography; cytology of these nodes was benign. Twenty-two out of 170 patients had a negative PET-CT with suspect nodes on external ultrasonography. In 18 of 22 patients, cervical lymph nodes were cytologically confirmed benign; in four patients, FNA was not possible or inconclusive. At a median postoperative follow-up of 15 months, only 1% of patients developed cervical lymph node metastases. This study shows no additional value of external ultrasonography to a negative PET-CT. According to our results, it can be omitted in the primary workup. However, suspect lymph nodes on PET-CT should be confirmed by FNA to exclude false positives if it would change treatment plan. © 2011 Copyright the Authors. Journal compilation © 2011, Wiley Periodicals, Inc. and the International Society for Diseases of the Esophagus.

LWT Based Sensor Node Signal Processing in Vehicle Surveillance Distributed Sensor Network

NASA Astrophysics Data System (ADS)

Cha, Daehyun; Hwang, Chansik

Previous vehicle surveillance researches on distributed sensor network focused on overcoming power limitation and communication bandwidth constraints in sensor node. In spite of this constraints, vehicle surveillance sensor node must have signal compression, feature extraction, target localization, noise cancellation and collaborative signal processing with low computation and communication energy dissipation. In this paper, we introduce an algorithm for light-weight wireless sensor node signal processing based on lifting scheme wavelet analysis feature extraction in distributed sensor network.
Self-adaptive trust based ABR protocol for MANETs using Q-learning.

PubMed

Kumar, Anitha Vijaya; Jeyapal, Akilandeswari

2014-01-01

Mobile ad hoc networks (MANETs) are a collection of mobile nodes with a dynamic topology. MANETs work under scalable conditions for many applications and pose different security challenges. Due to the nomadic nature of nodes, detecting misbehaviour is a complex problem. Nodes also share routing information among the neighbours in order to find the route to the destination. This requires nodes to trust each other. Thus we can state that trust is a key concept in secure routing mechanisms. A number of cryptographic protection techniques based on trust have been proposed. Q-learning is a recently used technique, to achieve adaptive trust in MANETs. In comparison to other machine learning computational intelligence techniques, Q-learning achieves optimal results. Our work focuses on computing a score using Q-learning to weigh the trust of a particular node over associativity based routing (ABR) protocol. Thus secure and stable route is calculated as a weighted average of the trust value of the nodes in the route and associativity ticks ensure the stability of the route. Simulation results show that Q-learning based trust ABR protocol improves packet delivery ratio by 27% and reduces the route selection time by 40% over ABR protocol without trust calculation.
Self-Adaptive Trust Based ABR Protocol for MANETs Using Q-Learning

PubMed Central

Jeyapal, Akilandeswari

2014-01-01

Mobile ad hoc networks (MANETs) are a collection of mobile nodes with a dynamic topology. MANETs work under scalable conditions for many applications and pose different security challenges. Due to the nomadic nature of nodes, detecting misbehaviour is a complex problem. Nodes also share routing information among the neighbours in order to find the route to the destination. This requires nodes to trust each other. Thus we can state that trust is a key concept in secure routing mechanisms. A number of cryptographic protection techniques based on trust have been proposed. Q-learning is a recently used technique, to achieve adaptive trust in MANETs. In comparison to other machine learning computational intelligence techniques, Q-learning achieves optimal results. Our work focuses on computing a score using Q-learning to weigh the trust of a particular node over associativity based routing (ABR) protocol. Thus secure and stable route is calculated as a weighted average of the trust value of the nodes in the route and associativity ticks ensure the stability of the route. Simulation results show that Q-learning based trust ABR protocol improves packet delivery ratio by 27% and reduces the route selection time by 40% over ABR protocol without trust calculation. PMID:25254243
Analysis of complex network performance and heuristic node removal strategies

NASA Astrophysics Data System (ADS)

Jahanpour, Ehsan; Chen, Xin

2013-12-01

Removing important nodes from complex networks is a great challenge in fighting against criminal organizations and preventing disease outbreaks. Six network performance metrics, including four new metrics, are applied to quantify networks' diffusion speed, diffusion scale, homogeneity, and diameter. In order to efficiently identify nodes whose removal maximally destroys a network, i.e., minimizes network performance, ten structured heuristic node removal strategies are designed using different node centrality metrics including degree, betweenness, reciprocal closeness, complement-derived closeness, and eigenvector centrality. These strategies are applied to remove nodes from the September 11, 2001 hijackers' network, and their performance are compared to that of a random strategy, which removes randomly selected nodes, and the locally optimal solution (LOS), which removes nodes to minimize network performance at each step. The computational complexity of the 11 strategies and LOS is also analyzed. Results show that the node removal strategies using degree and betweenness centralities are more efficient than other strategies.
Dental application of novel finite element analysis software for three-dimensional finite element modeling of a dentulous mandible from its computed tomography images.

PubMed

Nakamura, Keiko; Tajima, Kiyoshi; Chen, Ker-Kong; Nagamatsu, Yuki; Kakigawa, Hiroshi; Masumi, Shin-ich

2013-12-01

This study focused on the application of novel finite-element analysis software for constructing a finite-element model from the computed tomography data of a human dentulous mandible. The finite-element model is necessary for evaluating the mechanical response of the alveolar part of the mandible, resulting from occlusal force applied to the teeth during biting. Commercially available patient-specific general computed tomography-based finite-element analysis software was solely applied to the finite-element analysis for the extraction of computed tomography data. The mandibular bone with teeth was extracted from the original images. Both the enamel and the dentin were extracted after image processing, and the periodontal ligament was created from the segmented dentin. The constructed finite-element model was reasonably accurate using a total of 234,644 nodes and 1,268,784 tetrahedral and 40,665 shell elements. The elastic moduli of the heterogeneous mandibular bone were determined from the bone density data of the computed tomography images. The results suggested that the software applied in this study is both useful and powerful for creating a more accurate three-dimensional finite-element model of a dentulous mandible from the computed tomography data without the need for any other software.
Wireless Sensor Node for Autonomous Monitoring and Alerts in Remote Environments

NASA Technical Reports Server (NTRS)

Panangadan, Anand V. (Inventor); Monacos, Steve P. (Inventor)

2015-01-01

A method, apparatus, system, and computer program products provides personal alert and tracking capabilities using one or more nodes. Each node includes radio transceiver chips operating at different frequency ranges, a power amplifier, sensors, a display, and embedded software. The chips enable the node to operate as either a mobile sensor node or a relay base station node while providing a long distance relay link between nodes. The power amplifier enables a line-of-sight communication between the one or more nodes. The sensors provide a GPS signal, temperature, and accelerometer information (used to trigger an alert condition). The embedded software captures and processes the sensor information, provides a multi-hop packet routing protocol to relay the sensor information to and receive alert information from a command center, and to display the alert information on the display.
Cyber Contingency Analysis version 1.x

DOE Office of Scientific and Technical Information (OSTI.GOV)

Contingency analysis based approach for quantifying and examining the resiliency of a cyber system in respect to confidentiality, integrity and availability. A graph representing an organization's cyber system and related resources is used for the availability contingency analysis. The mission critical paths associated with an organization are used to determine the consequences of a potential contingency. A node (or combination of nodes) are removed from the graph to analyze a particular contingency. The value of all mission critical paths that are disrupted by that contingency are used to quantify its severity. A total severity score can be calculated based onmore » the complete list of all these contingencies. A simple n1 analysis can be done in which only one node is removed at a time for the analysis. We can also compute nk analysis, where k is the number of nodes to simultaneously remove for analysis. A contingency risk score can also be computed, which takes the probability of the contingencies into account. In addition to availability, we can also quantify confidentiality and integrity scores for the system. These treat user accounts as potential contingencies. The amount (and type) of files that an account can read to is used to compute the confidentiality score. The amount (and type) of files that an account can write to is used to compute the integrity score. As with availability analysis, we can use this information to compute total severity scores in regards to confidentiality and integrity. We can also take probability into account to compute associated risk scores.« less
FDG-PET/CT in the evaluation of anal carcinoma

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cotter, Shane E.; Medical Scientist Training Program, Washington University School of Medicine, St. Louis, MO; Grigsby, Perry W.

2006-07-01

Purpose: Surgical staging and treatment of anal carcinoma has been replaced by noninvasive staging studies and combined modality therapy. In this study, we compare computed tomography (CT) and physical examination to [{sup 18}F]-fluoro-2-deoxy-D-glucose-positron emission tomography/computed tomography (FDG-PET/CT) in the staging of carcinoma of the anal canal, with special emphasis on determination of spread to inguinal lymph nodes. Methods and Materials: Between July 2003 and July 2005, 41 consecutive patients with biopsy-proved anal carcinoma underwent a complete staging evaluation including physical examination, CT, and 2-FDG-PET/CT. Patients ranged in age from 30 to 89 years. Nine men were HIV-positive. Treatment was withmore » standard Nigro regimen. Results: [{sup 18}F]-fluoro-2-deoxy-D-glucose-positron emission tomography/computed tomography (FDG-PET/CT) detected 91% of nonexcised primary tumors, whereas CT visualized 59%. FDG-PET/CT detected abnormal uptake in pelvic nodes of 5 patients with normal pelvic CT scans. FDG-PET/CT detected abnormal nodes in 20% of groins that were normal by CT, and in 23% without abnormality on physical examination. Furthermore, 17% of groins negative by both CT and physical examination showed abnormal uptake on FDG-PET/CT. HIV-positive patients had an increased frequency of PET-positive lymph nodes. Conclusion: [{sup 18}F]-fluoro-2-deoxy-D-glucose-positron emission tomography/computed tomography detects the primary tumor more often than CT. FDG-PET/CT detects substantially more abnormal inguinal lymph nodes than are identified by standard clinical staging with CT and physical examination.« less
Structure and Dynamics of Zr6O8 Metal-Organic Framework Node Surfaces Probed with Ethanol Dehydration as a Catalytic Test Reaction.

PubMed

Yang, Dong; Ortuño, Manuel A; Bernales, Varinia; Cramer, Christopher J; Gagliardi, Laura; Gates, Bruce C

2018-03-14

Some metal-organic frameworks (MOFs) incorporate nodes that are metal oxide clusters such as Zr 6 O 8 . Vacancies on the node surfaces, accidental or by design, act as catalytic sites. Here, we report elucidation of the chemistry of Zr 6 O 8 nodes in the MOFs UiO-66 and UiO-67 having used infrared and nuclear magnetic resonance spectroscopies to determine the ligands on the node surfaces originating from the solvents and modifiers used in the syntheses and having elucidated the catalytic properties of the nodes for ethanol dehydration, which takes place selectively to make diethyl ether but not ethylene at 473-523 K. Density functional theory calculations show that the key to the selective catalysis is the breaking of node-linker bonds (or the accidental adjacency of open/defect sites) that allows catalytically fruitful bonding of the reactant ethanol to neighboring sites on the nodes, facilitating the bimolecular ether formation through an S N 2 mechanism.
Histological differential diagnosis between lymph node toxoplasmosis and other benign lymph node hyperplasias.

PubMed

Miettinen, M

1981-03-01

The material from 667 lymph nodes, originally suspected of toxoplasmosis, was histologically re-examined, to evaluate criteria for diagnosis and differential diagnosis. The results showed that at least 80% of benign lymph node enlargements containing small groups of epithelioid cells were associated with high titres of Toxoplasma antibodies. Furthermore, 85--95% of the lymph nodes in association with high Toxoplasma antibodies showed the typical histological appearances of toxoplasmosis. The histological diagnosis of toxoplasmosis is thus both fairly specific and sensitive. Other lymph node lesions with small groups of epithelioid cells must be considered in the differential diagnosis. Sarcoidosis and tuberculosis usually have a predominance of distinct large epithelioid cell granulomata. Lymph nodes with sinus histiocytosis showing the formation of small groups of epithelioid cells, do not demonstrate prominent hyperplasia and include sparse germinal centres and were not associated with toxoplasmosis. Lymph nodes with disturbed general structure and small groups of epithelioid cells must be carefully assessed because of the significant possibility of malignancy.
Understanding Aprun Use Patterns

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, Hwa-Chun Wendy

2009-05-06

On the Cray XT, aprun is the command to launch an application to a set of compute nodes reserved through the Application Level Placement Scheduler (ALPS). At the National Energy Research Scientific Computing Center (NERSC), interactive aprun is disabled. That is, invocations of aprun have to go through the batch system. Batch scripts can and often do contain several apruns which either use subsets of the reserved nodes in parallel, or use all reserved nodes in consecutive apruns. In order to better understand how NERSC users run on the XT, it is necessary to associate aprun information with jobs. Itmore » is surprisingly more challenging than it sounds. In this paper, we describe those challenges and how we solved them to produce daily per-job reports for completed apruns. We also describe additional uses of the data, e.g. adjusting charging policy accordingly or associating node failures with jobs/users, and plans for enhancements.« less
Distributed Synchronization Technique for OFDMA-Based Wireless Mesh Networks Using a Bio-Inspired Algorithm

PubMed Central

Kim, Mi Jeong; Maeng, Sung Joon; Cho, Yong Soo

2015-01-01

In this paper, a distributed synchronization technique based on a bio-inspired algorithm is proposed for an orthogonal frequency division multiple access (OFDMA)-based wireless mesh network (WMN) with a time difference of arrival. The proposed time- and frequency-synchronization technique uses only the signals received from the neighbor nodes, by considering the effect of the propagation delay between the nodes. It achieves a fast synchronization with a relatively low computational complexity because it is operated in a distributed manner, not requiring any feedback channel for the compensation of the propagation delays. In addition, a self-organization scheme that can be effectively used to construct 1-hop neighbor nodes is proposed for an OFDMA-based WMN with a large number of nodes. The performance of the proposed technique is evaluated with regard to the convergence property and synchronization success probability using a computer simulation. PMID:26225974
Distributed Synchronization Technique for OFDMA-Based Wireless Mesh Networks Using a Bio-Inspired Algorithm.

PubMed

Kim, Mi Jeong; Maeng, Sung Joon; Cho, Yong Soo

2015-07-28

In this paper, a distributed synchronization technique based on a bio-inspired algorithm is proposed for an orthogonal frequency division multiple access (OFDMA)-based wireless mesh network (WMN) with a time difference of arrival. The proposed time- and frequency-synchronization technique uses only the signals received from the neighbor nodes, by considering the effect of the propagation delay between the nodes. It achieves a fast synchronization with a relatively low computational complexity because it is operated in a distributed manner, not requiring any feedback channel for the compensation of the propagation delays. In addition, a self-organization scheme that can be effectively used to construct 1-hop neighbor nodes is proposed for an OFDMA-based WMN with a large number of nodes. The performance of the proposed technique is evaluated with regard to the convergence property and synchronization success probability using a computer simulation.
Multinode reconfigurable pipeline computer

NASA Technical Reports Server (NTRS)

Nosenchuck, Daniel M. (Inventor); Littman, Michael G. (Inventor)

1989-01-01

A multinode parallel-processing computer is made up of a plurality of innerconnected, large capacity nodes each including a reconfigurable pipeline of functional units such as Integer Arithmetic Logic Processors, Floating Point Arithmetic Processors, Special Purpose Processors, etc. The reconfigurable pipeline of each node is connected to a multiplane memory by a Memory-ALU switch NETwork (MASNET). The reconfigurable pipeline includes three (3) basic substructures formed from functional units which have been found to be sufficient to perform the bulk of all calculations. The MASNET controls the flow of signals from the memory planes to the reconfigurable pipeline and vice versa. the nodes are connectable together by an internode data router (hyperspace router) so as to form a hypercube configuration. The capability of the nodes to conditionally configure the pipeline at each tick of the clock, without requiring a pipeline flush, permits many powerful algorithms to be implemented directly.
Portable multi-node LQCD Monte Carlo simulations using OpenACC

NASA Astrophysics Data System (ADS)

Bonati, Claudio; Calore, Enrico; D'Elia, Massimo; Mesiti, Michele; Negro, Francesco; Sanfilippo, Francesco; Schifano, Sebastiano Fabio; Silvi, Giorgio; Tripiccione, Raffaele

This paper describes a state-of-the-art parallel Lattice QCD Monte Carlo code for staggered fermions, purposely designed to be portable across different computer architectures, including GPUs and commodity CPUs. Portability is achieved using the OpenACC parallel programming model, used to develop a code that can be compiled for several processor architectures. The paper focuses on parallelization on multiple computing nodes using OpenACC to manage parallelism within the node, and OpenMPI to manage parallelism among the nodes. We first discuss the available strategies to be adopted to maximize performances, we then describe selected relevant details of the code, and finally measure the level of performance and scaling-performance that we are able to achieve. The work focuses mainly on GPUs, which offer a significantly high level of performances for this application, but also compares with results measured on other processors.
Method and apparatus of parallel computing with simultaneously operating stream prefetching and list prefetching engines

DOEpatents

Boyle, Peter A.; Christ, Norman H.; Gara, Alan; Mawhinney, Robert D.; Ohmacht, Martin; Sugavanam, Krishnan

2012-12-11

A prefetch system improves a performance of a parallel computing system. The parallel computing system includes a plurality of computing nodes. A computing node includes at least one processor and at least one memory device. The prefetch system includes at least one stream prefetch engine and at least one list prefetch engine. The prefetch system operates those engines simultaneously. After the at least one processor issues a command, the prefetch system passes the command to a stream prefetch engine and a list prefetch engine. The prefetch system operates the stream prefetch engine and the list prefetch engine to prefetch data to be needed in subsequent clock cycles in the processor in response to the passed command.
Integrating Grid Services into the Cray XT4 Environment

DOE Office of Scientific and Technical Information (OSTI.GOV)

NERSC; Cholia, Shreyas; Lin, Hwa-Chun Wendy

2009-05-01

The 38640 core Cray XT4"Franklin" system at the National Energy Research Scientific Computing Center (NERSC) is a massively parallel resource available to Department of Energy researchers that also provides on-demand grid computing to the Open Science Grid. The integration of grid services on Franklin presented various challenges, including fundamental differences between the interactive and compute nodes, a stripped down compute-node operating system without dynamic library support, a shared-root environment and idiosyncratic application launching. Inour work, we describe how we resolved these challenges on a running, general-purpose production system to provide on-demand compute, storage, accounting and monitoring services through generic gridmore » interfaces that mask the underlying system-specific details for the end user.« less
Three-Dimensional Computer Model of the Right Atrium Including the Sinoatrial and Atrioventricular Nodes Predicts Classical Nodal Behaviours

PubMed Central

Li, Jue; Inada, Shin; Schneider, Jurgen E.; Zhang, Henggui; Dobrzynski, Halina; Boyett, Mark R.

2014-01-01

The aim of the study was to develop a three-dimensional (3D) anatomically-detailed model of the rabbit right atrium containing the sinoatrial and atrioventricular nodes to study the electrophysiology of the nodes. A model was generated based on 3D images of a rabbit heart (atria and part of ventricles), obtained using high-resolution magnetic resonance imaging. Segmentation was carried out semi-manually. A 3D right atrium array model (∼3.16 million elements), including eighteen objects, was constructed. For description of cellular electrophysiology, the Rogers-modified FitzHugh-Nagumo model was further modified to allow control of the major characteristics of the action potential with relatively low computational resource requirements. Model parameters were chosen to simulate the action potentials in the sinoatrial node, atrial muscle, inferior nodal extension and penetrating bundle. The block zone was simulated as passive tissue. The sinoatrial node, crista terminalis, main branch and roof bundle were considered as anisotropic. We have simulated normal and abnormal electrophysiology of the two nodes. In accordance with experimental findings: (i) during sinus rhythm, conduction occurs down the interatrial septum and into the atrioventricular node via the fast pathway (conduction down the crista terminalis and into the atrioventricular node via the slow pathway is slower); (ii) during atrial fibrillation, the sinoatrial node is protected from overdrive by its long refractory period; and (iii) during atrial fibrillation, the atrioventricular node reduces the frequency of action potentials reaching the ventricles. The model is able to simulate ventricular echo beats. In summary, a 3D anatomical model of the right atrium containing the cardiac conduction system is able to simulate a wide range of classical nodal behaviours. PMID:25380074
Active Nodal Task Seeking for High-Performance, Ultra-Dependable Computing

DTIC Science & Technology

1994-07-01

implementation. Figure 1 shows a hardware organization of ANTS: stand-alone computing nodes inter - connected by buses. 2.1 Run Time Partitioning The...nodes in 14 respond to changing loads [27] or system reconfiguration [26]. Existing techniques are all source-initiated or server-initiated [27]. 5.1...short-running task segments. The task segments must be short-running in order that processors will become avalable often enough to satisfy changing
Storing files in a parallel computing system based on user or application specification

DOE Office of Scientific and Technical Information (OSTI.GOV)

Faibish, Sorin; Bent, John M.; Nick, Jeffrey M.

2016-03-29

Techniques are provided for storing files in a parallel computing system based on a user-specification. A plurality of files generated by a distributed application in a parallel computing system are stored by obtaining a specification from the distributed application indicating how the plurality of files should be stored; and storing one or more of the plurality of files in one or more storage nodes of a multi-tier storage system based on the specification. The plurality of files comprise a plurality of complete files and/or a plurality of sub-files. The specification can optionally be processed by a daemon executing on onemore » or more nodes in a multi-tier storage system. The specification indicates how the plurality of files should be stored, for example, identifying one or more storage nodes where the plurality of files should be stored.« less

Gap structure in Fe-based superconductors with accidental nodes: The role of hybridization

NASA Astrophysics Data System (ADS)

Hinojosa, Alberto; Chubukov, Andrey V.

2015-06-01

We study the effects of hybridization between the two electron pockets in Fe-based superconductors with s -wave gap with accidental nodes. We argue that hybridization reconstructs the Fermi surfaces and also induces an additional interpocket pairing component. We analyze how these two effects modify the gap structure by tracing the position of the nodal points of the energy dispersions in the superconducting state. We find three possible outcomes. In the first, the nodes simply shift their positions in the Brillouin zone; in the second, the nodes merge and disappear, in which case the gap function has either equal or opposite signs on the electron pockets; in the third, a new set of nodal points emerges, doubling the original number of nodes.
Deterministic quantum state transfer and remote entanglement using microwave photons.

PubMed

Kurpiers, P; Magnard, P; Walter, T; Royer, B; Pechal, M; Heinsoo, J; Salathé, Y; Akin, A; Storz, S; Besse, J-C; Gasparinetti, S; Blais, A; Wallraff, A

2018-06-01

Sharing information coherently between nodes of a quantum network is fundamental to distributed quantum information processing. In this scheme, the computation is divided into subroutines and performed on several smaller quantum registers that are connected by classical and quantum channels 1 . A direct quantum channel, which connects nodes deterministically rather than probabilistically, achieves larger entanglement rates between nodes and is advantageous for distributed fault-tolerant quantum computation 2 . Here we implement deterministic state-transfer and entanglement protocols between two superconducting qubits fabricated on separate chips. Superconducting circuits 3 constitute a universal quantum node 4 that is capable of sending, receiving, storing and processing quantum information 5-8 . Our implementation is based on an all-microwave cavity-assisted Raman process 9 , which entangles or transfers the qubit state of a transmon-type artificial atom 10 with a time-symmetric itinerant single photon. We transfer qubit states by absorbing these itinerant photons at the receiving node, with a probability of 98.1 ± 0.1 per cent, achieving a transfer-process fidelity of 80.02 ± 0.07 per cent for a protocol duration of only 180 nanoseconds. We also prepare remote entanglement on demand with a fidelity as high as 78.9 ± 0.1 per cent at a rate of 50 kilohertz. Our results are in excellent agreement with numerical simulations based on a master-equation description of the system. This deterministic protocol has the potential to be used for quantum computing distributed across different nodes of a cryogenic network.
SU-E-T-314: The Application of Cloud Computing in Pencil Beam Scanning Proton Therapy Monte Carlo Simulation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Z; Gao, M

Purpose: Monte Carlo simulation plays an important role for proton Pencil Beam Scanning (PBS) technique. However, MC simulation demands high computing power and is limited to few large proton centers that can afford a computer cluster. We study the feasibility of utilizing cloud computing in the MC simulation of PBS beams. Methods: A GATE/GEANT4 based MC simulation software was installed on a commercial cloud computing virtual machine (Linux 64-bits, Amazon EC2). Single spot Integral Depth Dose (IDD) curves and in-air transverse profiles were used to tune the source parameters to simulate an IBA machine. With the use of StarCluster softwaremore » developed at MIT, a Linux cluster with 2–100 nodes can be conveniently launched in the cloud. A proton PBS plan was then exported to the cloud where the MC simulation was run. Results: The simulated PBS plan has a field size of 10×10cm{sup 2}, 20cm range, 10cm modulation, and contains over 10,000 beam spots. EC2 instance type m1.medium was selected considering the CPU/memory requirement and 40 instances were used to form a Linux cluster. To minimize cost, master node was created with on-demand instance and worker nodes were created with spot-instance. The hourly cost for the 40-node cluster was $0.63 and the projected cost for a 100-node cluster was $1.41. Ten million events were simulated to plot PDD and profile, with each job containing 500k events. The simulation completed within 1 hour and an overall statistical uncertainty of < 2% was achieved. Good agreement between MC simulation and measurement was observed. Conclusion: Cloud computing is a cost-effective and easy to maintain platform to run proton PBS MC simulation. When proton MC packages such as GATE and TOPAS are combined with cloud computing, it will greatly facilitate the pursuing of PBS MC studies, especially for newly established proton centers or individual researchers.« less
Contact Graph Routing

NASA Technical Reports Server (NTRS)

Burleigh, Scott C.

2011-01-01

Contact Graph Routing (CGR) is a dynamic routing system that computes routes through a time-varying topology of scheduled communication contacts in a network based on the DTN (Delay-Tolerant Networking) architecture. It is designed to enable dynamic selection of data transmission routes in a space network based on DTN. This dynamic responsiveness in route computation should be significantly more effective and less expensive than static routing, increasing total data return while at the same time reducing mission operations cost and risk. The basic strategy of CGR is to take advantage of the fact that, since flight mission communication operations are planned in detail, the communication routes between any pair of bundle agents in a population of nodes that have all been informed of one another's plans can be inferred from those plans rather than discovered via dialogue (which is impractical over long one-way-light-time space links). Messages that convey this planning information are used to construct contact graphs (time-varying models of network connectivity) from which CGR automatically computes efficient routes for bundles. Automatic route selection increases the flexibility and resilience of the space network, simplifying cross-support and reducing mission management costs. Note that there are no routing tables in Contact Graph Routing. The best route for a bundle destined for a given node may routinely be different from the best route for a different bundle destined for the same node, depending on bundle priority, bundle expiration time, and changes in the current lengths of transmission queues for neighboring nodes; routes must be computed individually for each bundle, from the Bundle Protocol agent's current network connectivity model for the bundle s destination node (the contact graph). Clearly this places a premium on optimizing the implementation of the route computation algorithm. The scalability of CGR to very large networks remains a research topic. The information carried by CGR contact plan messages is useful not only for dynamic route computation, but also for the implementation of rate control, congestion forecasting, transmission episode initiation and termination, timeout interval computation, and retransmission timer suspension and resumption.
Central FPGA-based destination and load control in the LHCb MHz event readout

NASA Astrophysics Data System (ADS)

Jacobsson, R.

2012-10-01

The readout strategy of the LHCb experiment is based on complete event readout at 1 MHz. A set of 320 sub-detector readout boards transmit event fragments at total rate of 24.6 MHz at a bandwidth usage of up to 70 GB/s over a commercial switching network based on Gigabit Ethernet to a distributed event building and high-level trigger processing farm with 1470 individual multi-core computer nodes. In the original specifications, the readout was based on a pure push protocol. This paper describes the proposal, implementation, and experience of a non-conventional mixture of a push and a pull protocol, akin to credit-based flow control. An FPGA-based central master module, partly operating at the LHC bunch clock frequency of 40.08 MHz and partly at a double clock speed, is in charge of the entire trigger and readout control from the front-end electronics up to the high-level trigger farm. One FPGA is dedicated to controlling the event fragment packing in the readout boards, the assignment of the farm node destination for each event, and controls the farm load based on an asynchronous pull mechanism from each farm node. This dynamic readout scheme relies on generic event requests and the concept of node credit allowing load control and trigger rate regulation as a function of the global farm load. It also allows the vital task of fast central monitoring and automatic recovery in-flight of failing nodes while maintaining dead-time and event loss at a minimum. This paper demonstrates the strength and suitability of implementing this real-time task for a very large distributed system in an FPGA where no random delays are introduced, and where extreme reliability and accurate event accounting are fundamental requirements. It was in use during the entire commissioning phase of LHCb and has been in faultless operation during the first two years of physics luminosity data taking.
Picoradio: Communication/Computation Piconodes for Sensor Networks

DTIC Science & Technology

2003-01-02

diagram of PicoNode III, or Quark node. It is made from two custom chips, Strange RF and Charm digital processor , and is complemented by a set of...the chipset comprising of Strange (analog OOK transceiver) and Charm (digital processor ) chips. 44 Figure 33: System block diagram of the Quark node...19 2.B PICONODE II - TWO-CHIP PICONODE IMPLEMENTATION ......................................... 21 2.B.1 Baseband processor (BBP
Announcing Supercomputer Summit

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wells, Jack; Bland, Buddy; Nichols, Jeff

Summit is the next leap in leadership-class computing systems for open science. With Summit we will be able to address, with greater complexity and higher fidelity, questions concerning who we are, our place on earth, and in our universe. Summit will deliver more than five times the computational performance of Titan’s 18,688 nodes, using only approximately 3,400 nodes when it arrives in 2017. Like Titan, Summit will have a hybrid architecture, and each node will contain multiple IBM POWER9 CPUs and NVIDIA Volta GPUs all connected together with NVIDIA’s high-speed NVLink. Each node will have over half a terabyte ofmore » coherent memory (high bandwidth memory + DDR4) addressable by all CPUs and GPUs plus 800GB of non-volatile RAM that can be used as a burst buffer or as extended memory. To provide a high rate of I/O throughput, the nodes will be connected in a non-blocking fat-tree using a dual-rail Mellanox EDR InfiniBand interconnect. Upon completion, Summit will allow researchers in all fields of science unprecedented access to solving some of the world’s most pressing challenges.« less
Understanding the I/O Performance Gap Between Cori KNL and Haswell

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liu, Jialin; Koziol, Quincey; Tang, Houjun

2017-05-01

The Cori system at NERSC has two compute partitions with different CPU architectures: a 2,004 node Haswell partition and a 9,688 node KNL partition, which ranked as the 5th most powerful and fastest supercomputer on the November 2016 Top 500 list. The compute partitions share a common storage configuration, and understanding the IO performance gap between them is important, impacting not only to NERSC/LBNL users and other national labs, but also to the relevant hardware vendors and software developers. In this paper, we have analyzed performance of single core and single node IO comprehensively on the Haswell and KNL partitions,more » and have discovered the major bottlenecks, which include CPU frequencies and memory copy performance. We have also extended our performance tests to multi-node IO and revealed the IO cost difference caused by network latency, buffer size, and communication cost. Overall, we have developed a strong understanding of the IO gap between Haswell and KNL nodes and the lessons learned from this exploration will guide us in designing optimal IO solutions in many-core era.« less
Exploiting on-node heterogeneity for in-situ analytics of climate simulations via a functional partitioning framework

NASA Astrophysics Data System (ADS)

Sapra, Karan; Gupta, Saurabh; Atchley, Scott; Anantharaj, Valentine; Miller, Ross; Vazhkudai, Sudharshan

2016-04-01

Efficient resource utilization is critical for improved end-to-end computing and workflow of scientific applications. Heterogeneous node architectures, such as the GPU-enabled Titan supercomputer at the Oak Ridge Leadership Computing Facility (OLCF), present us with further challenges. In many HPC applications on Titan, the accelerators are the primary compute engines while the CPUs orchestrate the offloading of work onto the accelerators, and moving the output back to the main memory. On the other hand, applications that do not exploit GPUs, the CPU usage is dominant while the GPUs idle. We utilized Heterogenous Functional Partitioning (HFP) runtime framework that can optimize usage of resources on a compute node to expedite an application's end-to-end workflow. This approach is different from existing techniques for in-situ analyses in that it provides a framework for on-the-fly analysis on-node by dynamically exploiting under-utilized resources therein. We have implemented in the Community Earth System Model (CESM) a new concurrent diagnostic processing capability enabled by the HFP framework. Various single variate statistics, such as means and distributions, are computed in-situ by launching HFP tasks on the GPU via the node local HFP daemon. Since our current configuration of CESM does not use GPU resources heavily, we can move these tasks to GPU using the HFP framework. Each rank running the atmospheric model in CESM pushes the variables of of interest via HFP function calls to the HFP daemon. This node local daemon is responsible for receiving the data from main program and launching the designated analytics tasks on the GPU. We have implemented these analytics tasks in C and use OpenACC directives to enable GPU acceleration. This methodology is also advantageous while executing GPU-enabled configurations of CESM when the CPUs will be idle during portions of the runtime. In our implementation results, we demonstrate that it is more efficient to use HFP framework to offload the tasks to GPUs instead of doing it in the main application. We observe increased resource utilization and overall productivity in this approach by using HFP framework for end-to-end workflow.
Cell boundary fault detection system

DOEpatents

Archer, Charles Jens [Rochester, MN; Pinnow, Kurt Walter [Rochester, MN; Ratterman, Joseph D [Rochester, MN; Smith, Brian Edward [Rochester, MN

2011-04-19

An apparatus and program product determine a nodal fault along the boundary, or face, of a computing cell. Nodes on adjacent cell boundaries communicate with each other, and the communications are analyzed to determine if a node or connection is faulty.
Expedition 40 crew in Node 2 after German - U.S. soccer game

NASA Image and Video Library

2014-06-26

European Space Agency astronaut Alexander Gerst, Expedition 40 flight engineer, and NASA astronaut Steve Swanson, commander, gather around a computer in the Unity node of the International Space Station after the German-USA soccer match.
Application of neoadjuvant chemotherapy in occult breast cancer

PubMed Central

Yang, Haisong; Li, Ling; Zhang, Mengmeng; Zhang, Shiyong; Xu, Shu; Ma, Xiaoxia

2017-01-01

Abstract Rationale: Although rare, occult breast cancer (OBC) originates from breast tissue. Its primary lesions cannot be identified by clinical examination or imaging; therefore, the diagnosis, treatment, and prognosis remain controversial. Patient concerns: This study comprised 5 female OBC patients who were admitted to the Affiliated Hospital of Guizhou Medical University for painless axillary lumps. Diagnoses: 18F-flurodeoxyglucose (18F-FDG) positron emission tomography/computed tomography (PET/CT) indicated metastasis in the ipsilateral axillary lymph nodes. No clear breast primary lesions were identified; other organs were also excluded as the primary site. Pathological biopsy confirmed axillary lymph node metastasis of adenocarcinoma. Immunohistochemical staining of the tumor to identify the source revealed that estrogen receptors (ERs) and progesterone receptors (PgRs) were positive in 2 cases, ER was positive and PR was negative in 1 case, and both were negative in 2 cases. Human epidermal growth factor receptor 2 was negative in all cases. All patients were diagnosed with OBC. Interventions: All patients underwent neoadjuvant chemotherapy (NAC). One patient did not undergo follow-up therapy. The other 4 underwent total mastectomy plus axillary lymph node dissection followed by radiotherapy. Two patients also underwent endocrine therapy. Outcomes: Patients were followed up for 9.0 to 72.0 months. Four achieved pathological complete response. One patient experienced metastasis to the ipsilateral supraclavicular lymph nodes 2.0 years later, which was cleared after additional treatment. The other patients were tumor free. Lessons: Here, we are reporting 5 cases of OBC treated with NAC that were evaluated by 18F-FDG PET/CT scans. This study suggests that NAC might lead to a positive outcome. PMID:28984771
Disconnection syndromes of basal ganglia, thalamus, and cerebrocerebellar systems.

PubMed

Schmahmann, Jeremy D; Pandya, Deepak N

2008-09-01

Disconnection syndromes were originally conceptualized as a disruption of communication between different cerebral cortical areas. Two developments mandate a re-evaluation of this notion. First, we present a synopsis of our anatomical studies in monkey elucidating principles of organization of cerebral cortex. Efferent fibers emanate from every cortical area, and are directed with topographic precision via association fibers to ipsilateral cortical areas, commissural fibers to contralateral cerebral regions, striatal fibers to basal ganglia, and projection subcortical bundles to thalamus, brainstem and/or pontocerebellar system. We note that cortical areas can be defined by their patterns of subcortical and cortical connections. Second, we consider motor, cognitive and neuropsychiatric disorders in patients with lesions restricted to basal ganglia, thalamus, or cerebellum, and recognize that these lesions mimic deficits resulting from cortical lesions, with qualitative differences between the manifestations of lesions in functionally related areas of cortical and subcortical nodes. We consider these findings on the basis of anatomical observations from tract tracing studies in monkey, viewing them as disconnection syndromes reflecting loss of the contribution of subcortical nodes to the distributed neural circuits. We introduce a new theoretical framework for the distributed neural circuits, based on general, and specific, principles of anatomical organization, and on the architecture of the nodes that comprise these systems. We propose that neural architecture determines function, i.e., each architectonically distinct cortical and subcortical area contributes a unique transform, or computation, to information processing; anatomically precise and segregated connections between nodes define behavior; and association fiber tracts that link cerebral cortical areas with each other enable the cross-modal integration required for evolved complex behaviors. This model enables the formulation and testing of future hypotheses in investigations using evolving magnetic resonance imaging techniques in humans, and in clinical studies in patients with cortical and subcortical lesions.
Efficient algorithms for dilated mappings of binary trees

NASA Technical Reports Server (NTRS)

Iqbal, M. Ashraf

1990-01-01

The problem is addressed to find a 1-1 mapping of the vertices of a binary tree onto those of a target binary tree such that the son of a node on the first binary tree is mapped onto a descendent of the image of that node in the second binary tree. There are two natural measures of the cost of this mapping, namely the dilation cost, i.e., the maximum distance in the target binary tree between the images of vertices that are adjacent in the original tree. The other measure, expansion cost, is defined as the number of extra nodes/edges to be added to the target binary tree in order to ensure a 1-1 mapping. An efficient algorithm to find a mapping of one binary tree onto another is described. It is shown that it is possible to minimize one cost of mapping at the expense of the other. This problem arises when designing pipelined arithmetic logic units (ALU) for special purpose computers. The pipeline is composed of ALU chips connected in the form of a binary tree. The operands to the pipeline can be supplied to the leaf nodes of the binary tree which then process and pass the results up to their parents. The final result is available at the root. As each new application may require a distinct nesting of operations, it is useful to be able to find a good mapping of a new binary tree over existing ALU tree. Another problem arises if every distinct required binary tree is known beforehand. Here it is useful to hardwire the pipeline in the form of a minimal supertree that contains all required binary trees.
Algorithm implementation on the Navier-Stokes computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Krist, S.E.; Zang, T.A.

1987-03-01

The Navier-Stokes Computer is a multi-purpose parallel-processing supercomputer which is currently under development at Princeton University. It consists of multiple local memory parallel processors, called Nodes, which are interconnected in a hypercube network. Details of the procedures involved in implementing an algorithm on the Navier-Stokes computer are presented. The particular finite difference algorithm considered in this analysis was developed for simulation of laminar-turbulent transition in wall bounded shear flows. Projected timing results for implementing this algorithm indicate that operation rates in excess of 42 GFLOPS are feasible on a 128 Node machine.
Algorithm implementation on the Navier-Stokes computer

NASA Technical Reports Server (NTRS)

Krist, Steven E.; Zang, Thomas A.

1987-01-01

The Navier-Stokes Computer is a multi-purpose parallel-processing supercomputer which is currently under development at Princeton University. It consists of multiple local memory parallel processors, called Nodes, which are interconnected in a hypercube network. Details of the procedures involved in implementing an algorithm on the Navier-Stokes computer are presented. The particular finite difference algorithm considered in this analysis was developed for simulation of laminar-turbulent transition in wall bounded shear flows. Projected timing results for implementing this algorithm indicate that operation rates in excess of 42 GFLOPS are feasible on a 128 Node machine.
Advanced flight computer. Special study

NASA Technical Reports Server (NTRS)

Coo, Dennis

1995-01-01

This report documents a special study to define a 32-bit radiation hardened, SEU tolerant flight computer architecture, and to investigate current or near-term technologies and development efforts that contribute to the Advanced Flight Computer (AFC) design and development. An AFC processing node architecture is defined. Each node may consist of a multi-chip processor as needed. The modular, building block approach uses VLSI technology and packaging methods that demonstrate a feasible AFC module in 1998 that meets that AFC goals. The defined architecture and approach demonstrate a clear low-risk, low-cost path to the 1998 production goal, with intermediate prototypes in 1996.
The Japan Lung Cancer Society–Japanese Society for Radiation Oncology consensus-based computed tomographic atlas for defining regional lymph node stations in radiotherapy for lung cancer

PubMed Central

Itazawa, Tomoko; Tamaki, Yukihisa; Komiyama, Takafumi; Nishimura, Yasumasa; Nakayama, Yuko; Ito, Hiroyuki; Ohde, Yasuhisa; Kusumoto, Masahiko; Sakai, Shuji; Suzuki, Kenji; Watanabe, Hirokazu; Asamura, Hisao

2017-01-01

The purpose of this study was to develop a consensus-based computed tomographic (CT) atlas that defines lymph node stations in radiotherapy for lung cancer based on the lymph node map of the International Association for the Study of Lung Cancer (IASLC). A project group in the Japanese Radiation Oncology Study Group (JROSG) initially prepared a draft of the atlas in which lymph node Stations 1–11 were illustrated on axial CT images. Subsequently, a joint committee of the Japan Lung Cancer Society (JLCS) and the Japanese Society for Radiation Oncology (JASTRO) was formulated to revise this draft. The committee consisted of four radiation oncologists, four thoracic surgeons and three thoracic radiologists. The draft prepared by the JROSG project group was intensively reviewed and discussed at four meetings of the committee over several months. Finally, we proposed definitions for the regional lymph node stations and the consensus-based CT atlas. This atlas was approved by the Board of Directors of JLCS and JASTRO. This resulted in the first official CT atlas for defining regional lymph node stations in radiotherapy for lung cancer authorized by the JLCS and JASTRO. In conclusion, the JLCS–JASTRO consensus-based CT atlas, which conforms to the IASLC lymph node map, was established. PMID:27609192
The local lymph node assay (LLNA).

PubMed

Rovida, Costanza; Ryan, Cindy; Cinelli, Serena; Basketter, David; Dearman, Rebecca; Kimber, Ian

2012-02-01

The murine local lymph node assay (LLNA) is a widely accepted method for assessing the skin sensitization potential of chemicals. Compared with other in vivo methods in guinea pig, the LLNA offers important advantages with respect to animal welfare, including a requirement for reduced animal numbers as well as reduced pain and trauma. In addition to hazard identification, the LLNA is used for determining the relative skin sensitizing potency of contact allergens as a pivotal contribution to the risk assessment process. The LLNA is the only in vivo method that has been subjected to a formal validation process. The original LLNA protocol is based on measurement of the proliferative activity of draining lymph node cells (LNC), as determined by incorporation of radiolabeled thymidine. Several variants to the original LLNA have been developed to eliminate the use of radioactive materials. One such alternative is considered here: the LLNA:BrdU-ELISA method, which uses 5-bromo-2-deoxyuridine (BrdU) in place of radiolabeled thymidine to measure LNC proliferation in draining nodes. © 2012 by John Wiley & Sons, Inc.
Chapter 28: Theory SkyNode

NASA Astrophysics Data System (ADS)

Wagner, R.; Norman, M. L.

Here we present a working example of a Basic SkyNode serving theoretical data. The data is taken from the Simulated Cluster Archive (SCA), a set of simulated X-ray clusters, where each cluster was computed using four different physics models. The LCA Theory SkyNode (LCATheory) tables contain columns of the integrated physical properties of the clusters at various redshifts. The ease of setting up a Theory SkyNode is an important result, because it represents a clear way to present theory data to the Virtual Observatory. Also, our Theory SkyNode provides a prototype for additional simulated object catalogs, which will be created from other simulations by our group, and hopefully others.

Lambda network having 2{sup m{minus}1} nodes in each of m stages with each node coupled to four other nodes for bidirectional routing of data packets between nodes

DOEpatents

Napolitano, L.M. Jr.

1995-11-28

The Lambda network is a single stage, packet-switched interprocessor communication network for a distributed memory, parallel processor computer. Its design arises from the desired network characteristics of minimizing mean and maximum packet transfer time, local routing, expandability, deadlock avoidance, and fault tolerance. The network is based on fixed degree nodes and has mean and maximum packet transfer distances where n is the number of processors. The routing method is detailed, as are methods for expandability, deadlock avoidance, and fault tolerance. 14 figs.
Method and computer product to increase accuracy of time-based software verification for sensor networks

DOEpatents

Foo Kune, Denis [Saint Paul, MN; Mahadevan, Karthikeyan [Mountain View, CA

2011-01-25

A recursive verification protocol to reduce the time variance due to delays in the network by putting the subject node at most one hop from the verifier node provides for an efficient manner to test wireless sensor nodes. Since the software signatures are time based, recursive testing will give a much cleaner signal for positive verification of the software running on any one node in the sensor network. In this protocol, the main verifier checks its neighbor, who in turn checks its neighbor, and continuing this process until all nodes have been verified. This ensures minimum time delays for the software verification. Should a node fail the test, the software verification downstream is halted until an alternative path (one not including the failed node) is found. Utilizing techniques well known in the art, having a node tested twice, or not at all, can be avoided.
Optimization-based channel constrained data aggregation routing algorithms in multi-radio wireless sensor networks.

PubMed

Yen, Hong-Hsu

2009-01-01

In wireless sensor networks, data aggregation routing could reduce the number of data transmissions so as to achieve energy efficient transmission. However, data aggregation introduces data retransmission that is caused by co-channel interference from neighboring sensor nodes. This kind of co-channel interference could result in extra energy consumption and significant latency from retransmission. This will jeopardize the benefits of data aggregation. One possible solution to circumvent data retransmission caused by co-channel interference is to assign different channels to every sensor node that is within each other's interference range on the data aggregation tree. By associating each radio with a different channel, a sensor node could receive data from all the children nodes on the data aggregation tree simultaneously. This could reduce the latency from the data source nodes back to the sink so as to meet the user's delay QoS. Since the number of radios on each sensor node and the number of non-overlapping channels are all limited resources in wireless sensor networks, a challenging question here is to minimize the total transmission cost under limited number of non-overlapping channels in multi-radio wireless sensor networks. This channel constrained data aggregation routing problem in multi-radio wireless sensor networks is an NP-hard problem. I first model this problem as a mixed integer and linear programming problem where the objective is to minimize the total transmission subject to the data aggregation routing, channel and radio resources constraints. The solution approach is based on the Lagrangean relaxation technique to relax some constraints into the objective function and then to derive a set of independent subproblems. By optimally solving these subproblems, it can not only calculate the lower bound of the original primal problem but also provide useful information to get the primal feasible solutions. By incorporating these Lagrangean multipliers as the link arc weight, the optimization-based heuristics are proposed to get energy-efficient data aggregation tree with better resource (channel and radio) utilization. From the computational experiments, the proposed optimization-based approach is superior to existing heuristics under all tested cases.
Therapy of vulvar carcinoma.

PubMed

Haberthür, F; Almendral, A C; Ritter, B

1993-01-01

83 vulvar carcinoma patients were originally treated in the period between 1970 and 1990. 82 patients presented with squamous cell carcinoma. 70% of the patients were in Stage I or II. It was originally possible to operate on 74 of the 83 patients. A simple or partial vulvectomy was applied 17 times. A bilateral inguinal lymph node excision additionally took place in 6 cases. 51 patients were subjected to radical vulvectomy with inguinofemoral lymph node excision. In 13 cases, pelvic lymph node extirpation was also performed. A posterior pelvic exenteration was performed in 6 cases presenting extensive carcinoma involvement of the vulva. In the remaining 9 patients, either it was not possible to operate, or a nonradical operation could be performed. The primary morbidity, consisting of wound healing disturbances and infections, amounted to 50% in our group. We observed lymphedema in 47% of the cases, although it was clinically important in only 10%. We did not have any primary surgical mortality. The 5-year survival rate was 82% in our patients without inguinofemoral lymph node involvement and only 40% in lymph node metastatic cases. The absolute 5-year cure rate was 66%, or 69% corrected. To be able to give increased preference to less invasive methods an improved prevention and clarification procedure for physicians and patients is necessary.
Elastic Extension of a CMS Computing Centre Resources on External Clouds

NASA Astrophysics Data System (ADS)

Codispoti, G.; Di Maria, R.; Aiftimiei, C.; Bonacorsi, D.; Calligola, P.; Ciaschini, V.; Costantini, A.; Dal Pra, S.; DeGirolamo, D.; Grandi, C.; Michelotto, D.; Panella, M.; Peco, G.; Sapunenko, V.; Sgaravatto, M.; Taneja, S.; Zizzi, G.

2016-10-01

After the successful LHC data taking in Run-I and in view of the future runs, the LHC experiments are facing new challenges in the design and operation of the computing facilities. The computing infrastructure for Run-II is dimensioned to cope at most with the average amount of data recorded. The usage peaks, as already observed in Run-I, may however originate large backlogs, thus delaying the completion of the data reconstruction and ultimately the data availability for physics analysis. In order to cope with the production peaks, CMS - along the lines followed by other LHC experiments - is exploring the opportunity to access Cloud resources provided by external partners or commercial providers. Specific use cases have already been explored and successfully exploited during Long Shutdown 1 (LS1) and the first part of Run 2. In this work we present the proof of concept of the elastic extension of a CMS site, specifically the Bologna Tier-3, on an external OpenStack infrastructure. We focus on the “Cloud Bursting” of a CMS Grid site using a newly designed LSF configuration that allows the dynamic registration of new worker nodes to LSF. In this approach, the dynamically added worker nodes instantiated on the OpenStack infrastructure are transparently accessed by the LHC Grid tools and at the same time they serve as an extension of the farm for the local usage. The amount of resources allocated thus can be elastically modeled to cope up with the needs of CMS experiment and local users. Moreover, a direct access/integration of OpenStack resources to the CMS workload management system is explored. In this paper we present this approach, we report on the performances of the on-demand allocated resources, and we discuss the lessons learned and the next steps.
Computing an upper bound on contact stress with surrogate duality

NASA Astrophysics Data System (ADS)

Xuan, Zhaocheng; Papadopoulos, Panayiotis

2016-07-01

We present a method for computing an upper bound on the contact stress of elastic bodies. The continuum model of elastic bodies with contact is first modeled as a constrained optimization problem by using finite elements. An explicit formulation of the total contact force, a fraction function with the numerator as a linear function and the denominator as a quadratic convex function, is derived with only the normalized nodal contact forces as the constrained variables in a standard simplex. Then two bounds are obtained for the sum of the nodal contact forces. The first is an explicit formulation of matrices of the finite element model, derived by maximizing the fraction function under the constraint that the sum of the normalized nodal contact forces is one. The second bound is solved by first maximizing the fraction function subject to the standard simplex and then using Dinkelbach's algorithm for fractional programming to find the maximum—since the fraction function is pseudo concave in a neighborhood of the solution. These two bounds are solved with the problem dimensions being only the number of contact nodes or node pairs, which are much smaller than the dimension for the original problem, namely, the number of degrees of freedom. Next, a scheme for constructing an upper bound on the contact stress is proposed that uses the bounds on the sum of the nodal contact forces obtained on a fine finite element mesh and the nodal contact forces obtained on a coarse finite element mesh, which are problems that can be solved at a lower computational cost. Finally, the proposed method is verified through some examples concerning both frictionless and frictional contact to demonstrate the method's feasibility, efficiency, and robustness.
Overlapping Community Detection based on Network Decomposition

NASA Astrophysics Data System (ADS)

Ding, Zhuanlian; Zhang, Xingyi; Sun, Dengdi; Luo, Bin

2016-04-01

Community detection in complex network has become a vital step to understand the structure and dynamics of networks in various fields. However, traditional node clustering and relatively new proposed link clustering methods have inherent drawbacks to discover overlapping communities. Node clustering is inadequate to capture the pervasive overlaps, while link clustering is often criticized due to the high computational cost and ambiguous definition of communities. So, overlapping community detection is still a formidable challenge. In this work, we propose a new overlapping community detection algorithm based on network decomposition, called NDOCD. Specifically, NDOCD iteratively splits the network by removing all links in derived link communities, which are identified by utilizing node clustering technique. The network decomposition contributes to reducing the computation time and noise link elimination conduces to improving the quality of obtained communities. Besides, we employ node clustering technique rather than link similarity measure to discover link communities, thus NDOCD avoids an ambiguous definition of community and becomes less time-consuming. We test our approach on both synthetic and real-world networks. Results demonstrate the superior performance of our approach both in computation time and accuracy compared to state-of-the-art algorithms.
NESTOR: A Computer-Based Medical Diagnostic Aid That Integrates Causal and Probabilistic Knowledge.

DTIC Science & Technology

1984-11-01

indiidual conditional probabilities between one cause node and its effect node, but less common to know a joint conditional probability between a...PERFOAMING ORG. REPORT NUMBER * 7. AUTI4ORs) O Gregory F. Cooper 1 CONTRACT OR GRANT NUMBERIa) ONR N00014-81-K-0004 g PERFORMING ORGANIZATION NAME AND...ADDRESS 10. PROGRAM ELEMENT, PROJECT. TASK Department of Computer Science AREA & WORK UNIT NUMBERS Stanford University Stanford, CA 94305 USA 12. REPORT
Peregrine Job Queues and Scheduling Policies | High-Performance Computing |

Science.gov Websites

batch batch-h long bigmem data-transfer feature Max wall time 1 hour 4 hours 2 days 2 days 10 days 10 # nodes per job 2 8 288 576 120 46 1 # of 24 core 64 GB Haswell nodes 2 8 0 1228 0 0 0 haswell # of 24core 32 GB nodes 2 16 576 0 126 0 0 24core # of 16core 32 GB nodes 2 8 195 0 162 0 5 16core, # of 24core
An evaluation of fossil tip-dating versus node-age calibrations in tetraodontiform fishes (Teleostei: Percomorphaceae).

PubMed

Arcila, Dahiana; Alexander Pyron, R; Tyler, James C; Ortí, Guillermo; Betancur-R, Ricardo

2015-01-01

Time-calibrated phylogenies based on molecular data provide a framework for comparative studies. Calibration methods to combine fossil information with molecular phylogenies are, however, under active development, often generating disagreement about the best way to incorporate paleontological data into these analyses. This study provides an empirical comparison of the most widely used approach based on node-dating priors for relaxed clocks implemented in the programs BEAST and MrBayes, with two recently proposed improvements: one using a new fossilized birth-death process model for node dating (implemented in the program DPPDiv), and the other using a total-evidence or tip-dating method (implemented in MrBayes and BEAST). These methods are applied herein to tetraodontiform fishes, a diverse group of living and extinct taxa that features one of the most extensive fossil records among teleosts. Previous estimates of time-calibrated phylogenies of tetraodontiforms using node-dating methods reported disparate estimates for their age of origin, ranging from the late Jurassic to the early Paleocene (ca. 150-59Ma). We analyzed a comprehensive dataset with 16 loci and 210 morphological characters, including 131 taxa (95 extant and 36 fossil species) representing all families of fossil and extant tetraodontiforms, under different molecular clock calibration approaches. Results from node-dating methods produced consistently younger ages than the tip-dating approaches. The older ages inferred by tip dating imply an unlikely early-late Jurassic (ca. 185-119Ma) origin for this order and the existence of extended ghost lineages in their fossil record. Node-based methods, by contrast, produce time estimates that are more consistent with the stratigraphic record, suggesting a late Cretaceous (ca. 86-96Ma) origin. We show that the precision of clade age estimates using tip dating increases with the number of fossils analyzed and with the proximity of fossil taxa to the node under assessment. This study suggests that current implementations of tip dating may overestimate ages of divergence in calibrated phylogenies. It also provides a comprehensive phylogenetic framework for tetraodontiform systematics and future comparative studies. Copyright © 2014 Elsevier Inc. All rights reserved.
Transient Solid Dynamics Simulations on the Sandia/Intel Teraflop Computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Attaway, S.; Brown, K.; Gardner, D.

1997-12-31

Transient solid dynamics simulations are among the most widely used engineering calculations. Industrial applications include vehicle crashworthiness studies, metal forging, and powder compaction prior to sintering. These calculations are also critical to defense applications including safety studies and weapons simulations. The practical importance of these calculations and their computational intensiveness make them natural candidates for parallelization. This has proved to be difficult, and existing implementations fail to scale to more than a few dozen processors. In this paper we describe our parallelization of PRONTO, Sandia`s transient solid dynamics code, via a novel algorithmic approach that utilizes multiple decompositions for differentmore » key segments of the computations, including the material contact calculation. This latter calculation is notoriously difficult to perform well in parallel, because it involves dynamically changing geometry, global searches for elements in contact, and unstructured communications among the compute nodes. Our approach scales to at least 3600 compute nodes of the Sandia/Intel Teraflop computer (the largest set of nodes to which we have had access to date) on problems involving millions of finite elements. On this machine we can simulate models using more than ten- million elements in a few tenths of a second per timestep, and solve problems more than 3000 times faster than a single processor Cray Jedi.« less
An accurate, compact and computationally efficient representation of orbitals for quantum Monte Carlo calculations

NASA Astrophysics Data System (ADS)

Luo, Ye; Esler, Kenneth; Kent, Paul; Shulenburger, Luke

Quantum Monte Carlo (QMC) calculations of giant molecules, surface and defect properties of solids have been feasible recently due to drastically expanding computational resources. However, with the most computationally efficient basis set, B-splines, these calculations are severely restricted by the memory capacity of compute nodes. The B-spline coefficients are shared on a node but not distributed among nodes, to ensure fast evaluation. A hybrid representation which incorporates atomic orbitals near the ions and B-spline ones in the interstitial regions offers a more accurate and less memory demanding description of the orbitals because they are naturally more atomic like near ions and much smoother in between, thus allowing coarser B-spline grids. We will demonstrate the advantage of hybrid representation over pure B-spline and Gaussian basis sets and also show significant speed-up like computing the non-local pseudopotentials with our new scheme. Moreover, we discuss a new algorithm for atomic orbital initialization which used to require an extra workflow step taking a few days. With this work, the highly efficient hybrid representation paves the way to simulate large size even in-homogeneous systems using QMC. This work was supported by the U.S. Department of Energy, Office of Science, Basic Energy Sciences, Computational Materials Sciences Program.
Limitations of PET/CT in the Detection of Occult N1 Metastasis in Clinical Stage I(T1-2aN0) Non-Small Cell Lung Cancer for Staging Prior to Stereotactic Body Radiotherapy.

PubMed

Akthar, Adil S; Ferguson, Mark K; Koshy, Matthew; Vigneswaran, Wickii T; Malik, Renuka

2017-02-01

Patients receiving stereotactic body radiotherapy for stage I non-small cell lung cancer are typically staged clinically with positron emission tomography-computed tomography. Currently, limited data exist for the detection of occult hilar/peribronchial (N1) disease. We hypothesize that positron emission tomography-computed tomography underestimates spread of cancer to N1 lymph nodes and that future stereotactic body radiotherapy patients may benefit from increased pathologic evaluation of N1 nodal stations in addition to N2 nodes. A retrospective study was performed of all patients with clinical stage I (T1-2aN0) non-small cell lung cancer (American Joint Committee on Cancer, 7th edition) by positron emission tomography-computed tomography at our institution from 2003 to 2011, with subsequent surgical resection and lymph node staging. Findings on positron emission tomography-computed tomography were compared to pathologic nodal involvement to determine the negative predictive value of positron emission tomography-computed tomography for the detection of N1 nodal disease. An analysis was conducted to identify predictors of occult spread. A total of 105 patients with clinical stage I non-small cell lung cancer were included in this study, of which 8 (7.6%) patients were found to have occult N1 metastasis on pathologic review yielding a negative predictive value for N1 disease of 92.4%. No patients had occult mediastinal nodes. The negative predictive value for positron emission tomography-computed tomography in patients with clinical stage T1 versus T2 tumors was 72 (96%) of 75 versus 25 (83%) of 30, respectively ( P = .03), and for peripheral versus central tumor location was 77 (98%) of 78 versus 20 (74%) of 27, respectively ( P = .0001). The negative predictive values for peripheral T1 and T2 tumors were 98% and 100%, respectively; while for central T1 and T2 tumors, the rates were 85% and 64%, respectively. Occult lymph node involvement was not associated with primary tumor maximum standard uptake value, histology, grade, or interval between positron emission tomography-computed tomography and surgery. Our results support pathologic assessment of N1 lymph nodes in patients with stage Inon-small cell lung cancer considered for stereotactic body radiotherapy, with the greatest benefit in patients with central and T2 tumors. Diagnostic evaluation with endoscopic bronchial ultrasound should be considered in the evaluation of stereotactic body radiotherapy candidates.
Limitations of PET/CT in the Detection of Occult N1 Metastasis in Clinical Stage I(T1-2aN0) Non-Small Cell Lung Cancer for Staging Prior to Stereotactic Body Radiotherapy

PubMed Central

Akthar, Adil S.; Ferguson, Mark K.; Koshy, Matthew; Vigneswaran, Wickii T.

2016-01-01

Purpose/Objectives: Patients receiving stereotactic body radiotherapy for stage I non-small cell lung cancer are typically staged clinically with positron emission tomography–computed tomography. Currently, limited data exist for the detection of occult hilar/peribronchial (N1) disease. We hypothesize that positron emission tomography–computed tomography underestimates spread of cancer to N1 lymph nodes and that future stereotactic body radiotherapy patients may benefit from increased pathologic evaluation of N1 nodal stations in addition to N2 nodes. Materials/Methods: A retrospective study was performed of all patients with clinical stage I (T1-2aN0) non-small cell lung cancer (American Joint Committee on Cancer, 7th edition) by positron emission tomography–computed tomography at our institution from 2003 to 2011, with subsequent surgical resection and lymph node staging. Findings on positron emission tomography–computed tomography were compared to pathologic nodal involvement to determine the negative predictive value of positron emission tomography–computed tomography for the detection of N1 nodal disease. An analysis was conducted to identify predictors of occult spread. Results: A total of 105 patients with clinical stage I non-small cell lung cancer were included in this study, of which 8 (7.6%) patients were found to have occult N1 metastasis on pathologic review yielding a negative predictive value for N1 disease of 92.4%. No patients had occult mediastinal nodes. The negative predictive value for positron emission tomography–computed tomography in patients with clinical stage T1 versus T2 tumors was 72 (96%) of 75 versus 25 (83%) of 30, respectively (P = .03), and for peripheral versus central tumor location was 77 (98%) of 78 versus 20 (74%) of 27, respectively (P = .0001). The negative predictive values for peripheral T1 and T2 tumors were 98% and 100%, respectively; while for central T1 and T2 tumors, the rates were 85% and 64%, respectively. Occult lymph node involvement was not associated with primary tumor maximum standard uptake value, histology, grade, or interval between positron emission tomography–computed tomography and surgery. Conclusion: Our results support pathologic assessment of N1 lymph nodes in patients with stage Inon-small cell lung cancer considered for stereotactic body radiotherapy, with the greatest benefit in patients with central and T2 tumors. Diagnostic evaluation with endoscopic bronchial ultrasound should be considered in the evaluation of stereotactic body radiotherapy candidates. PMID:26792491
Computer Three-Dimensional Reconstruction of the Atrioventricular Node

PubMed Central

Li, Jue; Greener, Ian D.; Inada, Shin; Nikolski, Vladimir P.; Yamamoto, Mitsuru; Hancox, Jules C.; Zhang, Henggui; Billeter, Rudi; Efimov, Igor R.; Dobrzynski, Halina; Boyett, Mark R.

2009-01-01

Because of its complexity, the atrioventricular node (AVN), remains 1 of the least understood regions of the heart. The aim of the study was to construct a detailed anatomic model of the AVN and relate it to AVN function. The electric activity of a rabbit AVN preparation was imaged using voltage-dependent dye. The preparation was then fixed and sectioned. Sixty-five sections at 60- to 340-μm intervals were stained for histology and immunolabeled for neurofilament (marker of nodal tissue) and connexin43 (gap junction protein). This revealed multiple structures within and around the AVN, including transitional tissue, inferior nodal extension, penetrating bundle, His bundle, atrial and ventricular muscle, central fibrous body, tendon of Todaro, and valves. A 3D anatomically detailed mathematical model (≈13 million element array) of the AVN and surrounding atrium and ventricle, incorporating all cell types, was constructed. Comparison of the model with electric activity recorded in experiments suggests that the inferior nodal extension forms the slow pathway, whereas the transitional tissue forms the fast pathway into the AVN. In addition, it suggests the pacemaker activity of the atrioventricular junction originates in the inferior nodal extension. Computer simulation of the propagation of the action potential through the anatomic model shows how, because of the complex structure of the AVN, reentry (slow-fast and fast-slow) can occur. In summary, a mathematical model of the anatomy of the AVN has been generated that allows AVN conduction to be explored. PMID:18309098
Architectures for Cognitive Systems

DTIC Science & Technology

2010-02-01

highly modular many- node chip was designed which addressed power efficiency to the maximum extent possible. Each node contains an Asynchronous Field...optimization to perform complex cognitive computing operations. This project focused on the design of the core and integration across a four node chip . A...follow on project will focus on creating a 3 dimensional stack of chips that is enabled by the low power usage. The chip incorporates structures to
Performance of an MPI-only semiconductor device simulator on a quad socket/quad core InfiniBand platform.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shadid, John Nicolas; Lin, Paul Tinphone

2009-01-01

This preliminary study considers the scaling and performance of a finite element (FE) semiconductor device simulator on a capacity cluster with 272 compute nodes based on a homogeneous multicore node architecture utilizing 16 cores. The inter-node communication backbone for this Tri-Lab Linux Capacity Cluster (TLCC) machine is comprised of an InfiniBand interconnect. The nonuniform memory access (NUMA) nodes consist of 2.2 GHz quad socket/quad core AMD Opteron processors. The performance results for this study are obtained with a FE semiconductor device simulation code (Charon) that is based on a fully-coupled Newton-Krylov solver with domain decomposition and multilevel preconditioners. Scaling andmore » multicore performance results are presented for large-scale problems of 100+ million unknowns on up to 4096 cores. A parallel scaling comparison is also presented with the Cray XT3/4 Red Storm capability platform. The results indicate that an MPI-only programming model for utilizing the multicore nodes is reasonably efficient on all 16 cores per compute node. However, the results also indicated that the multilevel preconditioner, which is critical for large-scale capability type simulations, scales better on the Red Storm machine than the TLCC machine.« less
KENNEDY SPACE CENTER, FLA. - Astronaut Tim Kopra aids in Intravehicular Activity (IVA) constraints testing on the Italian-built Node 2, a future element of the International Space Station. The second of three Station connecting modules, the Node 2 attaches to the end of the U.S. Lab and provides attach locations for several other elements. Kopra is currently assigned technical duties in the Space Station Branch of the Astronaut Office, where his primary focus involves the testing of crew interfaces for two future ISS modules as well as the implementation of support computers and operational Local Area Network on ISS. Node 2 is scheduled to launch on mission STS-120, Station assembly flight 10A.

NASA Image and Video Library

2004-02-03

KENNEDY SPACE CENTER, FLA. - Astronaut Tim Kopra aids in Intravehicular Activity (IVA) constraints testing on the Italian-built Node 2, a future element of the International Space Station. The second of three Station connecting modules, the Node 2 attaches to the end of the U.S. Lab and provides attach locations for several other elements. Kopra is currently assigned technical duties in the Space Station Branch of the Astronaut Office, where his primary focus involves the testing of crew interfaces for two future ISS modules as well as the implementation of support computers and operational Local Area Network on ISS. Node 2 is scheduled to launch on mission STS-120, Station assembly flight 10A.
KENNEDY SPACE CENTER, FLA. - In the Space Station Processing Facility, workers check over the Italian-built Node 2, a future element of the International Space Station. The second of three Station connecting modules, the Node 2 attaches to the end of the U.S. Lab and provides attach locations for several other elements. Kopra is currently assigned technical duties in the Space Station Branch of the Astronaut Office, where his primary focus involves the testing of crew interfaces for two future ISS modules as well as the implementation of support computers and operational Local Area Network on ISS. Node 2 is scheduled to launch on mission STS-120, Station assembly flight 10A.

NASA Image and Video Library

2004-02-03

KENNEDY SPACE CENTER, FLA. - In the Space Station Processing Facility, workers check over the Italian-built Node 2, a future element of the International Space Station. The second of three Station connecting modules, the Node 2 attaches to the end of the U.S. Lab and provides attach locations for several other elements. Kopra is currently assigned technical duties in the Space Station Branch of the Astronaut Office, where his primary focus involves the testing of crew interfaces for two future ISS modules as well as the implementation of support computers and operational Local Area Network on ISS. Node 2 is scheduled to launch on mission STS-120, Station assembly flight 10A.
Using Computer-extracted Image Phenotypes from Tumors on Breast MRI to Predict Breast Cancer Pathologic Stage

PubMed Central

Burnside, Elizabeth S.; Drukker, Karen; Li, Hui; Bonaccio, Ermelinda; Zuley, Margarita; Ganott, Marie; Net, Jose M.; Sutton, Elizabeth; Brandt, Kathleen R.; Whitman, Gary; Conzen, Suzanne; Lan, Li; Ji, Yuan; Zhu, Yitan; Jaffe, Carl; Huang, Erich; Freymann, John; Kirby, Justin; Morris, Elizabeth; Giger, Maryellen

2015-01-01

Background To demonstrate that computer-extracted image phenotypes (CEIPs) of biopsy-proven breast cancer on MRI can accurately predict pathologic stage. Methods We used a dataset of de-identified breast MRIs organized by the National Cancer Institute in The Cancer Imaging Archive. We analyzed 91 biopsy-proven breast cancer cases with pathologic stage (stage I = 22; stage II = 58; stage III = 11) and surgically proven nodal status (negative nodes = 46, ≥ 1 positive node = 44, no nodes examined = 1). We characterized tumors by (a) radiologist measured size, and (b) CEIP. We built models combining two CEIPs to predict tumor pathologic stage and lymph node involvement, evaluated them in leave-one-out cross-validation with area under the ROC curve (AUC) as figure of merit. Results Tumor size was the most powerful predictor of pathologic stage but CEIPs capturing biologic behavior also emerged as predictive (e.g. stage I+II vs. III demonstrated AUC = 0.83). No size measure was successful in the prediction of positive lymph nodes but adding a CEIP describing tumor “homogeneity,” significantly improved this discrimination (AUC = 0.62, p=.003) over chance. Conclusions Our results indicate that MRI phenotypes show promise for predicting breast cancer pathologic stage and lymph node status. PMID:26619259

Enhanced Contact Graph Routing (ECGR) MACHETE Simulation Model

NASA Technical Reports Server (NTRS)

Segui, John S.; Jennings, Esther H.; Clare, Loren P.

2013-01-01

Contact Graph Routing (CGR) for Delay/Disruption Tolerant Networking (DTN) space-based networks makes use of the predictable nature of node contacts to make real-time routing decisions given unpredictable traffic patterns. The contact graph will have been disseminated to all nodes before the start of route computation. CGR was designed for space-based networking environments where future contact plans are known or are independently computable (e.g., using known orbital dynamics). For each data item (known as a bundle in DTN), a node independently performs route selection by examining possible paths to the destination. Route computation could conceivably run thousands of times a second, so computational load is important. This work refers to the simulation software model of Enhanced Contact Graph Routing (ECGR) for DTN Bundle Protocol in JPL's MACHETE simulation tool. The simulation model was used for performance analysis of CGR and led to several performance enhancements. The simulation model was used to demonstrate the improvements of ECGR over CGR as well as other routing methods in space network scenarios. ECGR moved to using earliest arrival time because it is a global monotonically increasing metric that guarantees the safety properties needed for the solution's correctness since route re-computation occurs at each node to accommodate unpredicted changes (e.g., traffic pattern, link quality). Furthermore, using earliest arrival time enabled the use of the standard Dijkstra algorithm for path selection. The Dijkstra algorithm for path selection has a well-known inexpensive computational cost. These enhancements have been integrated into the open source CGR implementation. The ECGR model is also useful for route metric experimentation and comparisons with other DTN routing protocols particularly when combined with MACHETE's space networking models and Delay Tolerant Link State Routing (DTLSR) model.
Cloud object store for checkpoints of high performance computing applications using decoupling middleware

DOEpatents

Bent, John M.; Faibish, Sorin; Grider, Gary

2016-04-19

Cloud object storage is enabled for checkpoints of high performance computing applications using a middleware process. A plurality of files, such as checkpoint files, generated by a plurality of processes in a parallel computing system are stored by obtaining said plurality of files from said parallel computing system; converting said plurality of files to objects using a log structured file system middleware process; and providing said objects for storage in a cloud object storage system. The plurality of processes may run, for example, on a plurality of compute nodes. The log structured file system middleware process may be embodied, for example, as a Parallel Log-Structured File System (PLFS). The log structured file system middleware process optionally executes on a burst buffer node.
A Family of Algorithms for Computing Consensus about Node State from Network Data

PubMed Central

Brush, Eleanor R.; Krakauer, David C.; Flack, Jessica C.

2013-01-01

Biological and social networks are composed of heterogeneous nodes that contribute differentially to network structure and function. A number of algorithms have been developed to measure this variation. These algorithms have proven useful for applications that require assigning scores to individual nodes–from ranking websites to determining critical species in ecosystems–yet the mechanistic basis for why they produce good rankings remains poorly understood. We show that a unifying property of these algorithms is that they quantify consensus in the network about a node's state or capacity to perform a function. The algorithms capture consensus by either taking into account the number of a target node's direct connections, and, when the edges are weighted, the uniformity of its weighted in-degree distribution (breadth), or by measuring net flow into a target node (depth). Using data from communication, social, and biological networks we find that that how an algorithm measures consensus–through breadth or depth– impacts its ability to correctly score nodes. We also observe variation in sensitivity to source biases in interaction/adjacency matrices: errors arising from systematic error at the node level or direct manipulation of network connectivity by nodes. Our results indicate that the breadth algorithms, which are derived from information theory, correctly score nodes (assessed using independent data) and are robust to errors. However, in cases where nodes “form opinions” about other nodes using indirect information, like reputation, depth algorithms, like Eigenvector Centrality, are required. One caveat is that Eigenvector Centrality is not robust to error unless the network is transitive or assortative. In these cases the network structure allows the depth algorithms to effectively capture breadth as well as depth. Finally, we discuss the algorithms' cognitive and computational demands. This is an important consideration in systems in which individuals use the collective opinions of others to make decisions. PMID:23874167
Preoperative 18F-FDG-PET/CT imaging and sentinel node biopsy in the detection of regional lymph node metastases in malignant melanoma.

PubMed

Singh, Baljinder; Ezziddin, Samer; Palmedo, Holger; Reinhardt, Michael; Strunk, Holger; Tüting, Thomas; Biersack, Hans-Jürgen; Ahmadzadehfar, Hojjat

2008-10-01

The objective of this study was to evaluate the role of preoperative 18F-fluorodeoxyglucose-positron emission tomography/computed tomography scanning, preoperative lymphoscintigraphy (LS), and sentinel lymph node biopsy in patients with malignant melanoma. Fifty-two patients (36 men: 16 women; mean age 55.0+/-13.0 years; median age 61 years; range 17-76 years) with malignant melanoma were selected. According to the latest version of the American Joint Committee on Cancer staging system, the disease in the study patients was initially classified as either stage I or II. The other primary tumor characteristics were mean Breslow depth=2.87 mm and median=2 mm; range 1-12.0 mm and Clarks levels III-V. None of the study patients had clinical or radiological evidence of regional lymph node metastatic disease. At least one sentinel node was identified in all patients. Preoperative LS detected a total of 111 sentinel lymph nodes (average 2.13 sentinel lymph node per patient) and demonstrated a single nodal draining basin in 38 (73%) patients and multiple (2-3 draining basins) in the remaining 14 (27%) patients. Fourteen out of the 52 patients (27%) had at least one involved sentinel node. Positron emission tomography was true positive in two patients with a sentinel node greater than 1 cm and false positive in two other patients. In this study, the detection of sentinel lymph node by LS and gamma probe had a sensitivity of 100%. In contrast, 18F-FDG-PET imaging demonstrated very low sensitivity (14.3%; 95% CI, 2.5 to 44%) and positive predictive value (50%; 95% CI, 9 to 90%) for localizing the subclinical nodal metastases. The specificity, net present value, and diagnostic accuracy were 94.7, 75, and 73%, respectively. Preoperative fluorodeoxyglucose-positron emission tomography/computed tomography imaging is not able to substitute LS/sentinel lymph node biopsy in patients at stage I or II.
Best bang for your buck: GPU nodes for GROMACS biomolecular simulations

PubMed Central

Páll, Szilárd; Fechner, Martin; Esztermann, Ansgar; de Groot, Bert L.; Grubmüller, Helmut

2015-01-01

The molecular dynamics simulation package GROMACS runs efficiently on a wide variety of hardware from commodity workstations to high performance computing clusters. Hardware features are well‐exploited with a combination of single instruction multiple data, multithreading, and message passing interface (MPI)‐based single program multiple data/multiple program multiple data parallelism while graphics processing units (GPUs) can be used as accelerators to compute interactions off‐loaded from the CPU. Here, we evaluate which hardware produces trajectories with GROMACS 4.6 or 5.0 in the most economical way. We have assembled and benchmarked compute nodes with various CPU/GPU combinations to identify optimal compositions in terms of raw trajectory production rate, performance‐to‐price ratio, energy efficiency, and several other criteria. Although hardware prices are naturally subject to trends and fluctuations, general tendencies are clearly visible. Adding any type of GPU significantly boosts a node's simulation performance. For inexpensive consumer‐class GPUs this improvement equally reflects in the performance‐to‐price ratio. Although memory issues in consumer‐class GPUs could pass unnoticed as these cards do not support error checking and correction memory, unreliable GPUs can be sorted out with memory checking tools. Apart from the obvious determinants for cost‐efficiency like hardware expenses and raw performance, the energy consumption of a node is a major cost factor. Over the typical hardware lifetime until replacement of a few years, the costs for electrical power and cooling can become larger than the costs of the hardware itself. Taking that into account, nodes with a well‐balanced ratio of CPU and consumer‐class GPU resources produce the maximum amount of GROMACS trajectory over their lifetime. © 2015 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc. PMID:26238484
Best bang for your buck: GPU nodes for GROMACS biomolecular simulations.

PubMed

Kutzner, Carsten; Páll, Szilárd; Fechner, Martin; Esztermann, Ansgar; de Groot, Bert L; Grubmüller, Helmut

2015-10-05

The molecular dynamics simulation package GROMACS runs efficiently on a wide variety of hardware from commodity workstations to high performance computing clusters. Hardware features are well-exploited with a combination of single instruction multiple data, multithreading, and message passing interface (MPI)-based single program multiple data/multiple program multiple data parallelism while graphics processing units (GPUs) can be used as accelerators to compute interactions off-loaded from the CPU. Here, we evaluate which hardware produces trajectories with GROMACS 4.6 or 5.0 in the most economical way. We have assembled and benchmarked compute nodes with various CPU/GPU combinations to identify optimal compositions in terms of raw trajectory production rate, performance-to-price ratio, energy efficiency, and several other criteria. Although hardware prices are naturally subject to trends and fluctuations, general tendencies are clearly visible. Adding any type of GPU significantly boosts a node's simulation performance. For inexpensive consumer-class GPUs this improvement equally reflects in the performance-to-price ratio. Although memory issues in consumer-class GPUs could pass unnoticed as these cards do not support error checking and correction memory, unreliable GPUs can be sorted out with memory checking tools. Apart from the obvious determinants for cost-efficiency like hardware expenses and raw performance, the energy consumption of a node is a major cost factor. Over the typical hardware lifetime until replacement of a few years, the costs for electrical power and cooling can become larger than the costs of the hardware itself. Taking that into account, nodes with a well-balanced ratio of CPU and consumer-class GPU resources produce the maximum amount of GROMACS trajectory over their lifetime. © 2015 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc.
Understanding the influence of all nodes in a network

PubMed Central

Lawyer, Glenn

2015-01-01

Centrality measures such as the degree, k-shell, or eigenvalue centrality can identify a network's most influential nodes, but are rarely usefully accurate in quantifying the spreading power of the vast majority of nodes which are not highly influential. The spreading power of all network nodes is better explained by considering, from a continuous-time epidemiological perspective, the distribution of the force of infection each node generates. The resulting metric, the expected force, accurately quantifies node spreading power under all primary epidemiological models across a wide range of archetypical human contact networks. When node power is low, influence is a function of neighbor degree. As power increases, a node's own degree becomes more important. The strength of this relationship is modulated by network structure, being more pronounced in narrow, dense networks typical of social networking and weakening in broader, looser association networks such as the Internet. The expected force can be computed independently for individual nodes, making it applicable for networks whose adjacency matrix is dynamic, not well specified, or overwhelmingly large. PMID:25727453
Performance Evaluation of AODV with Blackhole Attack

NASA Astrophysics Data System (ADS)

Dara, Karuna

2010-11-01

A Mobile Ad Hoc Network (MANET) is a temporary network set up by a wireless mobile computers moving arbitrary in the places that have no network infrastructure. These nodes maintain connectivity in a decentralized manner. Since the nodes communicate with each other, they cooperate by forwarding data packets to other nodes in the network. Thus the nodes find a path to the destination node using routing protocols. However, due to security vulnerabilities of the routing protocols, mobile ad-hoc networks are unprotected to attacks of the malicious nodes. One of these attacks is the Black Hole Attack against network integrity absorbing all data packets in the network. Since the data packets do not reach the destination node on account of this attack, data loss will occur. In this paper, we simulated the black hole attack in various mobile ad-hoc network scenarios using AODV routing protocol of MANET and have tried to find a effect if number of nodes are increased with increase in malicious nodes.
An MPI-based MoSST core dynamics model

NASA Astrophysics Data System (ADS)

Jiang, Weiyuan; Kuang, Weijia

2008-09-01

Distributed systems are among the main cost-effective and expandable platforms for high-end scientific computing. Therefore scalable numerical models are important for effective use of such systems. In this paper, we present an MPI-based numerical core dynamics model for simulation of geodynamo and planetary dynamos, and for simulation of core-mantle interactions. The model is developed based on MPI libraries. Two algorithms are used for node-node communication: a "master-slave" architecture and a "divide-and-conquer" architecture. The former is easy to implement but not scalable in communication. The latter is scalable in both computation and communication. The model scalability is tested on Linux PC clusters with up to 128 nodes. This model is also benchmarked with a published numerical dynamo model solution.
Perigastric lymph node metastasis from papillary thyroid carcinoma in a patient with early gastric cancer: the first case report.

PubMed

Jeong, Gui-Ae; Kim, Hyung-Chul; Kim, Hee-Kyung; Cho, Gyu-Seok

2014-09-01

Distant metastasis from papillary thyroid carcinoma (PTC), particularly from papillary thyroid microcarcinoma, is rare. We present a case of perigastric lymph node metastasis from PTC in a patient with early gastric cancer and breast cancer. During post-surgical follow-up for breast cancer, a 56-year-old woman was diagnosed incidentally with early gastric cancer and synchronous left thyroid cancer. Therefore, laparoscopic distal gastrectomy with lymph node dissection and left thyroidectomy were performed. On the basis of the pathologic findings of the surgical specimens, the patient was diagnosed to have papillary thyroid microcarcinoma with perigastric lymph node metastasis and early gastric cancer with mucosal invasion. Finally, on the basis of immunohistochemical staining with galectin-3, the diagnosis of perigastric lymph node metastasis from PTC was made. When a patient has multiple primary malignancies with lymph node metastasis, careful pathologic examination of the surgical specimen is necessary; immunohistochemical staining may be helpful in determining the primary origin of lymph node metastasis.
Risk Factors for Predicting Occult Lymph Node Metastasis in Patients with Clinical Stage I Non-small Cell Lung Cancer Staged by Integrated Fluorodeoxyglucose Positron Emission Tomography/Computed Tomography.

PubMed

Kaseda, Kaoru; Asakura, Keisuke; Kazama, Akio; Ozawa, Yukihiko

2016-12-01

Lymph nodes in patients with non-small cell lung cancer (NSCLC) are often staged using integrated 18F-fluorodeoxyglucose positron emission tomography/computed tomography (FDG-PET/CT). However, this modality has limited ability to detect micrometastases. We aimed to define risk factors for occult lymph node metastasis in patients with clinical stage I NSCLC diagnosed by preoperative integrated FDG-PET/CT. We retrospectively reviewed the records of 246 patients diagnosed with clinical stage I NSCLC based on integrated FDG-PET/CT between April 2007 and May 2015. All patients were treated by complete surgical resection. The prevalence of occult lymph node metastasis in patients with clinical stage I NSCLC was analysed according to clinicopathological factors. Risk factors for occult lymph node metastasis were defined using univariate and multivariate analyses. Occult lymph node metastasis was detected in 31 patients (12.6 %). Univariate analysis revealed CEA (P = 0.04), SUV max of the primary tumour (P = 0.031), adenocarcinoma (P = 0.023), tumour size (P = 0.002) and pleural invasion (P = 0.046) as significant predictors of occult lymph node metastasis. Multivariate analysis selected SUV max of the primary tumour (P = 0.049), adenocarcinoma (P = 0.003) and tumour size (P = 0.019) as independent predictors of occult lymph node metastasis. The SUV max of the primary tumour, adenocarcinoma and tumour size were risk factors for occult lymph node metastasis in patients with NSCLC diagnosed as clinical stage I by preoperative integrated FDG-PET/CT. These findings would be helpful in selecting candidates for mediastinoscopy or endobronchial ultrasound-guided transbronchial needle aspiration.
Deep Space Networking Experiments on the EPOXI Spacecraft

NASA Technical Reports Server (NTRS)

Jones, Ross M.

2011-01-01

NASA's Space Communications & Navigation Program within the Space Operations Directorate is operating a program to develop and deploy Disruption Tolerant Networking [DTN] technology for a wide variety of mission types by the end of 2011. DTN is an enabling element of the Interplanetary Internet where terrestrial networking protocols are generally unsuitable because they rely on timely and continuous end-to-end delivery of data and acknowledgments. In fall of 2008 and 2009 and 2011 the Jet Propulsion Laboratory installed and tested essential elements of DTN technology on the Deep Impact spacecraft. These experiments, called Deep Impact Network Experiment (DINET 1) were performed in close cooperation with the EPOXI project which has responsibility for the spacecraft. The DINET 1 software was installed on the backup software partition on the backup flight computer for DINET 1. For DINET 1, the spacecraft was at a distance of about 15 million miles (24 million kilometers) from Earth. During DINET 1 300 images were transmitted from the JPL nodes to the spacecraft. Then, they were automatically forwarded from the spacecraft back to the JPL nodes, exercising DTN's bundle origination, transmission, acquisition, dynamic route computation, congestion control, prioritization, custody transfer, and automatic retransmission procedures, both on the spacecraft and on the ground, over a period of 27 days. The first DINET 1 experiment successfully validated many of the essential elements of the DTN protocols. DINET 2 demonstrated: 1) additional DTN functionality, 2) automated certain tasks which were manually implemented in DINET 1 and 3) installed the ION SW on nodes outside of JPL. DINET 3 plans to: 1) upgrade the LTP convergence-layer adapter to conform to the international LTP CL specification, 2) add convergence-layer "stewardship" procedures and 3) add the BSP security elements [PIB & PCB]. This paper describes the planning and execution of the flight experiment and the validation results.
Hoshide in Node 2

NASA Image and Video Library

2012-10-14

ISS033-E-013091 (14 Oct. 2012) --- Japan Aerospace Exploration Agency astronaut Aki Hoshide, Expedition 33 flight engineer, holds a computer attached to a stand in the Harmony node of the International Space Station. A signed poster of SpaceX personnel floats freely at upper left.
Hoshide in Node 2

NASA Image and Video Library

2012-10-14

ISS033-E-013092 (14 Oct. 2012) --- Japan Aerospace Exploration Agency astronaut Aki Hoshide, Expedition 33 flight engineer, holds a computer attached to a stand in the Harmony node of the International Space Station. A signed poster of SpaceX personnel floats freely at upper left.
Brain Performance versus Phase Transitions

NASA Astrophysics Data System (ADS)

Torres, Joaquín J.; Marro, J.

2015-07-01

We here illustrate how a well-founded study of the brain may originate in assuming analogies with phase-transition phenomena. Analyzing to what extent a weak signal endures in noisy environments, we identify the underlying mechanisms, and it results a description of how the excitability associated to (non-equilibrium) phase changes and criticality optimizes the processing of the signal. Our setting is a network of integrate-and-fire nodes in which connections are heterogeneous with rapid time-varying intensities mimicking fatigue and potentiation. Emergence then becomes quite robust against wiring topology modification—in fact, we considered from a fully connected network to the Homo sapiens connectome—showing the essential role of synaptic flickering on computations. We also suggest how to experimentally disclose significant changes during actual brain operation.
Predicted extracapsular invasion of hilar lymph node metastasis by fusion positron emission tomography/computed tomography in patients with lung cancer

PubMed Central

MAKINO, TAKASHI; HATA, YOSHINOBU; OTSUKA, HAJIME; KOEZUKA, SATOSHI; ISOBE, KAZUTOSHI; TOCHIGI, NOBUMI; SHIRAGA, NOBUYUKI; SHIBUYA, KAZUTOSHI; HOMMA, SAKAE; IYODA, AKIRA

2015-01-01

Intraoperative detection of hilar lymph node metastasis, particularly with extracapsular invasion, may affect the surgical procedure in patients with lung cancer, as the preoperative estimation of hilar lymph node metastasis is unsatisfactory. The aim of this study was to investigate whether fusion positron emission tomography/computed tomography (PET/CT) is able to predict extracapsular invasion of hilar lymph node metastasis. Between April, 2007 and April, 2013, 509 patients with primary lung cancer underwent surgical resection at our institution, among whom 28 patients exhibiting hilar lymph node metastasis (at stations 10 and 11) were enrolled in this study. A maximum lymph node standardized uptake value of >2.5 in PET scans was interpreted as positive. A total of 17 patients had positive preoperative PET/CT findings in their hilar lymph nodes, while the remaining 11 had negative findings. With regard to extracapsular nodal invasion, the PET/CT findings (P=0.0005) and the histological findings (squamous cell carcinoma, P=0.05) were found to be significant predictors in the univariate analysis. In the multivariate analysis, the PET/CT findings were the only independent predictor (P=0.0004). The requirement for extensive pulmonary resection (sleeve lobectomy, bilobectomy or pneumonectomy) was significantly more frequent in the patient group with positive compared with the group with negative PET/CT findings (76 vs. 9%, respectively, P=0.01). Therefore, the PET/CT findings in the hilar lymph nodes were useful for the prediction of extracapsular invasion and, consequently, for the estimation of possible extensive pulmonary resection. PMID:26623046
Predicted extracapsular invasion of hilar lymph node metastasis by fusion positron emission tomography/computed tomography in patients with lung cancer.

PubMed

Makino, Takashi; Hata, Yoshinobu; Otsuka, Hajime; Koezuka, Satoshi; Isobe, Kazutoshi; Tochigi, Nobumi; Shiraga, Nobuyuki; Shibuya, Kazutoshi; Homma, Sakae; Iyoda, Akira

2015-09-01

Intraoperative detection of hilar lymph node metastasis, particularly with extracapsular invasion, may affect the surgical procedure in patients with lung cancer, as the preoperative estimation of hilar lymph node metastasis is unsatisfactory. The aim of this study was to investigate whether fusion positron emission tomography/computed tomography (PET/CT) is able to predict extracapsular invasion of hilar lymph node metastasis. Between April, 2007 and April, 2013, 509 patients with primary lung cancer underwent surgical resection at our institution, among whom 28 patients exhibiting hilar lymph node metastasis (at stations 10 and 11) were enrolled in this study. A maximum lymph node standardized uptake value of >2.5 in PET scans was interpreted as positive. A total of 17 patients had positive preoperative PET/CT findings in their hilar lymph nodes, while the remaining 11 had negative findings. With regard to extracapsular nodal invasion, the PET/CT findings (P=0.0005) and the histological findings (squamous cell carcinoma, P=0.05) were found to be significant predictors in the univariate analysis. In the multivariate analysis, the PET/CT findings were the only independent predictor (P=0.0004). The requirement for extensive pulmonary resection (sleeve lobectomy, bilobectomy or pneumonectomy) was significantly more frequent in the patient group with positive compared with the group with negative PET/CT findings (76 vs. 9%, respectively, P=0.01). Therefore, the PET/CT findings in the hilar lymph nodes were useful for the prediction of extracapsular invasion and, consequently, for the estimation of possible extensive pulmonary resection.
Mathematical Analysis of Vehicle Delivery Scale of Bike-Sharing Rental Nodes

NASA Astrophysics Data System (ADS)

Zhai, Y.; Liu, J.; Liu, L.

2018-04-01

Aiming at the lack of scientific and reasonable judgment of vehicles delivery scale and insufficient optimization of scheduling decision, based on features of the bike-sharing usage, this paper analyses the applicability of the discrete time and state of the Markov chain, and proves its properties to be irreducible, aperiodic and positive recurrent. Based on above analysis, the paper has reached to the conclusion that limit state (steady state) probability of the bike-sharing Markov chain only exists and is independent of the initial probability distribution. Then this paper analyses the difficulty of the transition probability matrix parameter statistics and the linear equations group solution in the traditional solving algorithm of the bike-sharing Markov chain. In order to improve the feasibility, this paper proposes a "virtual two-node vehicle scale solution" algorithm which considered the all the nodes beside the node to be solved as a virtual node, offered the transition probability matrix, steady state linear equations group and the computational methods related to the steady state scale, steady state arrival time and scheduling decision of the node to be solved. Finally, the paper evaluates the rationality and accuracy of the steady state probability of the proposed algorithm by comparing with the traditional algorithm. By solving the steady state scale of the nodes one by one, the proposed algorithm is proved to have strong feasibility because it lowers the level of computational difficulty and reduces the number of statistic, which will help the bike-sharing companies to optimize the scale and scheduling of nodes.
NASA's Planetary Data System: Support for the Delivery of Derived Data Sets at the Atmospheres Node

NASA Astrophysics Data System (ADS)

Chanover, Nancy J.; Beebe, Reta; Neakrase, Lynn; Huber, Lyle; Rees, Shannon; Hornung, Danae

2015-11-01

NASA’s Planetary Data System is charged with archiving electronic data products from NASA planetary missions that are sponsored by NASA’s Science Mission Directorate. This archive, currently organized by science disciplines, uses standards for describing and storing data that are designed to enable future scientists who are unfamiliar with the original experiments to analyze the data, and to do this using a variety of computer platforms, with no additional support. These standards address the data structure, description contents, and media design. The new requirement in the NASA ROSES-2015 Research Announcement to include a Data Management Plan will result in an increase in the number of derived data sets that are being delivered to the PDS. These data sets may come from the Planetary Data Archiving, Restoration and Tools (PDART) program, other Data Analysis Programs (DAPs) or be volunteered by individuals who are publishing the results of their analysis. In response to this increase, the PDS Atmospheres Node is developing a set of guidelines and user tools to make the process of archiving these derived data products more efficient. Here we provide a description of Atmospheres Node resources, including a letter of support for the proposal stage, a communication schedule for the planned archive effort, product label samples and templates in extensible markup language (XML), documentation templates, and validation tools necessary for producing a PDS4-compliant derived data bundle(s) efficiently and accurately.
The network level reproduction number for infectious diseases with both vertical and horizontal transmission.

PubMed

Xue, Ling; Scoglio, Caterina

2013-05-01

A wide range of infectious diseases are both vertically and horizontally transmitted. Such diseases are spatially transmitted via multiple species in heterogeneous environments, typically described by complex meta-population models. The reproduction number, R0, is a critical metric predicting whether the disease can invade the meta-population system. This paper presents the reproduction number for a generic disease vertically and horizontally transmitted among multiple species in heterogeneous networks, where nodes are locations, and links reflect outgoing or incoming movement flows. The metapopulation model for vertically and horizontally transmitted diseases is gradually formulated from two species, two-node network models. We derived an explicit expression of R0, which is the spectral radius of a matrix reduced in size with respect to the original next generation matrix. The reproduction number is shown to be a function of vertical and horizontal transmission parameters, and the lower bound is the reproduction number for horizontal transmission. As an application, the reproduction number and its bounds for the Rift Valley fever zoonosis, where livestock, mosquitoes, and humans are the involved species are derived. By computing the reproduction number for different scenarios through numerical simulations, we found the reproduction number is affected by livestock movement rates only when parameters are heterogeneous across nodes. To summarize, our study contributes the reproduction number for vertically and horizontally transmitted diseases in heterogeneous networks. This explicit expression is easily adaptable to specific infectious diseases, affording insights into disease evolution. Copyright © 2013 Elsevier Inc. All rights reserved.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ali, Amjad Majid; Albert, Don; Andersson, Par

SLURM is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small computer clusters. As a cluster resource manager, SLURM has three key functions. First, it allocates exclusive and/or non-exclusive access to resources (compute nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work 9normally a parallel job) on the set of allocated nodes. Finally, it arbitrates conflicting requests for resources by managing a queue of pending work.
Generic Divide and Conquer Internet-Based Computing

NASA Technical Reports Server (NTRS)

Radenski, Atanas; Follen, Gregory J. (Technical Monitor)

2001-01-01

The rapid growth of internet-based applications and the proliferation of networking technologies have been transforming traditional commercial application areas as well as computer and computational sciences and engineering. This growth stimulates the exploration of new, internet-oriented software technologies that can open new research and application opportunities not only for the commercial world, but also for the scientific and high -performance computing applications community. The general goal of this research project is to contribute to better understanding of the transition to internet-based high -performance computing and to develop solutions for some of the difficulties of this transition. More specifically, our goal is to design an architecture for generic divide and conquer internet-based computing, to develop a portable implementation of this architecture, to create an example library of high-performance divide-and-conquer computing agents that run on top of this architecture, and to evaluate the performance of these agents. We have been designing an architecture that incorporates a master task-pool server and utilizes satellite computational servers that operate on the Internet in a dynamically changing large configuration of lower-end nodes provided by volunteer contributors. Our designed architecture is intended to be complementary to and accessible from computational grids such as Globus, Legion, and Condor. Grids provide remote access to existing high-end computing resources; in contrast, our goal is to utilize idle processor time of lower-end internet nodes. Our project is focused on a generic divide-and-conquer paradigm and its applications that operate on a loose and ever changing pool of lower-end internet nodes.
Extended field intensity modulated radiation therapy with concomitant boost for lymph node-positive cervical cancer: analysis of regional control and recurrence patterns in the positron emission tomography/computed tomography era.

PubMed

Vargo, John A; Kim, Hayeon; Choi, Serah; Sukumvanich, Paniti; Olawaiye, Alexander B; Kelley, Joseph L; Edwards, Robert P; Comerci, John T; Beriwal, Sushil

2014-12-01

Positron emission tomography/computed tomography (PET/CT) is commonly used for nodal staging in locally advanced cervical cancer; however the false negative rate for para-aortic disease are 20% to 25% in PET-positive pelvic nodal disease. Unless surgically staged, pelvis-only treatment may undertreat para-aortic disease. We have treated patients with PET-positive nodes with extended field intensity modulated radiation therapy (IMRT) to address the para-aortic region prophylactically with concomitant boost to involved nodes. The purpose of this study was to assess regional control rates and recurrence patterns. Sixty-one patients with cervical cancer (stage IBI-IVA) diagnosed from 2003 to 2012 with PET-avid pelvic nodes treated with extended field IMRT (45 Gy in 25 fractions with concomitant boost to involved nodes to a median of 55 Gy in 25 fractions) with concurrent cisplatin and brachytherapy were retrospectively analyzed. The nodal location was pelvis-only in 41 patients (67%) and pelvis + para-aortic in 20 patients (33%). There were a total of 179 nodes, with a median number of positive nodes of 2 (range, 1-16 nodes) per patient and a median nodal size of 1.8 cm (range, 0.7-4.5 cm). Response was assessed by PET/CT at 12 to 16 weeks. Complete clinical and imaging response at the first follow-up visit was seen in 77% of patients. At a mean follow-up time of 29 months (range, 3-116 months), 8 patients experienced recurrence. The sites of persistent/recurrent disease were as follows: cervix 10 (16.3%), regional nodes 3 (4.9%), and distant 14 (23%). The rate of para-aortic failure in patients with pelvic-only nodes was 2.5%. There were no significant differences in recurrence patterns by the number/location of nodes, largest node size, or maximum node standardized uptake value. The rate of late grade 3+ adverse events was 4%. Extended field IMRT was well tolerated and resulted in low regional recurrence in node-positive cervical cancer. The dose of 55 Gy in 25 fractions was effective in eradicating disease in involved nodes, with acceptable late adverse events. Distant metastasis is the predominant mode of failure, and the OUTBACK trial may challenge the presented paradigms. Copyright © 2014 Elsevier Inc. All rights reserved.
11C-Choline-Pet Guided Stereotactic Body Radiation Therapy for Lymph Node Metastases in Oligometastatic Prostate Cancer.

PubMed

Franzese, Ciro; Lopci, Egesta; Di Brina, Lucia; D'Agostino, Giuseppe Roberto; Navarria, Pierina; Mancosu, Pietro; Tomatis, Stefano; Chiti, Arturo; Scorsetti, Marta

2017-10-21

aim is outcome of 11C-Choline-PET guided SBRT on lymph node metastases. patients with 1 - 4 lymph node metastases detected by 11C-choline-PET were treated with SBRT. Toxicity, treated metastases control and Progression Free Survival were computed. twenty-six patients, 38 lymph node metastases were irradiated. No grade ≥ 2 toxicity. Median PSA-nadir after RT was 1.02 ng/mL. Post-treatment 11C-Choline-PET showed metabolic complete response in 17 metastases (44,7%), partial response in 9 metastases (38%). SBRT is effective and safe for lymph node metastases. PET is important in identification of gross tumor and evaluation of the response.
In-situ trainable intrusion detection system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Symons, Christopher T.; Beaver, Justin M.; Gillen, Rob

A computer implemented method detects intrusions using a computer by analyzing network traffic. The method includes a semi-supervised learning module connected to a network node. The learning module uses labeled and unlabeled data to train a semi-supervised machine learning sensor. The method records events that include a feature set made up of unauthorized intrusions and benign computer requests. The method identifies at least some of the benign computer requests that occur during the recording of the events while treating the remainder of the data as unlabeled. The method trains the semi-supervised learning module at the network node in-situ, such thatmore » the semi-supervised learning modules may identify malicious traffic without relying on specific rules, signatures, or anomaly detection.« less
A Secure Scheme for Distributed Consensus Estimation against Data Falsification in Heterogeneous Wireless Sensor Networks.

PubMed

Mi, Shichao; Han, Hui; Chen, Cailian; Yan, Jian; Guan, Xinping

2016-02-19

Heterogeneous wireless sensor networks (HWSNs) can achieve more tasks and prolong the network lifetime. However, they are vulnerable to attacks from the environment or malicious nodes. This paper is concerned with the issues of a consensus secure scheme in HWSNs consisting of two types of sensor nodes. Sensor nodes (SNs) have more computation power, while relay nodes (RNs) with low power can only transmit information for sensor nodes. To address the security issues of distributed estimation in HWSNs, we apply the heterogeneity of responsibilities between the two types of sensors and then propose a parameter adjusted-based consensus scheme (PACS) to mitigate the effect of the malicious node. Finally, the convergence property is proven to be guaranteed, and the simulation results validate the effectiveness and efficiency of PACS.
A Hybrid Scheme for Fine-Grained Search and Access Authorization in Fog Computing Environment

PubMed Central

Xiao, Min; Zhou, Jing; Liu, Xuejiao; Jiang, Mingda

2017-01-01

In the fog computing environment, the encrypted sensitive data may be transferred to multiple fog nodes on the edge of a network for low latency; thus, fog nodes need to implement a search over encrypted data as a cloud server. Since the fog nodes tend to provide service for IoT applications often running on resource-constrained end devices, it is necessary to design lightweight solutions. At present, there is little research on this issue. In this paper, we propose a fine-grained owner-forced data search and access authorization scheme spanning user-fog-cloud for resource constrained end users. Compared to existing schemes only supporting either index encryption with search ability or data encryption with fine-grained access control ability, the proposed hybrid scheme supports both abilities simultaneously, and index ciphertext and data ciphertext are constructed based on a single ciphertext-policy attribute based encryption (CP-ABE) primitive and share the same key pair, thus the data access efficiency is significantly improved and the cost of key management is greatly reduced. Moreover, in the proposed scheme, the resource constrained end devices are allowed to rapidly assemble ciphertexts online and securely outsource most of decryption task to fog nodes, and mediated encryption mechanism is also adopted to achieve instantaneous user revocation instead of re-encrypting ciphertexts with many copies in many fog nodes. The security and the performance analysis show that our scheme is suitable for a fog computing environment. PMID:28629131
Calibrated birth-death phylogenetic time-tree priors for bayesian inference.

PubMed

Heled, Joseph; Drummond, Alexei J

2015-05-01

Here we introduce a general class of multiple calibration birth-death tree priors for use in Bayesian phylogenetic inference. All tree priors in this class separate ancestral node heights into a set of "calibrated nodes" and "uncalibrated nodes" such that the marginal distribution of the calibrated nodes is user-specified whereas the density ratio of the birth-death prior is retained for trees with equal values for the calibrated nodes. We describe two formulations, one in which the calibration information informs the prior on ranked tree topologies, through the (conditional) prior, and the other which factorizes the prior on divergence times and ranked topologies, thus allowing uniform, or any arbitrary prior distribution on ranked topologies. Although the first of these formulations has some attractive properties, the algorithm we present for computing its prior density is computationally intensive. However, the second formulation is always faster and computationally efficient for up to six calibrations. We demonstrate the utility of the new class of multiple-calibration tree priors using both small simulations and a real-world analysis and compare the results to existing schemes. The two new calibrated tree priors described in this article offer greater flexibility and control of prior specification in calibrated time-tree inference and divergence time dating, and will remove the need for indirect approaches to the assessment of the combined effect of calibration densities and tree priors in Bayesian phylogenetic inference. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
A Hybrid Scheme for Fine-Grained Search and Access Authorization in Fog Computing Environment.

PubMed

Xiao, Min; Zhou, Jing; Liu, Xuejiao; Jiang, Mingda

2017-06-17

In the fog computing environment, the encrypted sensitive data may be transferred to multiple fog nodes on the edge of a network for low latency; thus, fog nodes need to implement a search over encrypted data as a cloud server. Since the fog nodes tend to provide service for IoT applications often running on resource-constrained end devices, it is necessary to design lightweight solutions. At present, there is little research on this issue. In this paper, we propose a fine-grained owner-forced data search and access authorization scheme spanning user-fog-cloud for resource constrained end users. Compared to existing schemes only supporting either index encryption with search ability or data encryption with fine-grained access control ability, the proposed hybrid scheme supports both abilities simultaneously, and index ciphertext and data ciphertext are constructed based on a single ciphertext-policy attribute based encryption (CP-ABE) primitive and share the same key pair, thus the data access efficiency is significantly improved and the cost of key management is greatly reduced. Moreover, in the proposed scheme, the resource constrained end devices are allowed to rapidly assemble ciphertexts online and securely outsource most of decryption task to fog nodes, and mediated encryption mechanism is also adopted to achieve instantaneous user revocation instead of re-encrypting ciphertexts with many copies in many fog nodes. The security and the performance analysis show that our scheme is suitable for a fog computing environment.
The Father Christmas worm

NASA Technical Reports Server (NTRS)

Green, James L.; Sisson, Patricia L.

1989-01-01

Given here is an overview analysis of the Father Christmas Worm, a computer worm that was released onto the DECnet Internet three days before Christmas 1988. The purpose behind the worm was to send an electronic mail message to all users on the computer system running the worm. The message was a Christmas greeting and was signed 'Father Christmas'. From the investigation, it was determined that the worm was released from a computer (node number 20597::) at a university in Switzerland. The worm was designed to travel quickly. Estimates are that it was copied to over 6,000 computer nodes. However, it was believed to have executed on only a fraction of those computers. Within ten minutes after it was released, the worm was detected at the Space Physics Analysis Network (SPAN), NASA's largest space and Earth science network. Once the source program was captured, a procedural cure, using the existing functionality of the computer operating systems, was quickly devised and distributed. A combination of existing computer security measures, the quick and accurate procedures devised to stop copies of the worm from executing, and the network itself, were used to rapidly provide the cure. These were the main reasons why the worm executed on such a small percentage of nodes. This overview of the analysis of the events concerning the worm is based on an investigation made by the SPAN Security Team and provides some insight into future security measures that will be taken to handle computer worms and viruses that may hit similar networks.
The Radio Frequency Health Node Wireless Sensor System

NASA Technical Reports Server (NTRS)

Valencia, J. Emilio; Stanley, Priscilla C.; Mackey, Paul J.

2009-01-01

The Radio Frequency Health Node (RFHN) wireless sensor system differs from other wireless sensor systems in ways originally intended to enhance utility as an instrumentation system for a spacecraft. The RFHN can also be adapted to use in terrestrial applications in which there are requirements for operational flexibility and integrability into higher-level instrumentation and data acquisition systems. As shown in the figure, the heart of the system is the RFHN, which is a unit that passes commands and data between (1) one or more commercially available wireless sensor units (optionally, also including wired sensor units) and (2) command and data interfaces with a local control computer that may be part of the spacecraft or other engineering system in which the wireless sensor system is installed. In turn, the local control computer can be in radio or wire communication with a remote control computer that may be part of a higher-level system. The remote control computer, acting via the local control computer and the RFHN, cannot only monitor readout data from the sensor units but can also remotely configure (program or reprogram) the RFHN and the sensor units during operation. In a spacecraft application, the RFHN and the sensor units can also be configured more nearly directly, prior to launch, via a serial interface that includes an umbilical cable between the spacecraft and ground support equipment. In either case, the RFHN wireless sensor system has the flexibility to be configured, as required, with different numbers and types of sensors for different applications. The RFHN can be used to effect realtime transfer of data from, and commands to, the wireless sensor units. It can also store data for later retrieval by an external computer. The RFHN communicates with the wireless sensor units via a radio transceiver module. The modular design of the RFHN makes it possible to add radio transceiver modules as needed to accommodate additional sets of wireless sensor units. The RFHN includes a core module that performs generic computer functions, including management of power and input, output, processing, and storage of data. In a typical application, the processing capabilities in the RFHN are utilized to perform preprocessing, trending, and fusion of sensor data. The core module also serves as the unit through which the remote control computer configures the sensor units and the rest of the RFHN.
Peregrine Queue Changes | High-Performance Computing | NREL

Science.gov Websites

that the best path is to disable the large queue and move the nodes from the "large" queue to jobs that request a large number of nodes. The large queue was disabled during the October System time
SINDA, Systems Improved Numerical Differencing Analyzer

NASA Technical Reports Server (NTRS)

Fink, L. C.; Pan, H. M. Y.; Ishimoto, T.

1972-01-01

Computer program has been written to analyze group of 100-node areas and then provide for summation of any number of 100-node areas to obtain temperature profile. SINDA program options offer user variety of methods for solution of thermal analog modes presented in network format.
Preconditioned implicit solvers for the Navier-Stokes equations on distributed-memory machines

NASA Technical Reports Server (NTRS)

Ajmani, Kumud; Liou, Meng-Sing; Dyson, Rodger W.

1994-01-01

The GMRES method is parallelized, and combined with local preconditioning to construct an implicit parallel solver to obtain steady-state solutions for the Navier-Stokes equations of fluid flow on distributed-memory machines. The new implicit parallel solver is designed to preserve the convergence rate of the equivalent 'serial' solver. A static domain-decomposition is used to partition the computational domain amongst the available processing nodes of the parallel machine. The SPMD (Single-Program Multiple-Data) programming model is combined with message-passing tools to develop the parallel code on a 32-node Intel Hypercube and a 512-node Intel Delta machine. The implicit parallel solver is validated for internal and external flow problems, and is found to compare identically with flow solutions obtained on a Cray Y-MP/8. A peak computational speed of 2300 MFlops/sec has been achieved on 512 nodes of the Intel Delta machine,k for a problem size of 1024 K equations (256 K grid points).
A simple node and conductor data generator for SINDA

NASA Technical Reports Server (NTRS)

Gottula, Ronald R.

1992-01-01

This paper presents a simple, automated method to generate NODE and CONDUCTOR DATA for thermal match modes. The method uses personal computer spreadsheets to create SINDA inputs. It was developed in order to make SINDA modeling less time consuming and serves as an alternative to graphical methods. Anyone having some experience using a personal computer can easily implement this process. The user develops spreadsheets to automatically calculate capacitances and conductances based on material properties and dimensional data. The necessary node and conductor information is then taken from the spreadsheets and automatically arranged into the proper format, ready for insertion directly into the SINDA model. This technique provides a number of benefits to the SINDA user such as a reduction in the number of hand calculations, and an ability to very quickly generate a parametric set of NODE and CONDUCTOR DATA blocks. It also provides advantages over graphical thermal modeling systems by retaining the analyst's complete visibility into the thermal network, and by permitting user comments anywhere within the DATA blocks.
The Japan Lung Cancer Society-Japanese Society for Radiation Oncology consensus-based computed tomographic atlas for defining regional lymph node stations in radiotherapy for lung cancer.

PubMed

Itazawa, Tomoko; Tamaki, Yukihisa; Komiyama, Takafumi; Nishimura, Yasumasa; Nakayama, Yuko; Ito, Hiroyuki; Ohde, Yasuhisa; Kusumoto, Masahiko; Sakai, Shuji; Suzuki, Kenji; Watanabe, Hirokazu; Asamura, Hisao

2017-01-01

The purpose of this study was to develop a consensus-based computed tomographic (CT) atlas that defines lymph node stations in radiotherapy for lung cancer based on the lymph node map of the International Association for the Study of Lung Cancer (IASLC). A project group in the Japanese Radiation Oncology Study Group (JROSG) initially prepared a draft of the atlas in which lymph node Stations 1-11 were illustrated on axial CT images. Subsequently, a joint committee of the Japan Lung Cancer Society (JLCS) and the Japanese Society for Radiation Oncology (JASTRO) was formulated to revise this draft. The committee consisted of four radiation oncologists, four thoracic surgeons and three thoracic radiologists. The draft prepared by the JROSG project group was intensively reviewed and discussed at four meetings of the committee over several months. Finally, we proposed definitions for the regional lymph node stations and the consensus-based CT atlas. This atlas was approved by the Board of Directors of JLCS and JASTRO. This resulted in the first official CT atlas for defining regional lymph node stations in radiotherapy for lung cancer authorized by the JLCS and JASTRO. In conclusion, the JLCS-JASTRO consensus-based CT atlas, which conforms to the IASLC lymph node map, was established. © The Author 2016. Published by Oxford University Press on behalf of The Japan Radiation Research Society and Japanese Society for Radiation Oncology.
Solar Array Panels With Dust-Removal Capability

NASA Technical Reports Server (NTRS)

Dawson, Stephen; Mardesich, Nick; Spence, Brian; White, Steve

2004-01-01

It has been proposed to incorporate piezoelectric vibrational actuators into the structural supports of solar photovoltaic panels, for the purpose of occasionally inducing vibrations in the panels in order to loosen accumulated dust. Provided that the panels were tilted, the loosened dust would slide off under its own weight. Originally aimed at preventing obscuration of photovoltaic cells by dust accumulating in the Martian environment, the proposal may also offer an option for the design of solar photovoltaic panels for unattended operation at remote locations on Earth. The figure depicts a typical lightweight solar photovoltaic panel comprising a backside grid of structural spars that support a thin face sheet that, in turn, supports an array of photovoltaic cells on the front side. The backside structure includes node points where several spars intersect. According to the proposal, piezoelectric buzzers would be attached to the node points. The process of designing the panel would be an iterative one that would include computational simulation of the vibrations by use of finite- element analysis to guide the selection of the vibrational frequency of the actuators and the cross sections of the spars to maximize the agitation of dust.
Intelligent self-organization methods for wireless ad hoc sensor networks based on limited resources

NASA Astrophysics Data System (ADS)

Hortos, William S.

2006-05-01

A wireless ad hoc sensor network (WSN) is a configuration for area surveillance that affords rapid, flexible deployment in arbitrary threat environments. There is no infrastructure support and sensor nodes communicate with each other only when they are in transmission range. To a greater degree than the terminals found in mobile ad hoc networks (MANETs) for communications, sensor nodes are resource-constrained, with limited computational processing, bandwidth, memory, and power, and are typically unattended once in operation. Consequently, the level of information exchange among nodes, to support any complex adaptive algorithms to establish network connectivity and optimize throughput, not only deplete those limited resources and creates high overhead in narrowband communications, but also increase network vulnerability to eavesdropping by malicious nodes. Cooperation among nodes, critical to the mission of sensor networks, can thus be disrupted by the inappropriate choice of the method for self-organization. Recent published contributions to the self-configuration of ad hoc sensor networks, e.g., self-organizing mapping and swarm intelligence techniques, have been based on the adaptive control of the cross-layer interactions found in MANET protocols to achieve one or more performance objectives: connectivity, intrusion resistance, power control, throughput, and delay. However, few studies have examined the performance of these algorithms when implemented with the limited resources of WSNs. In this paper, self-organization algorithms for the initiation, operation and maintenance of a network topology from a collection of wireless sensor nodes are proposed that improve the performance metrics significant to WSNs. The intelligent algorithm approach emphasizes low computational complexity, energy efficiency and robust adaptation to change, allowing distributed implementation with the actual limited resources of the cooperative nodes of the network. Extensions of the algorithms from flat topologies to two-tier hierarchies of sensor nodes are presented. Results from a few simulations of the proposed algorithms are compared to the published results of other approaches to sensor network self-organization in common scenarios. The estimated network lifetime and extent under static resource allocations are computed.
The explicit computation of integration algorithms and first integrals for ordinary differential equations with polynomials coefficients using trees

NASA Technical Reports Server (NTRS)

Crouch, P. E.; Grossman, Robert

1992-01-01

This note is concerned with the explicit symbolic computation of expressions involving differential operators and their actions on functions. The derivation of specialized numerical algorithms, the explicit symbolic computation of integrals of motion, and the explicit computation of normal forms for nonlinear systems all require such computations. More precisely, if R = k(x(sub 1),...,x(sub N)), where k = R or C, F denotes a differential operator with coefficients from R, and g member of R, we describe data structures and algorithms for efficiently computing g. The basic idea is to impose a multiplicative structure on the vector space with basis the set of finite rooted trees and whose nodes are labeled with the coefficients of the differential operators. Cancellations of two trees with r + 1 nodes translates into cancellation of O(N(exp r)) expressions involving the coefficient functions and their derivatives.
Integration of communications and tracking data processing simulation for space station

NASA Technical Reports Server (NTRS)

Lacovara, Robert C.

1987-01-01

A simplified model of the communications network for the Communications and Tracking Data Processing System (CTDP) was developed. It was simulated by use of programs running on several on-site computers. These programs communicate with one another by means of both local area networks and direct serial connections. The domain of the model and its simulation is from Orbital Replaceable Unit (ORU) interface to Data Management Systems (DMS). The simulation was designed to allow status queries from remote entities across the DMS networks to be propagated through the model to several simulated ORU's. The ORU response is then propagated back to the remote entity which originated the request. Response times at the various levels were investigated in a multi-tasking, multi-user operating system environment. Results indicate that the effective bandwidth of the system may be too low to support expected data volume requirements under conventional operating systems. Instead, some form of embedded process control program may be required on the node computers.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.