XVis: Visualization for the Extreme-Scale Scientific-Computation Ecosystem: Mid-year report FY17 Q2
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moreland, Kenneth D.; Pugmire, David; Rogers, David
The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. This project provides the necessary research and infrastructure for scientific discovery in this new computational ecosystem by addressingmore » four interlocking challenges: emerging processor technology, in situ integration, usability, and proxy analysis.« less
XVis: Visualization for the Extreme-Scale Scientific-Computation Ecosystem: Year-end report FY17.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moreland, Kenneth D.; Pugmire, David; Rogers, David
The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. This project provides the necessary research and infrastructure for scientific discovery in this new computational ecosystem by addressingmore » four interlocking challenges: emerging processor technology, in situ integration, usability, and proxy analysis.« less
XVis: Visualization for the Extreme-Scale Scientific-Computation Ecosystem. Mid-year report FY16 Q2
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moreland, Kenneth D.; Sewell, Christopher; Childs, Hank
The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. This project provides the necessary research and infrastructure for scientific discovery in this new computational ecosystem by addressingmore » four interlocking challenges: emerging processor technology, in situ integration, usability, and proxy analysis.« less
XVis: Visualization for the Extreme-Scale Scientific-Computation Ecosystem: Year-end report FY15 Q4.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moreland, Kenneth D.; Sewell, Christopher; Childs, Hank
The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. This project provides the necessary research and infrastructure for scientific discovery in this new computational ecosystem by addressingmore » four interlocking challenges: emerging processor technology, in situ integration, usability, and proxy analysis.« less
EPA uses high-end scientific computing, geospatial services and remote sensing/imagery analysis to support EPA's mission. The Center for Environmental Computing (CEC) assists the Agency's program offices and regions to meet staff needs in these areas.
Scholarly literature and the press: scientific impact and social perception of physics computing
NASA Astrophysics Data System (ADS)
Pia, M. G.; Basaglia, T.; Bell, Z. W.; Dressendorfer, P. V.
2014-06-01
The broad coverage of the search for the Higgs boson in the mainstream media is a relative novelty for high energy physics (HEP) research, whose achievements have traditionally been limited to scholarly literature. This paper illustrates the results of a scientometric analysis of HEP computing in scientific literature, institutional media and the press, and a comparative overview of similar metrics concerning representative particle physics measurements. The picture emerging from these scientometric data documents the relationship between the scientific impact and the social perception of HEP physics research versus that of HEP computing. The results of this analysis suggest that improved communication of the scientific and social role of HEP computing via press releases from the major HEP laboratories would be beneficial to the high energy physics community.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Geveci, Berk; Maynard, Robert
The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. The XVis project brought together collaborators from predominant DOE projects for visualization on accelerators and combining their respectivemore » features into a new visualization toolkit called VTK-m.« less
NASA Astrophysics Data System (ADS)
Khodachenko, Maxim; Miller, Steven; Stoeckler, Robert; Topf, Florian
2010-05-01
Computational modeling and observational data analysis are two major aspects of the modern scientific research. Both appear nowadays under extensive development and application. Many of the scientific goals of planetary space missions require robust models of planetary objects and environments as well as efficient data analysis algorithms, to predict conditions for mission planning and to interpret the experimental data. Europe has great strength in these areas, but it is insufficiently coordinated; individual groups, models, techniques and algorithms need to be coupled and integrated. Existing level of scientific cooperation and the technical capabilities for operative communication, allow considerable progress in the development of a distributed international Research Infrastructure (RI) which is based on the existing in Europe computational modelling and data analysis centers, providing the scientific community with dedicated services in the fields of their computational and data analysis expertise. These services will appear as a product of the collaborative communication and joint research efforts of the numerical and data analysis experts together with planetary scientists. The major goal of the EUROPLANET-RI / EMDAF is to make computational models and data analysis algorithms associated with particular national RIs and teams, as well as their outputs, more readily available to their potential user community and more tailored to scientific user requirements, without compromising front-line specialized research on model and data analysis algorithms development and software implementation. This objective will be met through four keys subdivisions/tasks of EMAF: 1) an Interactive Catalogue of Planetary Models; 2) a Distributed Planetary Modelling Laboratory; 3) a Distributed Data Analysis Laboratory, and 4) enabling Models and Routines for High Performance Computing Grids. Using the advantages of the coordinated operation and efficient communication between the involved computational modelling, research and data analysis expert teams and their related research infrastructures, EMDAF will provide a 1) flexible, 2) scientific user oriented, 3) continuously developing and fast upgrading computational and data analysis service to support and intensify the European planetary scientific research. At the beginning EMDAF will create a set of demonstrators and operational tests of this service in key areas of European planetary science. This work will aim at the following objectives: (a) Development and implementation of tools for distant interactive communication between the planetary scientists and computing experts (including related RIs); (b) Development of standard routine packages, and user-friendly interfaces for operation of the existing numerical codes and data analysis algorithms by the specialized planetary scientists; (c) Development of a prototype of numerical modelling services "on demand" for space missions and planetary researchers; (d) Development of a prototype of data analysis services "on demand" for space missions and planetary researchers; (e) Development of a prototype of coordinated interconnected simulations of planetary phenomena and objects (global multi-model simulators); (f) Providing the demonstrators of a coordinated use of high performance computing facilities (super-computer networks), done in cooperation with European HPC Grid DEISA.
Hahn, P; Dullweber, F; Unglaub, F; Spies, C K
2014-06-01
Searching for relevant publications is becoming more difficult with the increasing number of scientific articles. Text mining as a specific form of computer-based data analysis may be helpful in this context. Highlighting relations between authors and finding relevant publications concerning a specific subject using text analysis programs are illustrated graphically by 2 performed examples. © Georg Thieme Verlag KG Stuttgart · New York.
ERIC Educational Resources Information Center
Tuncer, Murat
2013-01-01
Present research investigates reciprocal relations amidst computer self-efficacy, scientific research and information literacy self-efficacy. Research findings have demonstrated that according to standardized regression coefficients, computer self-efficacy has a positive effect on information literacy self-efficacy. Likewise it has been detected…
ERIC Educational Resources Information Center
Hansen, John; Barnett, Michael; MaKinster, James; Keating, Thomas
2004-01-01
The increased availability of computational modeling software has created opportunities for students to engage in scientific inquiry through constructing computer-based models of scientific phenomena. However, despite the growing trend of integrating technology into science curricula, educators need to understand what aspects of these technologies…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Prowell, Stacy J; Symons, Christopher T
2015-01-01
Producing trusted results from high-performance codes is essential for policy and has significant economic impact. We propose combining rigorous analytical methods with machine learning techniques to achieve the goal of repeatable, trustworthy scientific computing.
The emergence of spatial cyberinfrastructure.
Wright, Dawn J; Wang, Shaowen
2011-04-05
Cyberinfrastructure integrates advanced computer, information, and communication technologies to empower computation-based and data-driven scientific practice and improve the synthesis and analysis of scientific data in a collaborative and shared fashion. As such, it now represents a paradigm shift in scientific research that has facilitated easy access to computational utilities and streamlined collaboration across distance and disciplines, thereby enabling scientific breakthroughs to be reached more quickly and efficiently. Spatial cyberinfrastructure seeks to resolve longstanding complex problems of handling and analyzing massive and heterogeneous spatial datasets as well as the necessity and benefits of sharing spatial data flexibly and securely. This article provides an overview and potential future directions of spatial cyberinfrastructure. The remaining four articles of the special feature are introduced and situated in the context of providing empirical examples of how spatial cyberinfrastructure is extending and enhancing scientific practice for improved synthesis and analysis of both physical and social science data. The primary focus of the articles is spatial analyses using distributed and high-performance computing, sensor networks, and other advanced information technology capabilities to transform massive spatial datasets into insights and knowledge.
The emergence of spatial cyberinfrastructure
Wright, Dawn J.; Wang, Shaowen
2011-01-01
Cyberinfrastructure integrates advanced computer, information, and communication technologies to empower computation-based and data-driven scientific practice and improve the synthesis and analysis of scientific data in a collaborative and shared fashion. As such, it now represents a paradigm shift in scientific research that has facilitated easy access to computational utilities and streamlined collaboration across distance and disciplines, thereby enabling scientific breakthroughs to be reached more quickly and efficiently. Spatial cyberinfrastructure seeks to resolve longstanding complex problems of handling and analyzing massive and heterogeneous spatial datasets as well as the necessity and benefits of sharing spatial data flexibly and securely. This article provides an overview and potential future directions of spatial cyberinfrastructure. The remaining four articles of the special feature are introduced and situated in the context of providing empirical examples of how spatial cyberinfrastructure is extending and enhancing scientific practice for improved synthesis and analysis of both physical and social science data. The primary focus of the articles is spatial analyses using distributed and high-performance computing, sensor networks, and other advanced information technology capabilities to transform massive spatial datasets into insights and knowledge. PMID:21467227
Space and Earth Sciences, Computer Systems, and Scientific Data Analysis Support, Volume 1
NASA Technical Reports Server (NTRS)
Estes, Ronald H. (Editor)
1993-01-01
This Final Progress Report covers the specific technical activities of Hughes STX Corporation for the last contract triannual period of 1 June through 30 Sep. 1993, in support of assigned task activities at Goddard Space Flight Center (GSFC). It also provides a brief summary of work throughout the contract period of performance on each active task. Technical activity is presented in Volume 1, while financial and level-of-effort data is presented in Volume 2. Technical support was provided to all Division and Laboratories of Goddard's Space Sciences and Earth Sciences Directorates. Types of support include: scientific programming, systems programming, computer management, mission planning, scientific investigation, data analysis, data processing, data base creation and maintenance, instrumentation development, and management services. Mission and instruments supported include: ROSAT, Astro-D, BBXRT, XTE, AXAF, GRO, COBE, WIND, UIT, SMM, STIS, HEIDI, DE, URAP, CRRES, Voyagers, ISEE, San Marco, LAGEOS, TOPEX/Poseidon, Pioneer-Venus, Galileo, Cassini, Nimbus-7/TOMS, Meteor-3/TOMS, FIFE, BOREAS, TRMM, AVHRR, and Landsat. Accomplishments include: development of computing programs for mission science and data analysis, supercomputer applications support, computer network support, computational upgrades for data archival and analysis centers, end-to-end management for mission data flow, scientific modeling and results in the fields of space and Earth physics, planning and design of GSFC VO DAAC and VO IMS, fabrication, assembly, and testing of mission instrumentation, and design of mission operations center.
NASA Astrophysics Data System (ADS)
Fiala, L.; Lokajicek, M.; Tumova, N.
2015-05-01
This volume of the IOP Conference Series is dedicated to scientific contributions presented at the 16th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT 2014), this year the motto was ''bridging disciplines''. The conference took place on September 1-5, 2014, at the Faculty of Civil Engineering, Czech Technical University in Prague, Czech Republic. The 16th edition of ACAT explored the boundaries of computing system architectures, data analysis algorithmics, automatic calculations, and theoretical calculation technologies. It provided a forum for confronting and exchanging ideas among these fields, where new approaches in computing technologies for scientific research were explored and promoted. This year's edition of the workshop brought together over 140 participants from all over the world. The workshop's 16 invited speakers presented key topics on advanced computing and analysis techniques in physics. During the workshop, 60 talks and 40 posters were presented in three tracks: Computing Technology for Physics Research, Data Analysis - Algorithms and Tools, and Computations in Theoretical Physics: Techniques and Methods. The round table enabled discussions on expanding software, knowledge sharing and scientific collaboration in the respective areas. ACAT 2014 was generously sponsored by Western Digital, Brookhaven National Laboratory, Hewlett Packard, DataDirect Networks, M Computers, Bright Computing, Huawei and PDV-Systemhaus. Special appreciations go to the track liaisons Lorenzo Moneta, Axel Naumann and Grigory Rubtsov for their work on the scientific program and the publication preparation. ACAT's IACC would also like to express its gratitude to all referees for their work on making sure the contributions are published in the proceedings. Our thanks extend to the conference liaisons Andrei Kataev and Jerome Lauret who worked with the local contacts and made this conference possible as well as to the program coordinator Federico Carminati and the conference chair Denis Perret-Gallix for their global supervision. Further information on ACAT 2014 can be found at http://www.particle.cz/acat2014
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schlicher, Bob G; Kulesz, James J; Abercrombie, Robert K
A principal tenant of the scientific method is that experiments must be repeatable and relies on ceteris paribus (i.e., all other things being equal). As a scientific community, involved in data sciences, we must investigate ways to establish an environment where experiments can be repeated. We can no longer allude to where the data comes from, we must add rigor to the data collection and management process from which our analysis is conducted. This paper describes a computing environment to support repeatable scientific big data experimentation of world-wide scientific literature, and recommends a system that is housed at the Oakmore » Ridge National Laboratory in order to provide value to investigators from government agencies, academic institutions, and industry entities. The described computing environment also adheres to the recently instituted digital data management plan mandated by multiple US government agencies, which involves all stages of the digital data life cycle including capture, analysis, sharing, and preservation. It particularly focuses on the sharing and preservation of digital research data. The details of this computing environment are explained within the context of cloud services by the three layer classification of Software as a Service , Platform as a Service , and Infrastructure as a Service .« less
Data, Analysis, and Visualization | Computational Science | NREL
Data, Analysis, and Visualization Data, Analysis, and Visualization Data management, data analysis . At NREL, our data management, data analysis, and scientific visualization capabilities help move the approaches to image analysis and computer vision. Data Management and Big Data Systems, software, and tools
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lingerfelt, Eric J; Endeve, Eirik; Hui, Yawei
Improvements in scientific instrumentation allow imaging at mesoscopic to atomic length scales, many spectroscopic modes, and now--with the rise of multimodal acquisition systems and the associated processing capability--the era of multidimensional, informationally dense data sets has arrived. Technical issues in these combinatorial scientific fields are exacerbated by computational challenges best summarized as a necessity for drastic improvement in the capability to transfer, store, and analyze large volumes of data. The Bellerophon Environment for Analysis of Materials (BEAM) platform provides material scientists the capability to directly leverage the integrated computational and analytical power of High Performance Computing (HPC) to perform scalablemore » data analysis and simulation and manage uploaded data files via an intuitive, cross-platform client user interface. This framework delivers authenticated, "push-button" execution of complex user workflows that deploy data analysis algorithms and computational simulations utilizing compute-and-data cloud infrastructures and HPC environments like Titan at the Oak Ridge Leadershp Computing Facility (OLCF).« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Saffer, Shelley
2014-12-01
This is a final report of the DOE award DE-SC0001132, Advanced Artificial Science. The development of an artificial science and engineering research infrastructure to facilitate innovative computational modeling, analysis, and application to interdisciplinary areas of scientific investigation. This document describes the achievements of the goals, and resulting research made possible by this award.
OMPC: an Open-Source MATLAB®-to-Python Compiler
Jurica, Peter; van Leeuwen, Cees
2008-01-01
Free access to scientific information facilitates scientific progress. Open-access scientific journals are a first step in this direction; a further step is to make auxiliary and supplementary materials that accompany scientific publications, such as methodological procedures and data-analysis tools, open and accessible to the scientific community. To this purpose it is instrumental to establish a software base, which will grow toward a comprehensive free and open-source language of technical and scientific computing. Endeavors in this direction are met with an important obstacle. MATLAB®, the predominant computation tool in many fields of research, is a closed-source commercial product. To facilitate the transition to an open computation platform, we propose Open-source MATLAB®-to-Python Compiler (OMPC), a platform that uses syntax adaptation and emulation to allow transparent import of existing MATLAB® functions into Python programs. The imported MATLAB® modules will run independently of MATLAB®, relying on Python's numerical and scientific libraries. Python offers a stable and mature open source platform that, in many respects, surpasses commonly used, expensive commercial closed source packages. The proposed software will therefore facilitate the transparent transition towards a free and general open-source lingua franca for scientific computation, while enabling access to the existing methods and algorithms of technical computing already available in MATLAB®. OMPC is available at http://ompc.juricap.com. PMID:19225577
Whole earth modeling: developing and disseminating scientific software for computational geophysics.
NASA Astrophysics Data System (ADS)
Kellogg, L. H.
2016-12-01
Historically, a great deal of specialized scientific software for modeling and data analysis has been developed by individual researchers or small groups of scientists working on their own specific research problems. As the magnitude of available data and computer power has increased, so has the complexity of scientific problems addressed by computational methods, creating both a need to sustain existing scientific software, and expand its development to take advantage of new algorithms, new software approaches, and new computational hardware. To that end, communities like the Computational Infrastructure for Geodynamics (CIG) have been established to support the use of best practices in scientific computing for solid earth geophysics research and teaching. Working as a scientific community enables computational geophysicists to take advantage of technological developments, improve the accuracy and performance of software, build on prior software development, and collaborate more readily. The CIG community, and others, have adopted an open-source development model, in which code is developed and disseminated by the community in an open fashion, using version control and software repositories like Git. One emerging issue is how to adequately identify and credit the intellectual contributions involved in creating open source scientific software. The traditional method of disseminating scientific ideas, peer reviewed publication, was not designed for review or crediting scientific software, although emerging publication strategies such software journals are attempting to address the need. We are piloting an integrated approach in which authors are identified and credited as scientific software is developed and run. Successful software citation requires integration with the scholarly publication and indexing mechanisms as well, to assign credit, ensure discoverability, and provide provenance for software.
Analysis of the flight dynamics of the Solar Maximum Mission (SMM) off-sun scientific pointing
NASA Technical Reports Server (NTRS)
Pitone, D. S.; Klein, J. R.
1989-01-01
Algorithms are presented which were created and implemented by the Goddard Space Flight Center's (GSFC's) Solar Maximum Mission (SMM) attitude operations team to support large-angle spacecraft pointing at scientific objectives. The mission objective of the post-repair SMM satellite was to study solar phenomena. However, because the scientific instruments, such as the Coronagraph/Polarimeter (CP) and the Hard X ray Burst Spectrometer (HXRBS), were able to view objects other than the Sun, attitude operations support for attitude pointing at large angles from the nominal solar-pointing attitudes was required. Subsequently, attitude support for SMM was provided for scientific objectives such as Comet Halley, Supernova 1987A, Cygnus X-1, and the Crab Nebula. In addition, the analysis was extended to include the reverse problem, computing the right ascension and declination of a body given the off-Sun angles. This analysis led to the computation of the orbits of seven new solar comets seen in the field-of-view (FOV) of the CP. The activities necessary to meet these large-angle attitude-pointing sequences, such as slew sequence planning, viewing-period prediction, and tracking-bias computation are described. Analysis is presented for the computation of maneuvers and pointing parameters relative to the SMM-unique, Sun-centered reference frame. Finally, science data and independent attitude solutions are used to evaluate the large-angle pointing performance.
Analysis of the flight dynamics of the Solar Maximum Mission (SMM) off-sun scientific pointing
NASA Technical Reports Server (NTRS)
Pitone, D. S.; Klein, J. R.; Twambly, B. J.
1990-01-01
Algorithms are presented which were created and implemented by the Goddard Space Flight Center's (GSFC's) Solar Maximum Mission (SMM) attitude operations team to support large-angle spacecraft pointing at scientific objectives. The mission objective of the post-repair SMM satellite was to study solar phenomena. However, because the scientific instruments, such as the Coronagraph/Polarimeter (CP) and the Hard X-ray Burst Spectrometer (HXRBS), were able to view objects other than the Sun, attitude operations support for attitude pointing at large angles from the nominal solar-pointing attitudes was required. Subsequently, attitude support for SMM was provided for scientific objectives such as Comet Halley, Supernova 1987A, Cygnus X-1, and the Crab Nebula. In addition, the analysis was extended to include the reverse problem, computing the right ascension and declination of a body given the off-Sun angles. This analysis led to the computation of the orbits of seven new solar comets seen in the field-of-view (FOV) of the CP. The activities necessary to meet these large-angle attitude-pointing sequences, such as slew sequence planning, viewing-period prediction, and tracking-bias computation are described. Analysis is presented for the computation of maneuvers and pointing parameters relative to the SMM-unique, Sun-centered reference frame. Finally, science data and independent attitude solutions are used to evaluate the larg-angle pointing performance.
Adaptation of XMM-Newton SAS to GRID and VO architectures via web
NASA Astrophysics Data System (ADS)
Ibarra, A.; de La Calle, I.; Gabriel, C.; Salgado, J.; Osuna, P.
2008-10-01
The XMM-Newton Scientific Analysis Software (SAS) is a robust software that has allowed users to produce good scientific results since the beginning of the mission. This has been possible given the SAS capability to evolve with the advent of new technologies and adapt to the needs of the scientific community. The prototype of the Remote Interface for Science Analysis (RISA) presented here, is one such example, which provides remote analysis of XMM-Newton data with access to all the existing SAS functionality, while making use of GRID computing technology. This new technology has recently emerged within the astrophysical community to tackle the ever lasting problem of computer power for the reduction of large amounts of data.
PANORAMA: An approach to performance modeling and diagnosis of extreme-scale workflows
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deelman, Ewa; Carothers, Christopher; Mandal, Anirban
Here we report that computational science is well established as the third pillar of scientific discovery and is on par with experimentation and theory. However, as we move closer toward the ability to execute exascale calculations and process the ensuing extreme-scale amounts of data produced by both experiments and computations alike, the complexity of managing the compute and data analysis tasks has grown beyond the capabilities of domain scientists. Therefore, workflow management systems are absolutely necessary to ensure current and future scientific discoveries. A key research question for these workflow management systems concerns the performance optimization of complex calculation andmore » data analysis tasks. The central contribution of this article is a description of the PANORAMA approach for modeling and diagnosing the run-time performance of complex scientific workflows. This approach integrates extreme-scale systems testbed experimentation, structured analytical modeling, and parallel systems simulation into a comprehensive workflow framework called Pegasus for understanding and improving the overall performance of complex scientific workflows.« less
PANORAMA: An approach to performance modeling and diagnosis of extreme-scale workflows
Deelman, Ewa; Carothers, Christopher; Mandal, Anirban; ...
2015-07-14
Here we report that computational science is well established as the third pillar of scientific discovery and is on par with experimentation and theory. However, as we move closer toward the ability to execute exascale calculations and process the ensuing extreme-scale amounts of data produced by both experiments and computations alike, the complexity of managing the compute and data analysis tasks has grown beyond the capabilities of domain scientists. Therefore, workflow management systems are absolutely necessary to ensure current and future scientific discoveries. A key research question for these workflow management systems concerns the performance optimization of complex calculation andmore » data analysis tasks. The central contribution of this article is a description of the PANORAMA approach for modeling and diagnosing the run-time performance of complex scientific workflows. This approach integrates extreme-scale systems testbed experimentation, structured analytical modeling, and parallel systems simulation into a comprehensive workflow framework called Pegasus for understanding and improving the overall performance of complex scientific workflows.« less
The role of dedicated data computing centers in the age of cloud computing
NASA Astrophysics Data System (ADS)
Caramarcu, Costin; Hollowell, Christopher; Strecker-Kellogg, William; Wong, Antonio; Zaytsev, Alexandr
2017-10-01
Brookhaven National Laboratory (BNL) anticipates significant growth in scientific programs with large computing and data storage needs in the near future and has recently reorganized support for scientific computing to meet these needs. A key component is the enhanced role of the RHIC-ATLAS Computing Facility (RACF) in support of high-throughput and high-performance computing (HTC and HPC) at BNL. This presentation discusses the evolving role of the RACF at BNL, in light of its growing portfolio of responsibilities and its increasing integration with cloud (academic and for-profit) computing activities. We also discuss BNL’s plan to build a new computing center to support the new responsibilities of the RACF and present a summary of the cost benefit analysis done, including the types of computing activities that benefit most from a local data center vs. cloud computing. This analysis is partly based on an updated cost comparison of Amazon EC2 computing services and the RACF, which was originally conducted in 2012.
OMPC: an Open-Source MATLAB-to-Python Compiler.
Jurica, Peter; van Leeuwen, Cees
2009-01-01
Free access to scientific information facilitates scientific progress. Open-access scientific journals are a first step in this direction; a further step is to make auxiliary and supplementary materials that accompany scientific publications, such as methodological procedures and data-analysis tools, open and accessible to the scientific community. To this purpose it is instrumental to establish a software base, which will grow toward a comprehensive free and open-source language of technical and scientific computing. Endeavors in this direction are met with an important obstacle. MATLAB((R)), the predominant computation tool in many fields of research, is a closed-source commercial product. To facilitate the transition to an open computation platform, we propose Open-source MATLAB((R))-to-Python Compiler (OMPC), a platform that uses syntax adaptation and emulation to allow transparent import of existing MATLAB((R)) functions into Python programs. The imported MATLAB((R)) modules will run independently of MATLAB((R)), relying on Python's numerical and scientific libraries. Python offers a stable and mature open source platform that, in many respects, surpasses commonly used, expensive commercial closed source packages. The proposed software will therefore facilitate the transparent transition towards a free and general open-source lingua franca for scientific computation, while enabling access to the existing methods and algorithms of technical computing already available in MATLAB((R)). OMPC is available at http://ompc.juricap.com.
NASA Astrophysics Data System (ADS)
Wang, Jianxiong
2014-06-01
This volume of Journal of Physics: Conference Series is dedicated to scientific contributions presented at the 15th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT 2013) which took place on 16-21 May 2013 at the Institute of High Energy Physics, Chinese Academy of Sciences, Beijing, China. The workshop series brings together computer science researchers and practitioners, and researchers from particle physics and related fields to explore and confront the boundaries of computing and of automatic data analysis and theoretical calculation techniques. This year's edition of the workshop brought together over 120 participants from all over the world. 18 invited speakers presented key topics on the universe in computer, Computing in Earth Sciences, multivariate data analysis, automated computation in Quantum Field Theory as well as computing and data analysis challenges in many fields. Over 70 other talks and posters presented state-of-the-art developments in the areas of the workshop's three tracks: Computing Technologies, Data Analysis Algorithms and Tools, and Computational Techniques in Theoretical Physics. The round table discussions on open-source, knowledge sharing and scientific collaboration stimulate us to think over the issue in the respective areas. ACAT 2013 was generously sponsored by the Chinese Academy of Sciences (CAS), National Natural Science Foundation of China (NFSC), Brookhaven National Laboratory in the USA (BNL), Peking University (PKU), Theoretical Physics Cernter for Science facilities of CAS (TPCSF-CAS) and Sugon. We would like to thank all the participants for their scientific contributions and for the en- thusiastic participation in all its activities of the workshop. Further information on ACAT 2013 can be found at http://acat2013.ihep.ac.cn. Professor Jianxiong Wang Institute of High Energy Physics Chinese Academy of Science Details of committees and sponsors are available in the PDF
Stone, John E; Hallock, Michael J; Phillips, James C; Peterson, Joseph R; Luthey-Schulten, Zaida; Schulten, Klaus
2016-05-01
Many of the continuing scientific advances achieved through computational biology are predicated on the availability of ongoing increases in computational power required for detailed simulation and analysis of cellular processes on biologically-relevant timescales. A critical challenge facing the development of future exascale supercomputer systems is the development of new computing hardware and associated scientific applications that dramatically improve upon the energy efficiency of existing solutions, while providing increased simulation, analysis, and visualization performance. Mobile computing platforms have recently become powerful enough to support interactive molecular visualization tasks that were previously only possible on laptops and workstations, creating future opportunities for their convenient use for meetings, remote collaboration, and as head mounted displays for immersive stereoscopic viewing. We describe early experiences adapting several biomolecular simulation and analysis applications for emerging heterogeneous computing platforms that combine power-efficient system-on-chip multi-core CPUs with high-performance massively parallel GPUs. We present low-cost power monitoring instrumentation that provides sufficient temporal resolution to evaluate the power consumption of individual CPU algorithms and GPU kernels. We compare the performance and energy efficiency of scientific applications running on emerging platforms with results obtained on traditional platforms, identify hardware and algorithmic performance bottlenecks that affect the usability of these platforms, and describe avenues for improving both the hardware and applications in pursuit of the needs of molecular modeling tasks on mobile devices and future exascale computers.
Northwest Trajectory Analysis Capability: A Platform for Enhancing Computational Biophysics Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peterson, Elena S.; Stephan, Eric G.; Corrigan, Abigail L.
2008-07-30
As computational resources continue to increase, the ability of computational simulations to effectively complement, and in some cases replace, experimentation in scientific exploration also increases. Today, large-scale simulations are recognized as an effective tool for scientific exploration in many disciplines including chemistry and biology. A natural side effect of this trend has been the need for an increasingly complex analytical environment. In this paper, we describe Northwest Trajectory Analysis Capability (NTRAC), an analytical software suite developed to enhance the efficiency of computational biophysics analyses. Our strategy is to layer higher-level services and introduce improved tools within the user’s familiar environmentmore » without preventing researchers from using traditional tools and methods. Our desire is to share these experiences to serve as an example for effectively analyzing data intensive large scale simulation data.« less
GISpark: A Geospatial Distributed Computing Platform for Spatiotemporal Big Data
NASA Astrophysics Data System (ADS)
Wang, S.; Zhong, E.; Wang, E.; Zhong, Y.; Cai, W.; Li, S.; Gao, S.
2016-12-01
Geospatial data are growing exponentially because of the proliferation of cost effective and ubiquitous positioning technologies such as global remote-sensing satellites and location-based devices. Analyzing large amounts of geospatial data can provide great value for both industrial and scientific applications. Data- and compute- intensive characteristics inherent in geospatial big data increasingly pose great challenges to technologies of data storing, computing and analyzing. Such challenges require a scalable and efficient architecture that can store, query, analyze, and visualize large-scale spatiotemporal data. Therefore, we developed GISpark - a geospatial distributed computing platform for processing large-scale vector, raster and stream data. GISpark is constructed based on the latest virtualized computing infrastructures and distributed computing architecture. OpenStack and Docker are used to build multi-user hosting cloud computing infrastructure for GISpark. The virtual storage systems such as HDFS, Ceph, MongoDB are combined and adopted for spatiotemporal data storage management. Spark-based algorithm framework is developed for efficient parallel computing. Within this framework, SuperMap GIScript and various open-source GIS libraries can be integrated into GISpark. GISpark can also integrated with scientific computing environment (e.g., Anaconda), interactive computing web applications (e.g., Jupyter notebook), and machine learning tools (e.g., TensorFlow/Orange). The associated geospatial facilities of GISpark in conjunction with the scientific computing environment, exploratory spatial data analysis tools, temporal data management and analysis systems make up a powerful geospatial computing tool. GISpark not only provides spatiotemporal big data processing capacity in the geospatial field, but also provides spatiotemporal computational model and advanced geospatial visualization tools that deals with other domains related with spatial property. We tested the performance of the platform based on taxi trajectory analysis. Results suggested that GISpark achieves excellent run time performance in spatiotemporal big data applications.
Long-term Stable Conservative Multiscale Methods for Vortex Flows
2017-10-31
Computational and Applied Mathematics and Engeneering, Eccomas 2016 (Crete, June, 2016) - M. A. Olshanskii, Scientific computing seminar of Math ...UMass Dartmouth (October 2015) - L. Rebholz, Applied Math Seminar Talk, University of Alberta (October 2015) - L. Rebholz, Colloquium Talk, Scientific...Colloquium, (November 2016) - L. Rebholz, Joint Math Meetings 2017, Special session on recent advances in numerical analysis of PDEs, Atlanta GA
ERIC Educational Resources Information Center
Cottrell, William B.; And Others
The Nuclear Safety Information Center (NSIC) is a highly sophisticated scientific information center operated at Oak Ridge National Laboratory (ORNL) for the U.S. Atomic Energy Commission. Its information file, which consists of both data and bibliographic information, is computer stored and numerous programs have been developed to facilitate the…
Automated Detection of Events of Scientific Interest
NASA Technical Reports Server (NTRS)
James, Mark
2007-01-01
A report presents a slightly different perspective of the subject matter of Fusing Symbolic and Numerical Diagnostic Computations (NPO-42512), which appears elsewhere in this issue of NASA Tech Briefs. Briefly, the subject matter is the X-2000 Anomaly Detection Language, which is a developmental computing language for fusing two diagnostic computer programs one implementing a numerical analysis method, the other implementing a symbolic analysis method into a unified event-based decision analysis software system for real-time detection of events. In the case of the cited companion NASA Tech Briefs article, the contemplated events that one seeks to detect would be primarily failures or other changes that could adversely affect the safety or success of a spacecraft mission. In the case of the instant report, the events to be detected could also include natural phenomena that could be of scientific interest. Hence, the use of X- 2000 Anomaly Detection Language could contribute to a capability for automated, coordinated use of multiple sensors and sensor-output-data-processing hardware and software to effect opportunistic collection and analysis of scientific data.
1977-07-01
on an IBM 370/165 computer at The University of Kentucky using the Fortran IV, G level compiler and should be easily implemented on other computers...order as the columns of T. 3.5.3 Subroutines NROOT and EIGEN Subroutines NROOT and EIGEN are a set of subroutines from the IBM Scientific Subroutine...November 1975). [10] System/360 Scientific Subroutine Package, Version III, Fifth Edition (August 1970), IBM Corporation, Technical Publications
ERIC Educational Resources Information Center
Bekmezci, Mehmet; Celik, Ismail; Sahin, Ismail; Kiray, Ahmet; Akturk, Ahmet Oguz
2015-01-01
In this research, students' scientific attitude, computer anxiety, educational use of the Internet, academic achievement, and problematic use of the Internet are analyzed based on different variables (gender, parents' educational level and daily access to the Internet). The research group involves 361 students from two middle schools which are…
Computational science: shifting the focus from tools to models
Hinsen, Konrad
2014-01-01
Computational techniques have revolutionized many aspects of scientific research over the last few decades. Experimentalists use computation for data analysis, processing ever bigger data sets. Theoreticians compute predictions from ever more complex models. However, traditional articles do not permit the publication of big data sets or complex models. As a consequence, these crucial pieces of information no longer enter the scientific record. Moreover, they have become prisoners of scientific software: many models exist only as software implementations, and the data are often stored in proprietary formats defined by the software. In this article, I argue that this emphasis on software tools over models and data is detrimental to science in the long term, and I propose a means by which this can be reversed. PMID:25309728
Stone, John E.; Hallock, Michael J.; Phillips, James C.; Peterson, Joseph R.; Luthey-Schulten, Zaida; Schulten, Klaus
2016-01-01
Many of the continuing scientific advances achieved through computational biology are predicated on the availability of ongoing increases in computational power required for detailed simulation and analysis of cellular processes on biologically-relevant timescales. A critical challenge facing the development of future exascale supercomputer systems is the development of new computing hardware and associated scientific applications that dramatically improve upon the energy efficiency of existing solutions, while providing increased simulation, analysis, and visualization performance. Mobile computing platforms have recently become powerful enough to support interactive molecular visualization tasks that were previously only possible on laptops and workstations, creating future opportunities for their convenient use for meetings, remote collaboration, and as head mounted displays for immersive stereoscopic viewing. We describe early experiences adapting several biomolecular simulation and analysis applications for emerging heterogeneous computing platforms that combine power-efficient system-on-chip multi-core CPUs with high-performance massively parallel GPUs. We present low-cost power monitoring instrumentation that provides sufficient temporal resolution to evaluate the power consumption of individual CPU algorithms and GPU kernels. We compare the performance and energy efficiency of scientific applications running on emerging platforms with results obtained on traditional platforms, identify hardware and algorithmic performance bottlenecks that affect the usability of these platforms, and describe avenues for improving both the hardware and applications in pursuit of the needs of molecular modeling tasks on mobile devices and future exascale computers. PMID:27516922
Programmers, professors, and parasites: credit and co-authorship in computer science.
Solomon, Justin
2009-12-01
This article presents an in-depth analysis of past and present publishing practices in academic computer science to suggest the establishment of a more consistent publishing standard. Historical precedent for academic publishing in computer science is established through the study of anecdotes as well as statistics collected from databases of published computer science papers. After examining these facts alongside information about analogous publishing situations and standards in other scientific fields, the article concludes with a list of basic principles that should be adopted in any computer science publishing standard. These principles would contribute to the reliability and scientific nature of academic publications in computer science and would allow for more straightforward discourse in future publications.
Key Lessons in Building "Data Commons": The Open Science Data Cloud Ecosystem
NASA Astrophysics Data System (ADS)
Patterson, M.; Grossman, R.; Heath, A.; Murphy, M.; Wells, W.
2015-12-01
Cloud computing technology has created a shift around data and data analysis by allowing researchers to push computation to data as opposed to having to pull data to an individual researcher's computer. Subsequently, cloud-based resources can provide unique opportunities to capture computing environments used both to access raw data in its original form and also to create analysis products which may be the source of data for tables and figures presented in research publications. Since 2008, the Open Cloud Consortium (OCC) has operated the Open Science Data Cloud (OSDC), which provides scientific researchers with computational resources for storing, sharing, and analyzing large (terabyte and petabyte-scale) scientific datasets. OSDC has provided compute and storage services to over 750 researchers in a wide variety of data intensive disciplines. Recently, internal users have logged about 2 million core hours each month. The OSDC also serves the research community by colocating these resources with access to nearly a petabyte of public scientific datasets in a variety of fields also accessible for download externally by the public. In our experience operating these resources, researchers are well served by "data commons," meaning cyberinfrastructure that colocates data archives, computing, and storage infrastructure and supports essential tools and services for working with scientific data. In addition to the OSDC public data commons, the OCC operates a data commons in collaboration with NASA and is developing a data commons for NOAA datasets. As cloud-based infrastructures for distributing and computing over data become more pervasive, we ask, "What does it mean to publish data in a data commons?" Here we present the OSDC perspective and discuss several services that are key in architecting data commons, including digital identifier services.
Architectural Strategies for Enabling Data-Driven Science at Scale
NASA Astrophysics Data System (ADS)
Crichton, D. J.; Law, E. S.; Doyle, R. J.; Little, M. M.
2017-12-01
The analysis of large data collections from NASA or other agencies is often executed through traditional computational and data analysis approaches, which require users to bring data to their desktops and perform local data analysis. Alternatively, data are hauled to large computational environments that provide centralized data analysis via traditional High Performance Computing (HPC). Scientific data archives, however, are not only growing massive, but are also becoming highly distributed. Neither traditional approach provides a good solution for optimizing analysis into the future. Assumptions across the NASA mission and science data lifecycle, which historically assume that all data can be collected, transmitted, processed, and archived, will not scale as more capable instruments stress legacy-based systems. New paradigms are needed to increase the productivity and effectiveness of scientific data analysis. This paradigm must recognize that architectural and analytical choices are interrelated, and must be carefully coordinated in any system that aims to allow efficient, interactive scientific exploration and discovery to exploit massive data collections, from point of collection (e.g., onboard) to analysis and decision support. The most effective approach to analyzing a distributed set of massive data may involve some exploration and iteration, putting a premium on the flexibility afforded by the architectural framework. The framework should enable scientist users to assemble workflows efficiently, manage the uncertainties related to data analysis and inference, and optimize deep-dive analytics to enhance scalability. In many cases, this "data ecosystem" needs to be able to integrate multiple observing assets, ground environments, archives, and analytics, evolving from stewardship of measurements of data to using computational methodologies to better derive insight from the data that may be fused with other sets of data. This presentation will discuss architectural strategies, including a 2015-2016 NASA AIST Study on Big Data, for evolving scientific research towards massively distributed data-driven discovery. It will include example use cases across earth science, planetary science, and other disciplines.
Climate Change Discourse in Mass Media: Application of Computer-Assisted Content Analysis
ERIC Educational Resources Information Center
Kirilenko, Andrei P.; Stepchenkova, Svetlana O.
2012-01-01
Content analysis of mass media publications has become a major scientific method used to analyze public discourse on climate change. We propose a computer-assisted content analysis method to extract prevalent themes and analyze discourse changes over an extended period in an objective and quantifiable manner. The method includes the following: (1)…
A Disciplined Architectural Approach to Scaling Data Analysis for Massive, Scientific Data
NASA Astrophysics Data System (ADS)
Crichton, D. J.; Braverman, A. J.; Cinquini, L.; Turmon, M.; Lee, H.; Law, E.
2014-12-01
Data collections across remote sensing and ground-based instruments in astronomy, Earth science, and planetary science are outpacing scientists' ability to analyze them. Furthermore, the distribution, structure, and heterogeneity of the measurements themselves pose challenges that limit the scalability of data analysis using traditional approaches. Methods for developing science data processing pipelines, distribution of scientific datasets, and performing analysis will require innovative approaches that integrate cyber-infrastructure, algorithms, and data into more systematic approaches that can more efficiently compute and reduce data, particularly distributed data. This requires the integration of computer science, machine learning, statistics and domain expertise to identify scalable architectures for data analysis. The size of data returned from Earth Science observing satellites and the magnitude of data from climate model output, is predicted to grow into the tens of petabytes challenging current data analysis paradigms. This same kind of growth is present in astronomy and planetary science data. One of the major challenges in data science and related disciplines defining new approaches to scaling systems and analysis in order to increase scientific productivity and yield. Specific needs include: 1) identification of optimized system architectures for analyzing massive, distributed data sets; 2) algorithms for systematic analysis of massive data sets in distributed environments; and 3) the development of software infrastructures that are capable of performing massive, distributed data analysis across a comprehensive data science framework. NASA/JPL has begun an initiative in data science to address these challenges. Our goal is to evaluate how scientific productivity can be improved through optimized architectural topologies that identify how to deploy and manage the access, distribution, computation, and reduction of massive, distributed data, while managing the uncertainties of scientific conclusions derived from such capabilities. This talk will provide an overview of JPL's efforts in developing a comprehensive architectural approach to data science.
Big data computing: Building a vision for ARS information management
USDA-ARS?s Scientific Manuscript database
Improvements are needed within the ARS to increase scientific capacity and keep pace with new developments in computer technologies that support data acquisition and analysis. Enhancements in computing power and IT infrastructure are needed to provide scientists better access to high performance com...
Performance analysis of a dual-tree algorithm for computing spatial distance histograms
Chen, Shaoping; Tu, Yi-Cheng; Xia, Yuni
2011-01-01
Many scientific and engineering fields produce large volume of spatiotemporal data. The storage, retrieval, and analysis of such data impose great challenges to database systems design. Analysis of scientific spatiotemporal data often involves computing functions of all point-to-point interactions. One such analytics, the Spatial Distance Histogram (SDH), is of vital importance to scientific discovery. Recently, algorithms for efficient SDH processing in large-scale scientific databases have been proposed. These algorithms adopt a recursive tree-traversing strategy to process point-to-point distances in the visited tree nodes in batches, thus require less time when compared to the brute-force approach where all pairwise distances have to be computed. Despite the promising experimental results, the complexity of such algorithms has not been thoroughly studied. In this paper, we present an analysis of such algorithms based on a geometric modeling approach. The main technique is to transform the analysis of point counts into a problem of quantifying the area of regions where pairwise distances can be processed in batches by the algorithm. From the analysis, we conclude that the number of pairwise distances that are left to be processed decreases exponentially with more levels of the tree visited. This leads to the proof of a time complexity lower than the quadratic time needed for a brute-force algorithm and builds the foundation for a constant-time approximate algorithm. Our model is also general in that it works for a wide range of point spatial distributions, histogram types, and space-partitioning options in building the tree. PMID:21804753
37 CFR 6.1 - International schedule of classes of goods and services.
Code of Federal Regulations, 2012 CFR
2012-07-01
...; entertainment; sporting and cultural activities. 42. Scientific and technological services and research and design relating thereto; industrial analysis and research services; design and development of computer...-operated); cutlery; side arms; razors. 9. Scientific, nautical, surveying, photographic, cinematographic...
37 CFR 6.1 - International schedule of classes of goods and services.
Code of Federal Regulations, 2014 CFR
2014-07-01
...; entertainment; sporting and cultural activities. 42. Scientific and technological services and research and design relating thereto; industrial analysis and research services; design and development of computer...); cutlery; side arms; razors. 9. Scientific, nautical, surveying, photographic, cinematographic, optical...
37 CFR 6.1 - International schedule of classes of goods and services.
Code of Federal Regulations, 2013 CFR
2013-07-01
...; entertainment; sporting and cultural activities. 42. Scientific and technological services and research and design relating thereto; industrial analysis and research services; design and development of computer...); cutlery; side arms; razors. 9. Scientific, nautical, surveying, photographic, cinematographic, optical...
Haidar, Azzam; Jagode, Heike; Vaccaro, Phil; ...
2018-03-22
The emergence of power efficiency as a primary constraint in processor and system design poses new challenges concerning power and energy awareness for numerical libraries and scientific applications. Power consumption also plays a major role in the design of data centers, which may house petascale or exascale-level computing systems. At these extreme scales, understanding and improving the energy efficiency of numerical libraries and their related applications becomes a crucial part of the successful implementation and operation of the computing system. In this paper, we study and investigate the practice of controlling a compute system's power usage, and we explore howmore » different power caps affect the performance of numerical algorithms with different computational intensities. Further, we determine the impact, in terms of performance and energy usage, that these caps have on a system running scientific applications. This analysis will enable us to characterize the types of algorithms that benefit most from these power management schemes. Our experiments are performed using a set of representative kernels and several popular scientific benchmarks. Lastly, we quantify a number of power and performance measurements and draw observations and conclusions that can be viewed as a roadmap to achieving energy efficiency in the design and execution of scientific algorithms.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Haidar, Azzam; Jagode, Heike; Vaccaro, Phil
The emergence of power efficiency as a primary constraint in processor and system design poses new challenges concerning power and energy awareness for numerical libraries and scientific applications. Power consumption also plays a major role in the design of data centers, which may house petascale or exascale-level computing systems. At these extreme scales, understanding and improving the energy efficiency of numerical libraries and their related applications becomes a crucial part of the successful implementation and operation of the computing system. In this paper, we study and investigate the practice of controlling a compute system's power usage, and we explore howmore » different power caps affect the performance of numerical algorithms with different computational intensities. Further, we determine the impact, in terms of performance and energy usage, that these caps have on a system running scientific applications. This analysis will enable us to characterize the types of algorithms that benefit most from these power management schemes. Our experiments are performed using a set of representative kernels and several popular scientific benchmarks. Lastly, we quantify a number of power and performance measurements and draw observations and conclusions that can be viewed as a roadmap to achieving energy efficiency in the design and execution of scientific algorithms.« less
Evolution of the Virtualized HPC Infrastructure of Novosibirsk Scientific Center
NASA Astrophysics Data System (ADS)
Adakin, A.; Anisenkov, A.; Belov, S.; Chubarov, D.; Kalyuzhny, V.; Kaplin, V.; Korol, A.; Kuchin, N.; Lomakin, S.; Nikultsev, V.; Skovpen, K.; Sukharev, A.; Zaytsev, A.
2012-12-01
Novosibirsk Scientific Center (NSC), also known worldwide as Akademgorodok, is one of the largest Russian scientific centers hosting Novosibirsk State University (NSU) and more than 35 research organizations of the Siberian Branch of Russian Academy of Sciences including Budker Institute of Nuclear Physics (BINP), Institute of Computational Technologies, and Institute of Computational Mathematics and Mathematical Geophysics (ICM&MG). Since each institute has specific requirements on the architecture of computing farms involved in its research field, currently we've got several computing facilities hosted by NSC institutes, each optimized for a particular set of tasks, of which the largest are the NSU Supercomputer Center, Siberian Supercomputer Center (ICM&MG), and a Grid Computing Facility of BINP. A dedicated optical network with the initial bandwidth of 10 Gb/s connecting these three facilities was built in order to make it possible to share the computing resources among the research communities, thus increasing the efficiency of operating the existing computing facilities and offering a common platform for building the computing infrastructure for future scientific projects. Unification of the computing infrastructure is achieved by extensive use of virtualization technology based on XEN and KVM platforms. This contribution gives a thorough review of the present status and future development prospects for the NSC virtualized computing infrastructure and the experience gained while using it for running production data analysis jobs related to HEP experiments being carried out at BINP, especially the KEDR detector experiment at the VEPP-4M electron-positron collider.
On Establishing Big Data Wave Breakwaters with Analytics (Invited)
NASA Astrophysics Data System (ADS)
Riedel, M.
2013-12-01
The Research Data Alliance Big Data Analytics (RDA-BDA) Interest Group seeks to develop community based recommendations on feasible data analytics approaches to address scientific community needs of utilizing large quantities of data. RDA-BDA seeks to analyze different scientific domain applications and their potential use of various big data analytics techniques. A systematic classification of feasible combinations of analysis algorithms, analytical tools, data and resource characteristics and scientific queries will be covered in these recommendations. These combinations are complex since a wide variety of different data analysis algorithms exist (e.g. specific algorithms using GPUs of analyzing brain images) that need to work together with multiple analytical tools reaching from simple (iterative) map-reduce methods (e.g. with Apache Hadoop or Twister) to sophisticated higher level frameworks that leverage machine learning algorithms (e.g. Apache Mahout). These computational analysis techniques are often augmented with visual analytics techniques (e.g. computational steering on large-scale high performance computing platforms) to put the human judgement into the analysis loop or new approaches with databases that are designed to support new forms of unstructured or semi-structured data as opposed to the rather tradtional structural databases (e.g. relational databases). More recently, data analysis and underpinned analytics frameworks also have to consider energy footprints of underlying resources. To sum up, the aim of this talk is to provide pieces of information to understand big data analytics in the context of science and engineering using the aforementioned classification as the lighthouse and as the frame of reference for a systematic approach. This talk will provide insights about big data analytics methods in context of science within varios communities and offers different views of how approaches of correlation and causality offer complementary methods to advance in science and engineering today. The RDA Big Data Analytics Group seeks to understand what approaches are not only technically feasible, but also scientifically feasible. The lighthouse Goal of the RDA Big Data Analytics Group is a classification of clever combinations of various Technologies and scientific applications in order to provide clear recommendations to the scientific community what approaches are technicalla and scientifically feasible.
Reproducible research in vadose zone sciences
USDA-ARS?s Scientific Manuscript database
A significant portion of present-day soil and Earth science research is computational, involving complex data analysis pipelines, advanced mathematical and statistical models, and sophisticated computer codes. Opportunities for scientific progress are greatly diminished if reproducing and building o...
Harnessing the power of emerging petascale platforms
NASA Astrophysics Data System (ADS)
Mellor-Crummey, John
2007-07-01
As part of the US Department of Energy's Scientific Discovery through Advanced Computing (SciDAC-2) program, science teams are tackling problems that require computational simulation and modeling at the petascale. A grand challenge for computer science is to develop software technology that makes it easier to harness the power of these systems to aid scientific discovery. As part of its activities, the SciDAC-2 Center for Scalable Application Development Software (CScADS) is building open source software tools to support efficient scientific computing on the emerging leadership-class platforms. In this paper, we describe two tools for performance analysis and tuning that are being developed as part of CScADS: a tool for analyzing scalability and performance, and a tool for optimizing loop nests for better node performance. We motivate these tools by showing how they apply to S3D, a turbulent combustion code under development at Sandia National Laboratory. For S3D, our node performance analysis tool helped uncover several performance bottlenecks. Using our loop nest optimization tool, we transformed S3D's most costly loop nest to reduce execution time by a factor of 2.94 for a processor working on a 503 domain.
Exploiting graphics processing units for computational biology and bioinformatics.
Payne, Joshua L; Sinnott-Armstrong, Nicholas A; Moore, Jason H
2010-09-01
Advances in the video gaming industry have led to the production of low-cost, high-performance graphics processing units (GPUs) that possess more memory bandwidth and computational capability than central processing units (CPUs), the standard workhorses of scientific computing. With the recent release of generalpurpose GPUs and NVIDIA's GPU programming language, CUDA, graphics engines are being adopted widely in scientific computing applications, particularly in the fields of computational biology and bioinformatics. The goal of this article is to concisely present an introduction to GPU hardware and programming, aimed at the computational biologist or bioinformaticist. To this end, we discuss the primary differences between GPU and CPU architecture, introduce the basics of the CUDA programming language, and discuss important CUDA programming practices, such as the proper use of coalesced reads, data types, and memory hierarchies. We highlight each of these topics in the context of computing the all-pairs distance between instances in a dataset, a common procedure in numerous disciplines of scientific computing. We conclude with a runtime analysis of the GPU and CPU implementations of the all-pairs distance calculation. We show our final GPU implementation to outperform the CPU implementation by a factor of 1700.
Understanding the Performance and Potential of Cloud Computing for Scientific Applications
Sadooghi, Iman; Martin, Jesus Hernandez; Li, Tonglin; ...
2015-02-19
In this paper, commercial clouds bring a great opportunity to the scientific computing area. Scientific applications usually require significant resources, however not all scientists have access to sufficient high-end computing systems, may of which can be found in the Top500 list. Cloud Computing has gained the attention of scientists as a competitive resource to run HPC applications at a potentially lower cost. But as a different infrastructure, it is unclear whether clouds are capable of running scientific applications with a reasonable performance per money spent. This work studies the performance of public clouds and places this performance in context tomore » price. We evaluate the raw performance of different services of AWS cloud in terms of the basic resources, such as compute, memory, network and I/O. We also evaluate the performance of the scientific applications running in the cloud. This paper aims to assess the ability of the cloud to perform well, as well as to evaluate the cost of the cloud running scientific applications. We developed a full set of metrics and conducted a comprehensive performance evlauation over the Amazon cloud. We evaluated EC2, S3, EBS and DynamoDB among the many Amazon AWS services. We evaluated the memory sub-system performance with CacheBench, the network performance with iperf, processor and network performance with the HPL benchmark application, and shared storage with NFS and PVFS in addition to S3. We also evaluated a real scientific computing application through the Swift parallel scripting system at scale. Armed with both detailed benchmarks to gauge expected performance and a detailed monetary cost analysis, we expect this paper will be a recipe cookbook for scientists to help them decide where to deploy and run their scientific applications between public clouds, private clouds, or hybrid clouds.« less
Understanding the Performance and Potential of Cloud Computing for Scientific Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sadooghi, Iman; Martin, Jesus Hernandez; Li, Tonglin
In this paper, commercial clouds bring a great opportunity to the scientific computing area. Scientific applications usually require significant resources, however not all scientists have access to sufficient high-end computing systems, may of which can be found in the Top500 list. Cloud Computing has gained the attention of scientists as a competitive resource to run HPC applications at a potentially lower cost. But as a different infrastructure, it is unclear whether clouds are capable of running scientific applications with a reasonable performance per money spent. This work studies the performance of public clouds and places this performance in context tomore » price. We evaluate the raw performance of different services of AWS cloud in terms of the basic resources, such as compute, memory, network and I/O. We also evaluate the performance of the scientific applications running in the cloud. This paper aims to assess the ability of the cloud to perform well, as well as to evaluate the cost of the cloud running scientific applications. We developed a full set of metrics and conducted a comprehensive performance evlauation over the Amazon cloud. We evaluated EC2, S3, EBS and DynamoDB among the many Amazon AWS services. We evaluated the memory sub-system performance with CacheBench, the network performance with iperf, processor and network performance with the HPL benchmark application, and shared storage with NFS and PVFS in addition to S3. We also evaluated a real scientific computing application through the Swift parallel scripting system at scale. Armed with both detailed benchmarks to gauge expected performance and a detailed monetary cost analysis, we expect this paper will be a recipe cookbook for scientists to help them decide where to deploy and run their scientific applications between public clouds, private clouds, or hybrid clouds.« less
76 FR 41234 - Advanced Scientific Computing Advisory Committee Charter Renewal
Federal Register 2010, 2011, 2012, 2013, 2014
2011-07-13
... Secretariat, General Services Administration, notice is hereby given that the Advanced Scientific Computing... advice and recommendations concerning the Advanced Scientific Computing program in response only to... Advanced Scientific Computing Research program and recommendations based thereon; --Advice on the computing...
Airborne Cloud Computing Environment (ACCE)
NASA Technical Reports Server (NTRS)
Hardman, Sean; Freeborn, Dana; Crichton, Dan; Law, Emily; Kay-Im, Liz
2011-01-01
Airborne Cloud Computing Environment (ACCE) is JPL's internal investment to improve the return on airborne missions. Improve development performance of the data system. Improve return on the captured science data. The investment is to develop a common science data system capability for airborne instruments that encompasses the end-to-end lifecycle covering planning, provisioning of data system capabilities, and support for scientific analysis in order to improve the quality, cost effectiveness, and capabilities to enable new scientific discovery and research in earth observation.
Visualization techniques to aid in the analysis of multi-spectral astrophysical data sets
NASA Technical Reports Server (NTRS)
Brugel, Edward W.; Domik, Gitta O.; Ayres, Thomas R.
1993-01-01
The goal of this project was to support the scientific analysis of multi-spectral astrophysical data by means of scientific visualization. Scientific visualization offers its greatest value if it is not used as a method separate or alternative to other data analysis methods but rather in addition to these methods. Together with quantitative analysis of data, such as offered by statistical analysis, image or signal processing, visualization attempts to explore all information inherent in astrophysical data in the most effective way. Data visualization is one aspect of data analysis. Our taxonomy as developed in Section 2 includes identification and access to existing information, preprocessing and quantitative analysis of data, visual representation and the user interface as major components to the software environment of astrophysical data analysis. In pursuing our goal to provide methods and tools for scientific visualization of multi-spectral astrophysical data, we therefore looked at scientific data analysis as one whole process, adding visualization tools to an already existing environment and integrating the various components that define a scientific data analysis environment. As long as the software development process of each component is separate from all other components, users of data analysis software are constantly interrupted in their scientific work in order to convert from one data format to another, or to move from one storage medium to another, or to switch from one user interface to another. We also took an in-depth look at scientific visualization and its underlying concepts, current visualization systems, their contributions, and their shortcomings. The role of data visualization is to stimulate mental processes different from quantitative data analysis, such as the perception of spatial relationships or the discovery of patterns or anomalies while browsing through large data sets. Visualization often leads to an intuitive understanding of the meaning of data values and their relationships by sacrificing accuracy in interpreting the data values. In order to be accurate in the interpretation, data values need to be measured, computed on, and compared to theoretical or empirical models (quantitative analysis). If visualization software hampers quantitative analysis (which happens with some commercial visualization products), its use is greatly diminished for astrophysical data analysis. The software system STAR (Scientific Toolkit for Astrophysical Research) was developed as a prototype during the course of the project to better understand the pragmatic concerns raised in the project. STAR led to a better understanding on the importance of collaboration between astrophysicists and computer scientists.
CAD/CAM and scientific data management at Dassault
NASA Technical Reports Server (NTRS)
Bohn, P.
1984-01-01
The history of CAD/CAM and scientific data management at Dassault are presented. Emphasis is put on the targets of the now commercially available software CATIA. The links with scientific computations such as aerodynamics and structural analysis are presented. Comments are made on the principles followed within the company. The consequences of the approximative nature of scientific data are examined. Consequence of the new history function is mainly its protection against copy or alteration. Future plans at Dassault for scientific data appear to be in opposite directions compared to some general tendencies.
ERIC Educational Resources Information Center
Blaser, Mark; Larsen, Jamie
1996-01-01
Presents five interactive, computer-based activities that mimic scientific tests used by sport researchers to help companies design high-performance athletic shoes, including impact tests, flexion tests, friction tests, video analysis, and computer modeling. Provides a platform for teachers to build connections between chemistry (polymer science),…
NASA Astrophysics Data System (ADS)
Teodorescu, Liliana; Britton, David; Glover, Nigel; Heinrich, Gudrun; Lauret, Jérôme; Naumann, Axel; Speer, Thomas; Teixeira-Dias, Pedro
2012-06-01
ACAT2011 This volume of Journal of Physics: Conference Series is dedicated to scientific contributions presented at the 14th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT 2011) which took place on 5-7 September 2011 at Brunel University, UK. The workshop series, which began in 1990 in Lyon, France, brings together computer science researchers and practitioners, and researchers from particle physics and related fields in order to explore and confront the boundaries of computing and of automatic data analysis and theoretical calculation techniques. It is a forum for the exchange of ideas among the fields, exploring and promoting cutting-edge computing, data analysis and theoretical calculation techniques in fundamental physics research. This year's edition of the workshop brought together over 100 participants from all over the world. 14 invited speakers presented key topics on computing ecosystems, cloud computing, multivariate data analysis, symbolic and automatic theoretical calculations as well as computing and data analysis challenges in astrophysics, bioinformatics and musicology. Over 80 other talks and posters presented state-of-the art developments in the areas of the workshop's three tracks: Computing Technologies, Data Analysis Algorithms and Tools, and Computational Techniques in Theoretical Physics. Panel and round table discussions on data management and multivariate data analysis uncovered new ideas and collaboration opportunities in the respective areas. This edition of ACAT was generously sponsored by the Science and Technology Facility Council (STFC), the Institute for Particle Physics Phenomenology (IPPP) at Durham University, Brookhaven National Laboratory in the USA and Dell. We would like to thank all the participants of the workshop for the high level of their scientific contributions and for the enthusiastic participation in all its activities which were, ultimately, the key factors in the success of the workshop. Further information on ACAT 2011 can be found at http://acat2011.cern.ch Dr Liliana Teodorescu Brunel University ACATgroup The PDF also contains details of the workshop's committees and sponsors.
Toward Theory-Based Instruction in Scientific Problem Solving.
ERIC Educational Resources Information Center
Heller, Joan I.; And Others
Several empirical and theoretical analyses related to scientific problem-solving are reviewed, including: detailed studies of individuals at different levels of expertise, and computer models simulating some aspects of human information processing during problem solving. Analysis of these studies has revealed many facets about the nature of the…
NASA Astrophysics Data System (ADS)
Okaya, D.; Deelman, E.; Maechling, P.; Wong-Barnum, M.; Jordan, T. H.; Meyers, D.
2007-12-01
Large scientific collaborations, such as the SCEC Petascale Cyberfacility for Physics-based Seismic Hazard Analysis (PetaSHA) Project, involve interactions between many scientists who exchange ideas and research results. These groups must organize, manage, and make accessible their community materials of observational data, derivative (research) results, computational products, and community software. The integration of scientific workflows as a paradigm to solve complex computations provides advantages of efficiency, reliability, repeatability, choices, and ease of use. The underlying resource needed for a scientific workflow to function and create discoverable and exchangeable products is the construction, tracking, and preservation of metadata. In the scientific workflow environment there is a two-tier structure of metadata. Workflow-level metadata and provenance describe operational steps, identity of resources, execution status, and product locations and names. Domain-level metadata essentially define the scientific meaning of data, codes and products. To a large degree the metadata at these two levels are separate. However, between these two levels is a subset of metadata produced at one level but is needed by the other. This crossover metadata suggests that some commonality in metadata handling is needed. SCEC researchers are collaborating with computer scientists at SDSC, the USC Information Sciences Institute, and Carnegie Mellon Univ. in order to perform earthquake science using high-performance computational resources. A primary objective of the "PetaSHA" collaboration is to perform physics-based estimations of strong ground motion associated with real and hypothetical earthquakes located within Southern California. Construction of 3D earth models, earthquake representations, and numerical simulation of seismic waves are key components of these estimations. Scientific workflows are used to orchestrate the sequences of scientific tasks and to access distributed computational facilities such as the NSF TeraGrid. Different types of metadata are produced and captured within the scientific workflows. One workflow within PetaSHA ("Earthworks") performs a linear sequence of tasks with workflow and seismological metadata preserved. Downstream scientific codes ingest these metadata produced by upstream codes. The seismological metadata uses attribute-value pairing in plain text; an identified need is to use more advanced handling methods. Another workflow system within PetaSHA ("Cybershake") involves several complex workflows in order to perform statistical analysis of ground shaking due to thousands of hypothetical but plausible earthquakes. Metadata management has been challenging due to its construction around a number of legacy scientific codes. We describe difficulties arising in the scientific workflow due to the lack of this metadata and suggest corrective steps, which in some cases include the cultural shift of domain science programmers coding for metadata.
The Magellan Final Report on Cloud Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
,; Coghlan, Susan; Yelick, Katherine
The goal of Magellan, a project funded through the U.S. Department of Energy (DOE) Office of Advanced Scientific Computing Research (ASCR), was to investigate the potential role of cloud computing in addressing the computing needs for the DOE Office of Science (SC), particularly related to serving the needs of mid- range computing and future data-intensive computing workloads. A set of research questions was formed to probe various aspects of cloud computing from performance, usability, and cost. To address these questions, a distributed testbed infrastructure was deployed at the Argonne Leadership Computing Facility (ALCF) and the National Energy Research Scientific Computingmore » Center (NERSC). The testbed was designed to be flexible and capable enough to explore a variety of computing models and hardware design points in order to understand the impact for various scientific applications. During the project, the testbed also served as a valuable resource to application scientists. Applications from a diverse set of projects such as MG-RAST (a metagenomics analysis server), the Joint Genome Institute, the STAR experiment at the Relativistic Heavy Ion Collider, and the Laser Interferometer Gravitational Wave Observatory (LIGO), were used by the Magellan project for benchmarking within the cloud, but the project teams were also able to accomplish important production science utilizing the Magellan cloud resources.« less
The application of cloud computing to scientific workflows: a study of cost and performance.
Berriman, G Bruce; Deelman, Ewa; Juve, Gideon; Rynge, Mats; Vöckler, Jens-S
2013-01-28
The current model of transferring data from data centres to desktops for analysis will soon be rendered impractical by the accelerating growth in the volume of science datasets. Processing will instead often take place on high-performance servers co-located with data. Evaluations of how new technologies such as cloud computing would support such a new distributed computing model are urgently needed. Cloud computing is a new way of purchasing computing and storage resources on demand through virtualization technologies. We report here the results of investigations of the applicability of commercial cloud computing to scientific computing, with an emphasis on astronomy, including investigations of what types of applications can be run cheaply and efficiently on the cloud, and an example of an application well suited to the cloud: processing a large dataset to create a new science product.
Robert Leland - Associate Lab Director, Scientific Computing and Energy
, applied mathematics, visualization, data, and analysis of energy systems, technologies, policies and Energy Analysis directorate. Leland earned his Ph.D. in mathematics from Oxford University in 1989
ERIC Educational Resources Information Center
Estes, Charles R.
1994-01-01
Discusses theoretical versus applied science and the use of the scientific method for analysis of social issues. Topics addressed include the use of simulation and modeling; the growth in computer power, including nanotechnology; distributed computing; self-evolving programs; spiritual matters; human engineering, i.e., molding individuals;…
Scientific computations section monthly report, November 1993
DOE Office of Scientific and Technical Information (OSTI.GOV)
Buckner, M.R.
1993-12-30
This progress report from the Savannah River Technology Center contains abstracts from papers from the computational modeling, applied statistics, applied physics, experimental thermal hydraulics, and packaging and transportation groups. Specific topics covered include: engineering modeling and process simulation, criticality methods and analysis, plutonium disposition.
76 FR 31945 - Advanced Scientific Computing Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2011-06-02
... DEPARTMENT OF ENERGY Advanced Scientific Computing Advisory Committee AGENCY: Department of Energy... teleconference meeting of the Advanced Scientific Computing Advisory Committee (ASCAC). The Federal [email protected] . FOR FURTHER INFORMATION CONTACT: Melea Baker, Office of Advanced Scientific Computing...
Building Cognition: The Construction of Computational Representations for Scientific Discovery.
Chandrasekharan, Sanjay; Nersessian, Nancy J
2015-11-01
Novel computational representations, such as simulation models of complex systems and video games for scientific discovery (Foldit, EteRNA etc.), are dramatically changing the way discoveries emerge in science and engineering. The cognitive roles played by such computational representations in discovery are not well understood. We present a theoretical analysis of the cognitive roles such representations play, based on an ethnographic study of the building of computational models in a systems biology laboratory. Specifically, we focus on a case of model-building by an engineer that led to a remarkable discovery in basic bioscience. Accounting for such discoveries requires a distributed cognition (DC) analysis, as DC focuses on the roles played by external representations in cognitive processes. However, DC analyses by and large have not examined scientific discovery, and they mostly focus on memory offloading, particularly how the use of existing external representations changes the nature of cognitive tasks. In contrast, we study discovery processes and argue that discoveries emerge from the processes of building the computational representation. The building process integrates manipulations in imagination and in the representation, creating a coupled cognitive system of model and modeler, where the model is incorporated into the modeler's imagination. This account extends DC significantly, and we present some of the theoretical and application implications of this extended account. Copyright © 2014 Cognitive Science Society, Inc.
Cardiology office computer use: primer, pointers, pitfalls.
Shepard, R B; Blum, R I
1986-10-01
An office computer is a utility, like an automobile, with benefits and costs that are both direct and hidden and potential for disaster. For the cardiologist or cardiovascular surgeon, the increasing power and decreasing costs of computer hardware and the availability of software make use of an office computer system an increasingly attractive possibility. Management of office business functions is common; handling and scientific analysis of practice medical information are less common. The cardiologist can also access national medical information systems for literature searches and for interactive further education. Selection and testing of programs and the entire computer system before purchase of computer hardware will reduce the chances of disappointment or serious problems. Personnel pretraining and planning for office information flow and medical information security are necessary. Some cardiologists design their own office systems, buy hardware and software as needed, write programs for themselves and carry out the implementation themselves. For most cardiologists, the better course will be to take advantage of the professional experience of expert advisors. This article provides a starting point from which the practicing cardiologist can approach considering, specifying or implementing an office computer system for business functions and for scientific analysis of practice results.
A primer for biomedical scientists on how to execute model II linear regression analysis.
Ludbrook, John
2012-04-01
1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.
75 FR 9887 - Advanced Scientific Computing Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2010-03-04
... DEPARTMENT OF ENERGY Advanced Scientific Computing Advisory Committee AGENCY: Department of Energy... Advanced Scientific Computing Advisory Committee (ASCAC). Federal Advisory Committee Act (Pub. L. 92-463... INFORMATION CONTACT: Melea Baker, Office of Advanced Scientific Computing Research; SC-21/Germantown Building...
76 FR 9765 - Advanced Scientific Computing Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2011-02-22
... DEPARTMENT OF ENERGY Advanced Scientific Computing Advisory Committee AGENCY: Office of Science... Advanced Scientific Computing Advisory Committee (ASCAC). The Federal Advisory Committee Act (Pub. L. 92... INFORMATION CONTACT: Melea Baker, Office of Advanced Scientific Computing Research, SC-21/Germantown Building...
77 FR 45345 - DOE/Advanced Scientific Computing Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2012-07-31
... Recompetition results for Scientific Discovery through Advanced Computing (SciDAC) applications Co-design Public... DEPARTMENT OF ENERGY DOE/Advanced Scientific Computing Advisory Committee AGENCY: Office of... the Advanced Scientific Computing Advisory Committee (ASCAC). The Federal Advisory Committee Act (Pub...
75 FR 64720 - DOE/Advanced Scientific Computing Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-20
... DEPARTMENT OF ENERGY DOE/Advanced Scientific Computing Advisory Committee AGENCY: Department of... the Advanced Scientific Computing Advisory Committee (ASCAC). Federal Advisory Committee Act (Pub. L.... FOR FURTHER INFORMATION CONTACT: Melea Baker, Office of Advanced Scientific Computing Research; SC-21...
Computing through Scientific Abstractions in SysBioPS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chin, George; Stephan, Eric G.; Gracio, Deborah K.
2004-10-13
Today, biologists and bioinformaticists have a tremendous amount of computational power at their disposal. With the availability of supercomputers, burgeoning scientific databases and digital libraries such as GenBank and PubMed, and pervasive computational environments such as the Grid, biologists have access to a wealth of computational capabilities and scientific data at hand. Yet, the rapid development of computational technologies has far exceeded the typical biologist’s ability to effectively apply the technology in their research. Computational sciences research and development efforts such as the Biology Workbench, BioSPICE (Biological Simulation Program for Intra-Cellular Evaluation), and BioCoRE (Biological Collaborative Research Environment) are importantmore » in connecting biologists and their scientific problems to computational infrastructures. On the Computational Cell Environment and Heuristic Entity-Relationship Building Environment projects at the Pacific Northwest National Laboratory, we are jointly developing a new breed of scientific problem solving environment called SysBioPSE that will allow biologists to access and apply computational resources in the scientific research context. In contrast to other computational science environments, SysBioPSE operates as an abstraction layer above a computational infrastructure. The goal of SysBioPSE is to allow biologists to apply computational resources in the context of the scientific problems they are addressing and the scientific perspectives from which they conduct their research. More specifically, SysBioPSE allows biologists to capture and represent scientific concepts and theories and experimental processes, and to link these views to scientific applications, data repositories, and computer systems.« less
FAST: A multi-processed environment for visualization of computational fluid dynamics
NASA Technical Reports Server (NTRS)
Bancroft, Gordon V.; Merritt, Fergus J.; Plessel, Todd C.; Kelaita, Paul G.; Mccabe, R. Kevin
1991-01-01
Three-dimensional, unsteady, multi-zoned fluid dynamics simulations over full scale aircraft are typical of the problems being investigated at NASA Ames' Numerical Aerodynamic Simulation (NAS) facility on CRAY2 and CRAY-YMP supercomputers. With multiple processor workstations available in the 10-30 Mflop range, we feel that these new developments in scientific computing warrant a new approach to the design and implementation of analysis tools. These larger, more complex problems create a need for new visualization techniques not possible with the existing software or systems available as of this writing. The visualization techniques will change as the supercomputing environment, and hence the scientific methods employed, evolves even further. The Flow Analysis Software Toolkit (FAST), an implementation of a software system for fluid mechanics analysis, is discussed.
75 FR 43518 - Advanced Scientific Computing Advisory Committee; Meeting
Federal Register 2010, 2011, 2012, 2013, 2014
2010-07-26
... DEPARTMENT OF ENERGY Advanced Scientific Computing Advisory Committee; Meeting AGENCY: Office of... Scientific Computing Advisory Committee (ASCAC). Federal Advisory Committee Act (Pub. L. 92-463, 86 Stat. 770...: Melea Baker, Office of Advanced Scientific Computing Research; SC-21/Germantown Building; U. S...
Display system for imaging scientific telemetric information
NASA Technical Reports Server (NTRS)
Zabiyakin, G. I.; Rykovanov, S. N.
1979-01-01
A system for imaging scientific telemetric information, based on the M-6000 minicomputer and the SIGD graphic display, is described. Two dimensional graphic display of telemetric information and interaction with the computer, in analysis and processing of telemetric parameters displayed on the screen is provided. The running parameter information output method is presented. User capabilities in the analysis and processing of telemetric information imaged on the display screen and the user language are discussed and illustrated.
Technologies for Large Data Management in Scientific Computing
NASA Astrophysics Data System (ADS)
Pace, Alberto
2014-01-01
In recent years, intense usage of computing has been the main strategy of investigations in several scientific research projects. The progress in computing technology has opened unprecedented opportunities for systematic collection of experimental data and the associated analysis that were considered impossible only few years ago. This paper focuses on the strategies in use: it reviews the various components that are necessary for an effective solution that ensures the storage, the long term preservation, and the worldwide distribution of large quantities of data that are necessary in a large scientific research project. The paper also mentions several examples of data management solutions used in High Energy Physics for the CERN Large Hadron Collider (LHC) experiments in Geneva, Switzerland which generate more than 30,000 terabytes of data every year that need to be preserved, analyzed, and made available to a community of several tenth of thousands scientists worldwide.
NASA Astrophysics Data System (ADS)
Seamon, E.; Gessler, P. E.; Flathers, E.
2015-12-01
The creation and use of large amounts of data in scientific investigations has become common practice. Data collection and analysis for large scientific computing efforts are not only increasing in volume as well as number, the methods and analysis procedures are evolving toward greater complexity (Bell, 2009, Clarke, 2009, Maimon, 2010). In addition, the growth of diverse data-intensive scientific computing efforts (Soni, 2011, Turner, 2014, Wu, 2008) has demonstrated the value of supporting scientific data integration. Efforts to bridge this gap between the above perspectives have been attempted, in varying degrees, with modular scientific computing analysis regimes implemented with a modest amount of success (Perez, 2009). This constellation of effects - 1) an increasing growth in the volume and amount of data, 2) a growing data-intensive science base that has challenging needs, and 3) disparate data organization and integration efforts - has created a critical gap. Namely, systems of scientific data organization and management typically do not effectively enable integrated data collaboration or data-intensive science-based communications. Our research efforts attempt to address this gap by developing a modular technology framework for data science integration efforts - with climate variation as the focus. The intention is that this model, if successful, could be generalized to other application areas. Our research aim focused on the design and implementation of a modular, deployable technology architecture for data integration. Developed using aspects of R, interactive python, SciDB, THREDDS, Javascript, and varied data mining and machine learning techniques, the Modular Data Response Framework (MDRF) was implemented to explore case scenarios for bio-climatic variation as they relate to pacific northwest ecosystem regions. Our preliminary results, using historical NETCDF climate data for calibration purposes across the inland pacific northwest region (Abatzoglou, Brown, 2011), show clear ecosystems shifting over a ten-year period (2001-2011), based on multiple supervised classifier methods for bioclimatic indicators.
Educational and Scientific Applications of Climate Model Diagnostic Analyzer
NASA Astrophysics Data System (ADS)
Lee, S.; Pan, L.; Zhai, C.; Tang, B.; Kubar, T. L.; Zhang, J.; Bao, Q.
2016-12-01
Climate Model Diagnostic Analyzer (CMDA) is a web-based information system designed for the climate modeling and model analysis community to analyze climate data from models and observations. CMDA provides tools to diagnostically analyze climate data for model validation and improvement, and to systematically manage analysis provenance for sharing results with other investigators. CMDA utilizes cloud computing resources, multi-threading computing, machine-learning algorithms, web service technologies, and provenance-supporting technologies to address technical challenges that the Earth science modeling and model analysis community faces in evaluating and diagnosing climate models. As CMDA infrastructure and technology have matured, we have developed the educational and scientific applications of CMDA. Educationally, CMDA supported the summer school of the JPL Center for Climate Sciences for three years since 2014. In the summer school, the students work on group research projects where CMDA provide datasets and analysis tools. Each student is assigned to a virtual machine with CMDA installed in Amazon Web Services. A provenance management system for CMDA is developed to keep track of students' usages of CMDA, and to recommend datasets and analysis tools for their research topic. The provenance system also allows students to revisit their analysis results and share them with their group. Scientifically, we have developed several science use cases of CMDA covering various topics, datasets, and analysis types. Each use case developed is described and listed in terms of a scientific goal, datasets used, the analysis tools used, scientific results discovered from the use case, an analysis result such as output plots and data files, and a link to the exact analysis service call with all the input arguments filled. For example, one science use case is the evaluation of NCAR CAM5 model with MODIS total cloud fraction. The analysis service used is Difference Plot Service of Two Variables, and the datasets used are NCAR CAM total cloud fraction and MODIS total cloud fraction. The scientific highlight of the use case is that the CAM5 model overall does a fairly decent job at simulating total cloud cover, though simulates too few clouds especially near and offshore of the eastern ocean basins where low clouds are dominant.
Capturing Petascale Application Characteristics with the Sequoia Toolkit
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vetter, Jeffrey S; Bhatia, Nikhil; Grobelny, Eric M
2005-01-01
Characterization of the computation, communication, memory, and I/O demands of current scientific applications is crucial for identifying which technologies will enable petascale scientific computing. In this paper, we present the Sequoia Toolkit for characterizing HPC applications. The Sequoia Toolkit consists of the Sequoia trace capture library and the Sequoia Event Analysis Library, or SEAL, that facilitates the development of tools for analyzing Sequoia event traces. Using the Sequoia Toolkit, we have characterized the behavior of application runs with up to 2048 application processes. To illustrate the use of the Sequoia Toolkit, we present a preliminary characterization of LAMMPS, a molecularmore » dynamics application of great interest to the computational biology community.« less
Visualization techniques to aid in the analysis of multispectral astrophysical data sets
NASA Technical Reports Server (NTRS)
Brugel, E. W.; Domik, Gitta O.; Ayres, T. R.
1993-01-01
The goal of this project was to support the scientific analysis of multi-spectral astrophysical data by means of scientific visualization. Scientific visualization offers its greatest value if it is not used as a method separate or alternative to other data analysis methods but rather in addition to these methods. Together with quantitative analysis of data, such as offered by statistical analysis, image or signal processing, visualization attempts to explore all information inherent in astrophysical data in the most effective way. Data visualization is one aspect of data analysis. Our taxonomy as developed in Section 2 includes identification and access to existing information, preprocessing and quantitative analysis of data, visual representation and the user interface as major components to the software environment of astrophysical data analysis. In pursuing our goal to provide methods and tools for scientific visualization of multi-spectral astrophysical data, we therefore looked at scientific data analysis as one whole process, adding visualization tools to an already existing environment and integrating the various components that define a scientific data analysis environment. As long as the software development process of each component is separate from all other components, users of data analysis software are constantly interrupted in their scientific work in order to convert from one data format to another, or to move from one storage medium to another, or to switch from one user interface to another. We also took an in-depth look at scientific visualization and its underlying concepts, current visualization systems, their contributions and their shortcomings. The role of data visualization is to stimulate mental processes different from quantitative data analysis, such as the perception of spatial relationships or the discovery of patterns or anomalies while browsing through large data sets. Visualization often leads to an intuitive understanding of the meaning of data values and their relationships by sacrificing accuracy in interpreting the data values. In order to be accurate in the interpretation, data values need to be measured, computed on, and compared to theoretical or empirical models (quantitative analysis). If visualization software hampers quantitative analysis (which happens with some commercial visualization products), its use is greatly diminished for astrophysical data analysis. The software system STAR (Scientific Toolkit for Astrophysical Research) was developed as a prototype during the course of the project to better understand the pragmatic concerns raised in the project. STAR led to a better understanding on the importance of collaboration between astrophysicists and computer scientists. Twenty-one examples of the use of visualization for astrophysical data are included with this report. Sixteen publications related to efforts performed during or initiated through work on this project are listed at the end of this report.
ArcGIS Framework for Scientific Data Analysis and Serving
NASA Astrophysics Data System (ADS)
Xu, H.; Ju, W.; Zhang, J.
2015-12-01
ArcGIS is a platform for managing, visualizing, analyzing, and serving geospatial data. Scientific data as part of the geospatial data features multiple dimensions (X, Y, time, and depth) and large volume. Multidimensional mosaic dataset (MDMD), a newly enhanced data model in ArcGIS, models the multidimensional gridded data (e.g. raster or image) as a hypercube and enables ArcGIS's capabilities to handle the large volume and near-real time scientific data. Built on top of geodatabase, the MDMD stores the dimension values and the variables (2D arrays) in a geodatabase table which allows accessing a slice or slices of the hypercube through a simple query and supports animating changes along time or vertical dimension using ArcGIS desktop or web clients. Through raster types, MDMD can manage not only netCDF, GRIB, and HDF formats but also many other formats or satellite data. It is scalable and can handle large data volume. The parallel geo-processing engine makes the data ingestion fast and easily. Raster function, definition of a raster processing algorithm, is a very important component in ArcGIS platform for on-demand raster processing and analysis. The scientific data analytics is achieved through the MDMD and raster function templates which perform on-demand scientific computation with variables ingested in the MDMD. For example, aggregating monthly average from daily data; computing total rainfall of a year; calculating heat index for forecasting data, and identifying fishing habitat zones etc. Addtionally, MDMD with the associated raster function templates can be served through ArcGIS server as image services which provide a framework for on-demand server side computation and analysis, and the published services can be accessed by multiple clients such as ArcMap, ArcGIS Online, JavaScript, REST, WCS, and WMS. This presentation will focus on the MDMD model and raster processing templates. In addtion, MODIS land cover, NDFD weather service, and HYCOM ocean model will be used to illustrate how ArcGIS platform and MDMD model can facilitate scientific data visualization and analytics and how the analysis results can be shared to more audience through ArcGIS Online and Portal.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hasenkamp, Daren; Sim, Alexander; Wehner, Michael
Extensive computing power has been used to tackle issues such as climate changes, fusion energy, and other pressing scientific challenges. These computations produce a tremendous amount of data; however, many of the data analysis programs currently only run a single processor. In this work, we explore the possibility of using the emerging cloud computing platform to parallelize such sequential data analysis tasks. As a proof of concept, we wrap a program for analyzing trends of tropical cyclones in a set of virtual machines (VMs). This approach allows the user to keep their familiar data analysis environment in the VMs, whilemore » we provide the coordination and data transfer services to ensure the necessary input and output are directed to the desired locations. This work extensively exercises the networking capability of the cloud computing systems and has revealed a number of weaknesses in the current cloud system software. In our tests, we are able to scale the parallel data analysis job to a modest number of VMs and achieve a speedup that is comparable to running the same analysis task using MPI. However, compared to MPI based parallelization, the cloud-based approach has a number of advantages. The cloud-based approach is more flexible because the VMs can capture arbitrary software dependencies without requiring the user to rewrite their programs. The cloud-based approach is also more resilient to failure; as long as a single VM is running, it can make progress while as soon as one MPI node fails the whole analysis job fails. In short, this initial work demonstrates that a cloud computing system is a viable platform for distributed scientific data analyses traditionally conducted on dedicated supercomputing systems.« less
NASA Technical Reports Server (NTRS)
Thomas, Valerie L.; Koblinsky, Chester J.; Webster, Ferris; Zlotnicki, Victor; Green, James L.
1987-01-01
The Space Physics Analysis Network (SPAN) is a multi-mission, correlative data comparison network which links space and Earth science research and data analysis computers. It provides a common working environment for sharing computer resources, sharing computer peripherals, solving proprietary problems, and providing the potential for significant time and cost savings for correlative data analysis. This is one of a series of discipline-specific SPAN documents which are intended to complement the SPAN primer and SPAN Management documents. Their purpose is to provide the discipline scientists with a comprehensive set of documents to assist in the use of SPAN for discipline specific scientific research.
Conformational Analysis of Drug Molecules: A Practical Exercise in the Medicinal Chemistry Course
ERIC Educational Resources Information Center
Yuriev, Elizabeth; Chalmers, David; Capuano, Ben
2009-01-01
Medicinal chemistry is a specialized, scientific discipline. Computational chemistry and structure-based drug design constitute important themes in the education of medicinal chemists. This problem-based task is associated with structure-based drug design lectures. It requires students to use computational techniques to investigate conformational…
NASA Technical Reports Server (NTRS)
2004-01-01
Since its founding in 1992, Global Science & Technology, Inc. (GST), of Greenbelt, Maryland, has been developing technologies and providing services in support of NASA scientific research. GST specialties include scientific analysis, science data and information systems, data visualization, communications, networking and Web technologies, computer science, and software system engineering. As a longtime contractor to Goddard Space Flight Center s Earth Science Directorate, GST scientific, engineering, and information technology staff have extensive qualifications with the synthesis of satellite, in situ, and Earth science data for weather- and climate-related projects. GST s experience in this arena is end-to-end, from building satellite ground receiving systems and science data systems, to product generation and research and analysis.
MSL: Facilitating automatic and physical analysis of published scientific literature in PDF format.
Ahmed, Zeeshan; Dandekar, Thomas
2015-01-01
Published scientific literature contains millions of figures, including information about the results obtained from different scientific experiments e.g. PCR-ELISA data, microarray analysis, gel electrophoresis, mass spectrometry data, DNA/RNA sequencing, diagnostic imaging (CT/MRI and ultrasound scans), and medicinal imaging like electroencephalography (EEG), magnetoencephalography (MEG), echocardiography (ECG), positron-emission tomography (PET) images. The importance of biomedical figures has been widely recognized in scientific and medicine communities, as they play a vital role in providing major original data, experimental and computational results in concise form. One major challenge for implementing a system for scientific literature analysis is extracting and analyzing text and figures from published PDF files by physical and logical document analysis. Here we present a product line architecture based bioinformatics tool 'Mining Scientific Literature (MSL)', which supports the extraction of text and images by interpreting all kinds of published PDF files using advanced data mining and image processing techniques. It provides modules for the marginalization of extracted text based on different coordinates and keywords, visualization of extracted figures and extraction of embedded text from all kinds of biological and biomedical figures using applied Optimal Character Recognition (OCR). Moreover, for further analysis and usage, it generates the system's output in different formats including text, PDF, XML and images files. Hence, MSL is an easy to install and use analysis tool to interpret published scientific literature in PDF format.
Computer Synthesis Approaches of Hyperboloid Gear Drives with Linear Contact
NASA Astrophysics Data System (ADS)
Abadjiev, Valentin; Kawasaki, Haruhisa
2014-09-01
The computer design has improved forming different type software for scientific researches in the field of gearing theory as well as performing an adequate scientific support of the gear drives manufacture. Here are attached computer programs that are based on mathematical models as a result of scientific researches. The modern gear transmissions require the construction of new mathematical approaches to their geometric, technological and strength analysis. The process of optimization, synthesis and design is based on adequate iteration procedures to find out an optimal solution by varying definite parameters. The study is dedicated to accepted methodology in the creation of soft- ware for the synthesis of a class high reduction hyperboloid gears - Spiroid and Helicon ones (Spiroid and Helicon are trademarks registered by the Illinois Tool Works, Chicago, Ill). The developed basic computer products belong to software, based on original mathematical models. They are based on the two mathematical models for the synthesis: "upon a pitch contact point" and "upon a mesh region". Computer programs are worked out on the basis of the described mathematical models, and the relations between them are shown. The application of the shown approaches to the synthesis of commented gear drives is illustrated.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-11-08
... analysis, survey methodology, geospatial analysis, econometrics, cognitive psychology, and computer science... following disciplines: demography, economics, geography, psychology, statistics, survey methodology, social... expertise in such areas as demography, economics, geography, psychology, statistics, survey methodology...
Integrating multiple scientific computing needs via a Private Cloud infrastructure
NASA Astrophysics Data System (ADS)
Bagnasco, S.; Berzano, D.; Brunetti, R.; Lusso, S.; Vallero, S.
2014-06-01
In a typical scientific computing centre, diverse applications coexist and share a single physical infrastructure. An underlying Private Cloud facility eases the management and maintenance of heterogeneous use cases such as multipurpose or application-specific batch farms, Grid sites catering to different communities, parallel interactive data analysis facilities and others. It allows to dynamically and efficiently allocate resources to any application and to tailor the virtual machines according to the applications' requirements. Furthermore, the maintenance of large deployments of complex and rapidly evolving middleware and application software is eased by the use of virtual images and contextualization techniques; for example, rolling updates can be performed easily and minimizing the downtime. In this contribution we describe the Private Cloud infrastructure at the INFN-Torino Computer Centre, that hosts a full-fledged WLCG Tier-2 site and a dynamically expandable PROOF-based Interactive Analysis Facility for the ALICE experiment at the CERN LHC and several smaller scientific computing applications. The Private Cloud building blocks include the OpenNebula software stack, the GlusterFS filesystem (used in two different configurations for worker- and service-class hypervisors) and the OpenWRT Linux distribution (used for network virtualization). A future integration into a federated higher-level infrastructure is made possible by exposing commonly used APIs like EC2 and by using mainstream contextualization tools like CloudInit.
Active Flash: Out-of-core Data Analytics on Flash Storage
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boboila, Simona; Kim, Youngjae; Vazhkudai, Sudharshan S
2012-01-01
Next generation science will increasingly come to rely on the ability to perform efficient, on-the-fly analytics of data generated by high-performance computing (HPC) simulations, modeling complex physical phenomena. Scientific computing workflows are stymied by the traditional chaining of simulation and data analysis, creating multiple rounds of redundant reads and writes to the storage system, which grows in cost with the ever-increasing gap between compute and storage speeds in HPC clusters. Recent HPC acquisitions have introduced compute node-local flash storage as a means to alleviate this I/O bottleneck. We propose a novel approach, Active Flash, to expedite data analysis pipelines bymore » migrating to the location of the data, the flash device itself. We argue that Active Flash has the potential to enable true out-of-core data analytics by freeing up both the compute core and the associated main memory. By performing analysis locally, dependence on limited bandwidth to a central storage system is reduced, while allowing this analysis to proceed in parallel with the main application. In addition, offloading work from the host to the more power-efficient controller reduces peak system power usage, which is already in the megawatt range and poses a major barrier to HPC system scalability. We propose an architecture for Active Flash, explore energy and performance trade-offs in moving computation from host to storage, demonstrate the ability of appropriate embedded controllers to perform data analysis and reduction tasks at speeds sufficient for this application, and present a simulation study of Active Flash scheduling policies. These results show the viability of the Active Flash model, and its capability to potentially have a transformative impact on scientific data analysis.« less
Computer applications in scientific balloon quality control
NASA Astrophysics Data System (ADS)
Seely, Loren G.; Smith, Michael S.
Seal defects and seal tensile strength are primary determinants of product quality in scientific balloon manufacturing; they therefore require a unit of quality measure. The availability of inexpensive and powerful data-processing tools can serve as the basis of a quality-trends-discerning analysis of products. The results of one such analysis are presently given in graphic form for use on the production floor. Software descriptions and their sample outputs are presented, together with a summary of overall and long-term effects of these methods on product quality.
Exploring Two Approaches for an End-to-End Scientific Analysis Workflow
NASA Astrophysics Data System (ADS)
Dodelson, Scott; Kent, Steve; Kowalkowski, Jim; Paterno, Marc; Sehrish, Saba
2015-12-01
The scientific discovery process can be advanced by the integration of independently-developed programs run on disparate computing facilities into coherent workflows usable by scientists who are not experts in computing. For such advancement, we need a system which scientists can use to formulate analysis workflows, to integrate new components to these workflows, and to execute different components on resources that are best suited to run those components. In addition, we need to monitor the status of the workflow as components get scheduled and executed, and to access the intermediate and final output for visual exploration and analysis. Finally, it is important for scientists to be able to share their workflows with collaborators. We have explored two approaches for such an analysis framework for the Large Synoptic Survey Telescope (LSST) Dark Energy Science Collaboration (DESC); the first one is based on the use and extension of Galaxy, a web-based portal for biomedical research, and the second one is based on a programming language, Python. In this paper, we present a brief description of the two approaches, describe the kinds of extensions to the Galaxy system we have found necessary in order to support the wide variety of scientific analysis in the cosmology community, and discuss how similar efforts might be of benefit to the HEP community.
Scalable Kernel Methods and Algorithms for General Sequence Analysis
ERIC Educational Resources Information Center
Kuksa, Pavel
2011-01-01
Analysis of large-scale sequential data has become an important task in machine learning and pattern recognition, inspired in part by numerous scientific and technological applications such as the document and text classification or the analysis of biological sequences. However, current computational methods for sequence comparison still lack…
Image improvement and three-dimensional reconstruction using holographic image processing
NASA Technical Reports Server (NTRS)
Stroke, G. W.; Halioua, M.; Thon, F.; Willasch, D. H.
1977-01-01
Holographic computing principles make possible image improvement and synthesis in many cases of current scientific and engineering interest. Examples are given for the improvement of resolution in electron microscopy and 3-D reconstruction in electron microscopy and X-ray crystallography, following an analysis of optical versus digital computing in such applications.
ERIC Educational Resources Information Center
Perry, James L.; Kraemer, Kenneth L.
1978-01-01
Argues that innovation attributes, together with policies associated with the diffusion on an innovation, account for significant differences in diffusion patterns. An empirical analysis of this thesis focuses on the diffusion of computer applications software in local government. Available from Elsevier Scientific Publishing Co., Box 211,…
Science-Driven Computing: NERSC's Plan for 2006-2010
DOE Office of Scientific and Technical Information (OSTI.GOV)
Simon, Horst D.; Kramer, William T.C.; Bailey, David H.
NERSC has developed a five-year strategic plan focusing on three components: Science-Driven Systems, Science-Driven Services, and Science-Driven Analytics. (1) Science-Driven Systems: Balanced introduction of the best new technologies for complete computational systems--computing, storage, networking, visualization and analysis--coupled with the activities necessary to engage vendors in addressing the DOE computational science requirements in their future roadmaps. (2) Science-Driven Services: The entire range of support activities, from high-quality operations and user services to direct scientific support, that enable a broad range of scientists to effectively use NERSC systems in their research. NERSC will concentrate on resources needed to realize the promise ofmore » the new highly scalable architectures for scientific discovery in multidisciplinary computational science projects. (3) Science-Driven Analytics: The architectural and systems enhancements and services required to integrate NERSC's powerful computational and storage resources to provide scientists with new tools to effectively manipulate, visualize, and analyze the huge data sets derived from simulations and experiments.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sewell, Christopher Meyer
This is a set of slides from a guest lecture for a class at the University of Texas, El Paso on visualization and data analysis for high-performance computing. The topics covered are the following: trends in high-performance computing; scientific visualization, such as OpenGL, ray tracing and volume rendering, VTK, and ParaView; data science at scale, such as in-situ visualization, image databases, distributed memory parallelism, shared memory parallelism, VTK-m, "big data", and then an analysis example.
Inconsistencies in Numerical Simulations of Dynamical Systems Using Interval Arithmetic
NASA Astrophysics Data System (ADS)
Nepomuceno, Erivelton G.; Peixoto, Márcia L. C.; Martins, Samir A. M.; Rodrigues, Heitor M.; Perc, Matjaž
Over the past few decades, interval arithmetic has been attracting widespread interest from the scientific community. With the expansion of computing power, scientific computing is encountering a noteworthy shift from floating-point arithmetic toward increased use of interval arithmetic. Notwithstanding the significant reliability of interval arithmetic, this paper presents a theoretical inconsistency in a simulation of dynamical systems using a well-known implementation of arithmetic interval. We have observed that two natural interval extensions present an empty intersection during a finite time range, which is contrary to the fundamental theorem of interval analysis. We have proposed a procedure to at least partially overcome this problem, based on the union of the two generated pseudo-orbits. This paper also shows a successful case of interval arithmetic application in the reduction of interval width size on the simulation of discrete map. The implications of our findings on the reliability of scientific computing using interval arithmetic have been properly addressed using two numerical examples.
Genten: Software for Generalized Tensor Decompositions v. 1.0.0
DOE Office of Scientific and Technical Information (OSTI.GOV)
Phipps, Eric T.; Kolda, Tamara G.; Dunlavy, Daniel
Tensors, or multidimensional arrays, are a powerful mathematical means of describing multiway data. This software provides computational means for decomposing or approximating a given tensor in terms of smaller tensors of lower dimension, focusing on decomposition of large, sparse tensors. These techniques have applications in many scientific areas, including signal processing, linear algebra, computer vision, numerical analysis, data mining, graph analysis, neuroscience and more. The software is designed to take advantage of parallelism present emerging computer architectures such has multi-core CPUs, many-core accelerators such as the Intel Xeon Phi, and computation-oriented GPUs to enable efficient processing of large tensors.
Software and the Scientist: Coding and Citation Practices in Geodynamics
NASA Astrophysics Data System (ADS)
Hwang, Lorraine; Fish, Allison; Soito, Laura; Smith, MacKenzie; Kellogg, Louise H.
2017-11-01
In geodynamics as in other scientific areas, computation has become a core component of research, complementing field observation, laboratory analysis, experiment, and theory. Computational tools for data analysis, mapping, visualization, modeling, and simulation are essential for all aspects of the scientific workflow. Specialized scientific software is often developed by geodynamicists for their own use, and this effort represents a distinctive intellectual contribution. Drawing on a geodynamics community that focuses on developing and disseminating scientific software, we assess the current practices of software development and attribution, as well as attitudes about the need and best practices for software citation. We analyzed publications by participants in the Computational Infrastructure for Geodynamics and conducted mixed method surveys of the solid earth geophysics community. From this we learned that coding skills are typically learned informally. Participants considered good code as trusted, reusable, readable, and not overly complex and considered a good coder as one that participates in the community in an open and reasonable manor contributing to both long- and short-term community projects. Participants strongly supported citing software reflected by the high rate a software package was named in the literature and the high rate of citations in the references. However, lacking are clear instructions from developers on how to cite and education of users on what to cite. In addition, citations did not always lead to discoverability of the resource. A unique identifier to the software package itself, community education, and citation tools would contribute to better attribution practices.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Habib, Salman; Roser, Robert; Gerber, Richard
The U.S. Department of Energy (DOE) Office of Science (SC) Offices of High Energy Physics (HEP) and Advanced Scientific Computing Research (ASCR) convened a programmatic Exascale Requirements Review on June 10–12, 2015, in Bethesda, Maryland. This report summarizes the findings, results, and recommendations derived from that meeting. The high-level findings and observations are as follows. Larger, more capable computing and data facilities are needed to support HEP science goals in all three frontiers: Energy, Intensity, and Cosmic. The expected scale of the demand at the 2025 timescale is at least two orders of magnitude — and in some cases greatermore » — than that available currently. The growth rate of data produced by simulations is overwhelming the current ability of both facilities and researchers to store and analyze it. Additional resources and new techniques for data analysis are urgently needed. Data rates and volumes from experimental facilities are also straining the current HEP infrastructure in its ability to store and analyze large and complex data volumes. Appropriately configured leadership-class facilities can play a transformational role in enabling scientific discovery from these datasets. A close integration of high-performance computing (HPC) simulation and data analysis will greatly aid in interpreting the results of HEP experiments. Such an integration will minimize data movement and facilitate interdependent workflows. Long-range planning between HEP and ASCR will be required to meet HEP’s research needs. To best use ASCR HPC resources, the experimental HEP program needs (1) an established, long-term plan for access to ASCR computational and data resources, (2) the ability to map workflows to HPC resources, (3) the ability for ASCR facilities to accommodate workflows run by collaborations potentially comprising thousands of individual members, (4) to transition codes to the next-generation HPC platforms that will be available at ASCR facilities, (5) to build up and train a workforce capable of developing and using simulations and analysis to support HEP scientific research on next-generation systems.« less
Performance Analysis Tool for HPC and Big Data Applications on Scientific Clusters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yoo, Wucherl; Koo, Michelle; Cao, Yu
Big data is prevalent in HPC computing. Many HPC projects rely on complex workflows to analyze terabytes or petabytes of data. These workflows often require running over thousands of CPU cores and performing simultaneous data accesses, data movements, and computation. It is challenging to analyze the performance involving terabytes or petabytes of workflow data or measurement data of the executions, from complex workflows over a large number of nodes and multiple parallel task executions. To help identify performance bottlenecks or debug the performance issues in large-scale scientific applications and scientific clusters, we have developed a performance analysis framework, using state-ofthe-more » art open-source big data processing tools. Our tool can ingest system logs and application performance measurements to extract key performance features, and apply the most sophisticated statistical tools and data mining methods on the performance data. It utilizes an efficient data processing engine to allow users to interactively analyze a large amount of different types of logs and measurements. To illustrate the functionality of the big data analysis framework, we conduct case studies on the workflows from an astronomy project known as the Palomar Transient Factory (PTF) and the job logs from the genome analysis scientific cluster. Our study processed many terabytes of system logs and application performance measurements collected on the HPC systems at NERSC. The implementation of our tool is generic enough to be used for analyzing the performance of other HPC systems and Big Data workows.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Curtis, Darren S.; Peterson, Elena S.; Oehmen, Chris S.
2008-05-04
This work presents the ScalaBLAST Web Application (SWA), a web based application implemented using the PHP script language, MySQL DBMS, and Apache web server under a GNU/Linux platform. SWA is an application built as part of the Data Intensive Computer for Complex Biological Systems (DICCBS) project at the Pacific Northwest National Laboratory (PNNL). SWA delivers accelerated throughput of bioinformatics analysis via high-performance computing through a convenient, easy-to-use web interface. This approach greatly enhances emerging fields of study in biology such as ontology-based homology, and multiple whole genome comparisons which, in the absence of a tool like SWA, require a heroicmore » effort to overcome the computational bottleneck associated with genome analysis. The current version of SWA includes a user account management system, a web based user interface, and a backend process that generates the files necessary for the Internet scientific community to submit a ScalaBLAST parallel processing job on a dedicated cluster.« less
MSL: Facilitating automatic and physical analysis of published scientific literature in PDF format
Ahmed, Zeeshan; Dandekar, Thomas
2018-01-01
Published scientific literature contains millions of figures, including information about the results obtained from different scientific experiments e.g. PCR-ELISA data, microarray analysis, gel electrophoresis, mass spectrometry data, DNA/RNA sequencing, diagnostic imaging (CT/MRI and ultrasound scans), and medicinal imaging like electroencephalography (EEG), magnetoencephalography (MEG), echocardiography (ECG), positron-emission tomography (PET) images. The importance of biomedical figures has been widely recognized in scientific and medicine communities, as they play a vital role in providing major original data, experimental and computational results in concise form. One major challenge for implementing a system for scientific literature analysis is extracting and analyzing text and figures from published PDF files by physical and logical document analysis. Here we present a product line architecture based bioinformatics tool ‘Mining Scientific Literature (MSL)’, which supports the extraction of text and images by interpreting all kinds of published PDF files using advanced data mining and image processing techniques. It provides modules for the marginalization of extracted text based on different coordinates and keywords, visualization of extracted figures and extraction of embedded text from all kinds of biological and biomedical figures using applied Optimal Character Recognition (OCR). Moreover, for further analysis and usage, it generates the system’s output in different formats including text, PDF, XML and images files. Hence, MSL is an easy to install and use analysis tool to interpret published scientific literature in PDF format. PMID:29721305
NASA Astrophysics Data System (ADS)
Beggrow, Elizabeth P.; Ha, Minsu; Nehm, Ross H.; Pearl, Dennis; Boone, William J.
2014-02-01
The landscape of science education is being transformed by the new Framework for Science Education (National Research Council, A framework for K-12 science education: practices, crosscutting concepts, and core ideas. The National Academies Press, Washington, DC, 2012), which emphasizes the centrality of scientific practices—such as explanation, argumentation, and communication—in science teaching, learning, and assessment. A major challenge facing the field of science education is developing assessment tools that are capable of validly and efficiently evaluating these practices. Our study examined the efficacy of a free, open-source machine-learning tool for evaluating the quality of students' written explanations of the causes of evolutionary change relative to three other approaches: (1) human-scored written explanations, (2) a multiple-choice test, and (3) clinical oral interviews. A large sample of undergraduates (n = 104) exposed to varying amounts of evolution content completed all three assessments: a clinical oral interview, a written open-response assessment, and a multiple-choice test. Rasch analysis was used to compute linear person measures and linear item measures on a single logit scale. We found that the multiple-choice test displayed poor person and item fit (mean square outfit >1.3), while both oral interview measures and computer-generated written response measures exhibited acceptable fit (average mean square outfit for interview: person 0.97, item 0.97; computer: person 1.03, item 1.06). Multiple-choice test measures were more weakly associated with interview measures (r = 0.35) than the computer-scored explanation measures (r = 0.63). Overall, Rasch analysis indicated that computer-scored written explanation measures (1) have the strongest correspondence to oral interview measures; (2) are capable of capturing students' normative scientific and naive ideas as accurately as human-scored explanations, and (3) more validly detect understanding than the multiple-choice assessment. These findings demonstrate the great potential of machine-learning tools for assessing key scientific practices highlighted in the new Framework for Science Education.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gerber, Richard; Hack, James; Riley, Katherine
The mission of the U.S. Department of Energy Office of Science (DOE SC) is the delivery of scientific discoveries and major scientific tools to transform our understanding of nature and to advance the energy, economic, and national security missions of the United States. To achieve these goals in today’s world requires investments in not only the traditional scientific endeavors of theory and experiment, but also in computational science and the facilities that support large-scale simulation and data analysis. The Advanced Scientific Computing Research (ASCR) program addresses these challenges in the Office of Science. ASCR’s mission is to discover, develop, andmore » deploy computational and networking capabilities to analyze, model, simulate, and predict complex phenomena important to DOE. ASCR supports research in computational science, three high-performance computing (HPC) facilities — the National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory and Leadership Computing Facilities at Argonne (ALCF) and Oak Ridge (OLCF) National Laboratories — and the Energy Sciences Network (ESnet) at Berkeley Lab. ASCR is guided by science needs as it develops research programs, computers, and networks at the leading edge of technologies. As we approach the era of exascale computing, technology changes are creating challenges for science programs in SC for those who need to use high performance computing and data systems effectively. Numerous significant modifications to today’s tools and techniques will be needed to realize the full potential of emerging computing systems and other novel computing architectures. To assess these needs and challenges, ASCR held a series of Exascale Requirements Reviews in 2015–2017, one with each of the six SC program offices,1 and a subsequent Crosscut Review that sought to integrate the findings from each. Participants at the reviews were drawn from the communities of leading domain scientists, experts in computer science and applied mathematics, ASCR facility staff, and DOE program managers in ASCR and the respective program offices. The purpose of these reviews was to identify mission-critical scientific problems within the DOE Office of Science (including experimental facilities) and determine the requirements for the exascale ecosystem that would be needed to address those challenges. The exascale ecosystem includes exascale computing systems, high-end data capabilities, efficient software at scale, libraries, tools, and other capabilities. This effort will contribute to the development of a strategic roadmap for ASCR compute and data facility investments and will help the ASCR Facility Division establish partnerships with Office of Science stakeholders. It will also inform the Office of Science research needs and agenda. The results of the six reviews have been published in reports available on the web at http://exascaleage.org/. This report presents a summary of the individual reports and of common and crosscutting findings, and it identifies opportunities for productive collaborations among the DOE SC program offices.« less
ERIC Educational Resources Information Center
Sins, Patrick H. M.; Savelsbergh, Elwin R.; van Joolingen, Wouter R.
2005-01-01
Although computer modelling is widely advocated as a way to offer students a deeper understanding of complex phenomena, the process of modelling is rather complex itself and needs scaffolding. In order to offer adequate support, a thorough understanding of the reasoning processes students employ and of difficulties they encounter during a…
Novel 3D/VR interactive environment for MD simulations, visualization and analysis.
Doblack, Benjamin N; Allis, Tim; Dávila, Lilian P
2014-12-18
The increasing development of computing (hardware and software) in the last decades has impacted scientific research in many fields including materials science, biology, chemistry and physics among many others. A new computational system for the accurate and fast simulation and 3D/VR visualization of nanostructures is presented here, using the open-source molecular dynamics (MD) computer program LAMMPS. This alternative computational method uses modern graphics processors, NVIDIA CUDA technology and specialized scientific codes to overcome processing speed barriers common to traditional computing methods. In conjunction with a virtual reality system used to model materials, this enhancement allows the addition of accelerated MD simulation capability. The motivation is to provide a novel research environment which simultaneously allows visualization, simulation, modeling and analysis. The research goal is to investigate the structure and properties of inorganic nanostructures (e.g., silica glass nanosprings) under different conditions using this innovative computational system. The work presented outlines a description of the 3D/VR Visualization System and basic components, an overview of important considerations such as the physical environment, details on the setup and use of the novel system, a general procedure for the accelerated MD enhancement, technical information, and relevant remarks. The impact of this work is the creation of a unique computational system combining nanoscale materials simulation, visualization and interactivity in a virtual environment, which is both a research and teaching instrument at UC Merced.
Novel 3D/VR Interactive Environment for MD Simulations, Visualization and Analysis
Doblack, Benjamin N.; Allis, Tim; Dávila, Lilian P.
2014-01-01
The increasing development of computing (hardware and software) in the last decades has impacted scientific research in many fields including materials science, biology, chemistry and physics among many others. A new computational system for the accurate and fast simulation and 3D/VR visualization of nanostructures is presented here, using the open-source molecular dynamics (MD) computer program LAMMPS. This alternative computational method uses modern graphics processors, NVIDIA CUDA technology and specialized scientific codes to overcome processing speed barriers common to traditional computing methods. In conjunction with a virtual reality system used to model materials, this enhancement allows the addition of accelerated MD simulation capability. The motivation is to provide a novel research environment which simultaneously allows visualization, simulation, modeling and analysis. The research goal is to investigate the structure and properties of inorganic nanostructures (e.g., silica glass nanosprings) under different conditions using this innovative computational system. The work presented outlines a description of the 3D/VR Visualization System and basic components, an overview of important considerations such as the physical environment, details on the setup and use of the novel system, a general procedure for the accelerated MD enhancement, technical information, and relevant remarks. The impact of this work is the creation of a unique computational system combining nanoscale materials simulation, visualization and interactivity in a virtual environment, which is both a research and teaching instrument at UC Merced. PMID:25549300
Web Services Provide Access to SCEC Scientific Research Application Software
NASA Astrophysics Data System (ADS)
Gupta, N.; Gupta, V.; Okaya, D.; Kamb, L.; Maechling, P.
2003-12-01
Web services offer scientific communities a new paradigm for sharing research codes and communicating results. While there are formal technical definitions of what constitutes a web service, for a user community such as the Southern California Earthquake Center (SCEC), we may conceptually consider a web service to be functionality provided on-demand by an application which is run on a remote computer located elsewhere on the Internet. The value of a web service is that it can (1) run a scientific code without the user needing to install and learn the intricacies of running the code; (2) provide the technical framework which allows a user's computer to talk to the remote computer which performs the service; (3) provide the computational resources to run the code; and (4) bundle several analysis steps and provide the end results in digital or (post-processed) graphical form. Within an NSF-sponsored ITR project coordinated by SCEC, we are constructing web services using architectural protocols and programming languages (e.g., Java). However, because the SCEC community has a rich pool of scientific research software (written in traditional languages such as C and FORTRAN), we also emphasize making existing scientific codes available by constructing web service frameworks which wrap around and directly run these codes. In doing so we attempt to broaden community usage of these codes. Web service wrapping of a scientific code can be done using a "web servlet" construction or by using a SOAP/WSDL-based framework. This latter approach is widely adopted in IT circles although it is subject to rapid evolution. Our wrapping framework attempts to "honor" the original codes with as little modification as is possible. For versatility we identify three methods of user access: (A) a web-based GUI (written in HTML and/or Java applets); (B) a Linux/OSX/UNIX command line "initiator" utility (shell-scriptable); and (C) direct access from within any Java application (and with the correct API interface from within C++ and/or C/Fortran). This poster presentation will provide descriptions of the following selected web services and their origin as scientific application codes: 3D community velocity models for Southern California, geocoordinate conversions (latitude/longitude to UTM), execution of GMT graphical scripts, data format conversions (Gocad to Matlab format), and implementation of Seismic Hazard Analysis application programs that calculate hazard curve and hazard map data sets.
Web-based interactive visualization in a Grid-enabled neuroimaging application using HTML5.
Siewert, René; Specovius, Svenja; Wu, Jie; Krefting, Dagmar
2012-01-01
Interactive visualization and correction of intermediate results are required in many medical image analysis pipelines. To allow certain interaction in the remote execution of compute- and data-intensive applications, new features of HTML5 are used. They allow for transparent integration of user interaction into Grid- or Cloud-enabled scientific workflows. Both 2D and 3D visualization and data manipulation can be performed through a scientific gateway without the need to install specific software or web browser plugins. The possibilities of web-based visualization are presented along the FreeSurfer-pipeline, a popular compute- and data-intensive software tool for quantitative neuroimaging.
Cellular automaton supercomputing
NASA Technical Reports Server (NTRS)
Wolfram, Stephen
1987-01-01
Many of the models now used in science and engineering are over a century old. And most of them can be implemented on modern digital computers only with considerable difficulty. Some new basic models are discussed which are much more directly suitable for digital computer simulation. The fundamental principle is that the models considered herein are as suitable as possible for implementation on digital computers. It is then a matter of scientific analysis to determine whether such models can reproduce the behavior seen in physical and other systems. Such analysis was carried out in several cases, and the results are very encouraging.
NASA Technical Reports Server (NTRS)
Schwan, Karsten
1997-01-01
This final report has four sections. We first describe the actual scientific results attained by our research team, followed by a description of the high performance computing research enhancing those results and prompted by the scientific tasks being undertaken. Next, we describe our research in data and program visualization motivated by the scientific research and also enabling it. Last, we comment on the indirect effects this research effort has had on our work, in terms of follow up or additional funding, student training, etc.
NASA Astrophysics Data System (ADS)
Christensen, C.; Summa, B.; Scorzelli, G.; Lee, J. W.; Venkat, A.; Bremer, P. T.; Pascucci, V.
2017-12-01
Massive datasets are becoming more common due to increasingly detailed simulations and higher resolution acquisition devices. Yet accessing and processing these huge data collections for scientific analysis is still a significant challenge. Solutions that rely on extensive data transfers are increasingly untenable and often impossible due to lack of sufficient storage at the client side as well as insufficient bandwidth to conduct such large transfers, that in some cases could entail petabytes of data. Large-scale remote computing resources can be useful, but utilizing such systems typically entails some form of offline batch processing with long delays, data replications, and substantial cost for any mistakes. Both types of workflows can severely limit the flexible exploration and rapid evaluation of new hypotheses that are crucial to the scientific process and thereby impede scientific discovery. In order to facilitate interactivity in both analysis and visualization of these massive data ensembles, we introduce a dynamic runtime system suitable for progressive computation and interactive visualization of arbitrarily large, disparately located spatiotemporal datasets. Our system includes an embedded domain-specific language (EDSL) that allows users to express a wide range of data analysis operations in a simple and abstract manner. The underlying runtime system transparently resolves issues such as remote data access and resampling while at the same time maintaining interactivity through progressive and interruptible processing. Computations involving large amounts of data can be performed remotely in an incremental fashion that dramatically reduces data movement, while the client receives updates progressively thereby remaining robust to fluctuating network latency or limited bandwidth. This system facilitates interactive, incremental analysis and visualization of massive remote datasets up to petabytes in size. Our system is now available for general use in the community through both docker and anaconda.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Amerio, S.; Behari, S.; Boyd, J.
The Fermilab Tevatron collider's data-taking run ended in September 2011, yielding a dataset with rich scientific potential. The CDF and D0 experiments each have approximately 9 PB of collider and simulated data stored on tape. A large computing infrastructure consisting of tape storage, disk cache, and distributed grid computing for physics analysis with the Tevatron data is present at Fermilab. The Fermilab Run II data preservation project intends to keep this analysis capability sustained through the year 2020 and beyond. To achieve this goal, we have implemented a system that utilizes virtualization, automated validation, and migration to new standards inmore » both software and data storage technology and leverages resources available from currently-running experiments at Fermilab. Lastly, these efforts have also provided useful lessons in ensuring long-term data access for numerous experiments, and enable high-quality scientific output for years to come.« less
Data preservation at the Fermilab Tevatron
NASA Astrophysics Data System (ADS)
Amerio, S.; Behari, S.; Boyd, J.; Brochmann, M.; Culbertson, R.; Diesburg, M.; Freeman, J.; Garren, L.; Greenlee, H.; Herner, K.; Illingworth, R.; Jayatilaka, B.; Jonckheere, A.; Li, Q.; Naymola, S.; Oleynik, G.; Sakumoto, W.; Varnes, E.; Vellidis, C.; Watts, G.; White, S.
2017-04-01
The Fermilab Tevatron collider's data-taking run ended in September 2011, yielding a dataset with rich scientific potential. The CDF and D0 experiments each have approximately 9 PB of collider and simulated data stored on tape. A large computing infrastructure consisting of tape storage, disk cache, and distributed grid computing for physics analysis with the Tevatron data is present at Fermilab. The Fermilab Run II data preservation project intends to keep this analysis capability sustained through the year 2020 and beyond. To achieve this goal, we have implemented a system that utilizes virtualization, automated validation, and migration to new standards in both software and data storage technology and leverages resources available from currently-running experiments at Fermilab. These efforts have also provided useful lessons in ensuring long-term data access for numerous experiments, and enable high-quality scientific output for years to come.
Exploring Two Approaches for an End-to-End Scientific Analysis Workflow
Dodelson, Scott; Kent, Steve; Kowalkowski, Jim; ...
2015-12-23
The advance of the scientific discovery process is accomplished by the integration of independently-developed programs run on disparate computing facilities into coherent workflows usable by scientists who are not experts in computing. For such advancement, we need a system which scientists can use to formulate analysis workflows, to integrate new components to these workflows, and to execute different components on resources that are best suited to run those components. In addition, we need to monitor the status of the workflow as components get scheduled and executed, and to access the intermediate and final output for visual exploration and analysis. Finally,more » it is important for scientists to be able to share their workflows with collaborators. Moreover we have explored two approaches for such an analysis framework for the Large Synoptic Survey Telescope (LSST) Dark Energy Science Collaboration (DESC), the first one is based on the use and extension of Galaxy, a web-based portal for biomedical research, and the second one is based on a programming language, Python. In our paper, we present a brief description of the two approaches, describe the kinds of extensions to the Galaxy system we have found necessary in order to support the wide variety of scientific analysis in the cosmology community, and discuss how similar efforts might be of benefit to the HEP community.« less
The application of computer image analysis in life sciences and environmental engineering
NASA Astrophysics Data System (ADS)
Mazur, R.; Lewicki, A.; Przybył, K.; Zaborowicz, M.; Koszela, K.; Boniecki, P.; Mueller, W.; Raba, B.
2014-04-01
The main aim of the article was to present research on the application of computer image analysis in Life Science and Environmental Engineering. The authors used different methods of computer image analysis in developing of an innovative biotest in modern biomonitoring of water quality. Created tools were based on live organisms such as bioindicators Lemna minor L. and Hydra vulgaris Pallas as well as computer image analysis method in the assessment of negatives reactions during the exposition of the organisms to selected water toxicants. All of these methods belong to acute toxicity tests and are particularly essential in ecotoxicological assessment of water pollutants. Developed bioassays can be used not only in scientific research but are also applicable in environmental engineering and agriculture in the study of adverse effects on water quality of various compounds used in agriculture and industry.
Cultural and Technological Issues and Solutions for Geodynamics Software Citation
NASA Astrophysics Data System (ADS)
Heien, E. M.; Hwang, L.; Fish, A. E.; Smith, M.; Dumit, J.; Kellogg, L. H.
2014-12-01
Computational software and custom-written codes play a key role in scientific research and teaching, providing tools to perform data analysis and forward modeling through numerical computation. However, development of these codes is often hampered by the fact that there is no well-defined way for the authors to receive credit or professional recognition for their work through the standard methods of scientific publication and subsequent citation of the work. This in turn may discourage researchers from publishing their codes or making them easier for other scientists to use. We investigate the issues involved in citing software in a scientific context, and introduce features that should be components of a citation infrastructure, particularly oriented towards the codes and scientific culture in the area of geodynamics research. The codes used in geodynamics are primarily specialized numerical modeling codes for continuum mechanics problems; they may be developed by individual researchers, teams of researchers, geophysicists in collaboration with computational scientists and applied mathematicians, or by coordinated community efforts such as the Computational Infrastructure for Geodynamics. Some but not all geodynamics codes are open-source. These characteristics are common to many areas of geophysical software development and use. We provide background on the problem of software citation and discuss some of the barriers preventing adoption of such citations, including social/cultural barriers, insufficient technological support infrastructure, and an overall lack of agreement about what a software citation should consist of. We suggest solutions in an initial effort to create a system to support citation of software and promotion of scientific software development.
Data management and analysis for the Earth System Grid
NASA Astrophysics Data System (ADS)
Williams, D. N.; Ananthakrishnan, R.; Bernholdt, D. E.; Bharathi, S.; Brown, D.; Chen, M.; Chervenak, A. L.; Cinquini, L.; Drach, R.; Foster, I. T.; Fox, P.; Hankin, S.; Henson, V. E.; Jones, P.; Middleton, D. E.; Schwidder, J.; Schweitzer, R.; Schuler, R.; Shoshani, A.; Siebenlist, F.; Sim, A.; Strand, W. G.; Wilhelmi, N.; Su, M.
2008-07-01
The international climate community is expected to generate hundreds of petabytes of simulation data within the next five to seven years. This data must be accessed and analyzed by thousands of analysts worldwide in order to provide accurate and timely estimates of the likely impact of climate change on physical, biological, and human systems. Climate change is thus not only a scientific challenge of the first order but also a major technological challenge. In order to address this technological challenge, the Earth System Grid Center for Enabling Technologies (ESG-CET) has been established within the U.S. Department of Energy's Scientific Discovery through Advanced Computing (SciDAC)-2 program, with support from the offices of Advanced Scientific Computing Research and Biological and Environmental Research. ESG-CET's mission is to provide climate researchers worldwide with access to the data, information, models, analysis tools, and computational capabilities required to make sense of enormous climate simulation datasets. Its specific goals are to (1) make data more useful to climate researchers by developing Grid technology that enhances data usability; (2) meet specific distributed database, data access, and data movement needs of national and international climate projects; (3) provide a universal and secure web-based data access portal for broad multi-model data collections; and (4) provide a wide-range of Grid-enabled climate data analysis tools and diagnostic methods to international climate centers and U.S. government agencies. Building on the successes of the previous Earth System Grid (ESG) project, which has enabled thousands of researchers to access tens of terabytes of data from a small number of ESG sites, ESG-CET is working to integrate a far larger number of distributed data providers, high-bandwidth wide-area networks, and remote computers in a highly collaborative problem-solving environment.
Erdemir, Ahmet; Guess, Trent M.; Halloran, Jason P.; Modenese, Luca; Reinbolt, Jeffrey A.; Thelen, Darryl G.; Umberger, Brian R.
2016-01-01
Objective The overall goal of this document is to demonstrate that dissemination of models and analyses for assessing the reproducibility of simulation results can be incorporated in the scientific review process in biomechanics. Methods As part of a special issue on model sharing and reproducibility in IEEE Transactions on Biomedical Engineering, two manuscripts on computational biomechanics were submitted: A. Rajagopal et al., IEEE Trans. Biomed. Eng., 2016 and A. Schmitz and D. Piovesan, IEEE Trans. Biomed. Eng., 2016. Models used in these studies were shared with the scientific reviewers and the public. In addition to the standard review of the manuscripts, the reviewers downloaded the models and performed simulations that reproduced results reported in the studies. Results There was general agreement between simulation results of the authors and those of the reviewers. Discrepancies were resolved during the necessary revisions. The manuscripts and instructions for download and simulation were updated in response to the reviewers’ feedback; changes that may otherwise have been missed if explicit model sharing and simulation reproducibility analysis were not conducted in the review process. Increased burden on the authors and the reviewers, to facilitate model sharing and to repeat simulations, were noted. Conclusion When the authors of computational biomechanics studies provide access to models and data, the scientific reviewers can download and thoroughly explore the model, perform simulations, and evaluate simulation reproducibility beyond the traditional manuscript-only review process. Significance Model sharing and reproducibility analysis in scholarly publishing will result in a more rigorous review process, which will enhance the quality of modeling and simulation studies and inform future users of computational models. PMID:28072567
Symbolic-numeric interface: A review
NASA Technical Reports Server (NTRS)
Ng, E. W.
1980-01-01
A survey of the use of a combination of symbolic and numerical calculations is presented. Symbolic calculations primarily refer to the computer processing of procedures from classical algebra, analysis, and calculus. Numerical calculations refer to both numerical mathematics research and scientific computation. This survey is intended to point out a large number of problem areas where a cooperation of symbolic and numerical methods is likely to bear many fruits. These areas include such classical operations as differentiation and integration, such diverse activities as function approximations and qualitative analysis, and such contemporary topics as finite element calculations and computation complexity. It is contended that other less obvious topics such as the fast Fourier transform, linear algebra, nonlinear analysis and error analysis would also benefit from a synergistic approach.
Scientific Services on the Cloud
NASA Astrophysics Data System (ADS)
Chapman, David; Joshi, Karuna P.; Yesha, Yelena; Halem, Milt; Yesha, Yaacov; Nguyen, Phuong
Scientific Computing was one of the first every applications for parallel and distributed computation. To this date, scientific applications remain some of the most compute intensive, and have inspired creation of petaflop compute infrastructure such as the Oak Ridge Jaguar and Los Alamos RoadRunner. Large dedicated hardware infrastructure has become both a blessing and a curse to the scientific community. Scientists are interested in cloud computing for much the same reason as businesses and other professionals. The hardware is provided, maintained, and administrated by a third party. Software abstraction and virtualization provide reliability, and fault tolerance. Graduated fees allow for multi-scale prototyping and execution. Cloud computing resources are only a few clicks away, and by far the easiest high performance distributed platform to gain access to. There may still be dedicated infrastructure for ultra-scale science, but the cloud can easily play a major part of the scientific computing initiative.
Center for Center for Technology for Advanced Scientific Component Software (TASCS)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kostadin, Damevski
A resounding success of the Scientific Discovery through Advanced Computing (SciDAC) program is that high-performance computational science is now universally recognized as a critical aspect of scientific discovery [71], complementing both theoretical and experimental research. As scientific communities prepare to exploit unprecedented computing capabilities of emerging leadership-class machines for multi-model simulations at the extreme scale [72], it is more important than ever to address the technical and social challenges of geographically distributed teams that combine expertise in domain science, applied mathematics, and computer science to build robust and flexible codes that can incorporate changes over time. The Center for Technologymore » for Advanced Scientific Component Software (TASCS)1 tackles these these issues by exploiting component-based software development to facilitate collaborative high-performance scientific computing.« less
NASA Astrophysics Data System (ADS)
Clementi, Enrico
2012-06-01
This is the introductory chapter to the AIP Proceedings volume "Theory and Applications of Computational Chemistry: The First Decade of the Second Millennium" where we discuss the evolution of "computational chemistry". Very early variational computational chemistry developments are reported in Sections 1 to 7, and 11, 12 by recalling some of the computational chemistry contributions by the author and his collaborators (from late 1950 to mid 1990); perturbation techniques are not considered in this already extended work. Present day's computational chemistry is partly considered in Sections 8 to 10 where more recent studies by the author and his collaborators are discussed, including the Hartree-Fock-Heitler-London method; a more general discussion on present day computational chemistry is presented in Section 14. The following chapters of this AIP volume provide a view of modern computational chemistry. Future computational chemistry developments can be extrapolated from the chapters of this AIP volume; further, in Sections 13 and 15 present an overall analysis on computational chemistry, obtained from the Global Simulation approach, by considering the evolution of scientific knowledge confronted with the opportunities offered by modern computers.
Computational Science: A Research Methodology for the 21st Century
NASA Astrophysics Data System (ADS)
Orbach, Raymond L.
2004-03-01
Computational simulation - a means of scientific discovery that employs computer systems to simulate a physical system according to laws derived from theory and experiment - has attained peer status with theory and experiment. Important advances in basic science are accomplished by a new "sociology" for ultrascale scientific computing capability (USSCC), a fusion of sustained advances in scientific models, mathematical algorithms, computer architecture, and scientific software engineering. Expansion of current capabilities by factors of 100 - 1000 open up new vistas for scientific discovery: long term climatic variability and change, macroscopic material design from correlated behavior at the nanoscale, design and optimization of magnetic confinement fusion reactors, strong interactions on a computational lattice through quantum chromodynamics, and stellar explosions and element production. The "virtual prototype," made possible by this expansion, can markedly reduce time-to-market for industrial applications such as jet engines and safer, more fuel efficient cleaner cars. In order to develop USSCC, the National Energy Research Scientific Computing Center (NERSC) announced the competition "Innovative and Novel Computational Impact on Theory and Experiment" (INCITE), with no requirement for current DOE sponsorship. Fifty nine proposals for grand challenge scientific problems were submitted for a small number of awards. The successful grants, and their preliminary progress, will be described.
From the desktop to the grid: scalable bioinformatics via workflow conversion.
de la Garza, Luis; Veit, Johannes; Szolek, Andras; Röttig, Marc; Aiche, Stephan; Gesing, Sandra; Reinert, Knut; Kohlbacher, Oliver
2016-03-12
Reproducibility is one of the tenets of the scientific method. Scientific experiments often comprise complex data flows, selection of adequate parameters, and analysis and visualization of intermediate and end results. Breaking down the complexity of such experiments into the joint collaboration of small, repeatable, well defined tasks, each with well defined inputs, parameters, and outputs, offers the immediate benefit of identifying bottlenecks, pinpoint sections which could benefit from parallelization, among others. Workflows rest upon the notion of splitting complex work into the joint effort of several manageable tasks. There are several engines that give users the ability to design and execute workflows. Each engine was created to address certain problems of a specific community, therefore each one has its advantages and shortcomings. Furthermore, not all features of all workflow engines are royalty-free -an aspect that could potentially drive away members of the scientific community. We have developed a set of tools that enables the scientific community to benefit from workflow interoperability. We developed a platform-free structured representation of parameters, inputs, outputs of command-line tools in so-called Common Tool Descriptor documents. We have also overcome the shortcomings and combined the features of two royalty-free workflow engines with a substantial user community: the Konstanz Information Miner, an engine which we see as a formidable workflow editor, and the Grid and User Support Environment, a web-based framework able to interact with several high-performance computing resources. We have thus created a free and highly accessible way to design workflows on a desktop computer and execute them on high-performance computing resources. Our work will not only reduce time spent on designing scientific workflows, but also make executing workflows on remote high-performance computing resources more accessible to technically inexperienced users. We strongly believe that our efforts not only decrease the turnaround time to obtain scientific results but also have a positive impact on reproducibility, thus elevating the quality of obtained scientific results.
NASA Astrophysics Data System (ADS)
Doyle, Paul; Mtenzi, Fred; Smith, Niall; Collins, Adrian; O'Shea, Brendan
2012-09-01
The scientific community is in the midst of a data analysis crisis. The increasing capacity of scientific CCD instrumentation and their falling costs is contributing to an explosive generation of raw photometric data. This data must go through a process of cleaning and reduction before it can be used for high precision photometric analysis. Many existing data processing pipelines either assume a relatively small dataset or are batch processed by a High Performance Computing centre. A radical overhaul of these processing pipelines is required to allow reduction and cleaning rates to process terabyte sized datasets at near capture rates using an elastic processing architecture. The ability to access computing resources and to allow them to grow and shrink as demand fluctuates is essential, as is exploiting the parallel nature of the datasets. A distributed data processing pipeline is required. It should incorporate lossless data compression, allow for data segmentation and support processing of data segments in parallel. Academic institutes can collaborate and provide an elastic computing model without the requirement for large centralized high performance computing data centers. This paper demonstrates how a base 10 order of magnitude improvement in overall processing time has been achieved using the "ACN pipeline", a distributed pipeline spanning multiple academic institutes.
A computer-aided movement analysis system.
Fioretti, S; Leo, T; Pisani, E; Corradini, M L
1990-08-01
Interaction with biomechanical data concerning human movement analysis implies the adoption of various experimental equipments and the choice of suitable models, data processing, and graphical data restitution techniques. The integration of measurement setups with the associated experimental protocols and the relative software procedures constitutes a computer-aided movement analysis (CAMA) system. In the present paper such integration is mapped onto the causes that limit the clinical acceptance of movement analysis methods. The structure of the system is presented. A specific CAMA system devoted to posture analysis is described in order to show the attainable features. Scientific results obtained with the support of the described system are also reported.
A scientific workflow framework for (13)C metabolic flux analysis.
Dalman, Tolga; Wiechert, Wolfgang; Nöh, Katharina
2016-08-20
Metabolic flux analysis (MFA) with (13)C labeling data is a high-precision technique to quantify intracellular reaction rates (fluxes). One of the major challenges of (13)C MFA is the interactivity of the computational workflow according to which the fluxes are determined from the input data (metabolic network model, labeling data, and physiological rates). Here, the workflow assembly is inevitably determined by the scientist who has to consider interacting biological, experimental, and computational aspects. Decision-making is context dependent and requires expertise, rendering an automated evaluation process hardly possible. Here, we present a scientific workflow framework (SWF) for creating, executing, and controlling on demand (13)C MFA workflows. (13)C MFA-specific tools and libraries, such as the high-performance simulation toolbox 13CFLUX2, are wrapped as web services and thereby integrated into a service-oriented architecture. Besides workflow steering, the SWF features transparent provenance collection and enables full flexibility for ad hoc scripting solutions. To handle compute-intensive tasks, cloud computing is supported. We demonstrate how the challenges posed by (13)C MFA workflows can be solved with our approach on the basis of two proof-of-concept use cases. Copyright © 2015 Elsevier B.V. All rights reserved.
Managing a tier-2 computer centre with a private cloud infrastructure
NASA Astrophysics Data System (ADS)
Bagnasco, Stefano; Berzano, Dario; Brunetti, Riccardo; Lusso, Stefano; Vallero, Sara
2014-06-01
In a typical scientific computing centre, several applications coexist and share a single physical infrastructure. An underlying Private Cloud infrastructure eases the management and maintenance of such heterogeneous applications (such as multipurpose or application-specific batch farms, Grid sites, interactive data analysis facilities and others), allowing dynamic allocation resources to any application. Furthermore, the maintenance of large deployments of complex and rapidly evolving middleware and application software is eased by the use of virtual images and contextualization techniques. Such infrastructures are being deployed in some large centres (see e.g. the CERN Agile Infrastructure project), but with several open-source tools reaching maturity this is becoming viable also for smaller sites. In this contribution we describe the Private Cloud infrastructure at the INFN-Torino Computer Centre, that hosts a full-fledged WLCG Tier-2 centre, an Interactive Analysis Facility for the ALICE experiment at the CERN LHC and several smaller scientific computing applications. The private cloud building blocks include the OpenNebula software stack, the GlusterFS filesystem and the OpenWRT Linux distribution (used for network virtualization); a future integration into a federated higher-level infrastructure is made possible by exposing commonly used APIs like EC2 and OCCI.
Visualization and Interaction in Research, Teaching, and Scientific Communication
NASA Astrophysics Data System (ADS)
Ammon, C. J.
2017-12-01
Modern computing provides many tools for exploring observations, numerical calculations, and theoretical relationships. The number of options is, in fact, almost overwhelming. But the choices provide those with modest programming skills opportunities to create unique views of scientific information and to develop deeper insights into their data, their computations, and the underlying theoretical data-model relationships. I present simple examples of using animation and human-computer interaction to explore scientific data and scientific-analysis approaches. I illustrate how valuable a little programming ability can free scientists from the constraints of existing tools and can facilitate the development of deeper appreciation data and models. I present examples from a suite of programming languages ranging from C to JavaScript including the Wolfram Language. JavaScript is valuable for sharing tools and insight (hopefully) with others because it is integrated into one of the most powerful communication tools in human history, the web browser. Although too much of that power is often spent on distracting advertisements, the underlying computation and graphics engines are efficient, flexible, and almost universally available in desktop and mobile computing platforms. Many are working to fulfill the browser's potential to become the most effective tool for interactive study. Open-source frameworks for visualizing everything from algorithms to data are available, but advance rapidly. One strategy for dealing with swiftly changing tools is to adopt common, open data formats that are easily adapted (often by framework or tool developers). I illustrate the use of animation and interaction in research and teaching with examples from earthquake seismology.
Joint the Center for Applied Scientific Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gamblin, Todd; Bremer, Timo; Van Essen, Brian
The Center for Applied Scientific Computing serves as Livermore Lab’s window to the broader computer science, computational physics, applied mathematics, and data science research communities. In collaboration with academic, industrial, and other government laboratory partners, we conduct world-class scientific research and development on problems critical to national security. CASC applies the power of high-performance computing and the efficiency of modern computational methods to the realms of stockpile stewardship, cyber and energy security, and knowledge discovery for intelligence applications.
1978-01-17
approach to designing computers: Formal mathematical methods were applied and computers themselves began to be widely used in designing other...capital, labor resources and the funds of consumers. Analysis of the model indicates that at the present time the average complexity of production of...ALGORITHMIC COMPLETENESS AND COMPLEXITY OF MICROPROGRAMS Kiev KIBERNETIKA in Russian No 3, May/Jun 77 pp 1-15 manuscript received 22 Dec 76 G0LUNK0V
78 FR 41046 - Advanced Scientific Computing Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2013-07-09
... Services Administration, notice is hereby given that the Advanced Scientific Computing Advisory Committee will be renewed for a two-year period beginning on July 1, 2013. The Committee will provide advice to the Director, Office of Science (DOE), on the Advanced Scientific Computing Research Program managed...
Fulcher, Ben D; Jones, Nick S
2017-11-22
Phenotype measurements frequently take the form of time series, but we currently lack a systematic method for relating these complex data streams to scientifically meaningful outcomes, such as relating the movement dynamics of organisms to their genotype or measurements of brain dynamics of a patient to their disease diagnosis. Previous work addressed this problem by comparing implementations of thousands of diverse scientific time-series analysis methods in an approach termed highly comparative time-series analysis. Here, we introduce hctsa, a software tool for applying this methodological approach to data. hctsa includes an architecture for computing over 7,700 time-series features and a suite of analysis and visualization algorithms to automatically select useful and interpretable time-series features for a given application. Using exemplar applications to high-throughput phenotyping experiments, we show how hctsa allows researchers to leverage decades of time-series research to quantify and understand informative structure in time-series data. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pike, Bill
Data—lots of data—generated in seconds and piling up on the internet, streaming and stored in countless databases. Big data is important for commerce, society and our nation’s security. Yet the volume, velocity, variety and veracity of data is simply too great for any single analyst to make sense of alone. It requires advanced, data-intensive computing. Simply put, data-intensive computing is the use of sophisticated computers to sort through mounds of information and present analysts with solutions in the form of graphics, scenarios, formulas, new hypotheses and more. This scientific capability is foundational to PNNL’s energy, environment and security missions. Seniormore » Scientist and Division Director Bill Pike and his team are developing analytic tools that are used to solve important national challenges, including cyber systems defense, power grid control systems, intelligence analysis, climate change and scientific exploration.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boyd, J.; Herner, K.; Jayatilaka, B.
The Fermilab Tevatron collider's data-taking run ended in September 2011, yielding a dataset with rich scientific potential. The CDF and DO experiments each have nearly 9 PB of collider and simulated data stored on tape. A large computing infrastructure consisting of tape storage, disk cache, and distributed grid computing for physics analysis with the Tevatron data is present at Fermilab. The Fermilab Run II data preservation project intends to keep this analysis capability sustained through the year 2020 or beyond. To achieve this, we are implementing a system that utilizes virtualization, automated validation, and migration to new standards in bothmore » software and data storage technology as well as leveraging resources available from currently-running experiments at Fermilab. Furthermore, these efforts will provide useful lessons in ensuring long-term data access for numerous experiments throughout high-energy physics, and provide a roadmap for high-quality scientific output for years to come.« less
Data preservation at the Fermilab Tevatron
Amerio, S.; Behari, S.; Boyd, J.; ...
2017-01-22
The Fermilab Tevatron collider's data-taking run ended in September 2011, yielding a dataset with rich scientific potential. The CDF and D0 experiments each have approximately 9 PB of collider and simulated data stored on tape. A large computing infrastructure consisting of tape storage, disk cache, and distributed grid computing for physics analysis with the Tevatron data is present at Fermilab. The Fermilab Run II data preservation project intends to keep this analysis capability sustained through the year 2020 and beyond. To achieve this goal, we have implemented a system that utilizes virtualization, automated validation, and migration to new standards inmore » both software and data storage technology and leverages resources available from currently-running experiments at Fermilab. Lastly, these efforts have also provided useful lessons in ensuring long-term data access for numerous experiments, and enable high-quality scientific output for years to come.« less
Data preservation at the Fermilab Tevatron
Boyd, J.; Herner, K.; Jayatilaka, B.; ...
2015-12-23
The Fermilab Tevatron collider's data-taking run ended in September 2011, yielding a dataset with rich scientific potential. The CDF and DO experiments each have nearly 9 PB of collider and simulated data stored on tape. A large computing infrastructure consisting of tape storage, disk cache, and distributed grid computing for physics analysis with the Tevatron data is present at Fermilab. The Fermilab Run II data preservation project intends to keep this analysis capability sustained through the year 2020 or beyond. To achieve this, we are implementing a system that utilizes virtualization, automated validation, and migration to new standards in bothmore » software and data storage technology as well as leveraging resources available from currently-running experiments at Fermilab. Furthermore, these efforts will provide useful lessons in ensuring long-term data access for numerous experiments throughout high-energy physics, and provide a roadmap for high-quality scientific output for years to come.« less
Computational knowledge integration in biopharmaceutical research.
Ficenec, David; Osborne, Mark; Pradines, Joel; Richards, Dan; Felciano, Ramon; Cho, Raymond J; Chen, Richard O; Liefeld, Ted; Owen, James; Ruttenberg, Alan; Reich, Christian; Horvath, Joseph; Clark, Tim
2003-09-01
An initiative to increase biopharmaceutical research productivity by capturing, sharing and computationally integrating proprietary scientific discoveries with public knowledge is described. This initiative involves both organisational process change and multiple interoperating software systems. The software components rely on mutually supporting integration techniques. These include a richly structured ontology, statistical analysis of experimental data against stored conclusions, natural language processing of public literature, secure document repositories with lightweight metadata, web services integration, enterprise web portals and relational databases. This approach has already begun to increase scientific productivity in our enterprise by creating an organisational memory (OM) of internal research findings, accessible on the web. Through bringing together these components it has also been possible to construct a very large and expanding repository of biological pathway information linked to this repository of findings which is extremely useful in analysis of DNA microarray data. This repository, in turn, enables our research paradigm to be shifted towards more comprehensive systems-based understandings of drug action.
Data preservation at the Fermilab Tevatron
NASA Astrophysics Data System (ADS)
Boyd, J.; Herner, K.; Jayatilaka, B.; Roser, R.; Sakumoto, W.
2015-12-01
The Fermilab Tevatron collider's data-taking run ended in September 2011, yielding a dataset with rich scientific potential. The CDF and DO experiments each have nearly 9 PB of collider and simulated data stored on tape. A large computing infrastructure consisting of tape storage, disk cache, and distributed grid computing for physics analysis with the Tevatron data is present at Fermilab. The Fermilab Run II data preservation project intends to keep this analysis capability sustained through the year 2020 or beyond. To achieve this, we are implementing a system that utilizes virtualization, automated validation, and migration to new standards in both software and data storage technology as well as leveraging resources available from currently-running experiments at Fermilab. These efforts will provide useful lessons in ensuring long-term data access for numerous experiments throughout high-energy physics, and provide a roadmap for high-quality scientific output for years to come.
Multi-year Content Analysis of User Facility Related Publications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Patton, Robert M; Stahl, Christopher G; Hines, Jayson
2013-01-01
Scientific user facilities provide resources and support that enable scientists to conduct experiments or simulations pertinent to their respective research. Consequently, it is critical to have an informed understanding of the impact and contributions that these facilities have on scientific discoveries. Leveraging insight into scientific publications that acknowledge the use of these facilities enables more informed decisions by facility management and sponsors in regard to policy, resource allocation, and influencing the direction of science as well as more effectively understand the impact of a scientific user facility. This work discusses preliminary results of mining scientific publications that utilized resources atmore » the Oak Ridge Leadership Computing Facility (OLCF) at Oak Ridge National Laboratory (ORNL). These results show promise in identifying and leveraging multi-year trends and providing a higher resolution view of the impact that a scientific user facility may have on scientific discoveries.« less
Guidelines for Preparation of a Scientific Paper
Kosiba, Margaret M.
1988-01-01
Even the experienced scientific writer may have difficulty transferring research results to clear, concise, publishable words. To assist the beginning scientific writer, guidelines are proposed that will provide direction for determining a topic, developing protocols, collecting data, using computers for analysis and word processing, incorporating copyediting notations, consulting scientific writing manuals, and developing sound writing habits. Guidelines for writing each section of a research paper are described to help the writer prepare the title page, introduction, materials and methods, results, and discussion sections of the paper, as well as the acknowledgments and references. Procedures for writing the first draft and subsequent revisions include a checklist of structural and stylistic problems and common errors in English usage. PMID:3339646
Computational chemistry and cheminformatics: an essay on the future.
Glen, Robert Charles
2012-01-01
Computers have changed the way we do science. Surrounded by a sea of data and with phenomenal computing capacity, the methodology and approach to scientific problems is evolving into a partnership between experiment, theory and data analysis. Given the pace of change of the last twenty-five years, it seems folly to speculate on the future, but along with unpredictable leaps of progress there will be a continuous evolution of capability, which points to opportunities and improvements that will certainly appear as our discipline matures.
Integrating Data Base into the Elementary School Science Program.
ERIC Educational Resources Information Center
Schlenker, Richard M.
This document describes seven science activities that combine scientific principles and computers. The objectives for the activities are to show students how the computer can be used as a tool to store and arrange scientific data, provide students with experience using the computer as a tool to manage scientific data, and provide students with…
NASA Astrophysics Data System (ADS)
Rahman, P. A.
2018-05-01
This scientific paper deals with the two-level backbone computer networks with arbitrary topology. A specialized method, offered by the author for calculation of the stationary availability factor of the two-level backbone computer networks, based on the Markov reliability models for the set of the independent repairable elements with the given failure and repair rates and the methods of the discrete mathematics, is also discussed. A specialized algorithm, offered by the author for analysis of the network connectivity, taking into account different kinds of the network equipment failures, is also observed. Finally, this paper presents an example of calculation of the stationary availability factor for the backbone computer network with the given topology.
A high performance scientific cloud computing environment for materials simulations
NASA Astrophysics Data System (ADS)
Jorissen, K.; Vila, F. D.; Rehr, J. J.
2012-09-01
We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including tools for execution and monitoring performance, as well as efficient I/O utilities that enable seamless connections to and from the cloud. Our SCC platform is optimized for the Amazon Elastic Compute Cloud (EC2). We present benchmarks for prototypical scientific applications and demonstrate performance comparable to local compute clusters. To facilitate code execution and provide user-friendly access, we have also integrated cloud computing capability in a JAVA-based GUI. Our SCC platform may be an alternative to traditional HPC resources for materials science or quantum chemistry applications.
Moutsatsos, Ioannis K; Hossain, Imtiaz; Agarinis, Claudia; Harbinski, Fred; Abraham, Yann; Dobler, Luc; Zhang, Xian; Wilson, Christopher J; Jenkins, Jeremy L; Holway, Nicholas; Tallarico, John; Parker, Christian N
2017-03-01
High-throughput screening generates large volumes of heterogeneous data that require a diverse set of computational tools for management, processing, and analysis. Building integrated, scalable, and robust computational workflows for such applications is challenging but highly valuable. Scientific data integration and pipelining facilitate standardized data processing, collaboration, and reuse of best practices. We describe how Jenkins-CI, an "off-the-shelf," open-source, continuous integration system, is used to build pipelines for processing images and associated data from high-content screening (HCS). Jenkins-CI provides numerous plugins for standard compute tasks, and its design allows the quick integration of external scientific applications. Using Jenkins-CI, we integrated CellProfiler, an open-source image-processing platform, with various HCS utilities and a high-performance Linux cluster. The platform is web-accessible, facilitates access and sharing of high-performance compute resources, and automates previously cumbersome data and image-processing tasks. Imaging pipelines developed using the desktop CellProfiler client can be managed and shared through a centralized Jenkins-CI repository. Pipelines and managed data are annotated to facilitate collaboration and reuse. Limitations with Jenkins-CI (primarily around the user interface) were addressed through the selection of helper plugins from the Jenkins-CI community.
Moutsatsos, Ioannis K.; Hossain, Imtiaz; Agarinis, Claudia; Harbinski, Fred; Abraham, Yann; Dobler, Luc; Zhang, Xian; Wilson, Christopher J.; Jenkins, Jeremy L.; Holway, Nicholas; Tallarico, John; Parker, Christian N.
2016-01-01
High-throughput screening generates large volumes of heterogeneous data that require a diverse set of computational tools for management, processing, and analysis. Building integrated, scalable, and robust computational workflows for such applications is challenging but highly valuable. Scientific data integration and pipelining facilitate standardized data processing, collaboration, and reuse of best practices. We describe how Jenkins-CI, an “off-the-shelf,” open-source, continuous integration system, is used to build pipelines for processing images and associated data from high-content screening (HCS). Jenkins-CI provides numerous plugins for standard compute tasks, and its design allows the quick integration of external scientific applications. Using Jenkins-CI, we integrated CellProfiler, an open-source image-processing platform, with various HCS utilities and a high-performance Linux cluster. The platform is web-accessible, facilitates access and sharing of high-performance compute resources, and automates previously cumbersome data and image-processing tasks. Imaging pipelines developed using the desktop CellProfiler client can be managed and shared through a centralized Jenkins-CI repository. Pipelines and managed data are annotated to facilitate collaboration and reuse. Limitations with Jenkins-CI (primarily around the user interface) were addressed through the selection of helper plugins from the Jenkins-CI community. PMID:27899692
Intelligence and Accidents: A Multilevel Model
2006-05-06
individuals with low scores. Analysis Procedures The HLM 6 computer program (Raudenbush, Bryk, Cheong, & Congdon , 2004) was employed to conduct the...Cheong, Y. F., & Congdon , R. (2004). HLM 6: Hierarchical linear and nonlinear modeling. Chicago: Scientific Software International. Reynolds, D. H
Science in the cloud (SIC): A use case in MRI connectomics
Gorgolewski, Krzysztof J.; Kleissas, Dean; Roncal, William Gray; Litt, Brian; Wandell, Brian; Poldrack, Russel A.; Wiener, Martin; Vogelstein, R. Jacob; Burns, Randal
2017-01-01
Abstract Modern technologies are enabling scientists to collect extraordinary amounts of complex and sophisticated data across a huge range of scales like never before. With this onslaught of data, we can allow the focal point to shift from data collection to data analysis. Unfortunately, lack of standardized sharing mechanisms and practices often make reproducing or extending scientific results very difficult. With the creation of data organization structures and tools that drastically improve code portability, we now have the opportunity to design such a framework for communicating extensible scientific discoveries. Our proposed solution leverages these existing technologies and standards, and provides an accessible and extensible model for reproducible research, called ‘science in the cloud’ (SIC). Exploiting scientific containers, cloud computing, and cloud data services, we show the capability to compute in the cloud and run a web service that enables intimate interaction with the tools and data presented. We hope this model will inspire the community to produce reproducible and, importantly, extensible results that will enable us to collectively accelerate the rate at which scientific breakthroughs are discovered, replicated, and extended. PMID:28327935
Science in the cloud (SIC): A use case in MRI connectomics.
Kiar, Gregory; Gorgolewski, Krzysztof J; Kleissas, Dean; Roncal, William Gray; Litt, Brian; Wandell, Brian; Poldrack, Russel A; Wiener, Martin; Vogelstein, R Jacob; Burns, Randal; Vogelstein, Joshua T
2017-05-01
Modern technologies are enabling scientists to collect extraordinary amounts of complex and sophisticated data across a huge range of scales like never before. With this onslaught of data, we can allow the focal point to shift from data collection to data analysis. Unfortunately, lack of standardized sharing mechanisms and practices often make reproducing or extending scientific results very difficult. With the creation of data organization structures and tools that drastically improve code portability, we now have the opportunity to design such a framework for communicating extensible scientific discoveries. Our proposed solution leverages these existing technologies and standards, and provides an accessible and extensible model for reproducible research, called 'science in the cloud' (SIC). Exploiting scientific containers, cloud computing, and cloud data services, we show the capability to compute in the cloud and run a web service that enables intimate interaction with the tools and data presented. We hope this model will inspire the community to produce reproducible and, importantly, extensible results that will enable us to collectively accelerate the rate at which scientific breakthroughs are discovered, replicated, and extended. © The Author 2017. Published by Oxford University Press.
Using text analysis to quantify the similarity and evolution of scientific disciplines
Dias, Laércio; Scharloth, Joachim
2018-01-01
We use an information-theoretic measure of linguistic similarity to investigate the organization and evolution of scientific fields. An analysis of almost 20 M papers from the past three decades reveals that the linguistic similarity is related but different from experts and citation-based classifications, leading to an improved view on the organization of science. A temporal analysis of the similarity of fields shows that some fields (e.g. computer science) are becoming increasingly central, but that on average the similarity between pairs of disciplines has not changed in the last decades. This suggests that tendencies of convergence (e.g. multi-disciplinarity) and divergence (e.g. specialization) of disciplines are in balance. PMID:29410857
Using text analysis to quantify the similarity and evolution of scientific disciplines.
Dias, Laércio; Gerlach, Martin; Scharloth, Joachim; Altmann, Eduardo G
2018-01-01
We use an information-theoretic measure of linguistic similarity to investigate the organization and evolution of scientific fields. An analysis of almost 20 M papers from the past three decades reveals that the linguistic similarity is related but different from experts and citation-based classifications, leading to an improved view on the organization of science. A temporal analysis of the similarity of fields shows that some fields (e.g. computer science) are becoming increasingly central, but that on average the similarity between pairs of disciplines has not changed in the last decades. This suggests that tendencies of convergence (e.g. multi-disciplinarity) and divergence (e.g. specialization) of disciplines are in balance.
Constructing Scientific Arguments Using Evidence from Dynamic Computational Climate Models
ERIC Educational Resources Information Center
Pallant, Amy; Lee, Hee-Sun
2015-01-01
Modeling and argumentation are two important scientific practices students need to develop throughout school years. In this paper, we investigated how middle and high school students (N = 512) construct a scientific argument based on evidence from computational models with which they simulated climate change. We designed scientific argumentation…
Trends in Social Science: The Impact of Computational and Simulative Models
NASA Astrophysics Data System (ADS)
Conte, Rosaria; Paolucci, Mario; Cecconi, Federico
This paper discusses current progress in the computational social sciences. Specifically, it examines the following questions: Are the computational social sciences exhibiting positive or negative developments? What are the roles of agent-based models and simulation (ABM), network analysis, and other "computational" methods within this dynamic? (Conte, The necessity of intelligent agents in social simulation, Advances in Complex Systems, 3(01n04), 19-38, 2000; Conte 2010; Macy, Annual Review of Sociology, 143-166, 2002). Are there objective indicators of scientific growth that can be applied to different scientific areas, allowing for comparison among them? In this paper, some answers to these questions are presented and discussed. In particular, comparisons among different disciplines in the social and computational sciences are shown, taking into account their respective growth trends in the number of publication citations over the last few decades (culled from Google Scholar). After a short discussion of the methodology adopted, results of keyword-based queries are presented, unveiling some unexpected local impacts of simulation on the takeoff of traditionally poorly productive disciplines.
NASA Astrophysics Data System (ADS)
McAuliffe, C.; Ledley, T.; Dahlman, L.; Haddad, N.
2007-12-01
One of the challenges faced by Earth science teachers, particularly in K-12 settings, is that of connecting scientific research to classroom experiences. Helping teachers and students analyze Web-based scientific data is one way to bring scientific research to the classroom. The Earth Exploration Toolbook (EET) was developed as an online resource to accomplish precisely that. The EET consists of chapters containing step-by-step instructions for accessing Web-based scientific data and for using a software analysis tool to explore issues or concepts in science, technology, and mathematics. For example, in one EET chapter, users download Earthquake data from the USGS and bring it into a geographic information system (GIS), analyzing factors affecting the distribution of earthquakes. The goal of the EET Workshops project is to provide professional development that enables teachers to incorporate Web-based scientific data and analysis tools in ways that meet their curricular needs. In the EET Workshops project, Earth science teachers participate in a pair of workshops that are conducted in a combined teleconference and Web-conference format. In the first workshop, the EET Data Analysis Workshop, participants are introduced to the National Science Digital Library (NSDL) and the Digital Library for Earth System Education (DLESE). They also walk through an Earth Exploration Toolbook (EET) chapter and discuss ways to use Earth science datasets and tools with their students. In a follow-up second workshop, the EET Implementation Workshop, teachers share how they used these materials in the classroom by describing the projects and activities that they carried out with students. The EET Workshops project offers unique and effective professional development. Participants work at their own Internet-connected computers, and dial into a toll-free group teleconference for step-by-step facilitation and interaction. They also receive support via Elluminate, a Web-conferencing software program. The software allows participants to see the facilitator's computer as the analysis techniques of an EET chapter are demonstrated. If needed, the facilitator can also view individual participant's computers, assisting with technical difficulties. In addition, it enables a large number of end users, often widely distributed, to engage in interactive, real-time instruction. In this presentation, we will describe the elements of an EET Workshop pair, highlighting the capabilities and use of Elluminate. We will share lessons learned through several years of conducting this type of professional development. We will also share findings from survey data gathered from teachers who have participated in our workshops.
Parallel processing for scientific computations
NASA Technical Reports Server (NTRS)
Alkhatib, Hasan S.
1995-01-01
The scope of this project dealt with the investigation of the requirements to support distributed computing of scientific computations over a cluster of cooperative workstations. Various experiments on computations for the solution of simultaneous linear equations were performed in the early phase of the project to gain experience in the general nature and requirements of scientific applications. A specification of a distributed integrated computing environment, DICE, based on a distributed shared memory communication paradigm has been developed and evaluated. The distributed shared memory model facilitates porting existing parallel algorithms that have been designed for shared memory multiprocessor systems to the new environment. The potential of this new environment is to provide supercomputing capability through the utilization of the aggregate power of workstations cooperating in a cluster interconnected via a local area network. Workstations, generally, do not have the computing power to tackle complex scientific applications, making them primarily useful for visualization, data reduction, and filtering as far as complex scientific applications are concerned. There is a tremendous amount of computing power that is left unused in a network of workstations. Very often a workstation is simply sitting idle on a desk. A set of tools can be developed to take advantage of this potential computing power to create a platform suitable for large scientific computations. The integration of several workstations into a logical cluster of distributed, cooperative, computing stations presents an alternative to shared memory multiprocessor systems. In this project we designed and evaluated such a system.
Spinal Cord Injury-Induced Dysautonomia via Plasticity in Paravertebral Sympathetic Postganglionic
2015-10-01
for future study Data analysis and publications Major Task 3: Data analysis and publications months % completion/ Completion dates Subtask 1...Data analysis 6-36 25% Subtask 2: Manuscript writing and submission 24-36 10% Milestone(s) Achieved: Dissemination of scientific results. b. What...ganglia by computational simulation and dynamic-clamp analysis . Journal of neurophysiology 92, 2659- 2671, doi:10.1152/jn.00470.2004 (2004). 14 Llewellyn
Opal web services for biomedical applications.
Ren, Jingyuan; Williams, Nadya; Clementi, Luca; Krishnan, Sriram; Li, Wilfred W
2010-07-01
Biomedical applications have become increasingly complex, and they often require large-scale high-performance computing resources with a large number of processors and memory. The complexity of application deployment and the advances in cluster, grid and cloud computing require new modes of support for biomedical research. Scientific Software as a Service (sSaaS) enables scalable and transparent access to biomedical applications through simple standards-based Web interfaces. Towards this end, we built a production web server (http://ws.nbcr.net) in August 2007 to support the bioinformatics application called MEME. The server has grown since to include docking analysis with AutoDock and AutoDock Vina, electrostatic calculations using PDB2PQR and APBS, and off-target analysis using SMAP. All the applications on the servers are powered by Opal, a toolkit that allows users to wrap scientific applications easily as web services without any modification to the scientific codes, by writing simple XML configuration files. Opal allows both web forms-based access and programmatic access of all our applications. The Opal toolkit currently supports SOAP-based Web service access to a number of popular applications from the National Biomedical Computation Resource (NBCR) and affiliated collaborative and service projects. In addition, Opal's programmatic access capability allows our applications to be accessed through many workflow tools, including Vision, Kepler, Nimrod/K and VisTrails. From mid-August 2007 to the end of 2009, we have successfully executed 239,814 jobs. The number of successfully executed jobs more than doubled from 205 to 411 per day between 2008 and 2009. The Opal-enabled service model is useful for a wide range of applications. It provides for interoperation with other applications with Web Service interfaces, and allows application developers to focus on the scientific tool and workflow development. Web server availability: http://ws.nbcr.net.
NASA Technical Reports Server (NTRS)
Sen, Syamal K.; Shaykhian, Gholam Ali
2011-01-01
MatLab(TradeMark)(MATrix LABoratory) is a numerical computation and simulation tool that is used by thousands Scientists and Engineers in many countries. MatLab does purely numerical calculations, which can be used as a glorified calculator or interpreter programming language; its real strength is in matrix manipulations. Computer algebra functionalities are achieved within the MatLab environment using "symbolic" toolbox. This feature is similar to computer algebra programs, provided by Maple or Mathematica to calculate with mathematical equations using symbolic operations. MatLab in its interpreter programming language form (command interface) is similar with well known programming languages such as C/C++, support data structures and cell arrays to define classes in object oriented programming. As such, MatLab is equipped with most of the essential constructs of a higher programming language. MatLab is packaged with an editor and debugging functionality useful to perform analysis of large MatLab programs and find errors. We believe there are many ways to approach real-world problems; prescribed methods to ensure foregoing solutions are incorporated in design and analysis of data processing and visualization can benefit engineers and scientist in gaining wider insight in actual implementation of their perspective experiments. This presentation will focus on data processing and visualizations aspects of engineering and scientific applications. Specifically, it will discuss methods and techniques to perform intermediate-level data processing covering engineering and scientific problems. MatLab programming techniques including reading various data files formats to produce customized publication-quality graphics, importing engineering and/or scientific data, organizing data in tabular format, exporting data to be used by other software programs such as Microsoft Excel, data presentation and visualization will be discussed.
COMPILATION OF GROUND-WATER MODELS
Ground-water modeling is a computer-based methodology for mathematical analysis of the mechanisms and controls of ground-water systems for the evaluation of policies, action, and designs that may affect such systems. n addition to satisfying scientific interest in the workings of...
Introduction to the LaRC central scientific computing complex
NASA Technical Reports Server (NTRS)
Shoosmith, John N.
1993-01-01
The computers and associated equipment that make up the Central Scientific Computing Complex of the Langley Research Center are briefly described. The electronic networks that provide access to the various components of the complex and a number of areas that can be used by Langley and contractors staff for special applications (scientific visualization, image processing, software engineering, and grid generation) are also described. Flight simulation facilities that use the central computers are described. Management of the complex, procedures for its use, and available services and resources are discussed. This document is intended for new users of the complex, for current users who wish to keep appraised of changes, and for visitors who need to understand the role of central scientific computers at Langley.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gerber, Richard; Allcock, William; Beggio, Chris
2014-10-17
U.S. Department of Energy (DOE) High Performance Computing (HPC) facilities are on the verge of a paradigm shift in the way they deliver systems and services to science and engineering teams. Research projects are producing a wide variety of data at unprecedented scale and level of complexity, with community-specific services that are part of the data collection and analysis workflow. On June 18-19, 2014 representatives from six DOE HPC centers met in Oakland, CA at the DOE High Performance Operational Review (HPCOR) to discuss how they can best provide facilities and services to enable large-scale data-driven scientific discovery at themore » DOE national laboratories. The report contains findings from that review.« less
GABBs: Cyberinfrastructure for Self-Service Geospatial Data Exploration, Computation, and Sharing
NASA Astrophysics Data System (ADS)
Song, C. X.; Zhao, L.; Biehl, L. L.; Merwade, V.; Villoria, N.
2016-12-01
Geospatial data are present everywhere today with the proliferation of location-aware computing devices. This is especially true in the scientific community where large amounts of data are driving research and education activities in many domains. Collaboration over geospatial data, for example, in modeling, data analysis and visualization, must still overcome the barriers of specialized software and expertise among other challenges. In addressing these needs, the Geospatial data Analysis Building Blocks (GABBs) project aims at building geospatial modeling, data analysis and visualization capabilities in an open source web platform, HUBzero. Funded by NSF's Data Infrastructure Building Blocks initiative, GABBs is creating a geospatial data architecture that integrates spatial data management, mapping and visualization, and interfaces in the HUBzero platform for scientific collaborations. The geo-rendering enabled Rappture toolkit, a generic Python mapping library, geospatial data exploration and publication tools, and an integrated online geospatial data management solution are among the software building blocks from the project. The GABBS software will be available through Amazon's AWS Marketplace VM images and open source. Hosting services are also available to the user community. The outcome of the project will enable researchers and educators to self-manage their scientific data, rapidly create GIS-enable tools, share geospatial data and tools on the web, and build dynamic workflows connecting data and tools, all without requiring significant software development skills, GIS expertise or IT administrative privileges. This presentation will describe the GABBs architecture, toolkits and libraries, and showcase the scientific use cases that utilize GABBs capabilities, as well as the challenges and solutions for GABBs to interoperate with other cyberinfrastructure platforms.
NASA Technical Reports Server (NTRS)
VanZandt, John
1994-01-01
The usage model of supercomputers for scientific applications, such as computational fluid dynamics (CFD), has changed over the years. Scientific visualization has moved scientists away from looking at numbers to looking at three-dimensional images, which capture the meaning of the data. This change has impacted the system models for computing. This report details the model which is used by scientists at NASA's research centers.
ERIC Educational Resources Information Center
Adams, Stephen T.
2004-01-01
Although one role of computers in science education is to help students learn specific science concepts, computers are especially intriguing as a vehicle for fostering the development of epistemological knowledge about the nature of scientific knowledge--what it means to "know" in a scientific sense (diSessa, 1985). In this vein, the…
Template Interfaces for Agile Parallel Data-Intensive Science
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ramakrishnan, Lavanya; Gunter, Daniel; Pastorello, Gilerto Z.
Tigres provides a programming library to compose and execute large-scale data-intensive scientific workflows from desktops to supercomputers. DOE User Facilities and large science collaborations are increasingly generating large enough data sets that it is no longer practical to download them to a desktop to operate on them. They are instead stored at centralized compute and storage resources such as high performance computing (HPC) centers. Analysis of this data requires an ability to run on these facilities, but with current technologies, scaling an analysis to an HPC center and to a large data set is difficult even for experts. Tigres ismore » addressing the challenge of enabling collaborative analysis of DOE Science data through a new concept of reusable "templates" that enable scientists to easily compose, run and manage collaborative computational tasks. These templates define common computation patterns used in analyzing a data set.« less
Analysis of citation networks as a new tool for scientific research
Vasudevan, R. K.; Ziatdinov, M.; Chen, C.; ...
2016-12-06
The rapid growth of scientific publications necessitates new methods to understand the direction of scientific research within fields of study, ascertain the importance of particular groups, authors, or institutions, compute metrics that can determine the importance (centrality) of particular seminal papers, and provide insight into the social (collaboration) networks that are present. We present one such method based on analysis of citation networks, using the freely available CiteSpace Program. We use citation network analysis on three examples, including a single material that has been widely explored in the last decade (BiFeO 3), two small subfields with a minimal number ofmore » authors (flexoelectricity and Kitaev physics), and a much wider field with thousands of publications pertaining to a single technique (scanning tunneling microscopy). Interpretation of the analysis and key insights into the fields, such as whether the fields are experiencing resurgence or stagnation, are discussed, and author or collaboration networks that are prominent are determined. Such methods represent a paradigm shift in our way of dealing with the large volume of scientific publications and could change the way literature searches and reviews are conducted, as well as how the impact of specific work is assessed.« less
Analysis of citation networks as a new tool for scientific research
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vasudevan, R. K.; Ziatdinov, M.; Chen, C.
The rapid growth of scientific publications necessitates new methods to understand the direction of scientific research within fields of study, ascertain the importance of particular groups, authors, or institutions, compute metrics that can determine the importance (centrality) of particular seminal papers, and provide insight into the social (collaboration) networks that are present. We present one such method based on analysis of citation networks, using the freely available CiteSpace Program. We use citation network analysis on three examples, including a single material that has been widely explored in the last decade (BiFeO 3), two small subfields with a minimal number ofmore » authors (flexoelectricity and Kitaev physics), and a much wider field with thousands of publications pertaining to a single technique (scanning tunneling microscopy). Interpretation of the analysis and key insights into the fields, such as whether the fields are experiencing resurgence or stagnation, are discussed, and author or collaboration networks that are prominent are determined. Such methods represent a paradigm shift in our way of dealing with the large volume of scientific publications and could change the way literature searches and reviews are conducted, as well as how the impact of specific work is assessed.« less
Deelman, E.; Callaghan, S.; Field, E.; Francoeur, H.; Graves, R.; Gupta, N.; Gupta, V.; Jordan, T.H.; Kesselman, C.; Maechling, P.; Mehringer, J.; Mehta, G.; Okaya, D.; Vahi, K.; Zhao, L.
2006-01-01
This paper discusses the process of building an environment where large-scale, complex, scientific analysis can be scheduled onto a heterogeneous collection of computational and storage resources. The example application is the Southern California Earthquake Center (SCEC) CyberShake project, an analysis designed to compute probabilistic seismic hazard curves for sites in the Los Angeles area. We explain which software tools were used to build to the system, describe their functionality and interactions. We show the results of running the CyberShake analysis that included over 250,000 jobs using resources available through SCEC and the TeraGrid. ?? 2006 IEEE.
Enhancements to VTK enabling Scientific Visualization in Immersive Environments
DOE Office of Scientific and Technical Information (OSTI.GOV)
O'Leary, Patrick; Jhaveri, Sankhesh; Chaudhary, Aashish
Modern scientific, engineering and medical computational sim- ulations, as well as experimental and observational data sens- ing/measuring devices, produce enormous amounts of data. While statistical analysis provides insight into this data, scientific vi- sualization is tactically important for scientific discovery, prod- uct design and data analysis. These benefits are impeded, how- ever, when scientific visualization algorithms are implemented from scratch—a time-consuming and redundant process in im- mersive application development. This process can greatly ben- efit from leveraging the state-of-the-art open-source Visualization Toolkit (VTK) and its community. Over the past two (almost three) decades, integrating VTK with a virtual reality (VR)more » environment has only been attempted to varying degrees of success. In this pa- per, we demonstrate two new approaches to simplify this amalga- mation of an immersive interface with visualization rendering from VTK. In addition, we cover several enhancements to VTK that pro- vide near real-time updates and efficient interaction. Finally, we demonstrate the combination of VTK with both Vrui and OpenVR immersive environments in example applications.« less
NASA Technical Reports Server (NTRS)
1971-01-01
Detailed information on the spacecraft performance, mission operations, and tracking and data acquisition is presented for the Mariner Venus 1967 and Mariner Venus 1967 extension projects. Scientific and engineering results and conclusions are discussed, and include the scientific mission, encounter with Venus, observations near Earth, and cruise phase of the mission. Flight path analysis, spacecraft subsystems, and mission-related hardware and computer program development are covered. The scientific experiments carried by Mariner 5 were ultraviolet photometer, solar plasma probe, helium magnetometer, trapped radiation detector, S-band radio occultation, dual-frequency radio propagation, and celestial mechanics. The engineering experience gained by converting a space Mariner Mars 1964 spacecraft into one flown to Venus is also described.
76 FR 64330 - Advanced Scientific Computing Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2011-10-18
... talks on HPC Reliability, Diffusion on Complex Networks, and Reversible Software Execution Systems Report from Applied Math Workshop on Mathematics for the Analysis, Simulation, and Optimization of Complex Systems Report from ASCR-BES Workshop on Data Challenges from Next Generation Facilities Public...
Using the High-Level Based Program Interface to Facilitate the Large Scale Scientific Computing
Shang, Yizi; Shang, Ling; Gao, Chuanchang; Lu, Guiming; Ye, Yuntao; Jia, Dongdong
2014-01-01
This paper is to make further research on facilitating the large-scale scientific computing on the grid and the desktop grid platform. The related issues include the programming method, the overhead of the high-level program interface based middleware, and the data anticipate migration. The block based Gauss Jordan algorithm as a real example of large-scale scientific computing is used to evaluate those issues presented above. The results show that the high-level based program interface makes the complex scientific applications on large-scale scientific platform easier, though a little overhead is unavoidable. Also, the data anticipation migration mechanism can improve the efficiency of the platform which needs to process big data based scientific applications. PMID:24574931
Defining Computational Thinking for Mathematics and Science Classrooms
ERIC Educational Resources Information Center
Weintrop, David; Beheshti, Elham; Horn, Michael; Orton, Kai; Jona, Kemi; Trouille, Laura; Wilensky, Uri
2016-01-01
Science and mathematics are becoming computational endeavors. This fact is reflected in the recently released Next Generation Science Standards and the decision to include "computational thinking" as a core scientific practice. With this addition, and the increased presence of computation in mathematics and scientific contexts, a new…
ERIC Educational Resources Information Center
Halbauer, Siegfried
1976-01-01
It was considered that students of intensive scientific Russian courses could learn vocabulary more efficiently if they were taught word stems and how to combine them with prefixes and suffixes to form scientific words. The computer programs developed to identify the most important stems is discussed. (Text is in German.) (FB)
Bethel, E. Wes [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division and Scientific Visualization Group
2018-05-07
Summer Lecture Series 2008: Scientific visualization transforms abstract data into readily comprehensible images, provide a vehicle for "seeing the unseeable," and play a central role in both experimental and computational sciences. Wes Bethel, who heads the Scientific Visualization Group in the Computational Research Division, presents an overview of visualization and computer graphics, current research challenges, and future directions for the field.
NASA Astrophysics Data System (ADS)
Michaelis, A.; Wang, W.; Melton, F. S.; Votava, P.; Milesi, C.; Hashimoto, H.; Nemani, R. R.; Hiatt, S. H.
2009-12-01
As the length and diversity of the global earth observation data records grow, modeling and analyses of biospheric conditions increasingly requires multiple terabytes of data from a diversity of models and sensors. With network bandwidth beginning to flatten, transmission of these data from centralized data archives presents an increasing challenge, and costs associated with local storage and management of data and compute resources are often significant for individual research and application development efforts. Sharing community valued intermediary data sets, results and codes from individual efforts with others that are not in direct funded collaboration can also be a challenge with respect to time, cost and expertise. We purpose a modeling, data and knowledge center that houses NASA satellite data, climate data and ancillary data where a focused community may come together to share modeling and analysis codes, scientific results, knowledge and expertise on a centralized platform, named Ecosystem Modeling Center (EMC). With the recent development of new technologies for secure hardware virtualization, an opportunity exists to create specific modeling, analysis and compute environments that are customizable, “archiveable” and transferable. Allowing users to instantiate such environments on large compute infrastructures that are directly connected to large data archives may significantly reduce costs and time associated with scientific efforts by alleviating users from redundantly retrieving and integrating data sets and building modeling analysis codes. The EMC platform also provides the possibility for users receiving indirect assistance from expertise through prefabricated compute environments, potentially reducing study “ramp up” times.
NASA Technical Reports Server (NTRS)
Vallee, J.; Gibbs, B.
1976-01-01
Between August 1975 and March 1976, two NASA projects with geographically separated participants used a computer-conferencing system developed by the Institute for the Future for portions of their work. Monthly usage statistics for the system were collected in order to examine the group and individual participation figures for all conferences. The conference transcripts were analysed to derive observations about the use of the medium. In addition to the results of these analyses, the attitudes of users and the major components of the costs of computer conferencing are discussed.
Diversity of social ties in scientific collaboration networks
NASA Astrophysics Data System (ADS)
Shi, Quan; Xu, Bo; Xu, Xiaomin; Xiao, Yanghua; Wang, Wei; Wang, Hengshan
2011-11-01
Diversity is one of the important perspectives to characterize behaviors of individuals in social networks. It is intuitively believed that diversity of social ties accounts for competition advantage and idea innovation. However, quantitative evidences in a real large social network can be rarely found in the previous research. Thanks to the availability of scientific publication records on WWW; now we can construct a large scientific collaboration network, which provides us a chance to gain insight into the diversity of relationships in a real social network through statistical analysis. In this article, we dedicate our efforts to perform empirical analysis on a scientific collaboration network extracted from DBLP, an online bibliographic database in computer science, in a systematical way, finding the following: distributions of diversity indices tend to decay in an exponential or Gaussian way; diversity indices are not trivially correlated to existing vertex importance measures; authors of diverse social ties tend to connect to each other and these authors are generally more competitive than others.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, Chase Qishi; Zhu, Michelle Mengxia
The advent of large-scale collaborative scientific applications has demonstrated the potential for broad scientific communities to pool globally distributed resources to produce unprecedented data acquisition, movement, and analysis. System resources including supercomputers, data repositories, computing facilities, network infrastructures, storage systems, and display devices have been increasingly deployed at national laboratories and academic institutes. These resources are typically shared by large communities of users over Internet or dedicated networks and hence exhibit an inherent dynamic nature in their availability, accessibility, capacity, and stability. Scientific applications using either experimental facilities or computation-based simulations with various physical, chemical, climatic, and biological models featuremore » diverse scientific workflows as simple as linear pipelines or as complex as a directed acyclic graphs, which must be executed and supported over wide-area networks with massively distributed resources. Application users oftentimes need to manually configure their computing tasks over networks in an ad hoc manner, hence significantly limiting the productivity of scientists and constraining the utilization of resources. The success of these large-scale distributed applications requires a highly adaptive and massively scalable workflow platform that provides automated and optimized computing and networking services. This project is to design and develop a generic Scientific Workflow Automation and Management Platform (SWAMP), which contains a web-based user interface specially tailored for a target application, a set of user libraries, and several easy-to-use computing and networking toolkits for application scientists to conveniently assemble, execute, monitor, and control complex computing workflows in heterogeneous high-performance network environments. SWAMP will enable the automation and management of the entire process of scientific workflows with the convenience of a few mouse clicks while hiding the implementation and technical details from end users. Particularly, we will consider two types of applications with distinct performance requirements: data-centric and service-centric applications. For data-centric applications, the main workflow task involves large-volume data generation, catalog, storage, and movement typically from supercomputers or experimental facilities to a team of geographically distributed users; while for service-centric applications, the main focus of workflow is on data archiving, preprocessing, filtering, synthesis, visualization, and other application-specific analysis. We will conduct a comprehensive comparison of existing workflow systems and choose the best suited one with open-source code, a flexible system structure, and a large user base as the starting point for our development. Based on the chosen system, we will develop and integrate new components including a black box design of computing modules, performance monitoring and prediction, and workflow optimization and reconfiguration, which are missing from existing workflow systems. A modular design for separating specification, execution, and monitoring aspects will be adopted to establish a common generic infrastructure suited for a wide spectrum of science applications. We will further design and develop efficient workflow mapping and scheduling algorithms to optimize the workflow performance in terms of minimum end-to-end delay, maximum frame rate, and highest reliability. We will develop and demonstrate the SWAMP system in a local environment, the grid network, and the 100Gpbs Advanced Network Initiative (ANI) testbed. The demonstration will target scientific applications in climate modeling and high energy physics and the functions to be demonstrated include workflow deployment, execution, steering, and reconfiguration. Throughout the project period, we will work closely with the science communities in the fields of climate modeling and high energy physics including Spallation Neutron Source (SNS) and Large Hadron Collider (LHC) projects to mature the system for production use.« less
Software for Planning Scientific Activities on Mars
NASA Technical Reports Server (NTRS)
Ai-Chang, Mitchell; Bresina, John; Jonsson, Ari; Hsu, Jennifer; Kanefsky, Bob; Morris, Paul; Rajan, Kanna; Yglesias, Jeffrey; Charest, Len; Maldague, Pierre
2003-01-01
Mixed-Initiative Activity Plan Generator (MAPGEN) is a ground-based computer program for planning and scheduling the scientific activities of instrumented exploratory robotic vehicles, within the limitations of available resources onboard the vehicle. MAPGEN is a combination of two prior software systems: (1) an activity-planning program, APGEN, developed at NASA s Jet Propulsion Laboratory and (2) the Europa planner/scheduler from NASA Ames Research Center. MAPGEN performs all of the following functions: Automatic generation of plans and schedules for scientific and engineering activities; Testing of hypotheses (or what-if analyses of various scenarios); Editing of plans; Computation and analysis of resources; and Enforcement and maintenance of constraints, including resolution of temporal and resource conflicts among planned activities. MAPGEN can be used in either of two modes: one in which the planner/scheduler is turned off and only the basic APGEN functionality is utilized, or one in which both component programs are used to obtain the full planning, scheduling, and constraint-maintenance functionality.
Laptop Use, Interactive Science Software, and Science Learning Among At-Risk Students
NASA Astrophysics Data System (ADS)
Zheng, Binbin; Warschauer, Mark; Hwang, Jin Kyoung; Collins, Penelope
2014-08-01
This year-long, quasi-experimental study investigated the impact of the use of netbook computers and interactive science software on fifth-grade students' science learning processes, academic achievement, and interest in further science, technology, engineering, and mathematics (STEM) study within a linguistically diverse school district in California. Analysis of students' state standardized science test scores indicated that the program helped close gaps in scientific achievement between at-risk learners (i.e., English learners, Hispanics, and free/reduced-lunch recipients) and their counterparts. Teacher and student interviews and classroom observations suggested that computer-supported visual representations and interactions supported diverse learners' scientific understanding and inquiry and enabled more individualized and differentiated instruction. Finally, interviews revealed that the program had a positive impact on students' motivation in science and on their interest in pursuing science-related careers. This study suggests that technology-facilitated science instruction is beneficial for improving at-risk students' science achievement, scaffolding students' scientific understanding, and strengthening students' motivation to pursue STEM-related careers.
Data Mining as a Service (DMaaS)
NASA Astrophysics Data System (ADS)
Tejedor, E.; Piparo, D.; Mascetti, L.; Moscicki, J.; Lamanna, M.; Mato, P.
2016-10-01
Data Mining as a Service (DMaaS) is a software and computing infrastructure that allows interactive mining of scientific data in the cloud. It allows users to run advanced data analyses by leveraging the widely adopted Jupyter notebook interface. Furthermore, the system makes it easier to share results and scientific code, access scientific software, produce tutorials and demonstrations as well as preserve the analyses of scientists. This paper describes how a first pilot of the DMaaS service is being deployed at CERN, starting from the notebook interface that has been fully integrated with the ROOT analysis framework, in order to provide all the tools for scientists to run their analyses. Additionally, we characterise the service backend, which combines a set of IT services such as user authentication, virtual computing infrastructure, mass storage, file synchronisation, development portals or batch systems. The added value acquired by the combination of the aforementioned categories of services is discussed, focusing on the opportunities offered by the CERNBox synchronisation service and its massive storage backend, EOS.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aurich, Maike K.; Fleming, Ronan M. T.; Thiele, Ines
Metabolomic data sets provide a direct read-out of cellular phenotypes and are increasingly generated to study biological questions. Previous work, by us and others, revealed the potential of analyzing extracellular metabolomic data in the context of the metabolic model using constraint-based modeling. With the MetaboTools, we make our methods available to the broader scientific community. The MetaboTools consist of a protocol, a toolbox, and tutorials of two use cases. The protocol describes, in a step-wise manner, the workflow of data integration, and computational analysis. The MetaboTools comprise the Matlab code required to complete the workflow described in the protocol. Tutorialsmore » explain the computational steps for integration of two different data sets and demonstrate a comprehensive set of methods for the computational analysis of metabolic models and stratification thereof into different phenotypes. The presented workflow supports integrative analysis of multiple omics data sets. Importantly, all analysis tools can be applied to metabolic models without performing the entire workflow. Taken together, the MetaboTools constitute a comprehensive guide to the intra-model analysis of extracellular metabolomic data from microbial, plant, or human cells. In conclusion, this computational modeling resource offers a broad set of computational analysis tools for a wide biomedical and non-biomedical research community.« less
Scientific Visualization, Seeing the Unseeable
LBNL
2017-12-09
June 24, 2008 Berkeley Lab lecture: Scientific visualization transforms abstract data into readily comprehensible images, provide a vehicle for "seeing the unseeable," and play a central role in bo... June 24, 2008 Berkeley Lab lecture: Scientific visualization transforms abstract data into readily comprehensible images, provide a vehicle for "seeing the unseeable," and play a central role in both experimental and computational sciences. Wes Bethel, who heads the Scientific Visualization Group in the Computational Research Division, presents an overview of visualization and computer graphics, current research challenges, and future directions for the field.
Bringing Legacy Visualization Software to Modern Computing Devices via Application Streaming
NASA Astrophysics Data System (ADS)
Fisher, Ward
2014-05-01
Planning software compatibility across forthcoming generations of computing platforms is a problem commonly encountered in software engineering and development. While this problem can affect any class of software, data analysis and visualization programs are particularly vulnerable. This is due in part to their inherent dependency on specialized hardware and computing environments. A number of strategies and tools have been designed to aid software engineers with this task. While generally embraced by developers at 'traditional' software companies, these methodologies are often dismissed by the scientific software community as unwieldy, inefficient and unnecessary. As a result, many important and storied scientific software packages can struggle to adapt to a new computing environment; for example, one in which much work is carried out on sub-laptop devices (such as tablets and smartphones). Rewriting these packages for a new platform often requires significant investment in terms of development time and developer expertise. In many cases, porting older software to modern devices is neither practical nor possible. As a result, replacement software must be developed from scratch, wasting resources better spent on other projects. Enabled largely by the rapid rise and adoption of cloud computing platforms, 'Application Streaming' technologies allow legacy visualization and analysis software to be operated wholly from a client device (be it laptop, tablet or smartphone) while retaining full functionality and interactivity. It mitigates much of the developer effort required by other more traditional methods while simultaneously reducing the time it takes to bring the software to a new platform. This work will provide an overview of Application Streaming and how it compares against other technologies which allow scientific visualization software to be executed from a remote computer. We will discuss the functionality and limitations of existing application streaming frameworks and how a developer might prepare their software for application streaming. We will also examine the secondary benefits realized by moving legacy software to the cloud. Finally, we will examine the process by which a legacy Java application, the Integrated Data Viewer (IDV), is to be adapted for tablet computing via Application Streaming.
Practical Use of Computationally Frugal Model Analysis Methods
Hill, Mary C.; Kavetski, Dmitri; Clark, Martyn; ...
2015-03-21
Computationally frugal methods of model analysis can provide substantial benefits when developing models of groundwater and other environmental systems. Model analysis includes ways to evaluate model adequacy and to perform sensitivity and uncertainty analysis. Frugal methods typically require 10s of parallelizable model runs; their convenience allows for other uses of the computational effort. We suggest that model analysis be posed as a set of questions used to organize methods that range from frugal to expensive (requiring 10,000 model runs or more). This encourages focus on method utility, even when methods have starkly different theoretical backgrounds. We note that many frugalmore » methods are more useful when unrealistic process-model nonlinearities are reduced. Inexpensive diagnostics are identified for determining when frugal methods are advantageous. Examples from the literature are used to demonstrate local methods and the diagnostics. We suggest that the greater use of computationally frugal model analysis methods would allow questions such as those posed in this work to be addressed more routinely, allowing the environmental sciences community to obtain greater scientific insight from the many ongoing and future modeling efforts« less
NASA Astrophysics Data System (ADS)
Myre, Joseph M.
Heterogeneous computing systems have recently come to the forefront of the High-Performance Computing (HPC) community's interest. HPC computer systems that incorporate special purpose accelerators, such as Graphics Processing Units (GPUs), are said to be heterogeneous. Large scale heterogeneous computing systems have consistently ranked highly on the Top500 list since the beginning of the heterogeneous computing trend. By using heterogeneous computing systems that consist of both general purpose processors and special- purpose accelerators, the speed and problem size of many simulations could be dramatically increased. Ultimately this results in enhanced simulation capabilities that allows, in some cases for the first time, the execution of parameter space and uncertainty analyses, model optimizations, and other inverse modeling techniques that are critical for scientific discovery and engineering analysis. However, simplifying the usage and optimization of codes for heterogeneous computing systems remains a challenge. This is particularly true for scientists and engineers for whom understanding HPC architectures and undertaking performance analysis may not be primary research objectives. To enable scientists and engineers to remain focused on their primary research objectives, a modular environment for geophysical inversion and run-time autotuning on heterogeneous computing systems is presented. This environment is composed of three major components: 1) CUSH---a framework for reducing the complexity of programming heterogeneous computer systems, 2) geophysical inversion routines which can be used to characterize physical systems, and 3) run-time autotuning routines designed to determine configurations of heterogeneous computing systems in an attempt to maximize the performance of scientific and engineering codes. Using three case studies, a lattice-Boltzmann method, a non-negative least squares inversion, and a finite-difference fluid flow method, it is shown that this environment provides scientists and engineers with means to reduce the programmatic complexity of their applications, to perform geophysical inversions for characterizing physical systems, and to determine high-performing run-time configurations of heterogeneous computing systems using a run-time autotuner.
Department of Defense In-House RDT and E Activities: Management Analysis Report for Fiscal Year 1993
1994-11-01
A worldwide unique lab because it houses a high - speed modeling and simulation system, a prototype...E Division, San Diego, CA: High Performance Computing Laboratory providing a wide range of advanced computer systems for the scientific investigation...Machines CM-200 and a 256-node Thinking Machines CM-S. The CM-5 is in a very large memory, ( high performance 32 Gbytes, >4 0 OFlop) coafiguration,
Methods for design and evaluation of parallel computating systems (The PISCES project)
NASA Technical Reports Server (NTRS)
Pratt, Terrence W.; Wise, Robert; Haught, Mary JO
1989-01-01
The PISCES project started in 1984 under the sponsorship of the NASA Computational Structural Mechanics (CSM) program. A PISCES 1 programming environment and parallel FORTRAN were implemented in 1984 for the DEC VAX (using UNIX processes to simulate parallel processes). This system was used for experimentation with parallel programs for scientific applications and AI (dynamic scene analysis) applications. PISCES 1 was ported to a network of Apollo workstations by N. Fitzgerald.
Documentation of the data analysis system for the gamma ray monitor aboard OSO-H
NASA Technical Reports Server (NTRS)
Croteau, S.; Buck, A.; Higbie, P.; Kantauskis, J.; Foss, S.; Chupp, D.; Forrest, D. J.; Suri, A.; Gleske, I.
1973-01-01
The programming system is presented which was developed to prepare the data from the gamma ray monitor on OSO-7 for scientific analysis. The detector, data, and objectives are described in detail. Programs presented include; FEEDER, PASS-1, CAL1, CAL2, PASS-3, Van Allen Belt Predict Program, Computation Center Plot Routine, and Response Function Programs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of manymore » computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.« less
Gimeno-Blanes, Francisco J.; Blanco-Velasco, Manuel; Barquero-Pérez, Óscar; García-Alberola, Arcadi; Rojo-Álvarez, José L.
2016-01-01
Great effort has been devoted in recent years to the development of sudden cardiac risk predictors as a function of electric cardiac signals, mainly obtained from the electrocardiogram (ECG) analysis. But these prediction techniques are still seldom used in clinical practice, partly due to its limited diagnostic accuracy and to the lack of consensus about the appropriate computational signal processing implementation. This paper addresses a three-fold approach, based on ECG indices, to structure this review on sudden cardiac risk stratification. First, throughout the computational techniques that had been widely proposed for obtaining these indices in technical literature. Second, over the scientific evidence, that although is supported by observational clinical studies, they are not always representative enough. And third, via the limited technology transfer of academy-accepted algorithms, requiring further meditation for future systems. We focus on three families of ECG derived indices which are tackled from the aforementioned viewpoints, namely, heart rate turbulence (HRT), heart rate variability (HRV), and T-wave alternans. In terms of computational algorithms, we still need clearer scientific evidence, standardizing, and benchmarking, siting on advanced algorithms applied over large and representative datasets. New scenarios like electronic health recordings, big data, long-term monitoring, and cloud databases, will eventually open new frameworks to foresee suitable new paradigms in the near future. PMID:27014083
Exploring Cloud Computing for Large-scale Scientific Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Guang; Han, Binh; Yin, Jian
This paper explores cloud computing for large-scale data-intensive scientific applications. Cloud computing is attractive because it provides hardware and software resources on-demand, which relieves the burden of acquiring and maintaining a huge amount of resources that may be used only once by a scientific application. However, unlike typical commercial applications that often just requires a moderate amount of ordinary resources, large-scale scientific applications often need to process enormous amount of data in the terabyte or even petabyte range and require special high performance hardware with low latency connections to complete computation in a reasonable amount of time. To address thesemore » challenges, we build an infrastructure that can dynamically select high performance computing hardware across institutions and dynamically adapt the computation to the selected resources to achieve high performance. We have also demonstrated the effectiveness of our infrastructure by building a system biology application and an uncertainty quantification application for carbon sequestration, which can efficiently utilize data and computation resources across several institutions.« less
Ultra-scale Visualization Climate Data Analysis Tools (UV-CDAT)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Dean N.; Silva, Claudio
2013-09-30
For the past three years, a large analysis and visualization effort—funded by the Department of Energy’s Office of Biological and Environmental Research (BER), the National Aeronautics and Space Administration (NASA), and the National Oceanic and Atmospheric Administration (NOAA)—has brought together a wide variety of industry-standard scientific computing libraries and applications to create Ultra-scale Visualization Climate Data Analysis Tools (UV-CDAT) to serve the global climate simulation and observational research communities. To support interactive analysis and visualization, all components connect through a provenance application–programming interface to capture meaningful history and workflow. Components can be loosely coupled into the framework for fast integrationmore » or tightly coupled for greater system functionality and communication with other components. The overarching goal of UV-CDAT is to provide a new paradigm for access to and analysis of massive, distributed scientific data collections by leveraging distributed data architectures located throughout the world. The UV-CDAT framework addresses challenges in analysis and visualization and incorporates new opportunities, including parallelism for better efficiency, higher speed, and more accurate scientific inferences. Today, it provides more than 600 users access to more analysis and visualization products than any other single source.« less
Data and Communications in Basic Energy Sciences: Creating a Pathway for Scientific Discovery
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nugent, Peter E.; Simonson, J. Michael
2011-10-24
This report is based on the Department of Energy (DOE) Workshop on “Data and Communications in Basic Energy Sciences: Creating a Pathway for Scientific Discovery” that was held at the Bethesda Marriott in Maryland on October 24-25, 2011. The workshop brought together leading researchers from the Basic Energy Sciences (BES) facilities and Advanced Scientific Computing Research (ASCR). The workshop was co-sponsored by these two Offices to identify opportunities and needs for data analysis, ownership, storage, mining, provenance and data transfer at light sources, neutron sources, microscopy centers and other facilities. Their charge was to identify current and anticipated issues inmore » the acquisition, analysis, communication and storage of experimental data that could impact the progress of scientific discovery, ascertain what knowledge, methods and tools are needed to mitigate present and projected shortcomings and to create the foundation for information exchanges and collaboration between ASCR and BES supported researchers and facilities. The workshop was organized in the context of the impending data tsunami that will be produced by DOE’s BES facilities. Current facilities, like SLAC National Accelerator Laboratory’s Linac Coherent Light Source, can produce up to 18 terabytes (TB) per day, while upgraded detectors at Lawrence Berkeley National Laboratory’s Advanced Light Source will generate ~10TB per hour. The expectation is that these rates will increase by over an order of magnitude in the coming decade. The urgency to develop new strategies and methods in order to stay ahead of this deluge and extract the most science from these facilities was recognized by all. The four focus areas addressed in this workshop were: Workflow Management - Experiment to Science: Identifying and managing the data path from experiment to publication. Theory and Algorithms: Recognizing the need for new tools for computation at scale, supporting large data sets and realistic theoretical models. Visualization and Analysis: Supporting near-real-time feedback for experiment optimization and new ways to extract and communicate critical information from large data sets. Data Processing and Management: Outlining needs in computational and communication approaches and infrastructure needed to handle unprecedented data volume and information content. It should be noted that almost all participants recognized that there were unlikely to be any turn-key solutions available due to the unique, diverse nature of the BES community, where research at adjacent beamlines at a given light source facility often span everything from biology to materials science to chemistry using scattering, imaging and/or spectroscopy. However, it was also noted that advances supported by other programs in data research, methodologies, and tool development could be implemented on reasonable time scales with modest effort. Adapting available standard file formats, robust workflows, and in-situ analysis tools for user facility needs could pay long-term dividends. Workshop participants assessed current requirements as well as future challenges and made the following recommendations in order to achieve the ultimate goal of enabling transformative science in current and future BES facilities: Theory and analysis components should be integrated seamlessly within experimental workflow. Develop new algorithms for data analysis based on common data formats and toolsets. Move analysis closer to experiment. Move the analysis closer to the experiment to enable real-time (in-situ) streaming capabilities, live visualization of the experiment and an increase of the overall experimental efficiency. Match data management access and capabilities with advancements in detectors and sources. Remove bottlenecks, provide interoperability across different facilities/beamlines and apply forefront mathematical techniques to more efficiently extract science from the experiments. This workshop report examines and reviews the status of several BES facilities and highlights the successes and shortcomings of the current data and communication pathways for scientific discovery. It then ascertains what methods and tools are needed to mitigate present and projected data bottlenecks to science over the next 10 years. The goal of this report is to create the foundation for information exchanges and collaborations among ASCR and BES supported researchers, the BES scientific user facilities, and ASCR computing and networking facilities. To jumpstart these activities, there was a strong desire to see a joint effort between ASCR and BES along the lines of the highly successful Scientific Discovery through Advanced Computing (SciDAC) program in which integrated teams of engineers, scientists and computer scientists were engaged to tackle a complete end-to-end workflow solution at one or more beamlines, to ascertain what challenges will need to be addressed in order to handle future increases in data« less
An Overview of the Computational Physics and Methods Group at Los Alamos National Laboratory
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, Randal Scott
CCS Division was formed to strengthen the visibility and impact of computer science and computational physics research on strategic directions for the Laboratory. Both computer science and computational science are now central to scientific discovery and innovation. They have become indispensable tools for all other scientific missions at the Laboratory. CCS Division forms a bridge between external partners and Laboratory programs, bringing new ideas and technologies to bear on today’s important problems and attracting high-quality technical staff members to the Laboratory. The Computational Physics and Methods Group CCS-2 conducts methods research and develops scientific software aimed at the latest andmore » emerging HPC systems.« less
[Earth Science Technology Office's Computational Technologies Project
NASA Technical Reports Server (NTRS)
Fischer, James (Technical Monitor); Merkey, Phillip
2005-01-01
This grant supported the effort to characterize the problem domain of the Earth Science Technology Office's Computational Technologies Project, to engage the Beowulf Cluster Computing Community as well as the High Performance Computing Research Community so that we can predict the applicability of said technologies to the scientific community represented by the CT project and formulate long term strategies to provide the computational resources necessary to attain the anticipated scientific objectives of the CT project. Specifically, the goal of the evaluation effort is to use the information gathered over the course of the Round-3 investigations to quantify the trends in scientific expectations, the algorithmic requirements and capabilities of high-performance computers to satisfy this anticipated need.
Big Data Ecosystems Enable Scientific Discovery
DOE Office of Scientific and Technical Information (OSTI.GOV)
Critchlow, Terence J.; Kleese van Dam, Kerstin
Over the past 5 years, advances in experimental, sensor and computational technologies have driven the exponential growth in the volumes, acquisition rates, variety and complexity of scientific data. As noted by Hey et al in their 2009 e-book The Fourth Paradigm, this availability of large-quantities of scientifically meaningful data has given rise to a new scientific methodology - data intensive science. Data intensive science is the ability to formulate and evaluate hypotheses using data and analysis to extend, complement and, at times, replace experimentation, theory, or simulation. This new approach to science no longer requires scientists to interact directly withmore » the objects of their research; instead they can utilize digitally captured, reduced, calibrated, analyzed, synthesized and visualized results - allowing them carry out 'experiments' in data.« less
2HOT: An Improved Parallel Hashed Oct-Tree N-Body Algorithm for Cosmological Simulation
Warren, Michael S.
2014-01-01
We report on improvements made over the past two decades to our adaptive treecode N-body method (HOT). A mathematical and computational approach to the cosmological N-body problem is described, with performance and scalability measured up to 256k (2 18 ) processors. We present error analysis and scientific application results from a series of more than ten 69 billion (4096 3 ) particle cosmological simulations, accounting for 4×10 20 floating point operations. These results include the first simulations using the new constraints on the standard model of cosmology from the Planck satellite. Our simulations set a new standard for accuracy andmore » scientific throughput, while meeting or exceeding the computational efficiency of the latest generation of hybrid TreePM N-body methods.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gallarno, George; Rogers, James H; Maxwell, Don E
The high computational capability of graphics processing units (GPUs) is enabling and driving the scientific discovery process at large-scale. The world s second fastest supercomputer for open science, Titan, has more than 18,000 GPUs that computational scientists use to perform scientific simu- lations and data analysis. Understanding of GPU reliability characteristics, however, is still in its nascent stage since GPUs have only recently been deployed at large-scale. This paper presents a detailed study of GPU errors and their impact on system operations and applications, describing experiences with the 18,688 GPUs on the Titan supercom- puter as well as lessons learnedmore » in the process of efficient operation of GPUs at scale. These experiences are helpful to HPC sites which already have large-scale GPU clusters or plan to deploy GPUs in the future.« less
Computers and Computation. Readings from Scientific American.
ERIC Educational Resources Information Center
Fenichel, Robert R.; Weizenbaum, Joseph
A collection of articles from "Scientific American" magazine has been put together at this time because the current period in computer science is one of consolidation rather than innovation. A few years ago, computer science was moving so swiftly that even the professional journals were more archival than informative; but today it is…
The impact of computer science in molecular medicine: enabling high-throughput research.
de la Iglesia, Diana; García-Remesal, Miguel; de la Calle, Guillermo; Kulikowski, Casimir; Sanz, Ferran; Maojo, Víctor
2013-01-01
The Human Genome Project and the explosion of high-throughput data have transformed the areas of molecular and personalized medicine, which are producing a wide range of studies and experimental results and providing new insights for developing medical applications. Research in many interdisciplinary fields is resulting in data repositories and computational tools that support a wide diversity of tasks: genome sequencing, genome-wide association studies, analysis of genotype-phenotype interactions, drug toxicity and side effects assessment, prediction of protein interactions and diseases, development of computational models, biomarker discovery, and many others. The authors of the present paper have developed several inventories covering tools, initiatives and studies in different computational fields related to molecular medicine: medical informatics, bioinformatics, clinical informatics and nanoinformatics. With these inventories, created by mining the scientific literature, we have carried out several reviews of these fields, providing researchers with a useful framework to locate, discover, search and integrate resources. In this paper we present an analysis of the state-of-the-art as it relates to computational resources for molecular medicine, based on results compiled in our inventories, as well as results extracted from a systematic review of the literature and other scientific media. The present review is based on the impact of their related publications and the available data and software resources for molecular medicine. It aims to provide information that can be useful to support ongoing research and work to improve diagnostics and therapeutics based on molecular-level insights.
Multimedia Image Technology and Computer Aided Manufacturing Engineering Analysis
NASA Astrophysics Data System (ADS)
Nan, Song
2018-03-01
Since the reform and opening up, with the continuous development of science and technology in China, more and more advanced science and technology have emerged under the trend of diversification. Multimedia imaging technology, for example, has a significant and positive impact on computer aided manufacturing engineering in China. From the perspective of scientific and technological advancement and development, the multimedia image technology has a very positive influence on the application and development of computer-aided manufacturing engineering, whether in function or function play. Therefore, this paper mainly starts from the concept of multimedia image technology to analyze the application of multimedia image technology in computer aided manufacturing engineering.
Computational perspectives in the history of science: to the memory of Peter Damerow.
Laubichler, Manfred D; Maienschein, Jane; Renn, Jürgen
2013-03-01
Computational methods and perspectives can transform the history of science by enabling the pursuit of novel types of questions, dramatically expanding the scale of analysis (geographically and temporally), and offering novel forms of publication that greatly enhance access and transparency. This essay presents a brief summary of a computational research system for the history of science, discussing its implications for research, education, and publication practices and its connections to the open-access movement and similar transformations in the natural and social sciences that emphasize big data. It also argues that computational approaches help to reconnect the history of science to individual scientific disciplines.
Zhou, Ji; Applegate, Christopher; Alonso, Albor Dobon; Reynolds, Daniel; Orford, Simon; Mackiewicz, Michal; Griffiths, Simon; Penfield, Steven; Pullen, Nick
2017-01-01
Plants demonstrate dynamic growth phenotypes that are determined by genetic and environmental factors. Phenotypic analysis of growth features over time is a key approach to understand how plants interact with environmental change as well as respond to different treatments. Although the importance of measuring dynamic growth traits is widely recognised, available open software tools are limited in terms of batch image processing, multiple traits analyses, software usability and cross-referencing results between experiments, making automated phenotypic analysis problematic. Here, we present Leaf-GP (Growth Phenotypes), an easy-to-use and open software application that can be executed on different computing platforms. To facilitate diverse scientific communities, we provide three software versions, including a graphic user interface (GUI) for personal computer (PC) users, a command-line interface for high-performance computer (HPC) users, and a well-commented interactive Jupyter Notebook (also known as the iPython Notebook) for computational biologists and computer scientists. The software is capable of extracting multiple growth traits automatically from large image datasets. We have utilised it in Arabidopsis thaliana and wheat ( Triticum aestivum ) growth studies at the Norwich Research Park (NRP, UK). By quantifying a number of growth phenotypes over time, we have identified diverse plant growth patterns between different genotypes under several experimental conditions. As Leaf-GP has been evaluated with noisy image series acquired by different imaging devices (e.g. smartphones and digital cameras) and still produced reliable biological outputs, we therefore believe that our automated analysis workflow and customised computer vision based feature extraction software implementation can facilitate a broader plant research community for their growth and development studies. Furthermore, because we implemented Leaf-GP based on open Python-based computer vision, image analysis and machine learning libraries, we believe that our software not only can contribute to biological research, but also demonstrates how to utilise existing open numeric and scientific libraries (e.g. Scikit-image, OpenCV, SciPy and Scikit-learn) to build sound plant phenomics analytic solutions, in a efficient and effective way. Leaf-GP is a sophisticated software application that provides three approaches to quantify growth phenotypes from large image series. We demonstrate its usefulness and high accuracy based on two biological applications: (1) the quantification of growth traits for Arabidopsis genotypes under two temperature conditions; and (2) measuring wheat growth in the glasshouse over time. The software is easy-to-use and cross-platform, which can be executed on Mac OS, Windows and HPC, with open Python-based scientific libraries preinstalled. Our work presents the advancement of how to integrate computer vision, image analysis, machine learning and software engineering in plant phenomics software implementation. To serve the plant research community, our modulated source code, detailed comments, executables (.exe for Windows; .app for Mac), and experimental results are freely available at https://github.com/Crop-Phenomics-Group/Leaf-GP/releases.
Reducing Time to Science: Unidata and JupyterHub Technology Using the Jetstream Cloud
NASA Astrophysics Data System (ADS)
Chastang, J.; Signell, R. P.; Fischer, J. L.
2017-12-01
Cloud computing can accelerate scientific workflows, discovery, and collaborations by reducing research and data friction. We describe the deployment of Unidata and JupyterHub technologies on the NSF-funded XSEDE Jetstream cloud. With the aid of virtual machines and Docker technology, we deploy a Unidata JupyterHub server co-located with a Local Data Manager (LDM), THREDDS data server (TDS), and RAMADDA geoscience content management system. We provide Jupyter Notebooks and the pre-built Python environments needed to run them. The notebooks can be used for instruction and as templates for scientific experimentation and discovery. We also supply a large quantity of NCEP forecast model results to allow data-proximate analysis and visualization. In addition, users can transfer data using Globus command line tools, and perform their own data-proximate analysis and visualization with Notebook technology. These data can be shared with others via a dedicated TDS server for scientific distribution and collaboration. There are many benefits of this approach. Not only is the cloud computing environment fast, reliable and scalable, but scientists can analyze, visualize, and share data using only their web browser. No local specialized desktop software or a fast internet connection is required. This environment will enable scientists to spend less time managing their software and more time doing science.
THE BERKELEY DATA ANALYSIS SYSTEM (BDAS): AN OPEN SOURCE PLATFORM FOR BIG DATA ANALYTICS
2017-09-01
Evan Sparks, Oliver Zahn, Michael J. Franklin, David A. Patterson, Saul Perlmutter. Scientific Computing Meets Big Data Technology: An Astronomy ...Processing Astronomy Imagery Using Big Data Technology. IEEE Transaction on Big Data, 2016. Approved for Public Release; Distribution Unlimited. 22 [93
Fitting the Jigsaw of Citation: Information Visualization in Domain Analysis.
ERIC Educational Resources Information Center
Chen, Chaomei; Paul, Ray J.; O'Keefe, Bob
2001-01-01
Discusses the role of information visualization in modeling and representing intellectual structures associated with scientific disciplines and visualizes the domain of computer graphics based on bibliographic data from author cocitation patterns. Highlights include author cocitation maps, citation time lines, animation of a high-dimensional…
NASA Astrophysics Data System (ADS)
Memon, Shahbaz; Vallot, Dorothée; Zwinger, Thomas; Neukirchen, Helmut
2017-04-01
Scientific communities generate complex simulations through orchestration of semi-structured analysis pipelines which involves execution of large workflows on multiple, distributed and heterogeneous computing and data resources. Modeling ice dynamics of glaciers requires workflows consisting of many non-trivial, computationally expensive processing tasks which are coupled to each other. From this domain, we present an e-Science use case, a workflow, which requires the execution of a continuum ice flow model and a discrete element based calving model in an iterative manner. Apart from the execution, this workflow also contains data format conversion tasks that support the execution of ice flow and calving by means of transition through sequential, nested and iterative steps. Thus, the management and monitoring of all the processing tasks including data management and transfer of the workflow model becomes more complex. From the implementation perspective, this workflow model was initially developed on a set of scripts using static data input and output references. In the course of application usage when more scripts or modifications introduced as per user requirements, the debugging and validation of results were more cumbersome to achieve. To address these problems, we identified a need to have a high-level scientific workflow tool through which all the above mentioned processes can be achieved in an efficient and usable manner. We decided to make use of the e-Science middleware UNICORE (Uniform Interface to Computing Resources) that allows seamless and automated access to different heterogenous and distributed resources which is supported by a scientific workflow engine. Based on this, we developed a high-level scientific workflow model for coupling of massively parallel High-Performance Computing (HPC) jobs: a continuum ice sheet model (Elmer/Ice) and a discrete element calving and crevassing model (HiDEM). In our talk we present how the use of a high-level scientific workflow middleware enables reproducibility of results more convenient and also provides a reusable and portable workflow template that can be deployed across different computing infrastructures. Acknowledgements This work was kindly supported by NordForsk as part of the Nordic Center of Excellence (NCoE) eSTICC (eScience Tools for Investigating Climate Change at High Northern Latitudes) and the Top-level Research Initiative NCoE SVALI (Stability and Variation of Arctic Land Ice).
MetaboTools: A comprehensive toolbox for analysis of genome-scale metabolic models
Aurich, Maike K.; Fleming, Ronan M. T.; Thiele, Ines
2016-08-03
Metabolomic data sets provide a direct read-out of cellular phenotypes and are increasingly generated to study biological questions. Previous work, by us and others, revealed the potential of analyzing extracellular metabolomic data in the context of the metabolic model using constraint-based modeling. With the MetaboTools, we make our methods available to the broader scientific community. The MetaboTools consist of a protocol, a toolbox, and tutorials of two use cases. The protocol describes, in a step-wise manner, the workflow of data integration, and computational analysis. The MetaboTools comprise the Matlab code required to complete the workflow described in the protocol. Tutorialsmore » explain the computational steps for integration of two different data sets and demonstrate a comprehensive set of methods for the computational analysis of metabolic models and stratification thereof into different phenotypes. The presented workflow supports integrative analysis of multiple omics data sets. Importantly, all analysis tools can be applied to metabolic models without performing the entire workflow. Taken together, the MetaboTools constitute a comprehensive guide to the intra-model analysis of extracellular metabolomic data from microbial, plant, or human cells. In conclusion, this computational modeling resource offers a broad set of computational analysis tools for a wide biomedical and non-biomedical research community.« less
Emerging Nanophotonic Applications Explored with Advanced Scientific Parallel Computing
NASA Astrophysics Data System (ADS)
Meng, Xiang
The domain of nanoscale optical science and technology is a combination of the classical world of electromagnetics and the quantum mechanical regime of atoms and molecules. Recent advancements in fabrication technology allows the optical structures to be scaled down to nanoscale size or even to the atomic level, which are far smaller than the wavelength they are designed for. These nanostructures can have unique, controllable, and tunable optical properties and their interactions with quantum materials can have important near-field and far-field optical response. Undoubtedly, these optical properties can have many important applications, ranging from the efficient and tunable light sources, detectors, filters, modulators, high-speed all-optical switches; to the next-generation classical and quantum computation, and biophotonic medical sensors. This emerging research of nanoscience, known as nanophotonics, is a highly interdisciplinary field requiring expertise in materials science, physics, electrical engineering, and scientific computing, modeling and simulation. It has also become an important research field for investigating the science and engineering of light-matter interactions that take place on wavelength and subwavelength scales where the nature of the nanostructured matter controls the interactions. In addition, the fast advancements in the computing capabilities, such as parallel computing, also become as a critical element for investigating advanced nanophotonic devices. This role has taken on even greater urgency with the scale-down of device dimensions, and the design for these devices require extensive memory and extremely long core hours. Thus distributed computing platforms associated with parallel computing are required for faster designs processes. Scientific parallel computing constructs mathematical models and quantitative analysis techniques, and uses the computing machines to analyze and solve otherwise intractable scientific challenges. In particular, parallel computing are forms of computation operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently. In this dissertation, we report a series of new nanophotonic developments using the advanced parallel computing techniques. The applications include the structure optimizations at the nanoscale to control both the electromagnetic response of materials, and to manipulate nanoscale structures for enhanced field concentration, which enable breakthroughs in imaging, sensing systems (chapter 3 and 4) and improve the spatial-temporal resolutions of spectroscopies (chapter 5). We also report the investigations on the confinement study of optical-matter interactions at the quantum mechanical regime, where the size-dependent novel properties enhanced a wide range of technologies from the tunable and efficient light sources, detectors, to other nanophotonic elements with enhanced functionality (chapter 6 and 7).
Nuclear Fuel Depletion Analysis Using Matlab Software
NASA Astrophysics Data System (ADS)
Faghihi, F.; Nematollahi, M. R.
Coupled first order IVPs are frequently used in many parts of engineering and sciences. In this article, we presented a code including three computer programs which are joint with the Matlab software to solve and plot the solutions of the first order coupled stiff or non-stiff IVPs. Some engineering and scientific problems related to IVPs are given and fuel depletion (production of the 239Pu isotope) in a Pressurized Water Nuclear Reactor (PWR) are computed by the present code.
2008-02-01
journal article. Didactic coursework requirements for the PhD degree have been completed at this time as well as successful presentation of the...Libraries", Modern Software Tools in Scientific Computing. Birkhauser Press, pp. 163-202, 1997. [5] Doyley MM, Weaver JB, Van Houten EEW, Kennedy FE...data from MR, x-ray computed tomography (CT) and digital photography have been used to successfully drive the algorithm in two-dimensional (2D) work
Using Microsoft PowerPoint as an Astronomical Image Analysis Tool
NASA Astrophysics Data System (ADS)
Beck-Winchatz, Bernhard
2006-12-01
Engaging students in the analysis of authentic scientific data is an effective way to teach them about the scientific process and to develop their problem solving, teamwork and communication skills. In astronomy several image processing and analysis software tools have been developed for use in school environments. However, the practical implementation in the classroom is often difficult because the teachers may not have the comfort level with computers necessary to install and use these tools, they may not have adequate computer privileges and/or support, and they may not have the time to learn how to use specialized astronomy software. To address this problem, we have developed a set of activities in which students analyze astronomical images using basic tools provided in PowerPoint. These include measuring sizes, distances, and angles, and blinking images. In contrast to specialized software, PowerPoint is broadly available on school computers. Many teachers are already familiar with PowerPoint, and the skills developed while learning how to analyze astronomical images are highly transferable. We will discuss several practical examples of measurements, including the following: -Variations in the distances to the sun and moon from their angular sizes -Magnetic declination from images of shadows -Diameter of the moon from lunar eclipse images -Sizes of lunar craters -Orbital radii of the Jovian moons and mass of Jupiter -Supernova and comet searches -Expansion rate of the universe from images of distant galaxies
NASA Astrophysics Data System (ADS)
Bergey, Bradley W.; Ketelhut, Diane Jass; Liang, Senfeng; Natarajan, Uma; Karakus, Melissa
2015-10-01
The primary aim of the study was to examine whether performance on a science assessment in an immersive virtual environment was associated with changes in scientific inquiry self-efficacy. A secondary aim of the study was to examine whether performance on the science assessment was equitable for students with different levels of computer game self-efficacy, including whether gender differences were observed. We examined 407 middle school students' scientific inquiry self-efficacy and computer game self-efficacy before and after completing a computer game-like assessment about a science mystery. Results from path analyses indicated that prior scientific inquiry self-efficacy predicted achievement on end-of-module questions, which in turn predicted change in scientific inquiry self-efficacy. By contrast, computer game self-efficacy was neither predictive of nor predicted by performance on the science assessment. While boys had higher computer game self-efficacy compared to girls, multi-group analyses suggested only minor gender differences in how efficacy beliefs related to performance. Implications for assessments with virtual environments and future design and research are discussed.
Nyström, Pär; Falck-Ytter, Terje; Gredebäck, Gustaf
2016-06-01
This article describes a new open source scientific workflow system, the TimeStudio Project, dedicated to the behavioral and brain sciences. The program is written in MATLAB and features a graphical user interface for the dynamic pipelining of computer algorithms developed as TimeStudio plugins. TimeStudio includes both a set of general plugins (for reading data files, modifying data structures, visualizing data structures, etc.) and a set of plugins specifically developed for the analysis of event-related eyetracking data as a proof of concept. It is possible to create custom plugins to integrate new or existing MATLAB code anywhere in a workflow, making TimeStudio a flexible workbench for organizing and performing a wide range of analyses. The system also features an integrated sharing and archiving tool for TimeStudio workflows, which can be used to share workflows both during the data analysis phase and after scientific publication. TimeStudio thus facilitates the reproduction and replication of scientific studies, increases the transparency of analyses, and reduces individual researchers' analysis workload. The project website ( http://timestudioproject.com ) contains the latest releases of TimeStudio, together with documentation and user forums.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hules, John
This 1998 annual report from the National Scientific Energy Research Computing Center (NERSC) presents the year in review of the following categories: Computational Science; Computer Science and Applied Mathematics; and Systems and Services. Also presented are science highlights in the following categories: Basic Energy Sciences; Biological and Environmental Research; Fusion Energy Sciences; High Energy and Nuclear Physics; and Advanced Scientific Computing Research and Other Projects.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Computational Research Division, Lawrence Berkeley National Laboratory; NERSC, Lawrence Berkeley National Laboratory; Computer Science Department, University of California, Berkeley
2009-05-04
We apply auto-tuning to a hybrid MPI-pthreads lattice Boltzmann computation running on the Cray XT4 at National Energy Research Scientific Computing Center (NERSC). Previous work showed that multicore-specific auto-tuning can improve the performance of lattice Boltzmann magnetohydrodynamics (LBMHD) by a factor of 4x when running on dual- and quad-core Opteron dual-socket SMPs. We extend these studies to the distributed memory arena via a hybrid MPI/pthreads implementation. In addition to conventional auto-tuning at the local SMP node, we tune at the message-passing level to determine the optimal aspect ratio as well as the correct balance between MPI tasks and threads permore » MPI task. Our study presents a detailed performance analysis when moving along an isocurve of constant hardware usage: fixed total memory, total cores, and total nodes. Overall, our work points to approaches for improving intra- and inter-node efficiency on large-scale multicore systems for demanding scientific applications.« less
Android application and REST server system for quasar spectrum presentation and analysis
NASA Astrophysics Data System (ADS)
Wasiewicz, P.; Pietralik, K.; Hryniewicz, K.
2017-08-01
This paper describes the implementation of a system consisting of a mobile application and RESTful architecture server intended for the analysis and presentation of quasars' spectrum. It also depicts the quasar's characteristics and significance to the scientific community, the source for acquiring astronomical objects' spectral data, used software solutions as well as presents the aspect of Cloud Computing and various possible deployment configurations.
The microcomputer scientific software series 3: general linear model--analysis of variance.
Harold M. Rauscher
1985-01-01
A BASIC language set of programs, designed for use on microcomputers, is presented. This set of programs will perform the analysis of variance for any statistical model describing either balanced or unbalanced designs. The program computes and displays the degrees of freedom, Type I sum of squares, and the mean square for the overall model, the error, and each factor...
Hypergraph-Based Combinatorial Optimization of Matrix-Vector Multiplication
ERIC Educational Resources Information Center
Wolf, Michael Maclean
2009-01-01
Combinatorial scientific computing plays an important enabling role in computational science, particularly in high performance scientific computing. In this thesis, we will describe our work on optimizing matrix-vector multiplication using combinatorial techniques. Our research has focused on two different problems in combinatorial scientific…
ERIC Educational Resources Information Center
Evans, C. D.
This paper describes the experiences of the industrial research laboratory of Kodak Ltd. in finding and providing a computer terminal most suited to its very varied requirements. These requirements include bibliographic and scientific data searching and access to a number of worldwide computing services for scientific computing work. The provision…
Amplify scientific discovery with artificial intelligence
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gil, Yolanda; Greaves, Mark T.; Hendler, James
Computing innovations have fundamentally changed many aspects of scientific inquiry. For example, advances in robotics, high-end computing, networking, and databases now underlie much of what we do in science such as gene sequencing, general number crunching, sharing information between scientists, and analyzing large amounts of data. As computing has evolved at a rapid pace, so too has its impact in science, with the most recent computing innovations repeatedly being brought to bear to facilitate new forms of inquiry. Recently, advances in Artificial Intelligence (AI) have deeply penetrated many consumer sectors, including for example Apple’s Siri™ speech recognition system, real-time automatedmore » language translation services, and a new generation of self-driving cars and self-navigating drones. However, AI has yet to achieve comparable levels of penetration in scientific inquiry, despite its tremendous potential in aiding computers to help scientists tackle tasks that require scientific reasoning. We contend that advances in AI will transform the practice of science as we are increasingly able to effectively and jointly harness human and machine intelligence in the pursuit of major scientific challenges.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ahrens, J.P.; Shapiro, L.G.; Tanimoto, S.L.
1997-04-01
This paper describes a computing environment which supports computer-based scientific research work. Key features include support for automatic distributed scheduling and execution and computer-based scientific experimentation. A new flexible and extensible scheduling technique that is responsive to a user`s scheduling constraints, such as the ordering of program results and the specification of task assignments and processor utilization levels, is presented. An easy-to-use constraint language for specifying scheduling constraints, based on the relational database query language SQL, is described along with a search-based algorithm for fulfilling these constraints. A set of performance studies show that the environment can schedule and executemore » program graphs on a network of workstations as the user requests. A method for automatically generating computer-based scientific experiments is described. Experiments provide a concise method of specifying a large collection of parameterized program executions. The environment achieved significant speedups when executing experiments; for a large collection of scientific experiments an average speedup of 3.4 on an average of 5.5 scheduled processors was obtained.« less
NAS (Numerical Aerodynamic Simulation Program) technical summaries, March 1989 - February 1990
NASA Technical Reports Server (NTRS)
1990-01-01
Given here are selected scientific results from the Numerical Aerodynamic Simulation (NAS) Program's third year of operation. During this year, the scientific community was given access to a Cray-2 and a Cray Y-MP supercomputer. Topics covered include flow field analysis of fighter wing configurations, large-scale ocean modeling, the Space Shuttle flow field, advanced computational fluid dynamics (CFD) codes for rotary-wing airloads and performance prediction, turbulence modeling of separated flows, airloads and acoustics of rotorcraft, vortex-induced nonlinearities on submarines, and standing oblique detonation waves.
Investigation of Storage Options for Scientific Computing on Grid and Cloud Facilities
NASA Astrophysics Data System (ADS)
Garzoglio, Gabriele
2012-12-01
In recent years, several new storage technologies, such as Lustre, Hadoop, OrangeFS, and BlueArc, have emerged. While several groups have run benchmarks to characterize them under a variety of configurations, more work is needed to evaluate these technologies for the use cases of scientific computing on Grid clusters and Cloud facilities. This paper discusses our evaluation of the technologies as deployed on a test bed at FermiCloud, one of the Fermilab infrastructure-as-a-service Cloud facilities. The test bed consists of 4 server-class nodes with 40 TB of disk space and up to 50 virtual machine clients, some running on the storage server nodes themselves. With this configuration, the evaluation compares the performance of some of these technologies when deployed on virtual machines and on “bare metal” nodes. In addition to running standard benchmarks such as IOZone to check the sanity of our installation, we have run I/O intensive tests using physics-analysis applications. This paper presents how the storage solutions perform in a variety of realistic use cases of scientific computing. One interesting difference among the storage systems tested is found in a decrease in total read throughput with increasing number of client processes, which occurs in some implementations but not others.
Investigation of storage options for scientific computing on Grid and Cloud facilities
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garzoglio, Gabriele
In recent years, several new storage technologies, such as Lustre, Hadoop, OrangeFS, and BlueArc, have emerged. While several groups have run benchmarks to characterize them under a variety of configurations, more work is needed to evaluate these technologies for the use cases of scientific computing on Grid clusters and Cloud facilities. This paper discusses our evaluation of the technologies as deployed on a test bed at FermiCloud, one of the Fermilab infrastructure-as-a-service Cloud facilities. The test bed consists of 4 server-class nodes with 40 TB of disk space and up to 50 virtual machine clients, some running on the storagemore » server nodes themselves. With this configuration, the evaluation compares the performance of some of these technologies when deployed on virtual machines and on bare metal nodes. In addition to running standard benchmarks such as IOZone to check the sanity of our installation, we have run I/O intensive tests using physics-analysis applications. This paper presents how the storage solutions perform in a variety of realistic use cases of scientific computing. One interesting difference among the storage systems tested is found in a decrease in total read throughput with increasing number of client processes, which occurs in some implementations but not others.« less
High throughput computing: a solution for scientific analysis
O'Donnell, M.
2011-01-01
handle job failures due to hardware, software, or network interruptions (obviating the need to manually resubmit the job after each stoppage); be affordable; and most importantly, allow us to complete very large, complex analyses that otherwise would not even be possible. In short, we envisioned a job-management system that would take advantage of unused FORT CPUs within a local area network (LAN) to effectively distribute and run highly complex analytical processes. What we found was a solution that uses High Throughput Computing (HTC) and High Performance Computing (HPC) systems to do exactly that (Figure 1).
ERIC Educational Resources Information Center
Gegner, Julie A.; Mackay, Donald H. J.; Mayer, Richard E.
2009-01-01
High school students can access original scientific research articles on the Internet, but may have trouble understanding them. To address this problem of online literacy, the authors developed a computer-based prototype for guiding students' comprehension of scientific articles. High school students were asked to read an original scientific…
ERIC Educational Resources Information Center
Weiss, Charles J.
2017-01-01
The Scientific Computing for Chemists course taught at Wabash College teaches chemistry students to use the Python programming language, Jupyter notebooks, and a number of common Python scientific libraries to process, analyze, and visualize data. Assuming no prior programming experience, the course introduces students to basic programming and…
Computational chemistry in pharmaceutical research: at the crossroads.
Bajorath, Jürgen
2012-01-01
Computational approaches are an integral part of pharmaceutical research. However, there are many of unsolved key questions that limit the scientific progress in the still evolving computational field and its impact on drug discovery. Importantly, a number of these questions are not new but date back many years. Hence, it might be difficult to conclusively answer them in the foreseeable future. Moreover, the computational field as a whole is characterized by a high degree of heterogeneity and so is, unfortunately, the quality of its scientific output. In light of this situation, it is proposed that changes in scientific standards and culture should be seriously considered now in order to lay a foundation for future progress in computational research.
[Earth and Space Sciences Project Services for NASA HPCC
NASA Technical Reports Server (NTRS)
Merkey, Phillip
2002-01-01
This grant supported the effort to characterize the problem domain of the Earth Science Technology Office's Computational Technologies Project, to engage the Beowulf Cluster Computing Community as well as the High Performance Computing Research Community so that we can predict the applicability of said technologies to the scientific community represented by the CT project and formulate long term strategies to provide the computational resources necessary to attain the anticipated scientific objectives of the CT project. Specifically, the goal of the evaluation effort is to use the information gathered over the course of the Round-3 investigations to quantify the trends in scientific expectations, the algorithmic requirements and capabilities of high-performance computers to satisfy this anticipated need.
Software Reuse Methods to Improve Technological Infrastructure for e-Science
NASA Technical Reports Server (NTRS)
Marshall, James J.; Downs, Robert R.; Mattmann, Chris A.
2011-01-01
Social computing has the potential to contribute to scientific research. Ongoing developments in information and communications technology improve capabilities for enabling scientific research, including research fostered by social computing capabilities. The recent emergence of e-Science practices has demonstrated the benefits from improvements in the technological infrastructure, or cyber-infrastructure, that has been developed to support science. Cloud computing is one example of this e-Science trend. Our own work in the area of software reuse offers methods that can be used to improve new technological development, including cloud computing capabilities, to support scientific research practices. In this paper, we focus on software reuse and its potential to contribute to the development and evaluation of information systems and related services designed to support new capabilities for conducting scientific research.
Multidimensional Environmental Data Resource Brokering on Computational Grids and Scientific Clouds
NASA Astrophysics Data System (ADS)
Montella, Raffaele; Giunta, Giulio; Laccetti, Giuliano
Grid computing has widely evolved over the past years, and its capabilities have found their way even into business products and are no longer relegated to scientific applications. Today, grid computing technology is not restricted to a set of specific grid open source or industrial products, but rather it is comprised of a set of capabilities virtually within any kind of software to create shared and highly collaborative production environments. These environments are focused on computational (workload) capabilities and the integration of information (data) into those computational capabilities. An active grid computing application field is the fully virtualization of scientific instruments in order to increase their availability and decrease operational and maintaining costs. Computational and information grids allow to manage real-world objects in a service-oriented way using industrial world-spread standards.
78 FR 6087 - Advanced Scientific Computing Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2013-01-29
... INFORMATION CONTACT: Melea Baker, Office of Advanced Scientific Computing Research; SC-21/Germantown Building... Theory and Experiment (INCITE) Public Comment (10-minute rule) Public Participation: The meeting is open...
Computational Science in Armenia (Invited Talk)
NASA Astrophysics Data System (ADS)
Marandjian, H.; Shoukourian, Yu.
This survey is devoted to the development of informatics and computer science in Armenia. The results in theoretical computer science (algebraic models, solutions to systems of general form recursive equations, the methods of coding theory, pattern recognition and image processing), constitute the theoretical basis for developing problem-solving-oriented environments. As examples can be mentioned: a synthesizer of optimized distributed recursive programs, software tools for cluster-oriented implementations of two-dimensional cellular automata, a grid-aware web interface with advanced service trading for linear algebra calculations. In the direction of solving scientific problems that require high-performance computing resources, examples of completed projects include the field of physics (parallel computing of complex quantum systems), astrophysics (Armenian virtual laboratory), biology (molecular dynamics study of human red blood cell membrane), meteorology (implementing and evaluating the Weather Research and Forecast Model for the territory of Armenia). The overview also notes that the Institute for Informatics and Automation Problems of the National Academy of Sciences of Armenia has established a scientific and educational infrastructure, uniting computing clusters of scientific and educational institutions of the country and provides the scientific community with access to local and international computational resources, that is a strong support for computational science in Armenia.
NASA Technical Reports Server (NTRS)
Denning, Peter J.; Tichy, Walter F.
1990-01-01
Highly parallel computing architectures are the only means to achieve the computation rates demanded by advanced scientific problems. A decade of research has demonstrated the feasibility of such machines and current research focuses on which architectures designated as multiple instruction multiple datastream (MIMD) and single instruction multiple datastream (SIMD) have produced the best results to date; neither shows a decisive advantage for most near-homogeneous scientific problems. For scientific problems with many dissimilar parts, more speculative architectures such as neural networks or data flow may be needed.
ASCR Cybersecurity for Scientific Computing Integrity - Research Pathways and Ideas Workshop
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peisert, Sean; Potok, Thomas E.; Jones, Todd
At the request of the U.S. Department of Energy's (DOE) Office of Science (SC) Advanced Scientific Computing Research (ASCR) program office, a workshop was held June 2-3, 2015, in Gaithersburg, MD, to identify potential long term (10 to +20 year) cybersecurity fundamental basic research and development challenges, strategies and roadmap facing future high performance computing (HPC), networks, data centers, and extreme-scale scientific user facilities. This workshop was a follow-on to the workshop held January 7-9, 2015, in Rockville, MD, that examined higher level ideas about scientific computing integrity specific to the mission of the DOE Office of Science. Issues includedmore » research computation and simulation that takes place on ASCR computing facilities and networks, as well as network-connected scientific instruments, such as those run by various DOE Office of Science programs. Workshop participants included researchers and operational staff from DOE national laboratories, as well as academic researchers and industry experts. Participants were selected based on the submission of abstracts relating to the topics discussed in the previous workshop report [1] and also from other ASCR reports, including "Abstract Machine Models and Proxy Architectures for Exascale Computing" [27], the DOE "Preliminary Conceptual Design for an Exascale Computing Initiative" [28], and the January 2015 machine learning workshop [29]. The workshop was also attended by several observers from DOE and other government agencies. The workshop was divided into three topic areas: (1) Trustworthy Supercomputing, (2) Extreme-Scale Data, Knowledge, and Analytics for Understanding and Improving Cybersecurity, and (3) Trust within High-end Networking and Data Centers. Participants were divided into three corresponding teams based on the category of their abstracts. The workshop began with a series of talks from the program manager and workshop chair, followed by the leaders for each of the three topics and a representative of each of the four major DOE Office of Science Advanced Scientific Computing Research Facilities: the Argonne Leadership Computing Facility (ALCF), the Energy Sciences Network (ESnet), the National Energy Research Scientific Computing Center (NERSC), and the Oak Ridge Leadership Computing Facility (OLCF). The rest of the workshop consisted of topical breakout discussions and focused writing periods that produced much of this report.« less
Distributed storage and cloud computing: a test case
NASA Astrophysics Data System (ADS)
Piano, S.; Delia Ricca, G.
2014-06-01
Since 2003 the computing farm hosted by the INFN Tier3 facility in Trieste supports the activities of many scientific communities. Hundreds of jobs from 45 different VOs, including those of the LHC experiments, are processed simultaneously. Given that normally the requirements of the different computational communities are not synchronized, the probability that at any given time the resources owned by one of the participants are not fully utilized is quite high. A balanced compensation should in principle allocate the free resources to other users, but there are limits to this mechanism. In fact, the Trieste site may not hold the amount of data needed to attract enough analysis jobs, and even in that case there could be a lack of bandwidth for their access. The Trieste ALICE and CMS computing groups, in collaboration with other Italian groups, aim to overcome the limitations of existing solutions using two approaches: sharing the data among all the participants taking full advantage of GARR-X wide area networks (10 GB/s) and integrating the resources dedicated to batch analysis with the ones reserved for dynamic interactive analysis, through modern solutions as cloud computing.
Building Cognition: The Construction of Computational Representations for Scientific Discovery
ERIC Educational Resources Information Center
Chandrasekharan, Sanjay; Nersessian, Nancy J.
2015-01-01
Novel computational representations, such as simulation models of complex systems and video games for scientific discovery (Foldit, EteRNA etc.), are dramatically changing the way discoveries emerge in science and engineering. The cognitive roles played by such computational representations in discovery are not well understood. We present a…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hey, Tony; Agarwal, Deborah; Borgman, Christine
The Advanced Scientific Computing Advisory Committee (ASCAC) was charged to form a standing subcommittee to review the Department of Energy’s Office of Scientific and Technical Information (OSTI) and to begin by assessing the quality and effectiveness of OSTI’s recent and current products and services and to comment on its mission and future directions in the rapidly changing environment for scientific publication and data. The Committee met with OSTI staff and reviewed available products, services and other materials. This report summaries their initial findings and recommendations.
Paramedir: A Tool for Programmable Performance Analysis
NASA Technical Reports Server (NTRS)
Jost, Gabriele; Labarta, Jesus; Gimenez, Judit
2004-01-01
Performance analysis of parallel scientific applications is time consuming and requires great expertise in areas such as programming paradigms, system software, and computer hardware architectures. In this paper we describe a tool that facilitates the programmability of performance metric calculations thereby allowing the automation of the analysis and reducing the application development time. We demonstrate how the system can be used to capture knowledge and intuition acquired by advanced parallel programmers in order to be transferred to novice users.
1990-12-07
Fundaqao Calouste Gulbenkian, Instituto Gulbenkian de Ci~ncia, Centro de C6lculo Cientifico , Coimbra, 1973. 28, Dirac, P. A. M., Spinors in Hilbert Space...Office of Scientific Research grants 1965 Mathematical Association of America Editorial Prize for the article entitled: "Linear Transformations on...matrices" 1966 L.R. Ford Memorial Prize awarded by the Mathematical Association of America for the article , "Permanents" 1989 Outstanding Computer
A comparative analysis of soft computing techniques for gene prediction.
Goel, Neelam; Singh, Shailendra; Aseri, Trilok Chand
2013-07-01
The rapid growth of genomic sequence data for both human and nonhuman species has made analyzing these sequences, especially predicting genes in them, very important and is currently the focus of many research efforts. Beside its scientific interest in the molecular biology and genomics community, gene prediction is of considerable importance in human health and medicine. A variety of gene prediction techniques have been developed for eukaryotes over the past few years. This article reviews and analyzes the application of certain soft computing techniques in gene prediction. First, the problem of gene prediction and its challenges are described. These are followed by different soft computing techniques along with their application to gene prediction. In addition, a comparative analysis of different soft computing techniques for gene prediction is given. Finally some limitations of the current research activities and future research directions are provided. Copyright © 2013 Elsevier Inc. All rights reserved.
The Quantitative Analysis of User Behavior Online - Data, Models and Algorithms
NASA Astrophysics Data System (ADS)
Raghavan, Prabhakar
By blending principles from mechanism design, algorithms, machine learning and massive distributed computing, the search industry has become good at optimizing monetization on sound scientific principles. This represents a successful and growing partnership between computer science and microeconomics. When it comes to understanding how online users respond to the content and experiences presented to them, we have more of a lacuna in the collaboration between computer science and certain social sciences. We will use a concrete technical example from image search results presentation, developing in the process some algorithmic and machine learning problems of interest in their own right. We then use this example to motivate the kinds of studies that need to grow between computer science and the social sciences; a critical element of this is the need to blend large-scale data analysis with smaller-scale eye-tracking and "individualized" lab studies.
NASA Technical Reports Server (NTRS)
Tennille, Geoffrey M.; Howser, Lona M.
1993-01-01
The use of the CONVEX computers that are an integral part of the Supercomputing Network Subsystems (SNS) of the Central Scientific Computing Complex of LaRC is briefly described. Features of the CONVEX computers that are significantly different than the CRAY supercomputers are covered, including: FORTRAN, C, architecture of the CONVEX computers, the CONVEX environment, batch job submittal, debugging, performance analysis, utilities unique to CONVEX, and documentation. This revision reflects the addition of the Applications Compiler and X-based debugger, CXdb. The document id intended for all CONVEX users as a ready reference to frequently asked questions and to more detailed information contained with the vendor manuals. It is appropriate for both the novice and the experienced user.
ERIC Educational Resources Information Center
Jacobson, Michael J.; Taylor, Charlotte E.; Richards, Deborah
2016-01-01
In this paper, we propose computational scientific inquiry (CSI) as an innovative model for learning important scientific knowledge and new practices for "doing" science. This approach involves the use of a "game-like" virtual world for students to experience virtual biological fieldwork in conjunction with using an agent-based…
ERIC Educational Resources Information Center
Hulshof, Casper D.; de Jong, Ton
2006-01-01
Students encounter many obstacles during scientific discovery learning with computer-based simulations. It is hypothesized that an effective type of support, that does not interfere with the scientific discovery learning process, should be delivered on a "just-in-time" base. This study explores the effect of facilitating access to…
Christensen, A. J.; Srinivasan, V.; Hart, J. C.; ...
2018-03-17
Sustainable crop production is a contributing factor to current and future food security. Innovative technologies are needed to design strategies that will achieve higher crop yields on less land and with fewer resources. Computational modeling coupled with advanced scientific visualization enables researchers to explore and interact with complex agriculture, nutrition, and climate data to predict how crops will respond to untested environments. These virtual observations and predictions can direct the development of crop ideotypes designed to meet future yield and nutritional demands. This review surveys modeling strategies for the development of crop ideotypes and scientific visualization technologies that have ledmore » to discoveries in “big data” analysis. Combined modeling and visualization approaches have been used to realistically simulate crops and to guide selection that immediately enhances crop quantity and quality under challenging environmental conditions. Lastly, this survey of current and developing technologies indicates that integrative modeling and advanced scientific visualization may help overcome challenges in agriculture and nutrition data as large-scale and multidimensional data become available in these fields.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Christensen, A. J.; Srinivasan, V.; Hart, J. C.
Sustainable crop production is a contributing factor to current and future food security. Innovative technologies are needed to design strategies that will achieve higher crop yields on less land and with fewer resources. Computational modeling coupled with advanced scientific visualization enables researchers to explore and interact with complex agriculture, nutrition, and climate data to predict how crops will respond to untested environments. These virtual observations and predictions can direct the development of crop ideotypes designed to meet future yield and nutritional demands. This review surveys modeling strategies for the development of crop ideotypes and scientific visualization technologies that have ledmore » to discoveries in “big data” analysis. Combined modeling and visualization approaches have been used to realistically simulate crops and to guide selection that immediately enhances crop quantity and quality under challenging environmental conditions. Lastly, this survey of current and developing technologies indicates that integrative modeling and advanced scientific visualization may help overcome challenges in agriculture and nutrition data as large-scale and multidimensional data become available in these fields.« less
Christensen, A J; Srinivasan, Venkatraman; Hart, John C; Marshall-Colon, Amy
2018-05-01
Sustainable crop production is a contributing factor to current and future food security. Innovative technologies are needed to design strategies that will achieve higher crop yields on less land and with fewer resources. Computational modeling coupled with advanced scientific visualization enables researchers to explore and interact with complex agriculture, nutrition, and climate data to predict how crops will respond to untested environments. These virtual observations and predictions can direct the development of crop ideotypes designed to meet future yield and nutritional demands. This review surveys modeling strategies for the development of crop ideotypes and scientific visualization technologies that have led to discoveries in "big data" analysis. Combined modeling and visualization approaches have been used to realistically simulate crops and to guide selection that immediately enhances crop quantity and quality under challenging environmental conditions. This survey of current and developing technologies indicates that integrative modeling and advanced scientific visualization may help overcome challenges in agriculture and nutrition data as large-scale and multidimensional data become available in these fields.
Christensen, A J; Srinivasan, Venkatraman; Hart, John C; Marshall-Colon, Amy
2018-01-01
Abstract Sustainable crop production is a contributing factor to current and future food security. Innovative technologies are needed to design strategies that will achieve higher crop yields on less land and with fewer resources. Computational modeling coupled with advanced scientific visualization enables researchers to explore and interact with complex agriculture, nutrition, and climate data to predict how crops will respond to untested environments. These virtual observations and predictions can direct the development of crop ideotypes designed to meet future yield and nutritional demands. This review surveys modeling strategies for the development of crop ideotypes and scientific visualization technologies that have led to discoveries in “big data” analysis. Combined modeling and visualization approaches have been used to realistically simulate crops and to guide selection that immediately enhances crop quantity and quality under challenging environmental conditions. This survey of current and developing technologies indicates that integrative modeling and advanced scientific visualization may help overcome challenges in agriculture and nutrition data as large-scale and multidimensional data become available in these fields. PMID:29562368
NASA Technical Reports Server (NTRS)
1986-01-01
Digital Imaging is the computer processed numerical representation of physical images. Enhancement of images results in easier interpretation. Quantitative digital image analysis by Perceptive Scientific Instruments, locates objects within an image and measures them to extract quantitative information. Applications are CAT scanners, radiography, microscopy in medicine as well as various industrial and manufacturing uses. The PSICOM 327 performs all digital image analysis functions. It is based on Jet Propulsion Laboratory technology, is accurate and cost efficient.
JPRS Report, Science & Technology, USSR: Computers
1987-09-29
Reliability of Protected Systems (L.S. Stoykova, O.A. Yushchenko; KIBERNETIKA, No 5, Sep-Oct 86) U Decision Making Based on Analysis of a Decision...34 published by the Central Scientific Research Institute for Information and Technoeconomic Research on Material and Technical Supply (TsNIITEIMS) of the...was said becomes clear after a subconscious analysis of the context. We have built our device according to the same pattern. In contrast to its
Cazzaniga, Paolo; Nobile, Marco S.; Besozzi, Daniela; Bellini, Matteo; Mauri, Giancarlo
2014-01-01
The introduction of general-purpose Graphics Processing Units (GPUs) is boosting scientific applications in Bioinformatics, Systems Biology, and Computational Biology. In these fields, the use of high-performance computing solutions is motivated by the need of performing large numbers of in silico analysis to study the behavior of biological systems in different conditions, which necessitate a computing power that usually overtakes the capability of standard desktop computers. In this work we present coagSODA, a CUDA-powered computational tool that was purposely developed for the analysis of a large mechanistic model of the blood coagulation cascade (BCC), defined according to both mass-action kinetics and Hill functions. coagSODA allows the execution of parallel simulations of the dynamics of the BCC by automatically deriving the system of ordinary differential equations and then exploiting the numerical integration algorithm LSODA. We present the biological results achieved with a massive exploration of perturbed conditions of the BCC, carried out with one-dimensional and bi-dimensional parameter sweep analysis, and show that GPU-accelerated parallel simulations of this model can increase the computational performances up to a 181× speedup compared to the corresponding sequential simulations. PMID:25025072
Interoperability of GADU in using heterogeneous Grid resources for bioinformatics applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sulakhe, D.; Rodriguez, A.; Wilde, M.
2008-03-01
Bioinformatics tools used for efficient and computationally intensive analysis of genetic sequences require large-scale computational resources to accommodate the growing data. Grid computational resources such as the Open Science Grid and TeraGrid have proved useful for scientific discovery. The genome analysis and database update system (GADU) is a high-throughput computational system developed to automate the steps involved in accessing the Grid resources for running bioinformatics applications. This paper describes the requirements for building an automated scalable system such as GADU that can run jobs on different Grids. The paper describes the resource-independent configuration of GADU using the Pegasus-based virtual datamore » system that makes high-throughput computational tools interoperable on heterogeneous Grid resources. The paper also highlights the features implemented to make GADU a gateway to computationally intensive bioinformatics applications on the Grid. The paper will not go into the details of problems involved or the lessons learned in using individual Grid resources as it has already been published in our paper on genome analysis research environment (GNARE) and will focus primarily on the architecture that makes GADU resource independent and interoperable across heterogeneous Grid resources.« less
ERIC Educational Resources Information Center
Chan, Kit Yu Karen; Yang, Sylvia; Maliska, Max E.; Grunbaum, Daniel
2012-01-01
The National Science Education Standards have highlighted the importance of active learning and reflection for contemporary scientific methods in K-12 classrooms, including the use of models. Computer modeling and visualization are tools that researchers employ in their scientific inquiry process, and often computer models are used in…
Architectural Principles and Experimentation of Distributed High Performance Virtual Clusters
ERIC Educational Resources Information Center
Younge, Andrew J.
2016-01-01
With the advent of virtualization and Infrastructure-as-a-Service (IaaS), the broader scientific computing community is considering the use of clouds for their scientific computing needs. This is due to the relative scalability, ease of use, advanced user environment customization abilities, and the many novel computing paradigms available for…
Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Carter, Jonathan; Shalf, John; Skinner, David; Ethier, Stephane; Biswas, Rupak; Djomehri, Jahed; VanderWijngaart, Rob
2003-01-01
The growing gap between sustained and peak performance for scientific applications has become a well-known problem in high performance computing. The recent development of parallel vector systems offers the potential to bridge this gap for a significant number of computational science codes and deliver a substantial increase in computing capabilities. This paper examines the intranode performance of the NEC SX6 vector processor and the cache-based IBM Power3/4 superscalar architectures across a number of key scientific computing areas. First, we present the performance of a microbenchmark suite that examines a full spectrum of low-level machine characteristics. Next, we study the behavior of the NAS Parallel Benchmarks using some simple optimizations. Finally, we evaluate the perfor- mance of several numerical codes from key scientific computing domains. Overall results demonstrate that the SX6 achieves high performance on a large fraction of our application suite and in many cases significantly outperforms the RISC-based architectures. However, certain classes of applications are not easily amenable to vectorization and would likely require extensive reengineering of both algorithm and implementation to utilize the SX6 effectively.
Challenges of Big Data Analysis.
Fan, Jianqing; Han, Fang; Liu, Han
2014-06-01
Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions.
Challenges of Big Data Analysis
Fan, Jianqing; Han, Fang; Liu, Han
2014-01-01
Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions. PMID:25419469
Microgravity sciences application visiting scientist program
NASA Technical Reports Server (NTRS)
Glicksman, Martin; Vanalstine, James
1995-01-01
Marshall Space Flight Center pursues scientific research in the area of low-gravity effects on materials and processes. To facilitate these Government performed research responsibilities, a number of supplementary research tasks were accomplished by a group of specialized visiting scientists. They participated in work on contemporary research problems with specific objectives related to current or future space flight experiments and defined and established independent programs of research which were based on scientific peer review and the relevance of the defined research to NASA microgravity for implementing a portion of the national program. The programs included research in the following areas: protein crystal growth, X-ray crystallography and computer analysis of protein crystal structure, optimization and analysis of protein crystal growth techniques, and design and testing of flight hardware.
Entering the 'big data' era in medicinal chemistry: molecular promiscuity analysis revisited.
Hu, Ye; Bajorath, Jürgen
2017-06-01
The 'big data' concept plays an increasingly important role in many scientific fields. Big data involves more than unprecedentedly large volumes of data that become available. Different criteria characterizing big data must be carefully considered in computational data mining, as we discuss herein focusing on medicinal chemistry. This is a scientific discipline where big data is beginning to emerge and provide new opportunities. For example, the ability of many drugs to specifically interact with multiple targets, termed promiscuity, forms the molecular basis of polypharmacology, a hot topic in drug discovery. Compound promiscuity analysis is an area that is much influenced by big data phenomena. Different results are obtained depending on chosen data selection and confidence criteria, as we also demonstrate.
The European perspective for LSST
NASA Astrophysics Data System (ADS)
Gangler, Emmanuel
2017-06-01
LSST is a next generation telescope that will produce an unprecedented data flow. The project goal is to deliver data products such as images and catalogs thus enabling scientific analysis for a wide community of users. As a large scale survey, LSST data will be complementary with other facilities in a wide range of scientific domains, including data from ESA or ESO. European countries have invested in LSST since 2007, in the construction of the camera as well as in the computing effort. This latter will be instrumental in designing the next step: how to distribute LSST data to Europe. Astroinformatics challenges for LSST indeed includes not only the analysis of LSST big data, but also the practical efficiency of the data access.
NASA Technical Reports Server (NTRS)
Oliger, Joseph
1992-01-01
The Research Institute for Advanced Computer Science (RIACS) was established by the Universities Space Research Association (USRA) at the NASA Ames Research Center (ARC) on 6 June 1983. RIACS is privately operated by USRA, a consortium of universities with research programs in the aerospace sciences, under a cooperative agreement with NASA. The primary mission of RIACS is to provide research and expertise in computer science and scientific computing to support the scientific missions of NASA ARC. The research carried out at RIACS must change its emphasis from year to year in response to NASA ARC's changing needs and technological opportunities. A flexible scientific staff is provided through a university faculty visitor program, a post doctoral program, and a student visitor program. Not only does this provide appropriate expertise but it also introduces scientists outside of NASA to NASA problems. A small group of core RIACS staff provides continuity and interacts with an ARC technical monitor and scientific advisory group to determine the RIACS mission. RIACS activities are reviewed and monitored by a USRA advisory council and ARC technical monitor. Research at RIACS is currently being done in the following areas: Parallel Computing; Advanced Methods for Scientific Computing; Learning Systems; High Performance Networks and Technology; Graphics, Visualization, and Virtual Environments.
NASA Astrophysics Data System (ADS)
Develaki, Maria
2017-11-01
Scientific reasoning is particularly pertinent to science education since it is closely related to the content and methodologies of science and contributes to scientific literacy. Much of the research in science education investigates the appropriate framework and teaching methods and tools needed to promote students' ability to reason and evaluate in a scientific way. This paper aims (a) to contribute to an extended understanding of the nature and pedagogical importance of model-based reasoning and (b) to exemplify how using computer simulations can support students' model-based reasoning. We provide first a background for both scientific reasoning and computer simulations, based on the relevant philosophical views and the related educational discussion. This background suggests that the model-based framework provides an epistemologically valid and pedagogically appropriate basis for teaching scientific reasoning and for helping students develop sounder reasoning and decision-taking abilities and explains how using computer simulations can foster these abilities. We then provide some examples illustrating the use of computer simulations to support model-based reasoning and evaluation activities in the classroom. The examples reflect the procedure and criteria for evaluating models in science and demonstrate the educational advantages of their application in classroom reasoning activities.
Applied Mathematics at the U.S. Department of Energy: Past, Present and a View to the Future
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brown, D L; Bell, J; Estep, D
2008-02-15
Over the past half-century, the Applied Mathematics program in the U.S. Department of Energy's Office of Advanced Scientific Computing Research has made significant, enduring advances in applied mathematics that have been essential enablers of modern computational science. Motivated by the scientific needs of the Department of Energy and its predecessors, advances have been made in mathematical modeling, numerical analysis of differential equations, optimization theory, mesh generation for complex geometries, adaptive algorithms and other important mathematical areas. High-performance mathematical software libraries developed through this program have contributed as much or more to the performance of modern scientific computer codes as themore » high-performance computers on which these codes run. The combination of these mathematical advances and the resulting software has enabled high-performance computers to be used for scientific discovery in ways that could only be imagined at the program's inception. Our nation, and indeed our world, face great challenges that must be addressed in coming years, and many of these will be addressed through the development of scientific understanding and engineering advances yet to be discovered. The U.S. Department of Energy (DOE) will play an essential role in providing science-based solutions to many of these problems, particularly those that involve the energy, environmental and national security needs of the country. As the capability of high-performance computers continues to increase, the types of questions that can be answered by applying this huge computational power become more varied and more complex. It will be essential that we find new ways to develop and apply the mathematics necessary to enable the new scientific and engineering discoveries that are needed. In August 2007, a panel of experts in applied, computational and statistical mathematics met for a day and a half in Berkeley, California to understand the mathematical developments required to meet the future science and engineering needs of the DOE. It is important to emphasize that the panelists were not asked to speculate only on advances that might be made in their own research specialties. Instead, the guidance this panel was given was to consider the broad science and engineering challenges that the DOE faces and identify the corresponding advances that must occur across the field of mathematics for these challenges to be successfully addressed. As preparation for the meeting, each panelist was asked to review strategic planning and other informational documents available for one or more of the DOE Program Offices, including the Offices of Science, Nuclear Energy, Fossil Energy, Environmental Management, Legacy Management, Energy Efficiency & Renewable Energy, Electricity Delivery & Energy Reliability and Civilian Radioactive Waste Management as well as the National Nuclear Security Administration. The panelists reported on science and engineering needs for each of these offices, and then discussed and identified mathematical advances that will be required if these challenges are to be met. A review of DOE challenges in energy, the environment and national security brings to light a broad and varied array of questions that the DOE must answer in the coming years. A representative subset of such questions includes: (1) Can we predict the operating characteristics of a clean coal power plant? (2) How stable is the plasma containment in a tokamak? (3) How quickly is climate change occurring and what are the uncertainties in the predicted time scales? (4) How quickly can an introduced bio-weapon contaminate the agricultural environment in the US? (5) How do we modify models of the atmosphere and clouds to incorporate newly collected data of possibly of new types? (6) How quickly can the United States recover if part of the power grid became inoperable? (7) What are optimal locations and communication protocols for sensing devices in a remote-sensing network? (8) How can new materials be designed with a specified desirable set of properties? In comparing and contrasting these and other questions of importance to DOE, the panel found that while the scientific breadth of the requirements is enormous, a central theme emerges: Scientists are being asked to identify or provide technology, or to give expert analysis to inform policy-makers that requires the scientific understanding of increasingly complex physical and engineered systems. In addition, as the complexity of the systems of interest increases, neither experimental observation nor mathematical and computational modeling alone can access all components of the system over the entire range of scales or conditions needed to provide the required scientific understanding.« less
Evaluating a Web-Based Video Corpus through an Analysis of User Interactions
ERIC Educational Resources Information Center
Caws, Catherine G.
2013-01-01
As shown by several studies, successful integration of technology in language learning requires a holistic approach in order to scientifically understand what learners do when working with web-based technology (cf. Raby, 2007). Additionally, a growing body of research in computer assisted language learning (CALL) evaluation, design and…
Information Storage and Retrieval Scientific Report No. ISR-22.
ERIC Educational Resources Information Center
Salton, Gerard
The twenty-second in a series, this report describes research in information organization and retrieval conducted by the Department of Computer Science at Cornell University. The report covers work carried out during the period summer 1972 through summer 1974 and is divided into four parts: indexing theory, automatic content analysis, feedback…
Agile parallel bioinformatics workflow management using Pwrake.
Mishima, Hiroyuki; Sasaki, Kensaku; Tanaka, Masahiro; Tatebe, Osamu; Yoshiura, Koh-Ichiro
2011-09-08
In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error.Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability and maintainability of rakefiles may facilitate sharing workflows among the scientific community. Workflows for GATK and Dindel are available at http://github.com/misshie/Workflows.
Agile parallel bioinformatics workflow management using Pwrake
2011-01-01
Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error. Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. Findings We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Conclusions Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability and maintainability of rakefiles may facilitate sharing workflows among the scientific community. Workflows for GATK and Dindel are available at http://github.com/misshie/Workflows. PMID:21899774
Dynamic computer model for the metallogenesis and tectonics of the Circum-North Pacific
Scotese, Christopher R.; Nokleberg, Warren J.; Monger, James W.H.; Norton, Ian O.; Parfenov, Leonid M.; Khanchuk, Alexander I.; Bundtzen, Thomas K.; Dawson, Kenneth M.; Eremin, Roman A.; Frolov, Yuri F.; Fujita, Kazuya; Goryachev, Nikolai A.; Pozdeev, Anany I.; Ratkin, Vladimir V.; Rodinov, Sergey M.; Rozenblum, Ilya S.; Scholl, David W.; Shpikerman, Vladimir I.; Sidorov, Anatoly A.; Stone, David B.
2001-01-01
The digital files on this report consist of a dynamic computer model of the metallogenesis and tectonics of the Circum-North Pacific, and background articles, figures, and maps. The tectonic part of the dynamic computer model is derived from a major analysis of the tectonic evolution of the Circum-North Pacific which is also contained in directory tectevol. The dynamic computer model and associated materials on this CD-ROM are part of a project on the major mineral deposits, metallogenesis, and tectonics of the Russian Far East, Alaska, and the Canadian Cordillera. The project provides critical information on bedrock geology and geophysics, tectonics, major metalliferous mineral resources, metallogenic patterns, and crustal origin and evolution of mineralizing systems for this region. The major scientific goals and benefits of the project are to: (1) provide a comprehensive international data base on the mineral resources of the region that is the first, extensive knowledge available in English; (2) provide major new interpretations of the origin and crustal evolution of mineralizing systems and their host rocks, thereby enabling enhanced, broad-scale tectonic reconstructions and interpretations; and (3) promote trade and scientific and technical exchanges between North America and Eastern Asia.
Parallel Tensor Compression for Large-Scale Scientific Data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kolda, Tamara G.; Ballard, Grey; Austin, Woody Nathan
As parallel computing trends towards the exascale, scientific data produced by high-fidelity simulations are growing increasingly massive. For instance, a simulation on a three-dimensional spatial grid with 512 points per dimension that tracks 64 variables per grid point for 128 time steps yields 8 TB of data. By viewing the data as a dense five way tensor, we can compute a Tucker decomposition to find inherent low-dimensional multilinear structure, achieving compression ratios of up to 10000 on real-world data sets with negligible loss in accuracy. So that we can operate on such massive data, we present the first-ever distributed memorymore » parallel implementation for the Tucker decomposition, whose key computations correspond to parallel linear algebra operations, albeit with nonstandard data layouts. Our approach specifies a data distribution for tensors that avoids any tensor data redistribution, either locally or in parallel. We provide accompanying analysis of the computation and communication costs of the algorithms. To demonstrate the compression and accuracy of the method, we apply our approach to real-world data sets from combustion science simulations. We also provide detailed performance results, including parallel performance in both weak and strong scaling experiments.« less
Geospatial-enabled Data Exploration and Computation through Data Infrastructure Building Blocks
NASA Astrophysics Data System (ADS)
Song, C. X.; Biehl, L. L.; Merwade, V.; Villoria, N.
2015-12-01
Geospatial data are present everywhere today with the proliferation of location-aware computing devices and sensors. This is especially true in the scientific community where large amounts of data are driving research and education activities in many domains. Collaboration over geospatial data, for example, in modeling, data analysis and visualization, must still overcome the barriers of specialized software and expertise among other challenges. The GABBs project aims at enabling broader access to geospatial data exploration and computation by developing spatial data infrastructure building blocks that leverage capabilities of end-to-end application service and virtualized computing framework in HUBzero. Funded by NSF Data Infrastructure Building Blocks (DIBBS) initiative, GABBs provides a geospatial data architecture that integrates spatial data management, mapping and visualization and will make it available as open source. The outcome of the project will enable users to rapidly create tools and share geospatial data and tools on the web for interactive exploration of data without requiring significant software development skills, GIS expertise or IT administrative privileges. This presentation will describe the development of geospatial data infrastructure building blocks and the scientific use cases that help drive the software development, as well as seek feedback from the user communities.
Methodical and technological aspects of creation of interactive computer learning systems
NASA Astrophysics Data System (ADS)
Vishtak, N. M.; Frolov, D. A.
2017-01-01
The article presents a methodology for the development of an interactive computer training system for training power plant. The methods used in the work are a generalization of the content of scientific and methodological sources on the use of computer-based training systems in vocational education, methods of system analysis, methods of structural and object-oriented modeling of information systems. The relevance of the development of the interactive computer training systems in the preparation of the personnel in the conditions of the educational and training centers is proved. Development stages of the computer training systems are allocated, factors of efficient use of the interactive computer training system are analysed. The algorithm of work performance at each development stage of the interactive computer training system that enables one to optimize time, financial and labor expenditure on the creation of the interactive computer training system is offered.
Hera - The HEASARC's New Data Analysis Service
NASA Technical Reports Server (NTRS)
Pence, William
2006-01-01
Hera is the new computer service provided by the HEASARC at the NASA Goddard Space Flight Center that enables qualified student and professional astronomical researchers to immediately begin analyzing scientific data from high-energy astrophysics missions. All the necessary resources needed to do the data analysis are freely provided by Hera, including: * the latest version of the hundreds of scientific analysis programs in the HEASARC's HEASOFT package, as well as most of the programs in the Chandra CIAO package and the XMM-Newton SAS package. * high speed access to the terabytes of data in the HEASARC's high energy astrophysics Browse data archive. * a cluster of fast Linw workstations to run the software * ample local disk space to temporarily store the data and results. Some of the many features and different modes of using Hera are illustrated in this poster presentation.
EdGCM: Research Tools for Training the Climate Change Generation
NASA Astrophysics Data System (ADS)
Chandler, M. A.; Sohl, L. E.; Zhou, J.; Sieber, R.
2011-12-01
Climate scientists employ complex computer simulations of the Earth's physical systems to prepare climate change forecasts, study the physical mechanisms of climate, and to test scientific hypotheses and computer parameterizations. The Intergovernmental Panel on Climate Change 4th Assessment Report (2007) demonstrates unequivocally that policy makers rely heavily on such Global Climate Models (GCMs) to assess the impacts of potential economic and emissions scenarios. However, true climate modeling capabilities are not disseminated to the majority of world governments or U.S. researchers - let alone to the educators who will be training the students who are about to be presented with a world full of climate change stakeholders. The goal is not entirely quixotic; in fact, by the mid-1990's prominent climate scientists were predicting with certainty that schools and politicians would "soon" be running GCMs on laptops [Randall, 1996]. For a variety of reasons this goal was never achieved (nor even really attempted). However, around the same time NASA and the National Science Foundation supported a small pilot project at Columbia University to show the potential of putting sophisticated computer climate models - not just "demos" or "toy models" - into the hands of non-specialists. The Educational Global Climate Modeling Project (EdGCM) gave users access to a real global climate model and provided them with the opportunity to experience the details of climate model setup, model operation, post-processing and scientific visualization. EdGCM was designed for use in both research and education - it is a full-blown research GCM, but the ultimate goal is to develop a capability to embed these crucial technologies across disciplines, networks, platforms, and even across academia and industry. With this capability in place we can begin training the skilled workforce that is necessary to deal with the multitude of climate impacts that will occur over the coming decades. To further increase the educational potential of climate models, the EdGCM project has also created "EZgcm". Through a joint venture of NASA, Columbia University and McGill University EZgcm moves the focus toward a greater use of Web 1.0 and Web 2.0-based technologies. It shifts the educational objectives towards a greater emphasis on teaching students how science is conducted and what role science plays in assessing climate change. That is, students learn about the steps of the scientific process as conveyed by climate modeling research: constructing a hypothesis, designing an experiment, running a computer model, using scientific visualization to support analysis, communicating the results of that analysis, and role playing the scientific peer review process. This is in stark contrast to what they learn from the political debate over climate change, which they often confuse with a scientific debate.
A toolbox and a record for scientific model development
NASA Technical Reports Server (NTRS)
Ellman, Thomas
1994-01-01
Scientific computation can benefit from software tools that facilitate construction of computational models, control the application of models, and aid in revising models to handle new situations. Existing environments for scientific programming provide only limited means of handling these tasks. This paper describes a two pronged approach for handling these tasks: (1) designing a 'Model Development Toolbox' that includes a basic set of model constructing operations; and (2) designing a 'Model Development Record' that is automatically generated during model construction. The record is subsequently exploited by tools that control the application of scientific models and revise models to handle new situations. Our two pronged approach is motivated by our belief that the model development toolbox and record should be highly interdependent. In particular, a suitable model development record can be constructed only when models are developed using a well defined set of operations. We expect this research to facilitate rapid development of new scientific computational models, to help ensure appropriate use of such models and to facilitate sharing of such models among working computational scientists. We are testing this approach by extending SIGMA, and existing knowledge-based scientific software design tool.
The Petascale Data Storage Institute
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gibson, Garth; Long, Darrell; Honeyman, Peter
2013-07-01
Petascale computing infrastructures for scientific discovery make petascale demands on information storage capacity, performance, concurrency, reliability, availability, and manageability.The Petascale Data Storage Institute focuses on the data storage problems found in petascale scientific computing environments, with special attention to community issues such as interoperability, community buy-in, and shared tools.The Petascale Data Storage Institute is a collaboration between researchers at Carnegie Mellon University, National Energy Research Scientific Computing Center, Pacific Northwest National Laboratory, Oak Ridge National Laboratory, Sandia National Laboratory, Los Alamos National Laboratory, University of Michigan, and the University of California at Santa Cruz.
The need for scientific software engineering in the pharmaceutical industry
NASA Astrophysics Data System (ADS)
Luty, Brock; Rose, Peter W.
2017-03-01
Scientific software engineering is a distinct discipline from both computational chemistry project support and research informatics. A scientific software engineer not only has a deep understanding of the science of drug discovery but also the desire, skills and time to apply good software engineering practices. A good team of scientific software engineers can create a software foundation that is maintainable, validated and robust. If done correctly, this foundation enable the organization to investigate new and novel computational ideas with a very high level of efficiency.
The need for scientific software engineering in the pharmaceutical industry.
Luty, Brock; Rose, Peter W
2017-03-01
Scientific software engineering is a distinct discipline from both computational chemistry project support and research informatics. A scientific software engineer not only has a deep understanding of the science of drug discovery but also the desire, skills and time to apply good software engineering practices. A good team of scientific software engineers can create a software foundation that is maintainable, validated and robust. If done correctly, this foundation enable the organization to investigate new and novel computational ideas with a very high level of efficiency.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bethel, E. Wes; Greenwald, Martin; Kleese van Dam, Kerstin
Scientific user facilities—particle accelerators, telescopes, colliders, supercomputers, light sources, sequencing facilities, and more—operated by the U.S. Department of Energy (DOE) Office of Science (SC) generate ever increasing volumes of data at unprecedented rates from experiments, observations, and simulations. At the same time there is a growing community of experimentalists that require real-time data analysis feedback, to enable them to steer their complex experimental instruments to optimized scientific outcomes and new discoveries. Recent efforts in DOE-SC have focused on articulating the data-centric challenges and opportunities facing these science communities. Key challenges include difficulties coping with data size, rate, and complexity inmore » the context of both real-time and post-experiment data analysis and interpretation. Solutions will require algorithmic and mathematical advances, as well as hardware and software infrastructures that adequately support data-intensive scientific workloads. This paper presents the summary findings of a workshop held by DOE-SC in September 2015, convened to identify the major challenges and the research that is needed to meet those challenges.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bethel, E. Wes; Greenwald, Martin; Kleese van Dam, Kersten
Scientific user facilities---particle accelerators, telescopes, colliders, supercomputers, light sources, sequencing facilities, and more---operated by the U.S. Department of Energy (DOE) Office of Science (SC) generate ever increasing volumes of data at unprecedented rates from experiments, observations, and simulations. At the same time there is a growing community of experimentalists that require real-time data analysis feedback, to enable them to steer their complex experimental instruments to optimized scientific outcomes and new discoveries. Recent efforts in DOE-SC have focused on articulating the data-centric challenges and opportunities facing these science communities. Key challenges include difficulties coping with data size, rate, and complexity inmore » the context of both real-time and post-experiment data analysis and interpretation. Solutions will require algorithmic and mathematical advances, as well as hardware and software infrastructures that adequately support data-intensive scientific workloads. This paper presents the summary findings of a workshop held by DOE-SC in September 2015, convened to identify the major challenges and the research that is needed to meet those challenges.« less
Data Crosscutting Requirements Review
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kleese van Dam, Kerstin; Shoshani, Arie; Plata, Charity
2013-04-01
In April 2013, a diverse group of researchers from the U.S. Department of Energy (DOE) scientific community assembled to assess data requirements associated with DOE-sponsored scientific facilities and large-scale experiments. Participants in the review included facilities staff, program managers, and scientific experts from the offices of Basic Energy Sciences, Biological and Environmental Research, High Energy Physics, and Advanced Scientific Computing Research. As part of the meeting, review participants discussed key issues associated with three distinct aspects of the data challenge: 1) processing, 2) management, and 3) analysis. These discussions identified commonalities and differences among the needs of varied scientific communities.more » They also helped to articulate gaps between current approaches and future needs, as well as the research advances that will be required to close these gaps. Moreover, the review provided a rare opportunity for experts from across the Office of Science to learn about their collective expertise, challenges, and opportunities. The "Data Crosscutting Requirements Review" generated specific findings and recommendations for addressing large-scale data crosscutting requirements.« less
A web portal for hydrodynamical, cosmological simulations
NASA Astrophysics Data System (ADS)
Ragagnin, A.; Dolag, K.; Biffi, V.; Cadolle Bel, M.; Hammer, N. J.; Krukau, A.; Petkova, M.; Steinborn, D.
2017-07-01
This article describes a data centre hosting a web portal for accessing and sharing the output of large, cosmological, hydro-dynamical simulations with a broad scientific community. It also allows users to receive related scientific data products by directly processing the raw simulation data on a remote computing cluster. The data centre has a multi-layer structure: a web portal, a job control layer, a computing cluster and a HPC storage system. The outer layer enables users to choose an object from the simulations. Objects can be selected by visually inspecting 2D maps of the simulation data, by performing highly compounded and elaborated queries or graphically by plotting arbitrary combinations of properties. The user can run analysis tools on a chosen object. These services allow users to run analysis tools on the raw simulation data. The job control layer is responsible for handling and performing the analysis jobs, which are executed on a computing cluster. The innermost layer is formed by a HPC storage system which hosts the large, raw simulation data. The following services are available for the users: (I) CLUSTERINSPECT visualizes properties of member galaxies of a selected galaxy cluster; (II) SIMCUT returns the raw data of a sub-volume around a selected object from a simulation, containing all the original, hydro-dynamical quantities; (III) SMAC creates idealized 2D maps of various, physical quantities and observables of a selected object; (IV) PHOX generates virtual X-ray observations with specifications of various current and upcoming instruments.
75 FR 65639 - Center for Scientific Review; Notice of Closed Meetings
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-26
...: Computational Biology Special Emphasis Panel A. Date: October 29, 2010. Time: 2 p.m. to 3:30 p.m. Agenda: To.... Name of Committee: Center for Scientific Review Special Emphasis Panel; Member Conflict: Computational...
ERIC Educational Resources Information Center
Abdullah, Sopiah; Shariff, Adilah
2008-01-01
The purpose of the study was to investigate the effects of inquiry-based computer simulation with heterogeneous-ability cooperative learning (HACL) and inquiry-based computer simulation with friendship cooperative learning (FCL) on (a) scientific reasoning (SR) and (b) conceptual understanding (CU) among Form Four students in Malaysian Smart…
The Methods of Cognitive Visualization for the Astronomical Databases Analyzing Tools Development
NASA Astrophysics Data System (ADS)
Vitkovskiy, V.; Gorohov, V.
2008-08-01
There are two kinds of computer graphics: the illustrative one and the cognitive one. Appropriate the cognitive pictures not only make evident and clear the sense of complex and difficult scientific concepts, but promote, --- and not so very rarely, --- a birth of a new knowledge. On the basis of the cognitive graphics concept, we worked out the SW-system for visualization and analysis. It allows to train and to aggravate intuition of researcher, to raise his interest and motivation to the creative, scientific cognition, to realize process of dialogue with the very problems simultaneously.
Telescience testbed pilot program, volume 1: Executive summary
NASA Technical Reports Server (NTRS)
Leiner, Barry M.
1989-01-01
Space Station Freedom and its associated labs, coupled with the availability of new computing and communications technologies, have the potential for significantly enhancing scientific research. A Telescience Testbed Pilot Program (TTPP), aimed at developing the experience base to deal with issues in the design of the future information system of the Space Station era. The testbeds represented four scientific disciplines (astronomy and astrophysics, earth sciences, life sciences, and microgravity sciences) and studied issues in payload design, operation, and data analysis. This volume, of a 3 volume set, which all contain the results of the TTPP, is the executive summary.
National Fusion Collaboratory: Grid Computing for Simulations and Experiments
NASA Astrophysics Data System (ADS)
Greenwald, Martin
2004-05-01
The National Fusion Collaboratory Project is creating a computational grid designed to advance scientific understanding and innovation in magnetic fusion research by facilitating collaborations, enabling more effective integration of experiments, theory and modeling and allowing more efficient use of experimental facilities. The philosophy of FusionGrid is that data, codes, analysis routines, visualization tools, and communication tools should be thought of as network available services, easily used by the fusion scientist. In such an environment, access to services is stressed rather than portability. By building on a foundation of established computer science toolkits, deployment time can be minimized. These services all share the same basic infrastructure that allows for secure authentication and resource authorization which allows stakeholders to control their own resources such as computers, data and experiments. Code developers can control intellectual property, and fair use of shared resources can be demonstrated and controlled. A key goal is to shield scientific users from the implementation details such that transparency and ease-of-use are maximized. The first FusionGrid service deployed was the TRANSP code, a widely used tool for transport analysis. Tools for run preparation, submission, monitoring and management have been developed and shared among a wide user base. This approach saves user sites from the laborious effort of maintaining such a large and complex code while at the same time reducing the burden on the development team by avoiding the need to support a large number of heterogeneous installations. Shared visualization and A/V tools are being developed and deployed to enhance long-distance collaborations. These include desktop versions of the Access Grid, a highly capable multi-point remote conferencing tool and capabilities for sharing displays and analysis tools over local and wide-area networks.
NASA Technical Reports Server (NTRS)
Vallee, J.; Wilson, T.
1976-01-01
Results are reported of the first experiments for a computer conference management information system at the National Aeronautics and Space Administration. Between August 1975 and March 1976, two NASA projects with geographically separated participants (NASA scientists) used the PLANET computer conferencing system for portions of their work. The first project was a technology assessment of future transportation systems. The second project involved experiments with the Communication Technology Satellite. As part of this project, pre- and postlaunch operations were discussed in a computer conference. These conferences also provided the context for an analysis of the cost of computer conferencing. In particular, six cost components were identified: (1) terminal equipment, (2) communication with a network port, (3) network connection, (4) computer utilization, (5) data storage and (6) administrative overhead.
NASA Technical Reports Server (NTRS)
Tighe, R. J.; Shen, M. Y. H.
1984-01-01
The Nimbus 7 ERB MATRIX Tape is a computer program in which radiances and irradiances are converted into fluxes which are used to compute the basic scientific output parameters, emitted flux, albedo, and net radiation. They are spatially averaged and presented as time averages over one-day, six-day, and monthly periods. MATRIX data for the period November 16, 1978 through October 31, 1979 are presented. Described are the Earth Radiation Budget experiment, the Science Quality Control Report, Items checked by the MATRIX Science Quality Control Program, and Science Quality Control Data Analysis Report. Additional material from the detailed scientific quality control of the tapes which may be very useful to a user of the MATRIX tapes is included. Known errors and data problems and some suggestions on how to use the data for further climatologic and atmospheric physics studies are also discussed.
OOI CyberInfrastructure - Next Generation Oceanographic Research
NASA Astrophysics Data System (ADS)
Farcas, C.; Fox, P.; Arrott, M.; Farcas, E.; Klacansky, I.; Krueger, I.; Meisinger, M.; Orcutt, J.
2008-12-01
Software has become a key enabling technology for scientific discovery, observation, modeling, and exploitation of natural phenomena. New value emerges from the integration of individual subsystems into networked federations of capabilities exposed to the scientific community. Such data-intensive interoperability networks are crucial for future scientific collaborative research, as they open up new ways of fusing data from different sources and across various domains, and analysis on wide geographic areas. The recently established NSF OOI program, through its CyberInfrastructure component addresses this challenge by providing broad access from sensor networks for data acquisition up to computational grids for massive computations and binding infrastructure facilitating policy management and governance of the emerging system-of-scientific-systems. We provide insight into the integration core of this effort, namely, a hierarchic service-oriented architecture for a robust, performant, and maintainable implementation. We first discuss the relationship between data management and CI crosscutting concerns such as identity management, policy and governance, which define the organizational contexts for data access and usage. Next, we detail critical services including data ingestion, transformation, preservation, inventory, and presentation. To address interoperability issues between data represented in various formats we employ a semantic framework derived from the Earth System Grid technology, a canonical representation for scientific data based on DAP/OPeNDAP, and related data publishers such as ERDDAP. Finally, we briefly present the underlying transport based on a messaging infrastructure over the AMQP protocol, and the preservation based on a distributed file system through SDSC iRODS.
The Center for Computational Biology: resources, achievements, and challenges
Dinov, Ivo D; Thompson, Paul M; Woods, Roger P; Van Horn, John D; Shattuck, David W; Parker, D Stott
2011-01-01
The Center for Computational Biology (CCB) is a multidisciplinary program where biomedical scientists, engineers, and clinicians work jointly to combine modern mathematical and computational techniques, to perform phenotypic and genotypic studies of biological structure, function, and physiology in health and disease. CCB has developed a computational framework built around the Manifold Atlas, an integrated biomedical computing environment that enables statistical inference on biological manifolds. These manifolds model biological structures, features, shapes, and flows, and support sophisticated morphometric and statistical analyses. The Manifold Atlas includes tools, workflows, and services for multimodal population-based modeling and analysis of biological manifolds. The broad spectrum of biomedical topics explored by CCB investigators include the study of normal and pathological brain development, maturation and aging, discovery of associations between neuroimaging and genetic biomarkers, and the modeling, analysis, and visualization of biological shape, form, and size. CCB supports a wide range of short-term and long-term collaborations with outside investigators, which drive the center's computational developments and focus the validation and dissemination of CCB resources to new areas and scientific domains. PMID:22081221
The Center for Computational Biology: resources, achievements, and challenges.
Toga, Arthur W; Dinov, Ivo D; Thompson, Paul M; Woods, Roger P; Van Horn, John D; Shattuck, David W; Parker, D Stott
2012-01-01
The Center for Computational Biology (CCB) is a multidisciplinary program where biomedical scientists, engineers, and clinicians work jointly to combine modern mathematical and computational techniques, to perform phenotypic and genotypic studies of biological structure, function, and physiology in health and disease. CCB has developed a computational framework built around the Manifold Atlas, an integrated biomedical computing environment that enables statistical inference on biological manifolds. These manifolds model biological structures, features, shapes, and flows, and support sophisticated morphometric and statistical analyses. The Manifold Atlas includes tools, workflows, and services for multimodal population-based modeling and analysis of biological manifolds. The broad spectrum of biomedical topics explored by CCB investigators include the study of normal and pathological brain development, maturation and aging, discovery of associations between neuroimaging and genetic biomarkers, and the modeling, analysis, and visualization of biological shape, form, and size. CCB supports a wide range of short-term and long-term collaborations with outside investigators, which drive the center's computational developments and focus the validation and dissemination of CCB resources to new areas and scientific domains.
NASA Technical Reports Server (NTRS)
Oliger, Joseph
1993-01-01
The Research Institute for Advanced Computer Science (RIACS) was established by the Universities Space Research Association (USRA) at the NASA Ames Research Center (ARC) on 6 June 1983. RIACS is privately operated by USRA, a consortium of universities with research programs in the aerospace sciences, under contract with NASA. The primary mission of RIACS is to provide research and expertise in computer science and scientific computing to support the scientific missions of NASA ARC. The research carried out at RIACS must change its emphasis from year to year in response to NASA ARC's changing needs and technological opportunities. A flexible scientific staff is provided through a university faculty visitor program, a post doctoral program, and a student visitor program. Not only does this provide appropriate expertise but it also introduces scientists outside of NASA to NASA problems. A small group of core RIACS staff provides continuity and interacts with an ARC technical monitor and scientific advisory group to determine the RIACS mission. RIACS activities are reviewed and monitored by a USRA advisory council and ARC technical monitor. Research at RIACS is currently being done in the following areas: Parallel Computing, Advanced Methods for Scientific Computing, High Performance Networks and Technology, and Learning Systems. Parallel compiler techniques, adaptive numerical methods for flows in complicated geometries, and optimization were identified as important problems to investigate for ARC's involvement in the Computational Grand Challenges of the next decade.
Audie, J; Boyd, C
2010-01-01
The case for peptide-based drugs is compelling. Due to their chemical, physical and conformational diversity, and relatively unproblematic toxicity and immunogenicity, peptides represent excellent starting material for drug discovery. Nature has solved many physiological and pharmacological problems through the use of peptides, polypeptides and proteins. If nature could solve such a diversity of challenging biological problems through the use of peptides, it seems reasonable to infer that human ingenuity will prove even more successful. And this, indeed, appears to be the case, as a number of scientific and methodological advances are making peptides and peptide-based compounds ever more promising pharmacological agents. Chief among these advances are powerful chemical and biological screening technologies for lead identification and optimization, methods for enhancing peptide in vivo stability, bioavailability and cell-permeability, and new delivery technologies. Other advances include the development and experimental validation of robust computational methods for peptide lead identification and optimization. Finally, scientific analysis, biology and chemistry indicate the prospect of designing relatively small peptides to therapeutically modulate so-called 'undruggable' protein-protein interactions. Taken together a clear picture is emerging: through the synergistic use of the scientific imagination and the computational, chemical and biological methods that are currently available, effective peptide therapeutics for novel targets can be designed that surpass even the proven peptidic designs of nature.
Use of cloud computing in biomedicine.
Sobeslav, Vladimir; Maresova, Petra; Krejcar, Ondrej; Franca, Tanos C C; Kuca, Kamil
2016-12-01
Nowadays, biomedicine is characterised by a growing need for processing of large amounts of data in real time. This leads to new requirements for information and communication technologies (ICT). Cloud computing offers a solution to these requirements and provides many advantages, such as cost savings, elasticity and scalability of using ICT. The aim of this paper is to explore the concept of cloud computing and the related use of this concept in the area of biomedicine. Authors offer a comprehensive analysis of the implementation of the cloud computing approach in biomedical research, decomposed into infrastructure, platform and service layer, and a recommendation for processing large amounts of data in biomedicine. Firstly, the paper describes the appropriate forms and technological solutions of cloud computing. Secondly, the high-end computing paradigm of cloud computing aspects is analysed. Finally, the potential and current use of applications in scientific research of this technology in biomedicine is discussed.
Supercomputing resources empowering superstack with interactive and integrated systems
NASA Astrophysics Data System (ADS)
Rückemann, Claus-Peter
2012-09-01
This paper presents the results from the development and implementation of Superstack algorithms to be dynamically used with integrated systems and supercomputing resources. Processing of geophysical data, thus named geoprocessing, is an essential part of the analysis of geoscientific data. The theory of Superstack algorithms and the practical application on modern computing architectures was inspired by developments introduced with processing of seismic data on mainframes and within the last years leading to high end scientific computing applications. There are several stacking algorithms known but with low signal to noise ratio in seismic data the use of iterative algorithms like the Superstack can support analysis and interpretation. The new Superstack algorithms are in use with wave theory and optical phenomena on highly performant computing resources for huge data sets as well as for sophisticated application scenarios in geosciences and archaeology.
AstrodyToolsWeb an e-Science project in Astrodynamics and Celestial Mechanics fields
NASA Astrophysics Data System (ADS)
López, R.; San-Juan, J. F.
2013-05-01
Astrodynamics Web Tools, AstrodyToolsWeb (http://tastrody.unirioja.es), is an ongoing collaborative Web Tools computing infrastructure project which has been specially designed to support scientific computation. AstrodyToolsWeb provides project collaborators with all the technical and human facilities in order to wrap, manage, and use specialized noncommercial software tools in Astrodynamics and Celestial Mechanics fields, with the aim of optimizing the use of resources, both human and material. However, this project is open to collaboration from the whole scientific community in order to create a library of useful tools and their corresponding theoretical backgrounds. AstrodyToolsWeb offers a user-friendly web interface in order to choose applications, introduce data, and select appropriate constraints in an intuitive and easy way for the user. After that, the application is executed in real time, whenever possible; then the critical information about program behavior (errors and logs) and output, including the postprocessing and interpretation of its results (graphical representation of data, statistical analysis or whatever manipulation therein), are shown via the same web interface or can be downloaded to the user's computer.
Geoinformation web-system for processing and visualization of large archives of geo-referenced data
NASA Astrophysics Data System (ADS)
Gordov, E. P.; Okladnikov, I. G.; Titov, A. G.; Shulgina, T. M.
2010-12-01
Developed working model of information-computational system aimed at scientific research in area of climate change is presented. The system will allow processing and analysis of large archives of geophysical data obtained both from observations and modeling. Accumulated experience of developing information-computational web-systems providing computational processing and visualization of large archives of geo-referenced data was used during the implementation (Gordov et al, 2007; Okladnikov et al, 2008; Titov et al, 2009). Functional capabilities of the system comprise a set of procedures for mathematical and statistical analysis, processing and visualization of data. At present five archives of data are available for processing: 1st and 2nd editions of NCEP/NCAR Reanalysis, ECMWF ERA-40 Reanalysis, JMA/CRIEPI JRA-25 Reanalysis, and NOAA-CIRES XX Century Global Reanalysis Version I. To provide data processing functionality a computational modular kernel and class library providing data access for computational modules were developed. Currently a set of computational modules for climate change indices approved by WMO is available. Also a special module providing visualization of results and writing to Encapsulated Postscript, GeoTIFF and ESRI shape files was developed. As a technological basis for representation of cartographical information in Internet the GeoServer software conforming to OpenGIS standards is used. Integration of GIS-functionality with web-portal software to provide a basis for web-portal’s development as a part of geoinformation web-system is performed. Such geoinformation web-system is a next step in development of applied information-telecommunication systems offering to specialists from various scientific fields unique opportunities of performing reliable analysis of heterogeneous geophysical data using approved computational algorithms. It will allow a wide range of researchers to work with geophysical data without specific programming knowledge and to concentrate on solving their specific tasks. The system would be of special importance for education in climate change domain. This work is partially supported by RFBR grant #10-07-00547, SB RAS Basic Program Projects 4.31.1.5 and 4.31.2.7, SB RAS Integration Projects 4 and 9.
An Expert Assistant for Computer Aided Parallelization
NASA Technical Reports Server (NTRS)
Jost, Gabriele; Chun, Robert; Jin, Haoqiang; Labarta, Jesus; Gimenez, Judit
2004-01-01
The prototype implementation of an expert system was developed to assist the user in the computer aided parallelization process. The system interfaces to tools for automatic parallelization and performance analysis. By fusing static program structure information and dynamic performance analysis data the expert system can help the user to filter, correlate, and interpret the data gathered by the existing tools. Sections of the code that show poor performance and require further attention are rapidly identified and suggestions for improvements are presented to the user. In this paper we describe the components of the expert system and discuss its interface to the existing tools. We present a case study to demonstrate the successful use in full scale scientific applications.
Algorithms for Haptic Rendering of 3D Objects
NASA Technical Reports Server (NTRS)
Basdogan, Cagatay; Ho, Chih-Hao; Srinavasan, Mandayam
2003-01-01
Algorithms have been developed to provide haptic rendering of three-dimensional (3D) objects in virtual (that is, computationally simulated) environments. The goal of haptic rendering is to generate tactual displays of the shapes, hardnesses, surface textures, and frictional properties of 3D objects in real time. Haptic rendering is a major element of the emerging field of computer haptics, which invites comparison with computer graphics. We have already seen various applications of computer haptics in the areas of medicine (surgical simulation, telemedicine, haptic user interfaces for blind people, and rehabilitation of patients with neurological disorders), entertainment (3D painting, character animation, morphing, and sculpting), mechanical design (path planning and assembly sequencing), and scientific visualization (geophysical data analysis and molecular manipulation).
Job Superscheduler Architecture and Performance in Computational Grid Environments
NASA Technical Reports Server (NTRS)
Shan, Hongzhang; Oliker, Leonid; Biswas, Rupak
2003-01-01
Computational grids hold great promise in utilizing geographically separated heterogeneous resources to solve large-scale complex scientific problems. However, a number of major technical hurdles, including distributed resource management and effective job scheduling, stand in the way of realizing these gains. In this paper, we propose a novel grid superscheduler architecture and three distributed job migration algorithms. We also model the critical interaction between the superscheduler and autonomous local schedulers. Extensive performance comparisons with ideal, central, and local schemes using real workloads from leading computational centers are conducted in a simulation environment. Additionally, synthetic workloads are used to perform a detailed sensitivity analysis of our superscheduler. Several key metrics demonstrate that substantial performance gains can be achieved via smart superscheduling in distributed computational grids.
NASA Technical Reports Server (NTRS)
Shapiro, Linda G.; Tanimoto, Steven L.; Ahrens, James P.
1996-01-01
The goal of this task was to create a design and prototype implementation of a database environment that is particular suited for handling the image, vision and scientific data associated with the NASA's EOC Amazon project. The focus was on a data model and query facilities that are designed to execute efficiently on parallel computers. A key feature of the environment is an interface which allows a scientist to specify high-level directives about how query execution should occur.
A Vision on the Status and Evolution of HEP Physics Software Tools
DOE Office of Scientific and Technical Information (OSTI.GOV)
Canal, P.; Elvira, D.; Hatcher, R.
2013-07-28
This paper represents the vision of the members of the Fermilab Scientific Computing Division's Computational Physics Department (SCD-CPD) on the status and the evolution of various HEP software tools such as the Geant4 detector simulation toolkit, the Pythia and GENIE physics generators, and the ROOT data analysis framework. The goal of this paper is to contribute ideas to the Snowmass 2013 process toward the composition of a unified document on the current status and potential evolution of the physics software tools which are essential to HEP.
Chen, Xiaodong; Ren, Liqiang; Zheng, Bin; Liu, Hong
2013-01-01
The conventional optical microscopes have been used widely in scientific research and in clinical practice. The modern digital microscopic devices combine the power of optical imaging and computerized analysis, archiving and communication techniques. It has a great potential in pathological examinations for improving the efficiency and accuracy of clinical diagnosis. This chapter reviews the basic optical principles of conventional microscopes, fluorescence microscopes and electron microscopes. The recent developments and future clinical applications of advanced digital microscopic imaging methods and computer assisted diagnosis schemes are also discussed.
Simulation of the visual effects of power plant plumes
Evelyn F. Treiman; David B. Champion; Mona J. Wecksung; Glenn H. Moore; Andrew Ford; Michael D. Williams
1979-01-01
The Los Alamos Scientific Laboratory has developed a computer-assisted technique that can predict the visibility effects of potential energy sources in advance of their construction. This technique has been employed in an economic and environmental analysis comparing a single 3000 MW coal-fired power plant with six 500 MW coal-fired power plants located at hypothetical...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Almgren, Ann; DeMar, Phil; Vetter, Jeffrey
The widespread use of computing in the American economy would not be possible without a thoughtful, exploratory research and development (R&D) community pushing the performance edge of operating systems, computer languages, and software libraries. These are the tools and building blocks — the hammers, chisels, bricks, and mortar — of the smartphone, the cloud, and the computing services on which we rely. Engineers and scientists need ever-more specialized computing tools to discover new material properties for manufacturing, make energy generation safer and more efficient, and provide insight into the fundamentals of the universe, for example. The research division of themore » U.S. Department of Energy’s (DOE’s) Office of Advanced Scientific Computing and Research (ASCR Research) ensures that these tools and building blocks are being developed and honed to meet the extreme needs of modern science. See also http://exascaleage.org/ascr/ for additional information.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Türkoğlu, Emir Alper, E-mail: eaturkoglu@yandex.com; Ağrı İbrahim Çeçen University, Central Research and Application Laboratory, Ağrı; Kurt, Murat, E-mail: muratkurt60@hotmail.com
Ağrı İbrahim Çeçen University built a central research and application laboratory (CRAL) in the east of Turkey. The CRAL possesses 7 research and analysis laboratories, 12 experts and researchers, 8 standard rooms for guest researchers, a restaurant, a conference hall, a meeting room, a prey room and a computer laboratory. The CRAL aims certain collaborations between researchers, experts, clinicians and educators in the areas of biotechnology, bioimagining, food safety & quality, omic sciences such as genomics, proteomics and metallomics. It also intends to develop sustainable solutions in agriculture and animal husbandry, promote public health quality, collect scientific knowledge and keepmore » it for future generations, contribute scientific awareness of all stratums of society, provide consulting for small initiatives and industries. It has been collaborated several scientific foundations since 2011.« less
Tools for 3D scientific visualization in computational aerodynamics
NASA Technical Reports Server (NTRS)
Bancroft, Gordon; Plessel, Todd; Merritt, Fergus; Watson, Val
1989-01-01
The purpose is to describe the tools and techniques in use at the NASA Ames Research Center for performing visualization of computational aerodynamics, for example visualization of flow fields from computer simulations of fluid dynamics about vehicles such as the Space Shuttle. The hardware used for visualization is a high-performance graphics workstation connected to a super computer with a high speed channel. At present, the workstation is a Silicon Graphics IRIS 3130, the supercomputer is a CRAY2, and the high speed channel is a hyperchannel. The three techniques used for visualization are post-processing, tracking, and steering. Post-processing analysis is done after the simulation. Tracking analysis is done during a simulation but is not interactive, whereas steering analysis involves modifying the simulation interactively during the simulation. Using post-processing methods, a flow simulation is executed on a supercomputer and, after the simulation is complete, the results of the simulation are processed for viewing. The software in use and under development at NASA Ames Research Center for performing these types of tasks in computational aerodynamics is described. Workstation performance issues, benchmarking, and high-performance networks for this purpose are also discussed as well as descriptions of other hardware for digital video and film recording.
Tools and techniques for computational reproducibility.
Piccolo, Stephen R; Frampton, Michael B
2016-07-11
When reporting research findings, scientists document the steps they followed so that others can verify and build upon the research. When those steps have been described in sufficient detail that others can retrace the steps and obtain similar results, the research is said to be reproducible. Computers play a vital role in many research disciplines and present both opportunities and challenges for reproducibility. Computers can be programmed to execute analysis tasks, and those programs can be repeated and shared with others. The deterministic nature of most computer programs means that the same analysis tasks, applied to the same data, will often produce the same outputs. However, in practice, computational findings often cannot be reproduced because of complexities in how software is packaged, installed, and executed-and because of limitations associated with how scientists document analysis steps. Many tools and techniques are available to help overcome these challenges; here we describe seven such strategies. With a broad scientific audience in mind, we describe the strengths and limitations of each approach, as well as the circumstances under which each might be applied. No single strategy is sufficient for every scenario; thus we emphasize that it is often useful to combine approaches.
Interactive Visualization and Analysis of Geospatial Data Sets - TrikeND-iGlobe
NASA Astrophysics Data System (ADS)
Rosebrock, Uwe; Hogan, Patrick; Chandola, Varun
2013-04-01
The visualization of scientific datasets is becoming an ever-increasing challenge as advances in computing technologies have enabled scientists to build high resolution climate models that have produced petabytes of climate data. To interrogate and analyze these large datasets in real-time is a task that pushes the boundaries of computing hardware and software. But integration of climate datasets with geospatial data requires considerable amount of effort and close familiarity of various data formats and projection systems, which has prevented widespread utilization outside of climate community. TrikeND-iGlobe is a sophisticated software tool that bridges this gap, allows easy integration of climate datasets with geospatial datasets and provides sophisticated visualization and analysis capabilities. The objective for TrikeND-iGlobe is the continued building of an open source 4D virtual globe application using NASA World Wind technology that integrates analysis of climate model outputs with remote sensing observations as well as demographic and environmental data sets. This will facilitate a better understanding of global and regional phenomenon, and the impact analysis of climate extreme events. The critical aim is real-time interactive interrogation. At the data centric level the primary aim is to enable the user to interact with the data in real-time for the purpose of analysis - locally or remotely. TrikeND-iGlobe provides the basis for the incorporation of modular tools that provide extended interactions with the data, including sub-setting, aggregation, re-shaping, time series analysis methods and animation to produce publication-quality imagery. TrikeND-iGlobe may be run locally or can be accessed via a web interface supported by high-performance visualization compute nodes placed close to the data. It supports visualizing heterogeneous data formats: traditional geospatial datasets along with scientific data sets with geographic coordinates (NetCDF, HDF, etc.). It also supports multiple data access mechanisms, including HTTP, FTP, WMS, WCS, and Thredds Data Server (for NetCDF data and for scientific data, TrikeND-iGlobe supports various visualization capabilities, including animations, vector field visualization, etc. TrikeND-iGlobe is a collaborative open-source project, contributors include NASA (ARC-PX), ORNL (Oakridge National Laboratories), Unidata, Kansas University, CSIRO CMAR Australia and Geoscience Australia.
Quantum Testbeds Stakeholder Workshop (QTSW) Report meeting purpose and agenda.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hebner, Gregory A.
Quantum computing (QC) is a promising early-stage technology with the potential to provide scientific computing capabilities far beyond what is possible with even an Exascale computer in specific problems of relevance to the Office of Science. These include (but are not limited to) materials modeling, molecular dynamics, and quantum chromodynamics. However, commercial QC systems are not yet available and the technical maturity of current QC hardware, software, algorithms, and systems integration is woefully incomplete. Thus, there is a significant opportunity for DOE to define the technology building blocks, and solve the system integration issues to enable a revolutionary tool. Oncemore » realized, QC will have world changing impact on economic competitiveness, the scientific enterprise, and citizen well-being. Prior to this workshop, DOE / Office of Advanced Scientific Computing Research (ASCR) hosted a workshop in 2015 to explore QC scientific applications. The goal of that workshop was to assess the viability of QC technologies to meet the computational requirements in support of DOE’s science and energy mission and to identify the potential impact of these technologies.« less
Medical image computing for computer-supported diagnostics and therapy. Advances and perspectives.
Handels, H; Ehrhardt, J
2009-01-01
Medical image computing has become one of the most challenging fields in medical informatics. In image-based diagnostics of the future software assistance will become more and more important, and image analysis systems integrating advanced image computing methods are needed to extract quantitative image parameters to characterize the state and changes of image structures of interest (e.g. tumors, organs, vessels, bones etc.) in a reproducible and objective way. Furthermore, in the field of software-assisted and navigated surgery medical image computing methods play a key role and have opened up new perspectives for patient treatment. However, further developments are needed to increase the grade of automation, accuracy, reproducibility and robustness. Moreover, the systems developed have to be integrated into the clinical workflow. For the development of advanced image computing systems methods of different scientific fields have to be adapted and used in combination. The principal methodologies in medical image computing are the following: image segmentation, image registration, image analysis for quantification and computer assisted image interpretation, modeling and simulation as well as visualization and virtual reality. Especially, model-based image computing techniques open up new perspectives for prediction of organ changes and risk analysis of patients and will gain importance in diagnostic and therapy of the future. From a methodical point of view the authors identify the following future trends and perspectives in medical image computing: development of optimized application-specific systems and integration into the clinical workflow, enhanced computational models for image analysis and virtual reality training systems, integration of different image computing methods, further integration of multimodal image data and biosignals and advanced methods for 4D medical image computing. The development of image analysis systems for diagnostic support or operation planning is a complex interdisciplinary process. Image computing methods enable new insights into the patient's image data and have the future potential to improve medical diagnostics and patient treatment.
DOE Office of Scientific and Technical Information (OSTI.GOV)
The Analysis of Search Results for the Clarification and Identification of Technology Emergence (AR-CITE) computer code examines a scientometric model that tracks the emergence of an identified technology from initial discovery (via original scientific and conference literature), through critical discoveries (via original scientific, conference literature and patents), transitioning through Technology Readiness Levels (TRLs) and ultimately on to commercial currency of citations, collaboration indicators, and on-line news patterns are identified. The combinations of four distinct and separate searchable on-line networked sources (i.e. scholarly publications and citation, world patents, news archives, and on-line mapping networks) are assembled to become one collective networkmore » (a dataset for analysis of relations). This established network becomes the basis from which to quickly analyze the temporal flow of activity (searchable events) for the subject domain to be clarified and identified.« less
Automated site characterization for robotic sample acquisition systems
NASA Astrophysics Data System (ADS)
Scholl, Marija S.; Eberlein, Susan J.
1993-04-01
A mobile, semiautonomous vehicle with multiple sensors and on-board intelligence is proposed for performing preliminary scientific investigations on extraterrestrial bodies prior to human exploration. Two technologies, a hybrid optical-digital computer system based on optical correlator technology and an image and instrument data analysis system, provide complementary capabilities that might be part of an instrument package for an intelligent robotic vehicle. The hybrid digital-optical vision system could perform real-time image classification tasks using an optical correlator with programmable matched filters under control of a digital microcomputer. The data analysis system would analyze visible and multiband imagery to extract mineral composition and textural information for geologic characterization. Together these technologies would support the site characterization needs of a robotic vehicle for both navigational and scientific purposes.
Entering the ‘big data’ era in medicinal chemistry: molecular promiscuity analysis revisited
Hu, Ye; Bajorath, Jürgen
2017-01-01
The ‘big data’ concept plays an increasingly important role in many scientific fields. Big data involves more than unprecedentedly large volumes of data that become available. Different criteria characterizing big data must be carefully considered in computational data mining, as we discuss herein focusing on medicinal chemistry. This is a scientific discipline where big data is beginning to emerge and provide new opportunities. For example, the ability of many drugs to specifically interact with multiple targets, termed promiscuity, forms the molecular basis of polypharmacology, a hot topic in drug discovery. Compound promiscuity analysis is an area that is much influenced by big data phenomena. Different results are obtained depending on chosen data selection and confidence criteria, as we also demonstrate. PMID:28670471
On-demand Simulation of Atmospheric Transport Processes on the AlpEnDAC Cloud
NASA Astrophysics Data System (ADS)
Hachinger, S.; Harsch, C.; Meyer-Arnek, J.; Frank, A.; Heller, H.; Giemsa, E.
2016-12-01
The "Alpine Environmental Data Analysis Centre" (AlpEnDAC) develops a data-analysis platform for high-altitude research facilities within the "Virtual Alpine Observatory" project (VAO). This platform, with its web portal, will support use cases going much beyond data management: On user request, the data are augmented with "on-demand" simulation results, such as air-parcel trajectories for tracing down the source of pollutants when they appear in high concentration. The respective back-end mechanism uses the Compute Cloud of the Leibniz Supercomputing Centre (LRZ) to transparently calculate results requested by the user, as far as they have not yet been stored in AlpEnDAC. The queuing-system operation model common in supercomputing is replaced by a model in which Virtual Machines (VMs) on the cloud are automatically created/destroyed, providing the necessary computing power immediately on demand. From a security point of view, this allows to perform simulations in a sandbox defined by the VM configuration, without direct access to a computing cluster. Within few minutes, the user receives conveniently visualized results. The AlpEnDAC infrastructure is distributed among two participating institutes [front-end at German Aerospace Centre (DLR), simulation back-end at LRZ], requiring an efficient mechanism for synchronization of measured and augmented data. We discuss our iRODS-based solution for these data-management tasks as well as the general AlpEnDAC framework. Our cloud-based offerings aim at making scientific computing for our users much more convenient and flexible than it has been, and to allow scientists without a broad background in scientific computing to benefit from complex numerical simulations.
Wyrm: A Brain-Computer Interface Toolbox in Python.
Venthur, Bastian; Dähne, Sven; Höhne, Johannes; Heller, Hendrik; Blankertz, Benjamin
2015-10-01
In the last years Python has gained more and more traction in the scientific community. Projects like NumPy, SciPy, and Matplotlib have created a strong foundation for scientific computing in Python and machine learning packages like scikit-learn or packages for data analysis like Pandas are building on top of it. In this paper we present Wyrm ( https://github.com/bbci/wyrm ), an open source BCI toolbox in Python. Wyrm is applicable to a broad range of neuroscientific problems. It can be used as a toolbox for analysis and visualization of neurophysiological data and in real-time settings, like an online BCI application. In order to prevent software defects, Wyrm makes extensive use of unit testing. We will explain the key aspects of Wyrm's software architecture and design decisions for its data structure, and demonstrate and validate the use of our toolbox by presenting our approach to the classification tasks of two different data sets from the BCI Competition III. Furthermore, we will give a brief analysis of the data sets using our toolbox, and demonstrate how we implemented an online experiment using Wyrm. With Wyrm we add the final piece to our ongoing effort to provide a complete, free and open source BCI system in Python.
Lowering the Barrier to Reproducible Research by Publishing Provenance from Common Analytical Tools
NASA Astrophysics Data System (ADS)
Jones, M. B.; Slaughter, P.; Walker, L.; Jones, C. S.; Missier, P.; Ludäscher, B.; Cao, Y.; McPhillips, T.; Schildhauer, M.
2015-12-01
Scientific provenance describes the authenticity, origin, and processing history of research products and promotes scientific transparency by detailing the steps in computational workflows that produce derived products. These products include papers, findings, input data, software products to perform computations, and derived data and visualizations. The geosciences community values this type of information, and, at least theoretically, strives to base conclusions on computationally replicable findings. In practice, capturing detailed provenance is laborious and thus has been a low priority; beyond a lab notebook describing methods and results, few researchers capture and preserve detailed records of scientific provenance. We have built tools for capturing and publishing provenance that integrate into analytical environments that are in widespread use by geoscientists (R and Matlab). These tools lower the barrier to provenance generation by automating capture of critical information as researchers prepare data for analysis, develop, test, and execute models, and create visualizations. The 'recordr' library in R and the `matlab-dataone` library in Matlab provide shared functions to capture provenance with minimal changes to normal working procedures. Researchers can capture both scripted and interactive sessions, tag and manage these executions as they iterate over analyses, and then prune and publish provenance metadata and derived products to the DataONE federation of archival repositories. Provenance traces conform to the ProvONE model extension of W3C PROV, enabling interoperability across tools and languages. The capture system supports fine-grained versioning of science products and provenance traces. By assigning global identifiers such as DOIs, reseachers can cite the computational processes used to reach findings. And, finally, DataONE has built a web portal to search, browse, and clearly display provenance relationships between input data, the software used to execute analyses and models, and derived data and products that arise from these computations. This provenance is vital to interpretation and understanding of science, and provides an audit trail that researchers can use to understand and replicate computational workflows in the geosciences.
High-throughput neuroimaging-genetics computational infrastructure
Dinov, Ivo D.; Petrosyan, Petros; Liu, Zhizhong; Eggert, Paul; Hobel, Sam; Vespa, Paul; Woo Moon, Seok; Van Horn, John D.; Franco, Joseph; Toga, Arthur W.
2014-01-01
Many contemporary neuroscientific investigations face significant challenges in terms of data management, computational processing, data mining, and results interpretation. These four pillars define the core infrastructure necessary to plan, organize, orchestrate, validate, and disseminate novel scientific methods, computational resources, and translational healthcare findings. Data management includes protocols for data acquisition, archival, query, transfer, retrieval, and aggregation. Computational processing involves the necessary software, hardware, and networking infrastructure required to handle large amounts of heterogeneous neuroimaging, genetics, clinical, and phenotypic data and meta-data. Data mining refers to the process of automatically extracting data features, characteristics and associations, which are not readily visible by human exploration of the raw dataset. Result interpretation includes scientific visualization, community validation of findings and reproducible findings. In this manuscript we describe the novel high-throughput neuroimaging-genetics computational infrastructure available at the Institute for Neuroimaging and Informatics (INI) and the Laboratory of Neuro Imaging (LONI) at University of Southern California (USC). INI and LONI include ultra-high-field and standard-field MRI brain scanners along with an imaging-genetics database for storing the complete provenance of the raw and derived data and meta-data. In addition, the institute provides a large number of software tools for image and shape analysis, mathematical modeling, genomic sequence processing, and scientific visualization. A unique feature of this architecture is the Pipeline environment, which integrates the data management, processing, transfer, and visualization. Through its client-server architecture, the Pipeline environment provides a graphical user interface for designing, executing, monitoring validating, and disseminating of complex protocols that utilize diverse suites of software tools and web-services. These pipeline workflows are represented as portable XML objects which transfer the execution instructions and user specifications from the client user machine to remote pipeline servers for distributed computing. Using Alzheimer's and Parkinson's data, we provide several examples of translational applications using this infrastructure1. PMID:24795619
Idle waves in high-performance computing
NASA Astrophysics Data System (ADS)
Markidis, Stefano; Vencels, Juris; Peng, Ivy Bo; Akhmetova, Dana; Laure, Erwin; Henri, Pierre
2015-01-01
The vast majority of parallel scientific applications distributes computation among processes that are in a busy state when computing and in an idle state when waiting for information from other processes. We identify the propagation of idle waves through processes in scientific applications with a local information exchange between the two processes. Idle waves are nondispersive and have a phase velocity inversely proportional to the average busy time. The physical mechanism enabling the propagation of idle waves is the local synchronization between two processes due to remote data dependency. This study provides a description of the large number of processes in parallel scientific applications as a continuous medium. This work also is a step towards an understanding of how localized idle periods can affect remote processes, leading to the degradation of global performance in parallel scientific applications.
Ammonia Oxidation by Abstraction of Three Hydrogen Atoms from a Mo–NH 3 Complex
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bhattacharya, Papri; Heiden, Zachariah M.; Wiedner, Eric S.
We report ammonia oxidation by homolytic cleavage of all three H atoms from a Mo-15NH3 complex using the 2,4,6-tri-tert-butylphenoxyl radical to afford a Mo-alkylimido (Mo=15NR) complex (R = 2,4,6-tri-t-butylcyclohexa-2,5-dien-1-one). Reductive cleavage of Mo=15NR generates a terminal Mo≡N nitride, and a [Mo-15NH]+ complex is formed by protonation. Computational analysis describes the energetic profile for the stepwise removal of three H atoms from the Mo-15NH3 complex and the formation of Mo=15NR. Acknowledgment. This work was supported as part of the Center for Molecular Electrocatalysis, an Energy Frontier Re-search Center funded by the U.S. Department of Energy (U.S. DOE), Office of Science, Officemore » of Basic Energy Sciences. EPR and mass spectrometry experiments were performed using EMSL, a national scientific user facility sponsored by the DOE’s Office of Biological and Environmental Research and located at PNNL. The authors thank Dr. Eric D. Walter and Dr. Rosalie Chu for assistance in performing EPR and mass spectroscopy analysis, respectively. Computational resources provided by the National Energy Re-search Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory. Pacific North-west National Laboratory is operated by Battelle for the U.S. DOE.« less
NASA Astrophysics Data System (ADS)
Blikstein, Paulo
The goal of this dissertation is to explore relations between content, representation, and pedagogy, so as to understand the impact of the nascent field of complexity sciences on science, technology, engineering and mathematics (STEM) learning. Wilensky & Papert coined the term "structurations" to express the relationship between knowledge and its representational infrastructure. A change from one representational infrastructure to another they call a "restructuration." The complexity sciences have introduced a novel and powerful structuration: agent-based modeling. In contradistinction to traditional mathematical modeling, which relies on equational descriptions of macroscopic properties of systems, agent-based modeling focuses on a few archetypical micro-behaviors of "agents" to explain emergent macro-behaviors of the agent collective. Specifically, this dissertation is about a series of studies of undergraduate students' learning of materials science, in which two structurations are compared (equational and agent-based), consisting of both design research and empirical evaluation. I have designed MaterialSim, a constructionist suite of computer models, supporting materials and learning activities designed within the approach of agent-based modeling, and over four years conducted an empirical inves3 tigation of an undergraduate materials science course. The dissertation is comprised of three studies: Study 1 - diagnosis . I investigate current representational and pedagogical practices in engineering classrooms. Study 2 - laboratory studies. I investigate the cognition of students engaging in scientific inquiry through programming their own scientific models. Study 3 - classroom implementation. I investigate the characteristics, advantages, and trajectories of scientific content knowledge that is articulated in epistemic forms and representational infrastructures unique to complexity sciences, as well as the feasibility of the integration of constructionist, agent-based learning environments in engineering classrooms. Data sources include classroom observations, interviews, videotaped sessions of model-building, questionnaires, analysis of computer-generated logfiles, and quantitative and qualitative analysis of artifacts. Results shows that (1) current representational and pedagogical practices in engineering classrooms were not up to the challenge of the complex content being taught, (2) by building their own scientific models, students developed a deeper understanding of core scientific concepts, and learned how to better identify unifying principles and behaviors in materials science, and (3) programming computer models was feasible within a regular engineering classroom.
Use of Emerging Grid Computing Technologies for the Analysis of LIGO Data
NASA Astrophysics Data System (ADS)
Koranda, Scott
2004-03-01
The LIGO Scientific Collaboration (LSC) today faces the challenge of enabling analysis of terabytes of LIGO data by hundreds of scientists from institutions all around the world. To meet this challenge the LSC is developing tools, infrastructure, applications, and expertise leveraging Grid Computing technologies available today, and making available to LSC scientists compute resources at sites across the United States and Europe. We use digital credentials for strong and secure authentication and authorization to compute resources and data. Building on top of products from the Globus project for high-speed data transfer and information discovery we have created the Lightweight Data Replicator (LDR) to securely and robustly replicate data to resource sites. We have deployed at our computing sites the Virtual Data Toolkit (VDT) Server and Client packages, developed in collaboration with our partners in the GriPhyN and iVDGL projects, providing uniform access to distributed resources for users and their applications. Taken together these Grid Computing technologies and infrastructure have formed the LSC DataGrid--a coherent and uniform environment across two continents for the analysis of gravitational-wave detector data. Much work, however, remains in order to scale current analyses and recent lessons learned need to be integrated into the next generation of Grid middleware.
Comparisons of some large scientific computers
NASA Technical Reports Server (NTRS)
Credeur, K. R.
1981-01-01
In 1975, the National Aeronautics and Space Administration (NASA) began studies to assess the technical and economic feasibility of developing a computer having sustained computational speed of one billion floating point operations per second and a working memory of at least 240 million words. Such a powerful computer would allow computational aerodynamics to play a major role in aeronautical design and advanced fluid dynamics research. Based on favorable results from these studies, NASA proceeded with developmental plans. The computer was named the Numerical Aerodynamic Simulator (NAS). To help insure that the estimated cost, schedule, and technical scope were realistic, a brief study was made of past large scientific computers. Large discrepancies between inception and operation in scope, cost, or schedule were studied so that they could be minimized with NASA's proposed new compter. The main computers studied were the ILLIAC IV, STAR 100, Parallel Element Processor Ensemble (PEPE), and Shuttle Mission Simulator (SMS) computer. Comparison data on memory and speed were also obtained on the IBM 650, 704, 7090, 360-50, 360-67, 360-91, and 370-195; the CDC 6400, 6600, 7600, CYBER 203, and CYBER 205; CRAY 1; and the Advanced Scientific Computer (ASC). A few lessons learned conclude the report.
USSR Report: Cybernetics, Computers and Automation Technology. No. 69.
1983-05-06
computers in multiprocessor and multistation design , control and scientific research automation systems. The results of comparing the efficiency of...Podvizhnaya, Scientific Research Institute of Control Computers, Severodonetsk] [Text] The most significant change in the design of the SM-2M compared to...UPRAVLYAYUSHCHIYE SISTEMY I MASHINY, Nov-Dec 82) 95 APPLICATIONS Kiev Automated Control System, Design Features and Prospects for Development (V. A
Yassin, Nisreen I R; Omran, Shaimaa; El Houby, Enas M F; Allam, Hemat
2018-03-01
The high incidence of breast cancer in women has increased significantly in the recent years. Physician experience of diagnosing and detecting breast cancer can be assisted by using some computerized features extraction and classification algorithms. This paper presents the conduction and results of a systematic review (SR) that aims to investigate the state of the art regarding the computer aided diagnosis/detection (CAD) systems for breast cancer. The SR was conducted using a comprehensive selection of scientific databases as reference sources, allowing access to diverse publications in the field. The scientific databases used are Springer Link (SL), Science Direct (SD), IEEE Xplore Digital Library, and PubMed. Inclusion and exclusion criteria were defined and applied to each retrieved work to select those of interest. From 320 studies retrieved, 154 studies were included. However, the scope of this research is limited to scientific and academic works and excludes commercial interests. This survey provides a general analysis of the current status of CAD systems according to the used image modalities and the machine learning based classifiers. Potential research studies have been discussed to create a more objective and efficient CAD systems. Copyright © 2017 Elsevier B.V. All rights reserved.
From cosmos to connectomes: the evolution of data-intensive science.
Burns, Randal; Vogelstein, Joshua T; Szalay, Alexander S
2014-09-17
The analysis of data requires computation: originally by hand and more recently by computers. Different models of computing are designed and optimized for different kinds of data. In data-intensive science, the scale and complexity of data exceeds the comfort zone of local data stores on scientific workstations. Thus, cloud computing emerges as the preeminent model, utilizing data centers and high-performance clusters, enabling remote users to access and query subsets of the data efficiently. We examine how data-intensive computational systems originally built for cosmology, the Sloan Digital Sky Survey (SDSS), are now being used in connectomics, at the Open Connectome Project. We list lessons learned and outline the top challenges we expect to face. Success in computational connectomics would drastically reduce the time between idea and discovery, as SDSS did in cosmology. Copyright © 2014 Elsevier Inc. All rights reserved.
Enabling Grid Computing resources within the KM3NeT computing model
NASA Astrophysics Data System (ADS)
Filippidis, Christos
2016-04-01
KM3NeT is a future European deep-sea research infrastructure hosting a new generation neutrino detectors that - located at the bottom of the Mediterranean Sea - will open a new window on the universe and answer fundamental questions both in particle physics and astrophysics. International collaborative scientific experiments, like KM3NeT, are generating datasets which are increasing exponentially in both complexity and volume, making their analysis, archival, and sharing one of the grand challenges of the 21st century. These experiments, in their majority, adopt computing models consisting of different Tiers with several computing centres and providing a specific set of services for the different steps of data processing such as detector calibration, simulation and data filtering, reconstruction and analysis. The computing requirements are extremely demanding and, usually, span from serial to multi-parallel or GPU-optimized jobs. The collaborative nature of these experiments demands very frequent WAN data transfers and data sharing among individuals and groups. In order to support the aforementioned demanding computing requirements we enabled Grid Computing resources, operated by EGI, within the KM3NeT computing model. In this study we describe our first advances in this field and the method for the KM3NeT users to utilize the EGI computing resources in a simulation-driven use-case.
Goscinski, Wojtek J.; McIntosh, Paul; Felzmann, Ulrich; Maksimenko, Anton; Hall, Christopher J.; Gureyev, Timur; Thompson, Darren; Janke, Andrew; Galloway, Graham; Killeen, Neil E. B.; Raniga, Parnesh; Kaluza, Owen; Ng, Amanda; Poudel, Govinda; Barnes, David G.; Nguyen, Toan; Bonnington, Paul; Egan, Gary F.
2014-01-01
The Multi-modal Australian ScienceS Imaging and Visualization Environment (MASSIVE) is a national imaging and visualization facility established by Monash University, the Australian Synchrotron, the Commonwealth Scientific Industrial Research Organization (CSIRO), and the Victorian Partnership for Advanced Computing (VPAC), with funding from the National Computational Infrastructure and the Victorian Government. The MASSIVE facility provides hardware, software, and expertise to drive research in the biomedical sciences, particularly advanced brain imaging research using synchrotron x-ray and infrared imaging, functional and structural magnetic resonance imaging (MRI), x-ray computer tomography (CT), electron microscopy and optical microscopy. The development of MASSIVE has been based on best practice in system integration methodologies, frameworks, and architectures. The facility has: (i) integrated multiple different neuroimaging analysis software components, (ii) enabled cross-platform and cross-modality integration of neuroinformatics tools, and (iii) brought together neuroimaging databases and analysis workflows. MASSIVE is now operational as a nationally distributed and integrated facility for neuroinfomatics and brain imaging research. PMID:24734019
Proceedings: Fourth Workshop on Mining Scientific Datasets
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kamath, C
Commercial applications of data mining in areas such as e-commerce, market-basket analysis, text-mining, and web-mining have taken on a central focus in the JCDD community. However, there is a significant amount of innovative data mining work taking place in the context of scientific and engineering applications that is not well represented in the mainstream KDD conferences. For example, scientific data mining techniques are being developed and applied to diverse fields such as remote sensing, physics, chemistry, biology, astronomy, structural mechanics, computational fluid dynamics etc. In these areas, data mining frequently complements and enhances existing analysis methods based on statistics, exploratorymore » data analysis, and domain-specific approaches. On the surface, it may appear that data from one scientific field, say genomics, is very different from another field, such as physics. However, despite their diversity, there is much that is common across the mining of scientific and engineering data. For example, techniques used to identify objects in images are very similar, regardless of whether the images came from a remote sensing application, a physics experiment, an astronomy observation, or a medical study. Further, with data mining being applied to new types of data, such as mesh data from scientific simulations, there is the opportunity to apply and extend data mining to new scientific domains. This one-day workshop brings together data miners analyzing science data and scientists from diverse fields to share their experiences, learn how techniques developed in one field can be applied in another, and better understand some of the newer techniques being developed in the KDD community. This is the fourth workshop on the topic of Mining Scientific Data sets; for information on earlier workshops, see http://www.ahpcrc.org/conferences/. This workshop continues the tradition of addressing challenging problems in a field where the diversity of applications is matched only by the opportunities that await a practitioner.« less
Archiving Software Systems: Approaches to Preserve Computational Capabilities
NASA Astrophysics Data System (ADS)
King, T. A.
2014-12-01
A great deal of effort is made to preserve scientific data. Not only because data is knowledge, but it is often costly to acquire and is sometimes collected under unique circumstances. Another part of the science enterprise is the development of software to process and analyze the data. Developed software is also a large investment and worthy of preservation. However, the long term preservation of software presents some challenges. Software often requires a specific technology stack to operate. This can include software, operating systems and hardware dependencies. One past approach to preserve computational capabilities is to maintain ancient hardware long past its typical viability. On an archive horizon of 100 years, this is not feasible. Another approach to preserve computational capabilities is to archive source code. While this can preserve details of the implementation and algorithms, it may not be possible to reproduce the technology stack needed to compile and run the resulting applications. This future forward dilemma has a solution. Technology used to create clouds and process big data can also be used to archive and preserve computational capabilities. We explore how basic hardware, virtual machines, containers and appropriate metadata can be used to preserve computational capabilities and to archive functional software systems. In conjunction with data archives, this provides scientist with both the data and capability to reproduce the processing and analysis used to generate past scientific results.
Cloud-based Jupyter Notebooks for Water Data Analysis
NASA Astrophysics Data System (ADS)
Castronova, A. M.; Brazil, L.; Seul, M.
2017-12-01
The development and adoption of technologies by the water science community to improve our ability to openly collaborate and share workflows will have a transformative impact on how we address the challenges associated with collaborative and reproducible scientific research. Jupyter notebooks offer one solution by providing an open-source platform for creating metadata-rich toolchains for modeling and data analysis applications. Adoption of this technology within the water sciences, coupled with publicly available datasets from agencies such as USGS, NASA, and EPA enables researchers to easily prototype and execute data intensive toolchains. Moreover, implementing this software stack in a cloud-based environment extends its native functionality to provide researchers a mechanism to build and execute toolchains that are too large or computationally demanding for typical desktop computers. Additionally, this cloud-based solution enables scientists to disseminate data processing routines alongside journal publications in an effort to support reproducibility. For example, these data collection and analysis toolchains can be shared, archived, and published using the HydroShare platform or downloaded and executed locally to reproduce scientific analysis. This work presents the design and implementation of a cloud-based Jupyter environment and its application for collecting, aggregating, and munging various datasets in a transparent, sharable, and self-documented manner. The goals of this work are to establish a free and open source platform for domain scientists to (1) conduct data intensive and computationally intensive collaborative research, (2) utilize high performance libraries, models, and routines within a pre-configured cloud environment, and (3) enable dissemination of research products. This presentation will discuss recent efforts towards achieving these goals, and describe the architectural design of the notebook server in an effort to support collaborative and reproducible science.
Accelerating scientific discovery : 2007 annual report.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beckman, P.; Dave, P.; Drugan, C.
2008-11-14
As a gateway for scientific discovery, the Argonne Leadership Computing Facility (ALCF) works hand in hand with the world's best computational scientists to advance research in a diverse span of scientific domains, ranging from chemistry, applied mathematics, and materials science to engineering physics and life sciences. Sponsored by the U.S. Department of Energy's (DOE) Office of Science, researchers are using the IBM Blue Gene/L supercomputer at the ALCF to study and explore key scientific problems that underlie important challenges facing our society. For instance, a research team at the University of California-San Diego/ SDSC is studying the molecular basis ofmore » Parkinson's disease. The researchers plan to use the knowledge they gain to discover new drugs to treat the disease and to identify risk factors for other diseases that are equally prevalent. Likewise, scientists from Pratt & Whitney are using the Blue Gene to understand the complex processes within aircraft engines. Expanding our understanding of jet engine combustors is the secret to improved fuel efficiency and reduced emissions. Lessons learned from the scientific simulations of jet engine combustors have already led Pratt & Whitney to newer designs with unprecedented reductions in emissions, noise, and cost of ownership. ALCF staff members provide in-depth expertise and assistance to those using the Blue Gene/L and optimizing user applications. Both the Catalyst and Applications Performance Engineering and Data Analytics (APEDA) teams support the users projects. In addition to working with scientists running experiments on the Blue Gene/L, we have become a nexus for the broader global community. In partnership with the Mathematics and Computer Science Division at Argonne National Laboratory, we have created an environment where the world's most challenging computational science problems can be addressed. Our expertise in high-end scientific computing enables us to provide guidance for applications that are transitioning to petascale as well as to produce software that facilitates their development, such as the MPICH library, which provides a portable and efficient implementation of the MPI standard--the prevalent programming model for large-scale scientific applications--and the PETSc toolkit that provides a programming paradigm that eases the development of many scientific applications on high-end computers.« less
PyEEG: an open source Python module for EEG/MEG feature extraction.
Bao, Forrest Sheng; Liu, Xin; Zhang, Christina
2011-01-01
Computer-aided diagnosis of neural diseases from EEG signals (or other physiological signals that can be treated as time series, e.g., MEG) is an emerging field that has gained much attention in past years. Extracting features is a key component in the analysis of EEG signals. In our previous works, we have implemented many EEG feature extraction functions in the Python programming language. As Python is gaining more ground in scientific computing, an open source Python module for extracting EEG features has the potential to save much time for computational neuroscientists. In this paper, we introduce PyEEG, an open source Python module for EEG feature extraction.
Fifty years of computer analysis in chest imaging: rule-based, machine learning, deep learning.
van Ginneken, Bram
2017-03-01
Half a century ago, the term "computer-aided diagnosis" (CAD) was introduced in the scientific literature. Pulmonary imaging, with chest radiography and computed tomography, has always been one of the focus areas in this field. In this study, I describe how machine learning became the dominant technology for tackling CAD in the lungs, generally producing better results than do classical rule-based approaches, and how the field is now rapidly changing: in the last few years, we have seen how even better results can be obtained with deep learning. The key differences among rule-based processing, machine learning, and deep learning are summarized and illustrated for various applications of CAD in the chest.
PyEEG: An Open Source Python Module for EEG/MEG Feature Extraction
Bao, Forrest Sheng; Liu, Xin; Zhang, Christina
2011-01-01
Computer-aided diagnosis of neural diseases from EEG signals (or other physiological signals that can be treated as time series, e.g., MEG) is an emerging field that has gained much attention in past years. Extracting features is a key component in the analysis of EEG signals. In our previous works, we have implemented many EEG feature extraction functions in the Python programming language. As Python is gaining more ground in scientific computing, an open source Python module for extracting EEG features has the potential to save much time for computational neuroscientists. In this paper, we introduce PyEEG, an open source Python module for EEG feature extraction. PMID:21512582
Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing
NASA Astrophysics Data System (ADS)
Klimentov, A.; Buncic, P.; De, K.; Jha, S.; Maeno, T.; Mount, R.; Nilsson, P.; Oleynik, D.; Panitkin, S.; Petrosyan, A.; Porter, R. J.; Read, K. F.; Vaniachine, A.; Wells, J. C.; Wenaus, T.
2015-05-01
The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS and ALICE are the largest collaborations ever assembled in the sciences and are at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, both experiments rely on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System (WMS) for managing the workflow for all data processing on hundreds of data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. The scale is demonstrated by the following numbers: PanDA manages O(102) sites, O(105) cores, O(108) jobs per year, O(103) users, and ATLAS data volume is O(1017) bytes. In 2013 we started an ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF). The project titled ‘Next Generation Workload Management and Analysis System for Big Data’ (BigPanDA) is funded by DOE ASCR and HEP. Extending PanDA to clouds and LCF presents new challenges in managing heterogeneity and supporting workflow. The BigPanDA project is underway to setup and tailor PanDA at the Oak Ridge Leadership Computing Facility (OLCF) and at the National Research Center "Kurchatov Institute" together with ALICE distributed computing and ORNL computing professionals. Our approach to integration of HPC platforms at the OLCF and elsewhere is to reuse, as much as possible, existing components of the PanDA system. We will present our current accomplishments with running the PanDA WMS at OLCF and other supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications.
Review and analysis of dense linear system solver package for distributed memory machines
NASA Technical Reports Server (NTRS)
Narang, H. N.
1993-01-01
A dense linear system solver package recently developed at the University of Texas at Austin for distributed memory machine (e.g. Intel Paragon) has been reviewed and analyzed. The package contains about 45 software routines, some written in FORTRAN, and some in C-language, and forms the basis for parallel/distributed solutions of systems of linear equations encountered in many problems of scientific and engineering nature. The package, being studied by the Computer Applications Branch of the Analysis and Computation Division, may provide a significant computational resource for NASA scientists and engineers in parallel/distributed computing. Since the package is new and not well tested or documented, many of its underlying concepts and implementations were unclear; our task was to review, analyze, and critique the package as a step in the process that will enable scientists and engineers to apply it to the solution of their problems. All routines in the package were reviewed and analyzed. Underlying theory or concepts which exist in the form of published papers or technical reports, or memos, were either obtained from the author, or from the scientific literature; and general algorithms, explanations, examples, and critiques have been provided to explain the workings of these programs. Wherever the things were still unclear, communications were made with the developer (author), either by telephone or by electronic mail, to understand the workings of the routines. Whenever possible, tests were made to verify the concepts and logic employed in their implementations. A detailed report is being separately documented to explain the workings of these routines.
PREFACE: New trends in Computer Simulations in Physics and not only in physics
NASA Astrophysics Data System (ADS)
Shchur, Lev N.; Krashakov, Serge A.
2016-02-01
In this volume we have collected papers based on the presentations given at the International Conference on Computer Simulations in Physics and beyond (CSP2015), held in Moscow, September 6-10, 2015. We hope that this volume will be helpful and scientifically interesting for readers. The Conference was organized for the first time with the common efforts of the Moscow Institute for Electronics and Mathematics (MIEM) of the National Research University Higher School of Economics, the Landau Institute for Theoretical Physics, and the Science Center in Chernogolovka. The name of the Conference emphasizes the multidisciplinary nature of computational physics. Its methods are applied to the broad range of current research in science and society. The choice of venue was motivated by the multidisciplinary character of the MIEM. It is a former independent university, which has recently become the part of the National Research University Higher School of Economics. The Conference Computer Simulations in Physics and beyond (CSP) is planned to be organized biannually. This year's Conference featured 99 presentations, including 21 plenary and invited talks ranging from the analysis of Irish myths with recent methods of statistical physics, to computing with novel quantum computers D-Wave and D-Wave2. This volume covers various areas of computational physics and emerging subjects within the computational physics community. Each section was preceded by invited talks presenting the latest algorithms and methods in computational physics, as well as new scientific results. Both parallel and poster sessions paid special attention to numerical methods, applications and results. For all the abstracts presented at the conference please follow the link http://csp2015.ac.ru/files/book5x.pdf
Container-Based Clinical Solutions for Portable and Reproducible Image Analysis.
Matelsky, Jordan; Kiar, Gregory; Johnson, Erik; Rivera, Corban; Toma, Michael; Gray-Roncal, William
2018-05-08
Medical imaging analysis depends on the reproducibility of complex computation. Linux containers enable the abstraction, installation, and configuration of environments so that software can be both distributed in self-contained images and used repeatably by tool consumers. While several initiatives in neuroimaging have adopted approaches for creating and sharing more reliable scientific methods and findings, Linux containers are not yet mainstream in clinical settings. We explore related technologies and their efficacy in this setting, highlight important shortcomings, demonstrate a simple use-case, and endorse the use of Linux containers for medical image analysis.
RAPPORT: running scientific high-performance computing applications on the cloud.
Cohen, Jeremy; Filippis, Ioannis; Woodbridge, Mark; Bauer, Daniela; Hong, Neil Chue; Jackson, Mike; Butcher, Sarah; Colling, David; Darlington, John; Fuchs, Brian; Harvey, Matt
2013-01-28
Cloud computing infrastructure is now widely used in many domains, but one area where there has been more limited adoption is research computing, in particular for running scientific high-performance computing (HPC) software. The Robust Application Porting for HPC in the Cloud (RAPPORT) project took advantage of existing links between computing researchers and application scientists in the fields of bioinformatics, high-energy physics (HEP) and digital humanities, to investigate running a set of scientific HPC applications from these domains on cloud infrastructure. In this paper, we focus on the bioinformatics and HEP domains, describing the applications and target cloud platforms. We conclude that, while there are many factors that need consideration, there is no fundamental impediment to the use of cloud infrastructure for running many types of HPC applications and, in some cases, there is potential for researchers to benefit significantly from the flexibility offered by cloud platforms.
High-performance scientific computing in the cloud
NASA Astrophysics Data System (ADS)
Jorissen, Kevin; Vila, Fernando; Rehr, John
2011-03-01
Cloud computing has the potential to open up high-performance computational science to a much broader class of researchers, owing to its ability to provide on-demand, virtualized computational resources. However, before such approaches can become commonplace, user-friendly tools must be developed that hide the unfamiliar cloud environment and streamline the management of cloud resources for many scientific applications. We have recently shown that high-performance cloud computing is feasible for parallelized x-ray spectroscopy calculations. We now present benchmark results for a wider selection of scientific applications focusing on electronic structure and spectroscopic simulation software in condensed matter physics. These applications are driven by an improved portable interface that can manage virtual clusters and run various applications in the cloud. We also describe a next generation of cluster tools, aimed at improved performance and a more robust cluster deployment. Supported by NSF grant OCI-1048052.
NASA Astrophysics Data System (ADS)
Kaplinger, Brian Douglas
For the past few decades, both the scientific community and the general public have been becoming more aware that the Earth lives in a shooting gallery of small objects. We classify all of these asteroids and comets, known or unknown, that cross Earth's orbit as near-Earth objects (NEOs). A look at our geologic history tells us that NEOs have collided with Earth in the past, and we expect that they will continue to do so. With thousands of known NEOs crossing the orbit of Earth, there has been significant scientific interest in developing the capability to deflect an NEO from an impacting trajectory. This thesis applies the ideas of Smoothed Particle Hydrodynamics (SPH) theory to the NEO disruption problem. A simulation package was designed that allows efficacy simulation to be integrated into the mission planning and design process. This is done by applying ideas in high-performance computing (HPC) on the computer graphics processing unit (GPU). Rather than prove a concept through large standalone simulations on a supercomputer, a highly parallel structure allows for flexible, target dependent questions to be resolved. Built around nonclassified data and analysis, this computer package will allow academic institutions to better tackle the issue of NEO mitigation effectiveness.
magHD: a new approach to multi-dimensional data storage, analysis, display and exploitation
NASA Astrophysics Data System (ADS)
Angleraud, Christophe
2014-06-01
The ever increasing amount of data and processing capabilities - following the well- known Moore's law - is challenging the way scientists and engineers are currently exploiting large datasets. The scientific visualization tools, although quite powerful, are often too generic and provide abstract views of phenomena, thus preventing cross disciplines fertilization. On the other end, Geographic information Systems allow nice and visually appealing maps to be built but they often get very confused as more layers are added. Moreover, the introduction of time as a fourth analysis dimension to allow analysis of time dependent phenomena such as meteorological or climate models, is encouraging real-time data exploration techniques that allow spatial-temporal points of interests to be detected by integration of moving images by the human brain. Magellium is involved in high performance image processing chains for satellite image processing as well as scientific signal analysis and geographic information management since its creation (2003). We believe that recent work on big data, GPU and peer-to-peer collaborative processing can open a new breakthrough in data analysis and display that will serve many new applications in collaborative scientific computing, environment mapping and understanding. The magHD (for Magellium Hyper-Dimension) project aims at developing software solutions that will bring highly interactive tools for complex datasets analysis and exploration commodity hardware, targeting small to medium scale clusters with expansion capabilities to large cloud based clusters.
Examples of Effective Data Sharing in Scientific Publishing
Kitchin, John R.
2015-05-11
Here, we present a perspective on an approach to data sharing in scientific publications we have been developing in our group. The essence of the approach is that data can be embedded in a human-readable and machine-addressable way within the traditional publishing environment. We show this by example for both computational and experimental data. We articulate a need for new authoring tools to facilitate data sharing, and we discuss the tools we have been developing for this purpose. With these tools, data generation, analysis, and manuscript preparation can be deeply integrated, resulting in easier and better data sharing in scientificmore » publications.« less
Examples of Effective Data Sharing in Scientific Publishing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kitchin, John R.
Here, we present a perspective on an approach to data sharing in scientific publications we have been developing in our group. The essence of the approach is that data can be embedded in a human-readable and machine-addressable way within the traditional publishing environment. We show this by example for both computational and experimental data. We articulate a need for new authoring tools to facilitate data sharing, and we discuss the tools we have been developing for this purpose. With these tools, data generation, analysis, and manuscript preparation can be deeply integrated, resulting in easier and better data sharing in scientificmore » publications.« less
M4AST - A Tool for Asteroid Modelling
NASA Astrophysics Data System (ADS)
Birlan, Mirel; Popescu, Marcel; Irimiea, Lucian; Binzel, Richard
2016-10-01
M4AST (Modelling for asteroids) is an online tool devoted to the analysis and interpretation of reflection spectra of asteroids in the visible and near-infrared spectral intervals. It consists into a spectral database of individual objects and a set of routines for analysis which address scientific aspects such as: taxonomy, curve matching with laboratory spectra, space weathering models, and mineralogical diagnosis. Spectral data were obtained using groundbased facilities; part of these data are precompiled from the literature[1].The database is composed by permanent and temporary files. Each permanent file contains a header and two or three columns (wavelength, spectral reflectance, and the error on spectral reflectance). Temporary files can be uploaded anonymously, and are purged for the property of submitted data. The computing routines are organized in order to accomplish several scientific objectives: visualize spectra, compute the asteroid taxonomic class, compare an asteroid spectrum with similar spectra of meteorites, and computing mineralogical parameters. One facility of using the Virtual Observatory protocols was also developed.A new version of the service was released in June 2016. This new release of M4AST contains a database and facilities to model more than 6,000 spectra of asteroids. A new web-interface was designed. This development allows new functionalities into a user-friendly environment. A bridge system of access and exploiting the database SMASS-MIT (http://smass.mit.edu) allows the treatment and analysis of these data in the framework of M4AST environment.Reference:[1] M. Popescu, M. Birlan, and D.A. Nedelcu, "Modeling of asteroids: M4AST," Astronomy & Astrophysics 544, EDP Sciences, pp. A130, 2012.
Geographic information systems, remote sensing, and spatial analysis activities in Texas, 2002-07
Pearson, D.K.; Gary, R.H.; Wilson, Z.D.
2007-01-01
Geographic information system (GIS) technology has become an important tool for scientific investigation, resource management, and environmental planning. A GIS is a computer-aided system capable of collecting, storing, analyzing, and displaying spatially referenced digital data. GIS technology is particularly useful when analyzing a wide variety of spatial data such as with remote sensing and spatial analysis. Remote sensing involves collecting remotely sensed data, such as satellite imagery, aerial photography, or radar images, and analyzing the data to gather information or investigate trends about the environment or the Earth's surface. Spatial analysis combines remotely sensed, thematic, statistical, quantitative, and geographical data through overlay, modeling, and other analytical techniques to investigate specific research questions. It is the combination of data formats and analysis techniques that has made GIS an essential tool in scientific investigations. This document presents information about the technical capabilities and project activities of the U.S. Geological Survey (USGS) Texas Water Science Center (TWSC) GIS Workgroup from 2002 through 2007.
Activities of the Research Institute for Advanced Computer Science
NASA Technical Reports Server (NTRS)
Oliger, Joseph
1994-01-01
The Research Institute for Advanced Computer Science (RIACS) was established by the Universities Space Research Association (USRA) at the NASA Ames Research Center (ARC) on June 6, 1983. RIACS is privately operated by USRA, a consortium of universities with research programs in the aerospace sciences, under contract with NASA. The primary mission of RIACS is to provide research and expertise in computer science and scientific computing to support the scientific missions of NASA ARC. The research carried out at RIACS must change its emphasis from year to year in response to NASA ARC's changing needs and technological opportunities. Research at RIACS is currently being done in the following areas: (1) parallel computing; (2) advanced methods for scientific computing; (3) high performance networks; and (4) learning systems. RIACS technical reports are usually preprints of manuscripts that have been submitted to research journals or conference proceedings. A list of these reports for the period January 1, 1994 through December 31, 1994 is in the Reports and Abstracts section of this report.
The future of scientific workflows
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deelman, Ewa; Peterka, Tom; Altintas, Ilkay
Today’s computational, experimental, and observational sciences rely on computations that involve many related tasks. The success of a scientific mission often hinges on the computer automation of these workflows. In April 2015, the US Department of Energy (DOE) invited a diverse group of domain and computer scientists from national laboratories supported by the Office of Science, the National Nuclear Security Administration, from industry, and from academia to review the workflow requirements of DOE’s science and national security missions, to assess the current state of the art in science workflows, to understand the impact of emerging extreme-scale computing systems on thosemore » workflows, and to develop requirements for automated workflow management in future and existing environments. This article is a summary of the opinions of over 50 leading researchers attending this workshop. We highlight use cases, computing systems, workflow needs and conclude by summarizing the remaining challenges this community sees that inhibit large-scale scientific workflows from becoming a mainstream tool for extreme-scale science.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Potok, Thomas; Schuman, Catherine; Patton, Robert
The White House and Department of Energy have been instrumental in driving the development of a neuromorphic computing program to help the United States continue its lead in basic research into (1) Beyond Exascale—high performance computing beyond Moore’s Law and von Neumann architectures, (2) Scientific Discovery—new paradigms for understanding increasingly large and complex scientific data, and (3) Emerging Architectures—assessing the potential of neuromorphic and quantum architectures. Neuromorphic computing spans a broad range of scientific disciplines from materials science to devices, to computer science, to neuroscience, all of which are required to solve the neuromorphic computing grand challenge. In our workshopmore » we focus on the computer science aspects, specifically from a neuromorphic device through an application. Neuromorphic devices present a very different paradigm to the computer science community from traditional von Neumann architectures, which raises six major questions about building a neuromorphic application from the device level. We used these fundamental questions to organize the workshop program and to direct the workshop panels and discussions. From the white papers, presentations, panels, and discussions, there emerged several recommendations on how to proceed.« less
Distributed Computing Architecture for Image-Based Wavefront Sensing and 2 D FFTs
NASA Technical Reports Server (NTRS)
Smith, Jeffrey S.; Dean, Bruce H.; Haghani, Shadan
2006-01-01
Image-based wavefront sensing (WFS) provides significant advantages over interferometric-based wavefi-ont sensors such as optical design simplicity and stability. However, the image-based approach is computational intensive, and therefore, specialized high-performance computing architectures are required in applications utilizing the image-based approach. The development and testing of these high-performance computing architectures are essential to such missions as James Webb Space Telescope (JWST), Terrestial Planet Finder-Coronagraph (TPF-C and CorSpec), and Spherical Primary Optical Telescope (SPOT). The development of these specialized computing architectures require numerous two-dimensional Fourier Transforms, which necessitate an all-to-all communication when applied on a distributed computational architecture. Several solutions for distributed computing are presented with an emphasis on a 64 Node cluster of DSPs, multiple DSP FPGAs, and an application of low-diameter graph theory. Timing results and performance analysis will be presented. The solutions offered could be applied to other all-to-all communication and scientifically computationally complex problems.
Portable computing - A fielded interactive scientific application in a small off-the-shelf package
NASA Technical Reports Server (NTRS)
Groleau, Nicolas; Hazelton, Lyman; Frainier, Rich; Compton, Michael; Colombano, Silvano; Szolovits, Peter
1993-01-01
Experience with the design and implementation of a portable computing system for STS crew-conducted science is discussed. Principal-Investigator-in-a-Box (PI) will help the SLS-2 astronauts perform vestibular (human orientation system) experiments in flight. PI is an interactive system that provides data acquisition and analysis, experiment step rescheduling, and various other forms of reasoning to astronaut users. The hardware architecture of PI consists of a computer and an analog interface box. 'Off-the-shelf' equipment is employed in the system wherever possible in an effort to use widely available tools and then to add custom functionality and application codes to them. Other projects which can help prospective teams to learn more about portable computing in space are also discussed.
The International Conference on Vector and Parallel Computing (2nd)
1989-01-17
Computation of the SVD of Bidiagonal Matrices" ...................................... 11 " Lattice QCD -As a Large Scale Scientific Computation...vectorizcd for the IBM 3090 Vector Facility. In addition, elapsed times " Lattice QCD -As a Large Scale Scientific have been reduced by using 3090...benchmarked Lattice QCD on a large number ofcompu- come from the wavefront solver routine. This was exten- ters: CrayX-MP and Cray 2 (vector
Multi-threading: A new dimension to massively parallel scientific computation
NASA Astrophysics Data System (ADS)
Nielsen, Ida M. B.; Janssen, Curtis L.
2000-06-01
Multi-threading is becoming widely available for Unix-like operating systems, and the application of multi-threading opens new ways for performing parallel computations with greater efficiency. We here briefly discuss the principles of multi-threading and illustrate the application of multi-threading for a massively parallel direct four-index transformation of electron repulsion integrals. Finally, other potential applications of multi-threading in scientific computing are outlined.
A Federal Vision for Future Computing: A Nanotechnology-Inspired Grand Challenge
2016-07-29
Science Foundation (NSF), Department of Defense (DOD), National Institute of Standards and Technology (NIST), Intelligence Community (IC) Introduction...multiple Federal agencies: • Intelligent big data sensors that act autonomously and are programmable via the network for increased flexibility, and... intelligence for scientific discovery enabled by rapid extreme-scale data analysis, capable of understanding and making sense of results and thereby
Department of Defense In-House RDT and E Activities
1976-10-30
BALLISTIC TESTS.FAC AVAL FCR TESIS OF SP ELELTRONIC’ FIl’ CON EQUIP 4 RELATED SYSTEMS E COMPONFNTZ, 35 INSTALLATION: MEDICAL BIOENGINEERINC- R&D LABORATORY...ANALYSIS OF CHEMICAL AND METALLOGRAPHIC EFFECTS, MICROBIOLOGICAL EFFECTS, CLIMATIC ENVIRONMENTAL EFFECTS. TEST AND EVALUATE WARHEADS AND SPECIAL...CCMMUNICATI’N SYST:M INSTRUMENTED DROP ZONES ENGINEERING TEST FACILITY INSTRUMENTATION CALIBRATICN FACILITY SCIENTIFIC COMPUTER CENTER ENVIRONMENTAL TESY
Aeropropulsion 1987. Session 2: Aeropropulsion Structures Research
NASA Technical Reports Server (NTRS)
1987-01-01
Aeropropulsion systems present unique problems to the structural engineer. The extremes in operating temperatures, rotational effects, and behaviors of advanced material systems combine into complexities that require advances in many scientific disciplines involved in structural analysis and design procedures. This session provides an overview of the complexities of aeropropulsion structures and the theoretical, computational, and experimental research conducted to achieve the needed advances.
A Contrastive Study of the Rhetorical Organisation of English and Spanish PhD Thesis Introductions
ERIC Educational Resources Information Center
Soler-Monreal, Carmen; Carbonell-Olivares, Maria; Gil-Salom, Luz
2011-01-01
This paper presents an analysis of the introductory sections of a corpus of 20 doctoral theses on computing written in Spanish and in English. Our aim was to ascertain whether the theses, produced within the same scientific-technological area but by authors from different cultural and linguistic backgrounds, employed the same rhetorical strategies…
Computational Omics Funding Opportunity | Office of Cancer Clinical Proteomics Research
The National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium (CPTAC) and the NVIDIA Foundation are pleased to announce funding opportunities in the fight against cancer. Each organization has launched a request for proposals (RFP) that will collectively fund up to $2 million to help to develop a new generation of data-intensive scientific tools to find new ways to treat cancer.
ERIC Educational Resources Information Center
Brunelli, Lina, Ed.; Cicchitelli, Giuseppe, Ed.
This volume contains papers on computing and television as well as on the rapidly growing area of electronic communication among teachers and between teachers and students. Data analysis is now an important strand in school curricula in many nations. Many of the contributors discuss statistics in the schools from various perspectives. University…
Conceptual-level workflow modeling of scientific experiments using NMR as a case study
Verdi, Kacy K; Ellis, Heidi JC; Gryk, Michael R
2007-01-01
Background Scientific workflows improve the process of scientific experiments by making computations explicit, underscoring data flow, and emphasizing the participation of humans in the process when intuition and human reasoning are required. Workflows for experiments also highlight transitions among experimental phases, allowing intermediate results to be verified and supporting the proper handling of semantic mismatches and different file formats among the various tools used in the scientific process. Thus, scientific workflows are important for the modeling and subsequent capture of bioinformatics-related data. While much research has been conducted on the implementation of scientific workflows, the initial process of actually designing and generating the workflow at the conceptual level has received little consideration. Results We propose a structured process to capture scientific workflows at the conceptual level that allows workflows to be documented efficiently, results in concise models of the workflow and more-correct workflow implementations, and provides insight into the scientific process itself. The approach uses three modeling techniques to model the structural, data flow, and control flow aspects of the workflow. The domain of biomolecular structure determination using Nuclear Magnetic Resonance spectroscopy is used to demonstrate the process. Specifically, we show the application of the approach to capture the workflow for the process of conducting biomolecular analysis using Nuclear Magnetic Resonance (NMR) spectroscopy. Conclusion Using the approach, we were able to accurately document, in a short amount of time, numerous steps in the process of conducting an experiment using NMR spectroscopy. The resulting models are correct and precise, as outside validation of the models identified only minor omissions in the models. In addition, the models provide an accurate visual description of the control flow for conducting biomolecular analysis using NMR spectroscopy experiment. PMID:17263870
Conceptual-level workflow modeling of scientific experiments using NMR as a case study.
Verdi, Kacy K; Ellis, Heidi Jc; Gryk, Michael R
2007-01-30
Scientific workflows improve the process of scientific experiments by making computations explicit, underscoring data flow, and emphasizing the participation of humans in the process when intuition and human reasoning are required. Workflows for experiments also highlight transitions among experimental phases, allowing intermediate results to be verified and supporting the proper handling of semantic mismatches and different file formats among the various tools used in the scientific process. Thus, scientific workflows are important for the modeling and subsequent capture of bioinformatics-related data. While much research has been conducted on the implementation of scientific workflows, the initial process of actually designing and generating the workflow at the conceptual level has received little consideration. We propose a structured process to capture scientific workflows at the conceptual level that allows workflows to be documented efficiently, results in concise models of the workflow and more-correct workflow implementations, and provides insight into the scientific process itself. The approach uses three modeling techniques to model the structural, data flow, and control flow aspects of the workflow. The domain of biomolecular structure determination using Nuclear Magnetic Resonance spectroscopy is used to demonstrate the process. Specifically, we show the application of the approach to capture the workflow for the process of conducting biomolecular analysis using Nuclear Magnetic Resonance (NMR) spectroscopy. Using the approach, we were able to accurately document, in a short amount of time, numerous steps in the process of conducting an experiment using NMR spectroscopy. The resulting models are correct and precise, as outside validation of the models identified only minor omissions in the models. In addition, the models provide an accurate visual description of the control flow for conducting biomolecular analysis using NMR spectroscopy experiment.
Cloud parallel processing of tandem mass spectrometry based proteomics data.
Mohammed, Yassene; Mostovenko, Ekaterina; Henneman, Alex A; Marissen, Rob J; Deelder, André M; Palmblad, Magnus
2012-10-05
Data analysis in mass spectrometry based proteomics struggles to keep pace with the advances in instrumentation and the increasing rate of data acquisition. Analyzing this data involves multiple steps requiring diverse software, using different algorithms and data formats. Speed and performance of the mass spectral search engines are continuously improving, although not necessarily as needed to face the challenges of acquired big data. Improving and parallelizing the search algorithms is one possibility; data decomposition presents another, simpler strategy for introducing parallelism. We describe a general method for parallelizing identification of tandem mass spectra using data decomposition that keeps the search engine intact and wraps the parallelization around it. We introduce two algorithms for decomposing mzXML files and recomposing resulting pepXML files. This makes the approach applicable to different search engines, including those relying on sequence databases and those searching spectral libraries. We use cloud computing to deliver the computational power and scientific workflow engines to interface and automate the different processing steps. We show how to leverage these technologies to achieve faster data analysis in proteomics and present three scientific workflows for parallel database as well as spectral library search using our data decomposition programs, X!Tandem and SpectraST.
NASA Astrophysics Data System (ADS)
Añel, Juan A.
2017-03-01
Nowadays, the majority of the scientific community is not aware of the risks and problems associated with an inadequate use of computer systems for research, mostly for reproducibility of scientific results. Such reproducibility can be compromised by the lack of clear standards and insufficient methodological description of the computational details involved in an experiment. In addition, the inappropriate application or ignorance of copyright laws can have undesirable effects on access to aspects of great importance of the design of experiments and therefore to the interpretation of results.
The Scientific Image in Behavior Analysis.
Keenan, Mickey
2016-05-01
Throughout the history of science, the scientific image has played a significant role in communication. With recent developments in computing technology, there has been an increase in the kinds of opportunities now available for scientists to communicate in more sophisticated ways. Within behavior analysis, though, we are only just beginning to appreciate the importance of going beyond the printing press to elucidate basic principles of behavior. The aim of this manuscript is to stimulate appreciation of both the role of the scientific image and the opportunities provided by a quick response code (QR code) for enhancing the functionality of the printed page. I discuss the limitations of imagery in behavior analysis ("Introduction"), and I show examples of what can be done with animations and multimedia for teaching philosophical issues that arise when teaching about private events ("Private Events 1 and 2"). Animations are also useful for bypassing ethical issues when showing examples of challenging behavior ("Challenging Behavior"). Each of these topics can be accessed only by scanning the QR code provided. This contingency has been arranged to help the reader embrace this new technology. In so doing, I hope to show its potential for going beyond the limitations of the printing press.
77 FR 11139 - Center for Scientific Review; Notice of Closed Meetings
Federal Register 2010, 2011, 2012, 2013, 2014
2012-02-24
...: Center for Scientific Review Special Emphasis Panel; ``Genetics and Epigenetics of Disease.'' Date: March... Scientific Review Special Emphasis Panel; Small Business: Cell, Computational, and Molecular Biology. Date...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ruebel, Oliver
2009-11-20
Knowledge discovery from large and complex collections of today's scientific datasets is a challenging task. With the ability to measure and simulate more processes at increasingly finer spatial and temporal scales, the increasing number of data dimensions and data objects is presenting tremendous challenges for data analysis and effective data exploration methods and tools. Researchers are overwhelmed with data and standard tools are often insufficient to enable effective data analysis and knowledge discovery. The main objective of this thesis is to provide important new capabilities to accelerate scientific knowledge discovery form large, complex, and multivariate scientific data. The research coveredmore » in this thesis addresses these scientific challenges using a combination of scientific visualization, information visualization, automated data analysis, and other enabling technologies, such as efficient data management. The effectiveness of the proposed analysis methods is demonstrated via applications in two distinct scientific research fields, namely developmental biology and high-energy physics.Advances in microscopy, image analysis, and embryo registration enable for the first time measurement of gene expression at cellular resolution for entire organisms. Analysis of high-dimensional spatial gene expression datasets is a challenging task. By integrating data clustering and visualization, analysis of complex, time-varying, spatial gene expression patterns and their formation becomes possible. The analysis framework MATLAB and the visualization have been integrated, making advanced analysis tools accessible to biologist and enabling bioinformatic researchers to directly integrate their analysis with the visualization. Laser wakefield particle accelerators (LWFAs) promise to be a new compact source of high-energy particles and radiation, with wide applications ranging from medicine to physics. To gain insight into the complex physical processes of particle acceleration, physicists model LWFAs computationally. The datasets produced by LWFA simulations are (i) extremely large, (ii) of varying spatial and temporal resolution, (iii) heterogeneous, and (iv) high-dimensional, making analysis and knowledge discovery from complex LWFA simulation data a challenging task. To address these challenges this thesis describes the integration of the visualization system VisIt and the state-of-the-art index/query system FastBit, enabling interactive visual exploration of extremely large three-dimensional particle datasets. Researchers are especially interested in beams of high-energy particles formed during the course of a simulation. This thesis describes novel methods for automatic detection and analysis of particle beams enabling a more accurate and efficient data analysis process. By integrating these automated analysis methods with visualization, this research enables more accurate, efficient, and effective analysis of LWFA simulation data than previously possible.« less
Profile of central research and application laboratory of Aǧrı İbrahim Çeçen University
NASA Astrophysics Data System (ADS)
Türkoǧlu, Emir Alper; Kurt, Murat; Tabay, Dilruba
2016-04-01
Aǧrı İbrahim Çeçen University built a central research and application laboratory (CRAL) in the east of Turkey. The CRAL possesses 7 research and analysis laboratories, 12 experts and researchers, 8 standard rooms for guest researchers, a restaurant, a conference hall, a meeting room, a prey room and a computer laboratory. The CRAL aims certain collaborations between researchers, experts, clinicians and educators in the areas of biotechnology, bioimagining, food safety & quality, omic sciences such as genomics, proteomics and metallomics. It also intends to develop sustainable solutions in agriculture and animal husbandry, promote public health quality, collect scientific knowledge and keep it for future generations, contribute scientific awareness of all stratums of society, provide consulting for small initiatives and industries. It has been collaborated several scientific foundations since 2011.
Scene analysis for a breadboard Mars robot functioning in an indoor environment
NASA Technical Reports Server (NTRS)
Levine, M. D.
1973-01-01
The problem is delt with of computer perception in an indoor laboratory environment containing rocks of various sizes. The sensory data processing is required for the NASA/JPL breadboard mobile robot that is a test system for an adaptive variably-autonomous vehicle that will conduct scientific explorations on the surface of Mars. Scene analysis is discussed in terms of object segmentation followed by feature extraction, which results in a representation of the scene in the robot's world model.
Software for visualization, analysis, and manipulation of laser scan images
NASA Astrophysics Data System (ADS)
Burnsides, Dennis B.
1997-03-01
The recent introduction of laser surface scanning to scientific applications presents a challenge to computer scientists and engineers. Full utilization of this two- dimensional (2-D) and three-dimensional (3-D) data requires advances in techniques and methods for data processing and visualization. This paper explores the development of software to support the visualization, analysis and manipulation of laser scan images. Specific examples presented are from on-going efforts at the Air Force Computerized Anthropometric Research and Design (CARD) Laboratory.
Silicon ribbon technology assessment 1978-1986 - A computer-assisted analysis using PECAN
NASA Technical Reports Server (NTRS)
Kran, A.
1978-01-01
The paper presents a 1978-1986 economic outlook for silicon ribbon technology based on the capillary action shaping technique. The outlook is presented within the framework of two sets of scenarios, which develop strategy for approaching the 1986 national energy capacity cost objective of $0.50/WE peak. The PECAN (Photovoltaic Energy Conversion Analysis) simulation technique is used to develop a 1986 sheet material price ($50/sq m) which apparently can be attained without further scientific breakthrough.
Neutron imaging data processing using the Mantid framework
NASA Astrophysics Data System (ADS)
Pouzols, Federico M.; Draper, Nicholas; Nagella, Sri; Yang, Erica; Sajid, Ahmed; Ross, Derek; Ritchie, Brian; Hill, John; Burca, Genoveva; Minniti, Triestino; Moreton-Smith, Christopher; Kockelmann, Winfried
2016-09-01
Several imaging instruments are currently being constructed at neutron sources around the world. The Mantid software project provides an extensible framework that supports high-performance computing for data manipulation, analysis and visualisation of scientific data. At ISIS, IMAT (Imaging and Materials Science & Engineering) will offer unique time-of-flight neutron imaging techniques which impose several software requirements to control the data reduction and analysis. Here we outline the extensions currently being added to Mantid to provide specific support for neutron imaging requirements.
Program Supports Scientific Visualization
NASA Technical Reports Server (NTRS)
Keith, Stephan
1994-01-01
Primary purpose of General Visualization System (GVS) computer program is to support scientific visualization of data generated by panel-method computer program PMARC_12 (inventory number ARC-13362) on Silicon Graphics Iris workstation. Enables user to view PMARC geometries and wakes as wire frames or as light shaded objects. GVS is written in C language.
Using POGIL to Help Students Learn to Program
ERIC Educational Resources Information Center
Hu, Helen H.; Shepherd, Tricia D.
2013-01-01
POGIL has been successfully implemented in a scientific computing course to teach science students how to program in Python. Following POGIL guidelines, the authors have developed guided inquiry activities that lead student teams to discover and understand programming concepts. With each iteration of the scientific computing course, the authors…
Ontology-Driven Discovery of Scientific Computational Entities
ERIC Educational Resources Information Center
Brazier, Pearl W.
2010-01-01
Many geoscientists use modern computational resources, such as software applications, Web services, scientific workflows and datasets that are readily available on the Internet, to support their research and many common tasks. These resources are often shared via human contact and sometimes stored in data portals; however, they are not necessarily…
Scientific Discovery through Advanced Computing in Plasma Science
NASA Astrophysics Data System (ADS)
Tang, William
2005-03-01
Advanced computing is generally recognized to be an increasingly vital tool for accelerating progress in scientific research during the 21st Century. For example, the Department of Energy's ``Scientific Discovery through Advanced Computing'' (SciDAC) Program was motivated in large measure by the fact that formidable scientific challenges in its research portfolio could best be addressed by utilizing the combination of the rapid advances in super-computing technology together with the emergence of effective new algorithms and computational methodologies. The imperative is to translate such progress into corresponding increases in the performance of the scientific codes used to model complex physical systems such as those encountered in high temperature plasma research. If properly validated against experimental measurements and analytic benchmarks, these codes can provide reliable predictive capability for the behavior of a broad range of complex natural and engineered systems. This talk reviews recent progress and future directions for advanced simulations with some illustrative examples taken from the plasma science applications area. Significant recent progress has been made in both particle and fluid simulations of fine-scale turbulence and large-scale dynamics, giving increasingly good agreement between experimental observations and computational modeling. This was made possible by the combination of access to powerful new computational resources together with innovative advances in analytic and computational methods for developing reduced descriptions of physics phenomena spanning a huge range in time and space scales. In particular, the plasma science community has made excellent progress in developing advanced codes for which computer run-time and problem size scale well with the number of processors on massively parallel machines (MPP's). A good example is the effective usage of the full power of multi-teraflop (multi-trillion floating point computations per second) MPP's to produce three-dimensional, general geometry, nonlinear particle simulations which have accelerated progress in understanding the nature of plasma turbulence in magnetically-confined high temperature plasmas. These calculations, which typically utilized billions of particles for thousands of time-steps, would not have been possible without access to powerful present generation MPP computers and the associated diagnostic and visualization capabilities. In general, results from advanced simulations provide great encouragement for being able to include increasingly realistic dynamics to enable deeper physics insights into plasmas in both natural and laboratory environments. The associated scientific excitement should serve to stimulate improved cross-cutting collaborations with other fields and also to help attract bright young talent to the computational science area.
Mangado, Nerea; Piella, Gemma; Noailly, Jérôme; Pons-Prats, Jordi; Ballester, Miguel Ángel González
2016-01-01
Computational modeling has become a powerful tool in biomedical engineering thanks to its potential to simulate coupled systems. However, real parameters are usually not accurately known, and variability is inherent in living organisms. To cope with this, probabilistic tools, statistical analysis and stochastic approaches have been used. This article aims to review the analysis of uncertainty and variability in the context of finite element modeling in biomedical engineering. Characterization techniques and propagation methods are presented, as well as examples of their applications in biomedical finite element simulations. Uncertainty propagation methods, both non-intrusive and intrusive, are described. Finally, pros and cons of the different approaches and their use in the scientific community are presented. This leads us to identify future directions for research and methodological development of uncertainty modeling in biomedical engineering. PMID:27872840
Mangado, Nerea; Piella, Gemma; Noailly, Jérôme; Pons-Prats, Jordi; Ballester, Miguel Ángel González
2016-01-01
Computational modeling has become a powerful tool in biomedical engineering thanks to its potential to simulate coupled systems. However, real parameters are usually not accurately known, and variability is inherent in living organisms. To cope with this, probabilistic tools, statistical analysis and stochastic approaches have been used. This article aims to review the analysis of uncertainty and variability in the context of finite element modeling in biomedical engineering. Characterization techniques and propagation methods are presented, as well as examples of their applications in biomedical finite element simulations. Uncertainty propagation methods, both non-intrusive and intrusive, are described. Finally, pros and cons of the different approaches and their use in the scientific community are presented. This leads us to identify future directions for research and methodological development of uncertainty modeling in biomedical engineering.
NASA Astrophysics Data System (ADS)
Michel, Robert G.; Cavallari, Jennifer M.; Znamenskaia, Elena; Yang, Karl X.; Sun, Tao; Bent, Gary
1999-12-01
This article is an electronic publication in Spectrochimica Acta Electronica (SAE), a section of Spectrochimica Acta Part B (SAB). The hardcopy text is accompanied by an electronic archive, stored on the CD-ROM accompanying this issue. The archive contains video clips. The main article discusses the scientific aspects of the subject and explains the purpose of the video files. Short, 15-30 s, digital video clips are easily controllable at the computer keyboard, which gives a speaker the ability to show fine details through the use of slow motion. Also, they are easily accessed from the computer hard drive for rapid extemporaneous presentation. In addition, they are easily transferred to the Internet for dissemination. From a pedagogical point of view, the act of making a video clip by a student allows for development of powers of observation, while the availability of the technology to make digital video clips gives a teacher the flexibility to demonstrate scientific concepts that would otherwise have to be done as 'live' demonstrations, with all the likely attendant misadventures. Our experience with digital video clips has been through their use in computer-based presentations by undergraduate and graduate students in analytical chemistry classes, and by high school and middle school teachers and their students in a variety of science and non-science classes. In physics teaching laboratories, we have used the hardware to capture digital video clips of dynamic processes, such as projectiles and pendulums, for later mathematical analysis.
Computer network access to scientific information systems for minority universities
NASA Astrophysics Data System (ADS)
Thomas, Valerie L.; Wakim, Nagi T.
1993-08-01
The evolution of computer networking technology has lead to the establishment of a massive networking infrastructure which interconnects various types of computing resources at many government, academic, and corporate institutions. A large segment of this infrastructure has been developed to facilitate information exchange and resource sharing within the scientific community. The National Aeronautics and Space Administration (NASA) supports both the development and the application of computer networks which provide its community with access to many valuable multi-disciplinary scientific information systems and on-line databases. Recognizing the need to extend the benefits of this advanced networking technology to the under-represented community, the National Space Science Data Center (NSSDC) in the Space Data and Computing Division at the Goddard Space Flight Center has developed the Minority University-Space Interdisciplinary Network (MU-SPIN) Program: a major networking and education initiative for Historically Black Colleges and Universities (HBCUs) and Minority Universities (MUs). In this paper, we will briefly explain the various components of the MU-SPIN Program while highlighting how, by providing access to scientific information systems and on-line data, it promotes a higher level of collaboration among faculty and students and NASA scientists.
Web-based analysis and publication of flow cytometry experiments.
Kotecha, Nikesh; Krutzik, Peter O; Irish, Jonathan M
2010-07-01
Cytobank is a Web-based application for storage, analysis, and sharing of flow cytometry experiments. Researchers use a Web browser to log in and use a wide range of tools developed for basic and advanced flow cytometry. In addition to providing access to standard cytometry tools from any computer, Cytobank creates a platform and community for developing new analysis and publication tools. Figure layouts created on Cytobank are designed to allow transparent access to the underlying experiment annotation and data processing steps. Since all flow cytometry files and analysis data are stored on a central server, experiments and figures can be viewed or edited by anyone with the proper permission, from any computer with Internet access. Once a primary researcher has performed the initial analysis of the data, collaborators can engage in experiment analysis and make their own figure layouts using the gated, compensated experiment files. Cytobank is available to the scientific community at http://www.cytobank.org. (c) 2010 by John Wiley & Sons, Inc.
Web-Based Analysis and Publication of Flow Cytometry Experiments
Kotecha, Nikesh; Krutzik, Peter O.; Irish, Jonathan M.
2014-01-01
Cytobank is a web-based application for storage, analysis, and sharing of flow cytometry experiments. Researchers use a web browser to log in and use a wide range of tools developed for basic and advanced flow cytometry. In addition to providing access to standard cytometry tools from any computer, Cytobank creates a platform and community for developing new analysis and publication tools. Figure layouts created on Cytobank are designed to allow transparent access to the underlying experiment annotation and data processing steps. Since all flow cytometry files and analysis data are stored on a central server, experiments and figures can be viewed or edited by anyone with the proper permissions from any computer with Internet access. Once a primary researcher has performed the initial analysis of the data, collaborators can engage in experiment analysis and make their own figure layouts using the gated, compensated experiment files. Cytobank is available to the scientific community at www.cytobank.org PMID:20578106
When fragments link: a bibliometric perspective on the development of fragment-based drug discovery.
Romasanta, Angelo K S; van der Sijde, Peter; Hellsten, Iina; Hubbard, Roderick E; Keseru, Gyorgy M; van Muijlwijk-Koezen, Jacqueline; de Esch, Iwan J P
2018-05-05
Fragment-based drug discovery (FBDD) is a highly interdisciplinary field, rich in ideas integrated from pharmaceutical sciences, chemistry, biology, and physics, among others. To enrich our understanding of the development of the field, we used bibliometric techniques to analyze 3642 publications in FBDD, complementing accounts by key practitioners. Mapping its core papers, we found the transfer of knowledge from academia to industry. Co-authorship analysis showed that university-industry collaboration has grown over time. Moreover, we show how ideas from other scientific disciplines have been integrated into the FBDD paradigm. Keyword analysis showed that the field is organized into four interconnected practices: library design, fragment screening, computational methods, and optimization. This study highlights the importance of interactions among various individuals and institutions from diverse disciplines in newly emerging scientific fields. Copyright © 2018. Published by Elsevier Ltd.
ISCR Annual Report: Fical Year 2004
DOE Office of Scientific and Technical Information (OSTI.GOV)
McGraw, J R
2005-03-03
Large-scale scientific computation and all of the disciplines that support and help to validate it have been placed at the focus of Lawrence Livermore National Laboratory (LLNL) by the Advanced Simulation and Computing (ASC) program of the National Nuclear Security Administration (NNSA) and the Scientific Discovery through Advanced Computing (SciDAC) initiative of the Office of Science of the Department of Energy (DOE). The maturation of computational simulation as a tool of scientific and engineering research is underscored in the November 2004 statement of the Secretary of Energy that, ''high performance computing is the backbone of the nation's science and technologymore » enterprise''. LLNL operates several of the world's most powerful computers--including today's single most powerful--and has undertaken some of the largest and most compute-intensive simulations ever performed. Ultrascale simulation has been identified as one of the highest priorities in DOE's facilities planning for the next two decades. However, computers at architectural extremes are notoriously difficult to use efficiently. Furthermore, each successful terascale simulation only points out the need for much better ways of interacting with the resulting avalanche of data. Advances in scientific computing research have, therefore, never been more vital to LLNL's core missions than at present. Computational science is evolving so rapidly along every one of its research fronts that to remain on the leading edge, LLNL must engage researchers at many academic centers of excellence. In Fiscal Year 2004, the Institute for Scientific Computing Research (ISCR) served as one of LLNL's main bridges to the academic community with a program of collaborative subcontracts, visiting faculty, student internships, workshops, and an active seminar series. The ISCR identifies researchers from the academic community for computer science and computational science collaborations with LLNL and hosts them for short- and long-term visits with the aim of encouraging long-term academic research agendas that address LLNL's research priorities. Through such collaborations, ideas and software flow in both directions, and LLNL cultivates its future workforce. The Institute strives to be LLNL's ''eyes and ears'' in the computer and information sciences, keeping the Laboratory aware of and connected to important external advances. It also attempts to be the ''feet and hands'' that carry those advances into the Laboratory and incorporates them into practice. ISCR research participants are integrated into LLNL's Computing and Applied Research (CAR) Department, especially into its Center for Applied Scientific Computing (CASC). In turn, these organizations address computational challenges arising throughout the rest of the Laboratory. Administratively, the ISCR flourishes under LLNL's University Relations Program (URP). Together with the other five institutes of the URP, it navigates a course that allows LLNL to benefit from academic exchanges while preserving national security. While it is difficult to operate an academic-like research enterprise within the context of a national security laboratory, the results declare the challenges well met and worth the continued effort.« less
A characterization of workflow management systems for extreme-scale applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ferreira da Silva, Rafael; Filgueira, Rosa; Pietri, Ilia
We present that the automation of the execution of computational tasks is at the heart of improving scientific productivity. Over the last years, scientific workflows have been established as an important abstraction that captures data processing and computation of large and complex scientific applications. By allowing scientists to model and express entire data processing steps and their dependencies, workflow management systems relieve scientists from the details of an application and manage its execution on a computational infrastructure. As the resource requirements of today’s computational and data science applications that process vast amounts of data keep increasing, there is a compellingmore » case for a new generation of advances in high-performance computing, commonly termed as extreme-scale computing, which will bring forth multiple challenges for the design of workflow applications and management systems. This paper presents a novel characterization of workflow management systems using features commonly associated with extreme-scale computing applications. We classify 15 popular workflow management systems in terms of workflow execution models, heterogeneous computing environments, and data access methods. Finally, the paper also surveys workflow applications and identifies gaps for future research on the road to extreme-scale workflows and management systems.« less
A characterization of workflow management systems for extreme-scale applications
Ferreira da Silva, Rafael; Filgueira, Rosa; Pietri, Ilia; ...
2017-02-16
We present that the automation of the execution of computational tasks is at the heart of improving scientific productivity. Over the last years, scientific workflows have been established as an important abstraction that captures data processing and computation of large and complex scientific applications. By allowing scientists to model and express entire data processing steps and their dependencies, workflow management systems relieve scientists from the details of an application and manage its execution on a computational infrastructure. As the resource requirements of today’s computational and data science applications that process vast amounts of data keep increasing, there is a compellingmore » case for a new generation of advances in high-performance computing, commonly termed as extreme-scale computing, which will bring forth multiple challenges for the design of workflow applications and management systems. This paper presents a novel characterization of workflow management systems using features commonly associated with extreme-scale computing applications. We classify 15 popular workflow management systems in terms of workflow execution models, heterogeneous computing environments, and data access methods. Finally, the paper also surveys workflow applications and identifies gaps for future research on the road to extreme-scale workflows and management systems.« less
1974-09-24
Transonic Flows with Imbedded Shock Waves", Boeing Scientific Research Laboratories Document D1-82-1053 (1971); also as invited lecture series for AGARD...Past Thin Lifting Airfoils", Boeing Scientific Research Laboratories Document D180-2298-1, June 1971. 5. Krupp, J. A. and Ia-man, 9. M., "Computation...Aerodynamics and Marine Sciences Laboratory, Boeing Scientific Research Laboratories, June 1971. 7. Krupp, J. A., "Documentation for Program TSONIC", Technical
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keyes, D E; McGraw, J R
2006-02-02
Large-scale scientific computation and all of the disciplines that support and help validate it have been placed at the focus of Lawrence Livermore National Laboratory (LLNL) by the Advanced Simulation and Computing (ASC) program of the National Nuclear Security Administration (NNSA) and the Scientific Discovery through Advanced Computing (SciDAC) initiative of the Office of Science of the Department of Energy (DOE). The maturation of simulation as a fundamental tool of scientific and engineering research is underscored in the President's Information Technology Advisory Committee (PITAC) June 2005 finding that ''computational science has become critical to scientific leadership, economic competitiveness, and nationalmore » security''. LLNL operates several of the world's most powerful computers--including today's single most powerful--and has undertaken some of the largest and most compute-intensive simulations ever performed, most notably the molecular dynamics simulation that sustained more than 100 Teraflop/s and won the 2005 Gordon Bell Prize. Ultrascale simulation has been identified as one of the highest priorities in DOE's facilities planning for the next two decades. However, computers at architectural extremes are notoriously difficult to use in an efficient manner. Furthermore, each successful terascale simulation only points out the need for much better ways of interacting with the resulting avalanche of data. Advances in scientific computing research have, therefore, never been more vital to the core missions of LLNL than at present. Computational science is evolving so rapidly along every one of its research fronts that to remain on the leading edge, LLNL must engage researchers at many academic centers of excellence. In FY 2005, the Institute for Scientific Computing Research (ISCR) served as one of LLNL's main bridges to the academic community with a program of collaborative subcontracts, visiting faculty, student internships, workshops, and an active seminar series. The ISCR identifies researchers from the academic community for computer science and computational science collaborations with LLNL and hosts them for both brief and extended visits with the aim of encouraging long-term academic research agendas that address LLNL research priorities. Through these collaborations, ideas and software flow in both directions, and LLNL cultivates its future workforce. The Institute strives to be LLNL's ''eyes and ears'' in the computer and information sciences, keeping the Laboratory aware of and connected to important external advances. It also attempts to be the ''hands and feet'' that carry those advances into the Laboratory and incorporate them into practice. ISCR research participants are integrated into LLNL's Computing Applications and Research (CAR) Department, especially into its Center for Applied Scientific Computing (CASC). In turn, these organizations address computational challenges arising throughout the rest of the Laboratory. Administratively, the ISCR flourishes under LLNL's University Relations Program (URP). Together with the other four institutes of the URP, the ISCR navigates a course that allows LLNL to benefit from academic exchanges while preserving national security. While it is difficult to operate an academic-like research enterprise within the context of a national security laboratory, the results declare the challenges well met and worth the continued effort. The pages of this annual report summarize the activities of the faculty members, postdoctoral researchers, students, and guests from industry and other laboratories who participated in LLNL's computational mission under the auspices of the ISCR during FY 2005.« less
Environmentalists and the Computer.
ERIC Educational Resources Information Center
Baron, Robert C.
1982-01-01
Review characteristics, applications, and limitations of computers, including word processing, data/record keeping, scientific and industrial, and educational applications. Discusses misuse of computers and role of computers in environmental management. (JN)
An empirical analysis of journal policy effectiveness for computational reproducibility.
Stodden, Victoria; Seiler, Jennifer; Ma, Zhaokun
2018-03-13
A key component of scientific communication is sufficient information for other researchers in the field to reproduce published findings. For computational and data-enabled research, this has often been interpreted to mean making available the raw data from which results were generated, the computer code that generated the findings, and any additional information needed such as workflows and input parameters. Many journals are revising author guidelines to include data and code availability. This work evaluates the effectiveness of journal policy that requires the data and code necessary for reproducibility be made available postpublication by the authors upon request. We assess the effectiveness of such a policy by ( i ) requesting data and code from authors and ( ii ) attempting replication of the published findings. We chose a random sample of 204 scientific papers published in the journal Science after the implementation of their policy in February 2011. We found that we were able to obtain artifacts from 44% of our sample and were able to reproduce the findings for 26%. We find this policy-author remission of data and code postpublication upon request-an improvement over no policy, but currently insufficient for reproducibility.
An empirical analysis of journal policy effectiveness for computational reproducibility
Seiler, Jennifer; Ma, Zhaokun
2018-01-01
A key component of scientific communication is sufficient information for other researchers in the field to reproduce published findings. For computational and data-enabled research, this has often been interpreted to mean making available the raw data from which results were generated, the computer code that generated the findings, and any additional information needed such as workflows and input parameters. Many journals are revising author guidelines to include data and code availability. This work evaluates the effectiveness of journal policy that requires the data and code necessary for reproducibility be made available postpublication by the authors upon request. We assess the effectiveness of such a policy by (i) requesting data and code from authors and (ii) attempting replication of the published findings. We chose a random sample of 204 scientific papers published in the journal Science after the implementation of their policy in February 2011. We found that we were able to obtain artifacts from 44% of our sample and were able to reproduce the findings for 26%. We find this policy—author remission of data and code postpublication upon request—an improvement over no policy, but currently insufficient for reproducibility. PMID:29531050
Eppig, Janan T
2017-07-01
The Mouse Genome Informatics (MGI) Resource supports basic, translational, and computational research by providing high-quality, integrated data on the genetics, genomics, and biology of the laboratory mouse. MGI serves a strategic role for the scientific community in facilitating biomedical, experimental, and computational studies investigating the genetics and processes of diseases and enabling the development and testing of new disease models and therapeutic interventions. This review describes the nexus of the body of growing genetic and biological data and the advances in computer technology in the late 1980s, including the World Wide Web, that together launched the beginnings of MGI. MGI develops and maintains a gold-standard resource that reflects the current state of knowledge, provides semantic and contextual data integration that fosters hypothesis testing, continually develops new and improved tools for searching and analysis, and partners with the scientific community to assure research data needs are met. Here we describe one slice of MGI relating to the development of community-wide large-scale mutagenesis and phenotyping projects and introduce ways to access and use these MGI data. References and links to additional MGI aspects are provided. © The Author 2017. Published by Oxford University Press.
Eppig, Janan T.
2017-01-01
Abstract The Mouse Genome Informatics (MGI) Resource supports basic, translational, and computational research by providing high-quality, integrated data on the genetics, genomics, and biology of the laboratory mouse. MGI serves a strategic role for the scientific community in facilitating biomedical, experimental, and computational studies investigating the genetics and processes of diseases and enabling the development and testing of new disease models and therapeutic interventions. This review describes the nexus of the body of growing genetic and biological data and the advances in computer technology in the late 1980s, including the World Wide Web, that together launched the beginnings of MGI. MGI develops and maintains a gold-standard resource that reflects the current state of knowledge, provides semantic and contextual data integration that fosters hypothesis testing, continually develops new and improved tools for searching and analysis, and partners with the scientific community to assure research data needs are met. Here we describe one slice of MGI relating to the development of community-wide large-scale mutagenesis and phenotyping projects and introduce ways to access and use these MGI data. References and links to additional MGI aspects are provided. PMID:28838066
Numerical methods for engine-airframe integration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Murthy, S.N.B.; Paynter, G.C.
1986-01-01
Various papers on numerical methods for engine-airframe integration are presented. The individual topics considered include: scientific computing environment for the 1980s, overview of prediction of complex turbulent flows, numerical solutions of the compressible Navier-Stokes equations, elements of computational engine/airframe integrations, computational requirements for efficient engine installation, application of CAE and CFD techniques to complete tactical missile design, CFD applications to engine/airframe integration, and application of a second-generation low-order panel methods to powerplant installation studies. Also addressed are: three-dimensional flow analysis of turboprop inlet and nacelle configurations, application of computational methods to the design of large turbofan engine nacelles, comparison ofmore » full potential and Euler solution algorithms for aeropropulsive flow field computations, subsonic/transonic, supersonic nozzle flows and nozzle integration, subsonic/transonic prediction capabilities for nozzle/afterbody configurations, three-dimensional viscous design methodology of supersonic inlet systems for advanced technology aircraft, and a user's technology assessment.« less
ISCB Ebola Award for Important Future Research on the Computational Biology of Ebola Virus
Karp, Peter D.; Berger, Bonnie; Kovats, Diane; Lengauer, Thomas; Linial, Michal; Sabeti, Pardis; Hide, Winston; Rost, Burkhard
2015-01-01
Speed is of the essence in combating Ebola; thus, computational approaches should form a significant component of Ebola research. As for the development of any modern drug, computational biology is uniquely positioned to contribute through comparative analysis of the genome sequences of Ebola strains as well as 3-D protein modeling. Other computational approaches to Ebola may include large-scale docking studies of Ebola proteins with human proteins and with small-molecule libraries, computational modeling of the spread of the virus, computational mining of the Ebola literature, and creation of a curated Ebola database. Taken together, such computational efforts could significantly accelerate traditional scientific approaches. In recognition of the need for important and immediate solutions from the field of computational biology against Ebola, the International Society for Computational Biology (ISCB) announces a prize for an important computational advance in fighting the Ebola virus. ISCB will confer the ISCB Fight against Ebola Award, along with a prize of US$2,000, at its July 2016 annual meeting (ISCB Intelligent Systems for Molecular Biology (ISMB) 2016, Orlando, Florida). PMID:26097686
ISCB Ebola Award for Important Future Research on the Computational Biology of Ebola Virus.
Karp, Peter D; Berger, Bonnie; Kovats, Diane; Lengauer, Thomas; Linial, Michal; Sabeti, Pardis; Hide, Winston; Rost, Burkhard
2015-01-01
Speed is of the essence in combating Ebola; thus, computational approaches should form a significant component of Ebola research. As for the development of any modern drug, computational biology is uniquely positioned to contribute through comparative analysis of the genome sequences of Ebola strains as well as 3-D protein modeling. Other computational approaches to Ebola may include large-scale docking studies of Ebola proteins with human proteins and with small-molecule libraries, computational modeling of the spread of the virus, computational mining of the Ebola literature, and creation of a curated Ebola database. Taken together, such computational efforts could significantly accelerate traditional scientific approaches. In recognition of the need for important and immediate solutions from the field of computational biology against Ebola, the International Society for Computational Biology (ISCB) announces a prize for an important computational advance in fighting the Ebola virus. ISCB will confer the ISCB Fight against Ebola Award, along with a prize of US$2,000, at its July 2016 annual meeting (ISCB Intelligent Systems for Molecular Biology (ISMB) 2016, Orlando, Florida).
Massive Data, the Digitization of Science, and Reproducibility of Results
Stodden, Victoria
2018-04-27
As the scientific enterprise becomes increasingly computational and data-driven, the nature of the information communicated must change. Without inclusion of the code and data with published computational results, we are engendering a credibility crisis in science. Controversies such as ClimateGate, the microarray-based drug sensitivity clinical trials under investigation at Duke University, and retractions from prominent journals due to unverified code suggest the need for greater transparency in our computational science. In this talk I argue that the scientific method be restored to (1) a focus on error control as central to scientific communication and (2) complete communication of the underlying methodology producing the results, ie. reproducibility. I outline barriers to these goals based on recent survey work (Stodden 2010), and suggest solutions such as the âReproducible Research Standardâ (Stodden 2009), giving open licensing options designed to create an intellectual property framework for scientists consonant with longstanding scientific norms.
NASA Astrophysics Data System (ADS)
Khalili, N.; Valliappan, S.; Li, Q.; Russell, A.
2010-07-01
The use for mathematical models of natural phenomena has underpinned science and engineering for centuries, but until the advent of modern computers and computational methods, the full utility of most of these models remained outside the reach of the engineering communities. Since World War II, advances in computational methods have transformed the way engineering and science is undertaken throughout the world. Today, theories of mechanics of solids and fluids, electromagnetism, heat transfer, plasma physics, and other scientific disciplines are implemented through computational methods in engineering analysis, design, manufacturing, and in studying broad classes of physical phenomena. The discipline concerned with the application of computational methods is now a key area of research, education, and application throughout the world. In the early 1980's, the International Association for Computational Mechanics (IACM) was founded to promote activities related to computational mechanics and has made impressive progress. The most important scientific event of IACM is the World Congress on Computational Mechanics. The first was held in Austin (USA) in 1986 and then in Stuttgart (Germany) in 1990, Chiba (Japan) in 1994, Buenos Aires (Argentina) in 1998, Vienna (Austria) in 2002, Beijing (China) in 2004, Los Angeles (USA) in 2006 and Venice, Italy; in 2008. The 9th World Congress on Computational Mechanics is held in conjunction with the 4th Asian Pacific Congress on Computational Mechanics under the auspices of Australian Association for Computational Mechanics (AACM), Asian Pacific Association for Computational Mechanics (APACM) and International Association for Computational Mechanics (IACM). The 1st Asian Pacific Congress was in Sydney (Australia) in 2001, then in Beijing (China) in 2004 and Kyoto (Japan) in 2007. The WCCM/APCOM 2010 publications consist of a printed book of abstracts given to delegates, along with 247 full length peer reviewed papers published with free access online in IOP Conference Series: Materials Science and Engineering. The editors acknowledge the help of the paper reviewers in maintaining a high standard of assessment and the co-operation of the authors in complying with the requirements of the editors and the reviewers. We also would like to take this opportunity to thank the members of the Local Organising Committee and the International Scientific Committee for helping make WCCM/APCOM 2010 a successful event. We also thank The University of New South Wales, The University of Newcastle, the Centre for Infrastructure Engineering and Safety (CIES), IACM, APCAM, AACM for their financial support, along with the United States Association for Computational Mechanics for the Travel Awards made available. N. Khalili S. Valliappan Q. Li A. Russell 19 July 2010 Sydney, Australia
NASA Technical Reports Server (NTRS)
Rilee, Michael Lee; Kuo, Kwo-Sen
2017-01-01
The SpatioTemporal Adaptive Resolution Encoding (STARE) is a unifying scheme encoding geospatial and temporal information for organizing data on scalable computing/storage resources, minimizing expensive data transfers. STARE provides a compact representation that turns set-logic functions into integer operations, e.g. conditional sub-setting, taking into account representative spatiotemporal resolutions of the data in the datasets. STARE geo-spatiotemporally aligns data placements of diverse data on massive parallel resources to maximize performance. Automating important scientific functions (e.g. regridding) and computational functions (e.g. data placement) allows scientists to focus on domain-specific questions instead of expending their efforts and expertise on data processing. With STARE-enabled automation, SciDB (Scientific Database) plus STARE provides a database interface, reducing costly data preparation, increasing the volume and variety of interoperable data, and easing result sharing. Using SciDB plus STARE as part of an integrated analysis infrastructure dramatically eases combining diametrically different datasets.
Scaffolding Argumentation about Water Quality: A Mixed-Method Study in a Rural Middle School
ERIC Educational Resources Information Center
Belland, Brian R.; Gu, Jiangyue; Armbrust, Sara; Cook, Brant
2015-01-01
A common way for students to develop scientific argumentation abilities is through argumentation about socioscientific issues, defined as scientific problems with social, ethical, and moral aspects. Computer-based scaffolding can support students in this process. In this mixed method study, we examined the use and impact of computer based…
48 CFR 9904.410-60 - Illustrations.
Code of Federal Regulations, 2012 CFR
2012-10-01
... budgets for the other segment should be removed from B's G&A expense pool and transferred to the other...; all home office expenses allocated to Segment H are included in Segment H's G&A expense pool. (2) This... cost of scientific computer operations in its G&A expense pool. The scientific computer is used...
48 CFR 9904.410-60 - Illustrations.
Code of Federal Regulations, 2014 CFR
2014-10-01
... budgets for the other segment should be removed from B's G&A expense pool and transferred to the other...; all home office expenses allocated to Segment H are included in Segment H's G&A expense pool. (2) This... cost of scientific computer operations in its G&A expense pool. The scientific computer is used...
America COMPETES Act and the FY2010 Budget
2009-06-29
Outstanding Junior Investigator, Fusion Energy Sciences Plasma Physics Junior Faculty Development; Advanced Scientific Computing Research Early Career...the Fusion Energy Sciences Graduate Fellowships.2 If members of Congress agree with this contention, these America COMPETES Act programs were...Physics Outstanding Junior Investigator, Fusion Energy Sciences Plasma Physics Junior Faculty Development; Advanced Scientific Computing Research Early
Exascale computing and big data
Reed, Daniel A.; Dongarra, Jack
2015-06-25
Scientific discovery and engineering innovation requires unifying traditionally separated high-performance computing and big data analytics. The tools and cultures of high-performance computing and big data analytics have diverged, to the detriment of both; unification is essential to address a spectrum of major research domains. The challenges of scale tax our ability to transmit data, compute complicated functions on that data, or store a substantial part of it; new approaches are required to meet these challenges. Finally, the international nature of science demands further development of advanced computer architectures and global standards for processing data, even as international competition complicates themore » openness of the scientific process.« less
Exascale computing and big data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Reed, Daniel A.; Dongarra, Jack
Scientific discovery and engineering innovation requires unifying traditionally separated high-performance computing and big data analytics. The tools and cultures of high-performance computing and big data analytics have diverged, to the detriment of both; unification is essential to address a spectrum of major research domains. The challenges of scale tax our ability to transmit data, compute complicated functions on that data, or store a substantial part of it; new approaches are required to meet these challenges. Finally, the international nature of science demands further development of advanced computer architectures and global standards for processing data, even as international competition complicates themore » openness of the scientific process.« less
NeuroManager: a workflow analysis based simulation management engine for computational neuroscience
Stockton, David B.; Santamaria, Fidel
2015-01-01
We developed NeuroManager, an object-oriented simulation management software engine for computational neuroscience. NeuroManager automates the workflow of simulation job submissions when using heterogeneous computational resources, simulators, and simulation tasks. The object-oriented approach (1) provides flexibility to adapt to a variety of neuroscience simulators, (2) simplifies the use of heterogeneous computational resources, from desktops to super computer clusters, and (3) improves tracking of simulator/simulation evolution. We implemented NeuroManager in MATLAB, a widely used engineering and scientific language, for its signal and image processing tools, prevalence in electrophysiology analysis, and increasing use in college Biology education. To design and develop NeuroManager we analyzed the workflow of simulation submission for a variety of simulators, operating systems, and computational resources, including the handling of input parameters, data, models, results, and analyses. This resulted in 22 stages of simulation submission workflow. The software incorporates progress notification, automatic organization, labeling, and time-stamping of data and results, and integrated access to MATLAB's analysis and visualization tools. NeuroManager provides users with the tools to automate daily tasks, and assists principal investigators in tracking and recreating the evolution of research projects performed by multiple people. Overall, NeuroManager provides the infrastructure needed to improve workflow, manage multiple simultaneous simulations, and maintain provenance of the potentially large amounts of data produced during the course of a research project. PMID:26528175
NeuroManager: a workflow analysis based simulation management engine for computational neuroscience.
Stockton, David B; Santamaria, Fidel
2015-01-01
We developed NeuroManager, an object-oriented simulation management software engine for computational neuroscience. NeuroManager automates the workflow of simulation job submissions when using heterogeneous computational resources, simulators, and simulation tasks. The object-oriented approach (1) provides flexibility to adapt to a variety of neuroscience simulators, (2) simplifies the use of heterogeneous computational resources, from desktops to super computer clusters, and (3) improves tracking of simulator/simulation evolution. We implemented NeuroManager in MATLAB, a widely used engineering and scientific language, for its signal and image processing tools, prevalence in electrophysiology analysis, and increasing use in college Biology education. To design and develop NeuroManager we analyzed the workflow of simulation submission for a variety of simulators, operating systems, and computational resources, including the handling of input parameters, data, models, results, and analyses. This resulted in 22 stages of simulation submission workflow. The software incorporates progress notification, automatic organization, labeling, and time-stamping of data and results, and integrated access to MATLAB's analysis and visualization tools. NeuroManager provides users with the tools to automate daily tasks, and assists principal investigators in tracking and recreating the evolution of research projects performed by multiple people. Overall, NeuroManager provides the infrastructure needed to improve workflow, manage multiple simultaneous simulations, and maintain provenance of the potentially large amounts of data produced during the course of a research project.
People’s Republic of China Scientific Abstracts, Number 195.
1978-08-01
regarding China. 17. Key Word« and Document Analysis. 17a. Descriptors China x Agricultural Science and Technology x Bio-Medical Sciences x...Chemistry Cybernetics, Computers, and Automation Technology x Earth Sciences 17b. ldentifiers/Open-Eoded Terms X X Engineering and Equipment...safety in astronautics, especially in reentry; 3. Producible with current technology ; *t. Provides an increased range of astronautical acti
Performance of the engineering analysis and data system 2 common file system
NASA Technical Reports Server (NTRS)
Debrunner, Linda S.
1993-01-01
The Engineering Analysis and Data System (EADS) was used from April 1986 to July 1993 to support large scale scientific and engineering computation (e.g. computational fluid dynamics) at Marshall Space Flight Center. The need for an updated system resulted in a RFP in June 1991, after which a contract was awarded to Cray Grumman. EADS II was installed in February 1993, and by July 1993 most users were migrated. EADS II is a network of heterogeneous computer systems supporting scientific and engineering applications. The Common File System (CFS) is a key component of this system. The CFS provides a seamless, integrated environment to the users of EADS II including both disk and tape storage. UniTree software is used to implement this hierarchical storage management system. The performance of the CFS suffered during the early months of the production system. Several of the performance problems were traced to software bugs which have been corrected. Other problems were associated with hardware. However, the use of NFS in UniTree UCFM software limits the performance of the system. The performance issues related to the CFS have led to a need to develop a greater understanding of the CFS organization. This paper will first describe the EADS II with emphasis on the CFS. Then, a discussion of mass storage systems will be presented, and methods of measuring the performance of the Common File System will be outlined. Finally, areas for further study will be identified and conclusions will be drawn.
Integrated instrumentation & computation environment for GRACE
NASA Astrophysics Data System (ADS)
Dhekne, P. S.
2002-03-01
The project GRACE (Gamma Ray Astrophysics with Coordinated Experiments) aims at setting up a state of the art Gamma Ray Observatory at Mt. Abu, Rajasthan for undertaking comprehensive scientific exploration over a wide spectral window (10's keV - 100's TeV) from a single location through 4 coordinated experiments. The cumulative data collection rate of all the telescopes is expected to be about 1 GB/hr, necessitating innovations in the data management environment. As real-time data acquisition and control as well as off-line data processing, analysis and visualization environment of these systems is based on the us cutting edge and affordable technologies in the field of computers, communications and Internet. We propose to provide a single, unified environment by seamless integration of instrumentation and computations by taking advantage of the recent advancements in Web based technologies. This new environment will allow researchers better acces to facilities, improve resource utilization and enhance collaborations by having identical environments for online as well as offline usage of this facility from any location. We present here a proposed implementation strategy for a platform independent web-based system that supplements automated functions with video-guided interactive and collaborative remote viewing, remote control through virtual instrumentation console, remote acquisition of telescope data, data analysis, data visualization and active imaging system. This end-to-end web-based solution will enhance collaboration among researchers at the national and international level for undertaking scientific studies, using the telescope systems of the GRACE project.
Reproducible research in palaeomagnetism
NASA Astrophysics Data System (ADS)
Lurcock, Pontus; Florindo, Fabio
2015-04-01
The reproducibility of research findings is attracting increasing attention across all scientific disciplines. In palaeomagnetism as elsewhere, computer-based analysis techniques are becoming more commonplace, complex, and diverse. Analyses can often be difficult to reproduce from scratch, both for the original researchers and for others seeking to build on the work. We present a palaeomagnetic plotting and analysis program designed to make reproducibility easier. Part of the problem is the divide between interactive and scripted (batch) analysis programs. An interactive desktop program with a graphical interface is a powerful tool for exploring data and iteratively refining analyses, but usually cannot operate without human interaction. This makes it impossible to re-run an analysis automatically, or to integrate it into a larger automated scientific workflow - for example, a script to generate figures and tables for a paper. In some cases the parameters of the analysis process itself are not saved explicitly, making it hard to repeat or improve the analysis even with human interaction. Conversely, non-interactive batch tools can be controlled by pre-written scripts and configuration files, allowing an analysis to be 'replayed' automatically from the raw data. However, this advantage comes at the expense of exploratory capability: iteratively improving an analysis entails a time-consuming cycle of editing scripts, running them, and viewing the output. Batch tools also tend to require more computer expertise from their users. PuffinPlot is a palaeomagnetic plotting and analysis program which aims to bridge this gap. First released in 2012, it offers both an interactive, user-friendly desktop interface and a batch scripting interface, both making use of the same core library of palaeomagnetic functions. We present new improvements to the program that help to integrate the interactive and batch approaches, allowing an analysis to be interactively explored and refined, then saved as a self-contained configuration which can be re-run without human interaction. PuffinPlot can thus be used as a component of a larger scientific workflow, integrated with workflow management tools such as Kepler, without compromising its capabilities as an exploratory tool. Since both PuffinPlot and the platform it runs on (Java) are Free/Open Source software, even the most fundamental components of an analysis can be verified and reproduced.
Evolution and Natural Selection: Learning by Playing and Reflecting
ERIC Educational Resources Information Center
Herrero, David; del Castillo, Héctor; Monjelat, Natalia; García-Varela, Ana Belén; Checa, Mirian; Gómez, Patricia
2014-01-01
Scientific literacy is more than the simple reproduction of traditional school science knowledge and requires a set of skills, among them identifying scientific issues, explaining phenomena scientifically and using scientific evidence. Several studies have indicated that playing computer games in the classroom can support the development of…
NASA Astrophysics Data System (ADS)
Mhashilkar, Parag; Tiradani, Anthony; Holzman, Burt; Larson, Krista; Sfiligoi, Igor; Rynge, Mats
2014-06-01
Scientific communities have been in the forefront of adopting new technologies and methodologies in the computing. Scientific computing has influenced how science is done today, achieving breakthroughs that were impossible to achieve several decades ago. For the past decade several such communities in the Open Science Grid (OSG) and the European Grid Infrastructure (EGI) have been using GlideinWMS to run complex application workflows to effectively share computational resources over the grid. GlideinWMS is a pilot-based workload management system (WMS) that creates on demand, a dynamically sized overlay HTCondor batch system on grid resources. At present, the computational resources shared over the grid are just adequate to sustain the computing needs. We envision that the complexity of the science driven by "Big Data" will further push the need for computational resources. To fulfill their increasing demands and/or to run specialized workflows, some of the big communities like CMS are investigating the use of cloud computing as Infrastructure-As-A-Service (IAAS) with GlideinWMS as a potential alternative to fill the void. Similarly, communities with no previous access to computing resources can use GlideinWMS to setup up a batch system on the cloud infrastructure. To enable this, the architecture of GlideinWMS has been extended to enable support for interfacing GlideinWMS with different Scientific and commercial cloud providers like HLT, FutureGrid, FermiCloud and Amazon EC2. In this paper, we describe a solution for cloud bursting with GlideinWMS. The paper describes the approach, architectural changes and lessons learned while enabling support for cloud infrastructures in GlideinWMS.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mhashilkar, Parag; Tiradani, Anthony; Holzman, Burt
Scientific communities have been in the forefront of adopting new technologies and methodologies in the computing. Scientific computing has influenced how science is done today, achieving breakthroughs that were impossible to achieve several decades ago. For the past decade several such communities in the Open Science Grid (OSG) and the European Grid Infrastructure (EGI) have been using GlideinWMS to run complex application workflows to effectively share computational resources over the grid. GlideinWMS is a pilot-based workload management system (WMS) that creates on demand, a dynamically sized overlay HTCondor batch system on grid resources. At present, the computational resources shared overmore » the grid are just adequate to sustain the computing needs. We envision that the complexity of the science driven by 'Big Data' will further push the need for computational resources. To fulfill their increasing demands and/or to run specialized workflows, some of the big communities like CMS are investigating the use of cloud computing as Infrastructure-As-A-Service (IAAS) with GlideinWMS as a potential alternative to fill the void. Similarly, communities with no previous access to computing resources can use GlideinWMS to setup up a batch system on the cloud infrastructure. To enable this, the architecture of GlideinWMS has been extended to enable support for interfacing GlideinWMS with different Scientific and commercial cloud providers like HLT, FutureGrid, FermiCloud and Amazon EC2. In this paper, we describe a solution for cloud bursting with GlideinWMS. The paper describes the approach, architectural changes and lessons learned while enabling support for cloud infrastructures in GlideinWMS.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lucas, Robert; Ang, James; Bergman, Keren
2014-02-10
Exascale computing systems are essential for the scientific fields that will transform the 21st century global economy, including energy, biotechnology, nanotechnology, and materials science. Progress in these fields is predicated on the ability to perform advanced scientific and engineering simulations, and analyze the deluge of data. On July 29, 2013, ASCAC was charged by Patricia Dehmer, the Acting Director of the Office of Science, to assemble a subcommittee to provide advice on exascale computing. This subcommittee was directed to return a list of no more than ten technical approaches (hardware and software) that will enable the development of a systemmore » that achieves the Department's goals for exascale computing. Numerous reports over the past few years have documented the technical challenges and the non¬-viability of simply scaling existing computer designs to reach exascale. The technical challenges revolve around energy consumption, memory performance, resilience, extreme concurrency, and big data. Drawing from these reports and more recent experience, this ASCAC subcommittee has identified the top ten computing technology advancements that are critical to making a capable, economically viable, exascale system.« less
NASA Astrophysics Data System (ADS)
Kuan, Wen-Hsuan; Tseng, Chi-Hung; Chen, Sufen; Wong, Ching-Chang
2016-06-01
We propose an integrated curriculum to establish essential abilities of computer programming for the freshmen of a physics department. The implementation of the graphical-based interfaces from Scratch to LabVIEW then to LabVIEW for Arduino in the curriculum `Computer-Assisted Instrumentation in the Design of Physics Laboratories' brings rigorous algorithm and syntax protocols together with imagination, communication, scientific applications and experimental innovation. The effectiveness of the curriculum was evaluated via statistical analysis of questionnaires, interview responses, the increase in student numbers majoring in physics, and performance in a competition. The results provide quantitative support that the curriculum remove huge barriers to programming which occur in text-based environments, helped students gain knowledge of programming and instrumentation, and increased the students' confidence and motivation to learn physics and computer languages.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nikolic, R J
This month's issue has the following articles: (1) Dawn of a New Era of Scientific Discovery - Commentary by Edward I. Moses; (2) At the Frontiers of Fundamental Science Research - Collaborators from national laboratories, universities, and international organizations are using the National Ignition Facility to probe key fundamental science questions; (3) Livermore Responds to Crisis in Post-Earthquake Japan - More than 70 Laboratory scientists provided round-the-clock expertise in radionuclide analysis and atmospheric dispersion modeling as part of the nation's support to Japan following the March 2011 earthquake and nuclear accident; (4) A Comprehensive Resource for Modeling, Simulation, and Experimentsmore » - A new Web-based resource called MIDAS is a central repository for material properties, experimental data, and computer models; and (5) Finding Data Needles in Gigabit Haystacks - Livermore computer scientists have developed a novel computer architecture based on 'persistent' memory to ease data-intensive computations.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Strout, Michelle
Programming parallel machines is fraught with difficulties: the obfuscation of algorithms due to implementation details such as communication and synchronization, the need for transparency between language constructs and performance, the difficulty of performing program analysis to enable automatic parallelization techniques, and the existence of important "dusty deck" codes. The SAIMI project developed abstractions that enable the orthogonal specification of algorithms and implementation details within the context of existing DOE applications. The main idea is to enable the injection of small programming models such as expressions involving transcendental functions, polyhedral iteration spaces with sparse constraints, and task graphs into full programsmore » through the use of pragmas. These smaller, more restricted programming models enable orthogonal specification of many implementation details such as how to map the computation on to parallel processors, how to schedule the computation, and how to allocation storage for the computation. At the same time, these small programming models enable the expression of the most computationally intense and communication heavy portions in many scientific simulations. The ability to orthogonally manipulate the implementation for such computations will significantly ease performance programming efforts and expose transformation possibilities and parameter to automated approaches such as autotuning. At Colorado State University, the SAIMI project was supported through DOE grant DE-SC3956 from April 2010 through August 2015. The SAIMI project has contributed a number of important results to programming abstractions that enable the orthogonal specification of implementation details in scientific codes. This final report summarizes the research that was funded by the SAIMI project.« less
Scaling to diversity: The DERECHOS distributed infrastructure for analyzing and sharing data
NASA Astrophysics Data System (ADS)
Rilee, M. L.; Kuo, K. S.; Clune, T.; Oloso, A.; Brown, P. G.
2016-12-01
Integrating Earth Science data from diverse sources such as satellite imagery and simulation output can be expensive and time-consuming, limiting scientific inquiry and the quality of our analyses. Reducing these costs will improve innovation and quality in science. The current Earth Science data infrastructure focuses on downloading data based on requests formed from the search and analysis of associated metadata. And while the data products provided by archives may use the best available data sharing technologies, scientist end-users generally do not have such resources (including staff) available to them. Furthermore, only once an end-user has received the data from multiple diverse sources and has integrated them can the actual analysis and synthesis begin. The cost of getting from idea to where synthesis can start dramatically slows progress. In this presentation we discuss a distributed computational and data storage framework that eliminates much of the aforementioned cost. The SciDB distributed array database is central as it is optimized for scientific computing involving very large arrays, performing better than less specialized frameworks like Spark. Adding spatiotemporal functions to the SciDB creates a powerful platform for analyzing and integrating massive, distributed datasets. SciDB allows Big Earth Data analysis to be performed "in place" without the need for expensive downloads and end-user resources. Spatiotemporal indexing technologies such as the hierarchical triangular mesh enable the compute and storage affinity needed to efficiently perform co-located and conditional analyses minimizing data transfers. These technologies automate the integration of diverse data sources using the framework, a critical step beyond current metadata search and analysis. Instead of downloading data into their idiosyncratic local environments, end-users can generate and share data products integrated from diverse multiple sources using a common shared environment, turning distributed active archive centers (DAACs) from warehouses into distributed active analysis centers.
Korb, Werner; Geißler, Norman; Strauß, Gero
2015-03-01
Engineering a medical technology is a complex process, therefore it is important to include experts from different scientific fields. This is particularly true for the development of surgical technology, where the relevant scientific fields are surgery (medicine) and engineering (electrical engineering, mechanical engineering, computer science, etc.). Furthermore, the scientific field of human factors is important to ensure that a surgical technology is indeed functional, process-oriented, effective, efficient as well as user- and patient-oriented. Working in such trans- and inter-disciplinary teams can be challenging due to different working cultures. The intention of this paper is to propose an innovative cooperative working culture for the interdisciplinary field of computer-assisted surgery (CAS) based on more than ten years of research on the one hand and the interdisciplinary literature on working cultures and various organizational theories on the other hand. In this paper, a retrospective analysis of more than ten years of research work in inter- and trans-disciplinary teams in the field of CAS will be performed. This analysis is based on the documented observations of the authors, the study reports, protocols, lab reports and published publications. To additionally evaluate the scientific experience in an interdisciplinary research team, a literature analysis regarding scientific literature on trans- and inter-disciplinarity was performed. Own research and literature analyses were compared. Both the literature and the scientific experience in an interdisciplinary research team show that consensus finding is not always easy. It is, however, important to start trans- and interdisciplinary projects with a shared mental model and common goals, which include communication and leadership issues within the project teams, i.e. clear and unambiguous information about the individual responsibilities and objectives to attain. This is made necessary due to differing leadership cultures within the cooperating disciplines. Another research outcome is the relevance of a cooperative learning culture throughout the complete duration of the project. Based on this cooperation, new ideas and projects were developed, i.e. a training concept for surgical trainers including technological competence for surgeons. An adapted innovative paradigm for a cooperating working culture in CAS is based on a shared mental model and common goals from the very beginning of a project. All actors in trans- and inter-disciplinary teams need to be interested in cooperation. This will lead to a common view on patients and technology models. Copyright © 2015 Elsevier B.V. All rights reserved.
Idea Paper: The Lifecycle of Software for Scientific Simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dubey, Anshu; McInnes, Lois C.
The software lifecycle is a well researched topic that has produced many models to meet the needs of different types of software projects. However, one class of projects, software development for scientific computing, has received relatively little attention from lifecycle researchers. In particular, software for end-to-end computations for obtaining scientific results has received few lifecycle proposals and no formalization of a development model. An examination of development approaches employed by the teams implementing large multicomponent codes reveals a great deal of similarity in their strategies. This idea paper formalizes these related approaches into a lifecycle model for end-to-end scientific applicationmore » software, featuring loose coupling between submodels for development of infrastructure and scientific capability. We also invite input from stakeholders to converge on a model that captures the complexity of this development processes and provides needed lifecycle guidance to the scientific software community.« less
NASA Astrophysics Data System (ADS)
Casu, F.; Bonano, M.; de Luca, C.; Lanari, R.; Manunta, M.; Manzo, M.; Zinno, I.
2017-12-01
Since its launch in 2014, the Sentinel-1 (S1) constellation has played a key role on SAR data availability and dissemination all over the World. Indeed, the free and open access data policy adopted by the European Copernicus program together with the global coverage acquisition strategy, make the Sentinel constellation as a game changer in the Earth Observation scenario. Being the SAR data become ubiquitous, the technological and scientific challenge is focused on maximizing the exploitation of such huge data flow. In this direction, the use of innovative processing algorithms and distributed computing infrastructures, such as the Cloud Computing platforms, can play a crucial role. In this work we present a Cloud Computing solution for the advanced interferometric (DInSAR) processing chain based on the Parallel SBAS (P-SBAS) approach, aimed at processing S1 Interferometric Wide Swath (IWS) data for the generation of large spatial scale deformation time series in efficient, automatic and systematic way. Such a DInSAR chain ingests Sentinel 1 SLC images and carries out several processing steps, to finally compute deformation time series and mean deformation velocity maps. Different parallel strategies have been designed ad hoc for each processing step of the P-SBAS S1 chain, encompassing both multi-core and multi-node programming techniques, in order to maximize the computational efficiency achieved within a Cloud Computing environment and cut down the relevant processing times. The presented P-SBAS S1 processing chain has been implemented on the Amazon Web Services platform and a thorough analysis of the attained parallel performances has been performed to identify and overcome the major bottlenecks to the scalability. The presented approach is used to perform national-scale DInSAR analyses over Italy, involving the processing of more than 3000 S1 IWS images acquired from both ascending and descending orbits. Such an experiment confirms the big advantage of exploiting large computational and storage resources of Cloud Computing platforms for large scale DInSAR analysis. The presented Cloud Computing P-SBAS processing chain can be a precious tool in the perspective of developing operational services disposable for the EO scientific community related to hazard monitoring and risk prevention and mitigation.
The ImageJ ecosystem: an open platform for biomedical image analysis
Schindelin, Johannes; Rueden, Curtis T.; Hiner, Mark C.; Eliceiri, Kevin W.
2015-01-01
Technology in microscopy advances rapidly, enabling increasingly affordable, faster, and more precise quantitative biomedical imaging, which necessitates correspondingly more-advanced image processing and analysis techniques. A wide range of software is available – from commercial to academic, special-purpose to Swiss army knife, small to large–but a key characteristic of software that is suitable for scientific inquiry is its accessibility. Open-source software is ideal for scientific endeavors because it can be freely inspected, modified, and redistributed; in particular, the open-software platform ImageJ has had a huge impact on life sciences, and continues to do so. From its inception, ImageJ has grown significantly due largely to being freely available and its vibrant and helpful user community. Scientists as diverse as interested hobbyists, technical assistants, students, scientific staff, and advanced biology researchers use ImageJ on a daily basis, and exchange knowledge via its dedicated mailing list. Uses of ImageJ range from data visualization and teaching to advanced image processing and statistical analysis. The software's extensibility continues to attract biologists at all career stages as well as computer scientists who wish to effectively implement specific image-processing algorithms. In this review, we use the ImageJ project as a case study of how open-source software fosters its suites of software tools, making multitudes of image-analysis technology easily accessible to the scientific community. We specifically explore what makes ImageJ so popular, how it impacts life science, how it inspires other projects, and how it is self-influenced by coevolving projects within the ImageJ ecosystem. PMID:26153368
The ImageJ ecosystem: An open platform for biomedical image analysis.
Schindelin, Johannes; Rueden, Curtis T; Hiner, Mark C; Eliceiri, Kevin W
2015-01-01
Technology in microscopy advances rapidly, enabling increasingly affordable, faster, and more precise quantitative biomedical imaging, which necessitates correspondingly more-advanced image processing and analysis techniques. A wide range of software is available-from commercial to academic, special-purpose to Swiss army knife, small to large-but a key characteristic of software that is suitable for scientific inquiry is its accessibility. Open-source software is ideal for scientific endeavors because it can be freely inspected, modified, and redistributed; in particular, the open-software platform ImageJ has had a huge impact on the life sciences, and continues to do so. From its inception, ImageJ has grown significantly due largely to being freely available and its vibrant and helpful user community. Scientists as diverse as interested hobbyists, technical assistants, students, scientific staff, and advanced biology researchers use ImageJ on a daily basis, and exchange knowledge via its dedicated mailing list. Uses of ImageJ range from data visualization and teaching to advanced image processing and statistical analysis. The software's extensibility continues to attract biologists at all career stages as well as computer scientists who wish to effectively implement specific image-processing algorithms. In this review, we use the ImageJ project as a case study of how open-source software fosters its suites of software tools, making multitudes of image-analysis technology easily accessible to the scientific community. We specifically explore what makes ImageJ so popular, how it impacts the life sciences, how it inspires other projects, and how it is self-influenced by coevolving projects within the ImageJ ecosystem. © 2015 Wiley Periodicals, Inc.
Snowflake: A Lightweight Portable Stencil DSL
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Nathan; Driscoll, Michael; Markley, Charles
Stencil computations are not well optimized by general-purpose production compilers and the increased use of multicore, manycore, and accelerator-based systems makes the optimization problem even more challenging. In this paper we present Snowflake, a Domain Specific Language (DSL) for stencils that uses a 'micro-compiler' approach, i.e., small, focused, domain-specific code generators. The approach is similar to that used in image processing stencils, but Snowflake handles the much more complex stencils that arise in scientific computing, including complex boundary conditions, higher-order operators (larger stencils), higher dimensions, variable coefficients, non-unit-stride iteration spaces, and multiple input or output meshes. Snowflake is embedded inmore » the Python language, allowing it to interoperate with popular scientific tools like SciPy and iPython; it also takes advantage of built-in Python libraries for powerful dependence analysis as part of a just-in-time compiler. We demonstrate the power of the Snowflake language and the micro-compiler approach with a complex scientific benchmark, HPGMG, that exercises the generality of stencil support in Snowflake. By generating OpenMP comparable to, and OpenCL within a factor of 2x of hand-optimized HPGMG, Snowflake demonstrates that a micro-compiler can support diverse processor architectures and is performance-competitive whilst preserving a high-level Python implementation.« less
Snowflake: A Lightweight Portable Stencil DSL
Zhang, Nathan; Driscoll, Michael; Markley, Charles; ...
2017-05-01
Stencil computations are not well optimized by general-purpose production compilers and the increased use of multicore, manycore, and accelerator-based systems makes the optimization problem even more challenging. In this paper we present Snowflake, a Domain Specific Language (DSL) for stencils that uses a 'micro-compiler' approach, i.e., small, focused, domain-specific code generators. The approach is similar to that used in image processing stencils, but Snowflake handles the much more complex stencils that arise in scientific computing, including complex boundary conditions, higher-order operators (larger stencils), higher dimensions, variable coefficients, non-unit-stride iteration spaces, and multiple input or output meshes. Snowflake is embedded inmore » the Python language, allowing it to interoperate with popular scientific tools like SciPy and iPython; it also takes advantage of built-in Python libraries for powerful dependence analysis as part of a just-in-time compiler. We demonstrate the power of the Snowflake language and the micro-compiler approach with a complex scientific benchmark, HPGMG, that exercises the generality of stencil support in Snowflake. By generating OpenMP comparable to, and OpenCL within a factor of 2x of hand-optimized HPGMG, Snowflake demonstrates that a micro-compiler can support diverse processor architectures and is performance-competitive whilst preserving a high-level Python implementation.« less
Li, Y; Nielsen, P V
2011-12-01
There has been a rapid growth of scientific literature on the application of computational fluid dynamics (CFD) in the research of ventilation and indoor air science. With a 1000-10,000 times increase in computer hardware capability in the past 20 years, CFD has become an integral part of scientific research and engineering development of complex air distribution and ventilation systems in buildings. This review discusses the major and specific challenges of CFD in terms of turbulence modelling, numerical approximation, and boundary conditions relevant to building ventilation. We emphasize the growing need for CFD verification and validation, suggest ongoing needs for analytical and experimental methods to support the numerical solutions, and discuss the growing capacity of CFD in opening up new research areas. We suggest that CFD has not become a replacement for experiment and theoretical analysis in ventilation research, rather it has become an increasingly important partner. We believe that an effective scientific approach for ventilation studies is still to combine experiments, theory, and CFD. We argue that CFD verification and validation are becoming more crucial than ever as more complex ventilation problems are solved. It is anticipated that ventilation problems at the city scale will be tackled by CFD in the next 10 years. © 2011 John Wiley & Sons A/S.
Ng, C M
2013-10-01
The development of a population PK/PD model, an essential component for model-based drug development, is both time- and labor-intensive. A graphical-processing unit (GPU) computing technology has been proposed and used to accelerate many scientific computations. The objective of this study was to develop a hybrid GPU-CPU implementation of parallelized Monte Carlo parametric expectation maximization (MCPEM) estimation algorithm for population PK data analysis. A hybrid GPU-CPU implementation of the MCPEM algorithm (MCPEMGPU) and identical algorithm that is designed for the single CPU (MCPEMCPU) were developed using MATLAB in a single computer equipped with dual Xeon 6-Core E5690 CPU and a NVIDIA Tesla C2070 GPU parallel computing card that contained 448 stream processors. Two different PK models with rich/sparse sampling design schemes were used to simulate population data in assessing the performance of MCPEMCPU and MCPEMGPU. Results were analyzed by comparing the parameter estimation and model computation times. Speedup factor was used to assess the relative benefit of parallelized MCPEMGPU over MCPEMCPU in shortening model computation time. The MCPEMGPU consistently achieved shorter computation time than the MCPEMCPU and can offer more than 48-fold speedup using a single GPU card. The novel hybrid GPU-CPU implementation of parallelized MCPEM algorithm developed in this study holds a great promise in serving as the core for the next-generation of modeling software for population PK/PD analysis.
Explore the virtual side of earth science
,
1998-01-01
Scientists have always struggled to find an appropriate technology that could represent three-dimensional (3-D) data, facilitate dynamic analysis, and encourage on-the-fly interactivity. In the recent past, scientific visualization has increased the scientist's ability to visualize information, but it has not provided the interactive environment necessary for rapidly changing the model or for viewing the model in ways not predetermined by the visualization specialist. Virtual Reality Modeling Language (VRML 2.0) is a new environment for visualizing 3-D information spaces and is accessible through the Internet with current browser technologies. Researchers from the U.S. Geological Survey (USGS) are using VRML as a scientific visualization tool to help convey complex scientific concepts to various audiences. Kevin W. Laurent, computer scientist, and Maura J. Hogan, technical information specialist, have created a collection of VRML models available through the Internet at Virtual Earth Science (virtual.er.usgs.gov).
Reflections on biomedical informatics: from cybernetics to genomic medicine and nanomedicine.
Maojo, Victor; Kulikowski, Casimir A
2006-01-01
Expanding on our previous analysis of Biomedical Informatics (BMI), the present perspective ranges from cybernetics to nanomedicine, based on its scientific, historical, philosophical, theoretical, experimental, and technological aspects as they affect systems developments, simulation and modelling, education, and the impact on healthcare. We then suggest that BMI is still searching for strong basic scientific principles around which it can crystallize. As -omic biological knowledge increasingly impacts the future of medicine, ubiquitous computing and informatics become even more essential, not only for the technological infrastructure, but as a part of the scientific enterprise itself. The Virtual Physiological Human and investigations into nanomedicine will surely produce yet more unpredictable opportunities, leading to significant changes in biomedical research and practice. As a discipline involved in making such advances possible, BMI is likely to need to re-define itself and extend its research horizons to meet the new challenges.
Can the behavioral sciences self-correct? A social epistemic study.
Romero, Felipe
2016-12-01
Advocates of the self-corrective thesis argue that scientific method will refute false theories and find closer approximations to the truth in the long run. I discuss a contemporary interpretation of this thesis in terms of frequentist statistics in the context of the behavioral sciences. First, I identify experimental replications and systematic aggregation of evidence (meta-analysis) as the self-corrective mechanism. Then, I present a computer simulation study of scientific communities that implement this mechanism to argue that frequentist statistics may converge upon a correct estimate or not depending on the social structure of the community that uses it. Based on this study, I argue that methodological explanations of the "replicability crisis" in psychology are limited and propose an alternative explanation in terms of biases. Finally, I conclude suggesting that scientific self-correction should be understood as an interaction effect between inference methods and social structures. Copyright © 2016 Elsevier Ltd. All rights reserved.
Conceptual model of iCAL4LA: Proposing the components using comparative analysis
NASA Astrophysics Data System (ADS)
Ahmad, Siti Zulaiha; Mutalib, Ariffin Abdul
2016-08-01
This paper discusses an on-going study that initiates an initial process in determining the common components for a conceptual model of interactive computer-assisted learning that is specifically designed for low achieving children. This group of children needs a specific learning support that can be used as an alternative learning material in their learning environment. In order to develop the conceptual model, this study extracts the common components from 15 strongly justified computer assisted learning studies. A comparative analysis has been conducted to determine the most appropriate components by using a set of specific indication classification to prioritize the applicability. The results of the extraction process reveal 17 common components for consideration. Later, based on scientific justifications, 16 of them were selected as the proposed components for the model.
Xarray: multi-dimensional data analysis in Python
NASA Astrophysics Data System (ADS)
Hoyer, Stephan; Hamman, Joe; Maussion, Fabien
2017-04-01
xarray (http://xarray.pydata.org) is an open source project and Python package that provides a toolkit and data structures for N-dimensional labeled arrays, which are the bread and butter of modern geoscientific data analysis. Key features of the package include label-based indexing and arithmetic, interoperability with the core scientific Python packages (e.g., pandas, NumPy, Matplotlib, Cartopy), out-of-core computation on datasets that don't fit into memory, a wide range of input/output options, and advanced multi-dimensional data manipulation tools such as group-by and resampling. In this contribution we will present the key features of the library and demonstrate its great potential for a wide range of applications, from (big-)data processing on super computers to data exploration in front of a classroom.
Nephele: a cloud platform for simplified, standardized and reproducible microbiome data analysis.
Weber, Nick; Liou, David; Dommer, Jennifer; MacMenamin, Philip; Quiñones, Mariam; Misner, Ian; Oler, Andrew J; Wan, Joe; Kim, Lewis; Coakley McCarthy, Meghan; Ezeji, Samuel; Noble, Karlynn; Hurt, Darrell E
2018-04-15
Widespread interest in the study of the microbiome has resulted in data proliferation and the development of powerful computational tools. However, many scientific researchers lack the time, training, or infrastructure to work with large datasets or to install and use command line tools. The National Institute of Allergy and Infectious Diseases (NIAID) has created Nephele, a cloud-based microbiome data analysis platform with standardized pipelines and a simple web interface for transforming raw data into biological insights. Nephele integrates common microbiome analysis tools as well as valuable reference datasets like the healthy human subjects cohort of the Human Microbiome Project (HMP). Nephele is built on the Amazon Web Services cloud, which provides centralized and automated storage and compute capacity, thereby reducing the burden on researchers and their institutions. https://nephele.niaid.nih.gov and https://github.com/niaid/Nephele. darrell.hurt@nih.gov.
A Python-based interface to examine motions in time series of solar images
NASA Astrophysics Data System (ADS)
Campos-Rozo, J. I.; Vargas Domínguez, S.
2017-10-01
Python is considered to be a mature programming language, besides of being widely accepted as an engaging option for scientific analysis in multiple areas, as will be presented in this work for the particular case of solar physics research. SunPy is an open-source library based on Python that has been recently developed to furnish software tools to solar data analysis and visualization. In this work we present a graphical user interface (GUI) based on Python and Qt to effectively compute proper motions for the analysis of time series of solar data. This user-friendly computing interface, that is intended to be incorporated to the Sunpy library, uses a local correlation tracking technique and some extra tools that allows the selection of different parameters to calculate, vizualize and analyze vector velocity fields of solar data, i.e. time series of solar filtergrams and magnetograms.
Nephele: a cloud platform for simplified, standardized and reproducible microbiome data analysis
Weber, Nick; Liou, David; Dommer, Jennifer; MacMenamin, Philip; Quiñones, Mariam; Misner, Ian; Oler, Andrew J; Wan, Joe; Kim, Lewis; Coakley McCarthy, Meghan; Ezeji, Samuel; Noble, Karlynn; Hurt, Darrell E
2018-01-01
Abstract Motivation Widespread interest in the study of the microbiome has resulted in data proliferation and the development of powerful computational tools. However, many scientific researchers lack the time, training, or infrastructure to work with large datasets or to install and use command line tools. Results The National Institute of Allergy and Infectious Diseases (NIAID) has created Nephele, a cloud-based microbiome data analysis platform with standardized pipelines and a simple web interface for transforming raw data into biological insights. Nephele integrates common microbiome analysis tools as well as valuable reference datasets like the healthy human subjects cohort of the Human Microbiome Project (HMP). Nephele is built on the Amazon Web Services cloud, which provides centralized and automated storage and compute capacity, thereby reducing the burden on researchers and their institutions. Availability and implementation https://nephele.niaid.nih.gov and https://github.com/niaid/Nephele Contact darrell.hurt@nih.gov PMID:29028892
The markup is the model: reasoning about systems biology models in the Semantic Web era.
Kell, Douglas B; Mendes, Pedro
2008-06-07
Metabolic control analysis, co-invented by Reinhart Heinrich, is a formalism for the analysis of biochemical networks, and is a highly important intellectual forerunner of modern systems biology. Exchanging ideas and exchanging models are part of the international activities of science and scientists, and the Systems Biology Markup Language (SBML) allows one to perform the latter with great facility. Encoding such models in SBML allows their distributed analysis using loosely coupled workflows, and with the advent of the Internet the various software modules that one might use to analyze biochemical models can reside on entirely different computers and even on different continents. Optimization is at the core of many scientific and biotechnological activities, and Reinhart made many major contributions in this area, stimulating our own activities in the use of the methods of evolutionary computing for optimization.
Introductory Tools for Radiative Transfer Models
NASA Astrophysics Data System (ADS)
Feldman, D.; Kuai, L.; Natraj, V.; Yung, Y.
2006-12-01
Satellite data are currently so voluminous that, despite their unprecedented quality and potential for scientific application, only a small fraction is analyzed due to two factors: researchers' computational constraints and a relatively small number of researchers actively utilizing the data. Ultimately it is hoped that the terabytes of unanalyzed data being archived can receive scientific scrutiny but this will require a popularization of the methods associated with the analysis. Since a large portion of complexity is associated with the proper implementation of the radiative transfer model, it is reasonable and appropriate to make the model as accessible as possible to general audiences. Unfortunately, the algorithmic and conceptual details that are necessary for state-of-the-art analysis also tend to frustrate the accessibility for those new to remote sensing. Several efforts have been made to have web- based radiative transfer calculations, and these are useful for limited calculations, but analysis of more than a few spectra requires the utilization of home- or server-based computing resources. We present a system that is designed to allow for easier access to radiative transfer models with implementation on a home computing platform in the hopes that this system can be utilized in and expanded upon in advanced high school and introductory college settings. This learning-by-doing process is aided through the use of several powerful tools. The first is a wikipedia-style introduction to the salient features of radiative transfer that references the seminal works in the field and refers to more complicated calculations and algorithms sparingly5. The second feature is a technical forum, commonly referred to as a tiki-wiki, that addresses technical and conceptual questions through public postings, private messages, and a ranked searching routine. Together, these tools may be able to facilitate greater interest in the field of remote sensing.
NASA Astrophysics Data System (ADS)
Tiede, Dirk; Lang, Stefan
2010-11-01
In this paper we focus on the application of transferable, object-based image analysis algorithms for dwelling extraction in a camp for internally displaced people (IDP) in Darfur, Sudan along with innovative means for scientific visualisation of the results. Three very high spatial resolution satellite images (QuickBird: 2002, 2004, 2008) were used for: (1) extracting different types of dwellings and (2) calculating and visualizing added-value products such as dwelling density and camp structure. The results were visualized on virtual globes (Google Earth and ArcGIS Explorer) revealing the analysis results (analytical 3D views,) transformed into the third dimension (z-value). Data formats depend on virtual globe software including KML/KMZ (keyhole mark-up language) and ESRI 3D shapefiles streamed as ArcGIS Server-based globe service. In addition, means for improving overall performance of automated dwelling structures using grid computing techniques are discussed using examples from a similar study.
Knepper, Richard; Börner, Katy
2016-01-01
This paper presents the results of a study that compares resource usage with publication output using data about the consumption of CPU cycles from the Extreme Science and Engineering Discovery Environment (XSEDE) and resulting scientific publications for 2,691 institutions/teams. Specifically, the datasets comprise a total of 5,374,032,696 central processing unit (CPU) hours run in XSEDE during July 1, 2011 to August 18, 2015 and 2,882 publications that cite the XSEDE resource. Three types of studies were conducted: a geospatial analysis of XSEDE providers and consumers, co-authorship network analysis of XSEDE publications, and bi-modal network analysis of how XSEDE resources are used by different research fields. Resulting visualizations show that a diverse set of consumers make use of XSEDE resources, that users of XSEDE publish together frequently, and that the users of XSEDE with the highest resource usage tend to be "traditional" high-performance computing (HPC) community members from astronomy, atmospheric science, physics, chemistry, and biology.
Center for Technology for Advanced Scientific Componet Software (TASCS)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Govindaraju, Madhusudhan
Advanced Scientific Computing Research Computer Science FY 2010Report Center for Technology for Advanced Scientific Component Software: Distributed CCA State University of New York, Binghamton, NY, 13902 Summary The overall objective of Binghamton's involvement is to work on enhancements of the CCA environment, motivated by the applications and research initiatives discussed in the proposal. This year we are working on re-focusing our design and development efforts to develop proof-of-concept implementations that have the potential to significantly impact scientific components. We worked on developing parallel implementations for non-hydrostatic code and worked on a model coupling interface for biogeochemical computations coded in MATLAB.more » We also worked on the design and implementation modules that will be required for the emerging MapReduce model to be effective for scientific applications. Finally, we focused on optimizing the processing of scientific datasets on multi-core processors. Research Details We worked on the following research projects that we are working on applying to CCA-based scientific applications. 1. Non-Hydrostatic Hydrodynamics: Non-static hydrodynamics are significantly more accurate at modeling internal waves that may be important in lake ecosystems. Non-hydrostatic codes, however, are significantly more computationally expensive, often prohibitively so. We have worked with Chin Wu at the University of Wisconsin to parallelize non-hydrostatic code. We have obtained a speed up of about 26 times maximum. Although this is significant progress, we hope to improve the performance further, such that it becomes a practical alternative to hydrostatic codes. 2. Model-coupling for water-based ecosystems: To answer pressing questions about water resources requires that physical models (hydrodynamics) be coupled with biological and chemical models. Most hydrodynamics codes are written in Fortran, however, while most ecologists work in MATLAB. This disconnect creates a great barrier. To address this, we are working on a model coupling interface that will allow biogeochemical computations written in MATLAB to couple with Fortran codes. This will greatly improve the productivity of ecosystem scientists. 2. Low overhead and Elastic MapReduce Implementation Optimized for Memory and CPU-Intensive Applications: Since its inception, MapReduce has frequently been associated with Hadoop and large-scale datasets. Its deployment at Amazon in the cloud, and its applications at Yahoo! for large-scale distributed document indexing and database building, among other tasks, have thrust MapReduce to the forefront of the data processing application domain. The applicability of the paradigm however extends far beyond its use with data intensive applications and diskbased systems, and can also be brought to bear in processing small but CPU intensive distributed applications. MapReduce however carries its own burdens. Through experiments using Hadoop in the context of diverse applications, we uncovered latencies and delay conditions potentially inhibiting the expected performance of a parallel execution in CPU-intensive applications. Furthermore, as it currently stands, MapReduce is favored for data-centric applications, and as such tends to be solely applied to disk-based applications. The paradigm, falls short in bringing its novelty to diskless systems dedicated to in-memory applications, and compute intensive programs processing much smaller data, but requiring intensive computations. In this project, we focused both on the performance of processing large-scale hierarchical data in distributed scientific applications, as well as the processing of smaller but demanding input sizes primarily used in diskless, and memory resident I/O systems. We designed LEMO-MR [1], a Low overhead, elastic, configurable for in- memory applications, and on-demand fault tolerance, an optimized implementation of MapReduce, for both on disk and in memory applications. We conducted experiments to identify not only the necessary components of this model, but also trade offs and factors to be considered. We have initial results to show the efficacy of our implementation in terms of potential speedup that can be achieved for representative data sets used by cloud applications. We have quantified the performance gains exhibited by our MapReduce implementation over Apache Hadoop in a compute intensive environment. 3. Cache Performance Optimization for Processing XML and HDF-based Application Data on Multi-core Processors: It is important to design and develop scientific middleware libraries to harness the opportunities presented by emerging multi-core processors. Implementations of scientific middleware and applications that do not adapt to the programming paradigm when executing on emerging processors can severely impact the overall performance. In this project, we focused on the utilization of the L2 cache, which is a critical shared resource on chip multiprocessors (CMP). The access pattern of the shared L2 cache, which is dependent on how the application schedules and assigns processing work to each thread, can either enhance or hurt the ability to hide memory latency on a multi-core processor. Therefore, while processing scientific datasets such as HDF5, it is essential to conduct fine-grained analysis of cache utilization, to inform scheduling decisions in multi-threaded programming. In this project, using the TAU toolkit for performance feedback from dual- and quad-core machines, we conducted performance analysis and recommendations on how processing threads can be scheduled on multi-core nodes to enhance the performance of a class of scientific applications that requires processing of HDF5 data. In particular, we quantified the gains associated with the use of the adaptations we have made to the Cache-Affinity and Balanced-Set scheduling algorithms to improve L2 cache performance, and hence the overall application execution time [2]. References: 1. Zacharia Fadika, Madhusudhan Govindaraju, ``MapReduce Implementation for Memory-Based and Processing Intensive Applications'', accepted in 2nd IEEE International Conference on Cloud Computing Technology and Science, Indianapolis, USA, Nov 30 - Dec 3, 2010. 2. Rajdeep Bhowmik, Madhusudhan Govindaraju, ``Cache Performance Optimization for Processing XML-based Application Data on Multi-core Processors'', in proceedings of The 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, May 17-20, 2010, Melbourne, Victoria, Australia. Contact Information: Madhusudhan Govindaraju Binghamton University State University of New York (SUNY) mgovinda@cs.binghamton.edu Phone: 607-777-4904« less
A feasibility study on porting the community land model onto accelerators using OpenACC
Wang, Dali; Wu, Wei; Winkler, Frank; ...
2014-01-01
As environmental models (such as Accelerated Climate Model for Energy (ACME), Parallel Reactive Flow and Transport Model (PFLOTRAN), Arctic Terrestrial Simulator (ATS), etc.) became more and more complicated, we are facing enormous challenges regarding to porting those applications onto hybrid computing architecture. OpenACC appears as a very promising technology, therefore, we have conducted a feasibility analysis on porting the Community Land Model (CLM), a terrestrial ecosystem model within the Community Earth System Models (CESM)). Specifically, we used automatic function testing platform to extract a small computing kernel out of CLM, then we apply this kernel into the actually CLM dataflowmore » procedure, and investigate the strategy of data parallelization and the benefit of data movement provided by current implementation of OpenACC. Even it is a non-intensive kernel, on a single 16-core computing node, the performance (based on the actual computation time using one GPU) of OpenACC implementation is 2.3 time faster than that of OpenMP implementation using single OpenMP thread, but it is 2.8 times slower than the performance of OpenMP implementation using 16 threads. On multiple nodes, MPI_OpenACC implementation demonstrated very good scalability on up to 128 GPUs on 128 computing nodes. This study also provides useful information for us to look into the potential benefits of “deep copy” capability and “routine” feature of OpenACC standards. In conclusion, we believe that our experience on the environmental model, CLM, can be beneficial to many other scientific research programs who are interested to porting their large scale scientific code using OpenACC onto high-end computers, empowered by hybrid computing architecture.« less
Introduction to computers: Reference guide
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ligon, F.V.
1995-04-01
The ``Introduction to Computers`` program establishes formal partnerships with local school districts and community-based organizations, introduces computer literacy to precollege students and their parents, and encourages students to pursue Scientific, Mathematical, Engineering, and Technical careers (SET). Hands-on assignments are given in each class, reinforcing the lesson taught. In addition, the program is designed to broaden the knowledge base of teachers in scientific/technical concepts, and Brookhaven National Laboratory continues to act as a liaison, offering educational outreach to diverse community organizations and groups. This manual contains the teacher`s lesson plans and the student documentation to this introduction to computer course.
ERIC Educational Resources Information Center
Ruzhitskaya, Lanika
2011-01-01
The presented research study investigated the effects of computer-supported inquiry-based learning and peer interaction methods on effectiveness of learning a scientific concept. The stellar parallax concept was selected as a basic, and yet important in astronomy, scientific construct, which is based on a straightforward relationship of several…
ERIC Educational Resources Information Center
Donna, Joel D.; Miller, Brant G.
2013-01-01
Technology plays a crucial role in facilitating collaboration within the scientific community. Cloud-computing applications, such as Google Drive, can be used to model such collaboration and support inquiry within the secondary science classroom. Little is known about pre-service teachers' beliefs related to the envisioned use of collaborative,…
ERIC Educational Resources Information Center
Kunsting, Josef; Wirth, Joachim; Paas, Fred
2011-01-01
Using a computer-based scientific discovery learning environment on buoyancy in fluids we investigated the "effects of goal specificity" (nonspecific goals vs. specific goals) for two goal types (problem solving goals vs. learning goals) on "strategy use" and "instructional efficiency". Our empirical findings close an important research gap,…
1987-10-01
include Security Classification) Instrumentation for scientific computing in neural networks, information science, artificial intelligence, and...instrumentation grant to purchase equipment for support of research in neural networks, information science, artificail intellignece , and applied mathematics...in Neural Networks, Information Science, Artificial Intelligence, and Applied Mathematics Contract AFOSR 86-0282 Principal Investigator: Stephen
The Observation of Bahasa Indonesia Official Computer Terms Implementation in Scientific Publication
NASA Astrophysics Data System (ADS)
Gunawan, D.; Amalia, A.; Lydia, M. S.; Muthaqin, M. I.
2018-03-01
The government of the Republic of Indonesia had issued a regulation to substitute computer terms in foreign language that have been used earlier into official computer terms in Bahasa Indonesia. This regulation was stipulated in Presidential Decree No. 2 of 2001 concerning the introduction of official computer terms in Bahasa Indonesia (known as Senarai Padanan Istilah/SPI). After sixteen years, people of Indonesia, particularly for academics, should have implemented the official computer terms in their official publications. This observation is conducted to discover the implementation of official computer terms usage in scientific publications which are written in Bahasa Indonesia. The data source used in this observation are the publications by the academics, particularly in computer science field. The method used in the observation is divided into four stages. The first stage is metadata harvesting by using Open Archive Initiative - Protocol for Metadata Harvesting (OAI-PMH). Second, converting the harvested document (in pdf format) to plain text. The third stage is text-preprocessing as the preparation of string matching. Then the final stage is searching the official computer terms based on 629 SPI terms by using Boyer-Moore algorithm. We observed that there are 240,781 foreign computer terms in 1,156 scientific publications from six universities. This result shows that the foreign computer terms are still widely used by the academics.
Evaluating the Efficacy of the Cloud for Cluster Computation
NASA Technical Reports Server (NTRS)
Knight, David; Shams, Khawaja; Chang, George; Soderstrom, Tom
2012-01-01
Computing requirements vary by industry, and it follows that NASA and other research organizations have computing demands that fall outside the mainstream. While cloud computing made rapid inroads for tasks such as powering web applications, performance issues on highly distributed tasks hindered early adoption for scientific computation. One venture to address this problem is Nebula, NASA's homegrown cloud project tasked with delivering science-quality cloud computing resources. However, another industry development is Amazon's high-performance computing (HPC) instances on Elastic Cloud Compute (EC2) that promises improved performance for cluster computation. This paper presents results from a series of benchmarks run on Amazon EC2 and discusses the efficacy of current commercial cloud technology for running scientific applications across a cluster. In particular, a 240-core cluster of cloud instances achieved 2 TFLOPS on High-Performance Linpack (HPL) at 70% of theoretical computational performance. The cluster's local network also demonstrated sub-100 ?s inter-process latency with sustained inter-node throughput in excess of 8 Gbps. Beyond HPL, a real-world Hadoop image processing task from NASA's Lunar Mapping and Modeling Project (LMMP) was run on a 29 instance cluster to process lunar and Martian surface images with sizes on the order of tens of gigapixels. These results demonstrate that while not a rival of dedicated supercomputing clusters, commercial cloud technology is now a feasible option for moderately demanding scientific workloads.
An infrastructure for the integration of geoscience instruments and sensors on the Grid
NASA Astrophysics Data System (ADS)
Pugliese, R.; Prica, M.; Kourousias, G.; Del Linz, A.; Curri, A.
2009-04-01
The Grid, as a computing paradigm, has long been in the attention of both academia and industry[1]. The distributed and expandable nature of its general architecture result to scalability and more efficient utilisation of the computing infrastructures. The scientific community, including that of geosciences, often handles problems with very high requirements in data processing, transferring, and storing[2,3]. This has raised the interest on Grid technologies but these are often viewed solely as an access gateway to HPC. Suitable Grid infrastructures could provide the geoscience community with additional benefits like those of sharing, remote access and control of scientific systems. These systems can be scientific instruments, sensors, robots, cameras and any other device used in geosciences. The solution for practical, general, and feasible Grid-enabling of such devices requires non-intrusive extensions on core parts of the current Grid architecture. We propose an extended version of an architecture[4] that can serve as the solution to the problem. The solution we propose is called Grid Instrument Element (IE) [5]. It is an addition to the existing core Grid parts; the Computing Element (CE) and the Storage Element (SE) that serve the purposes that their name suggests. The IE that we will be referring to, and the related technologies have been developed in the EU project on the Deployment of Remote Instrumentation Infrastructure (DORII1). In DORII, partners of various scientific communities including those of Earthquake, Environmental science, and Experimental science, have adopted the technology of the Instrument Element in order to integrate to the Grid their devices. The Oceanographic and coastal observation and modelling Mediterranean Ocean Observing Network (OGS2), a DORII partner, is in the process of deploying the above mentioned Grid technologies on two types of observational modules: Argo profiling floats and a novel Autonomous Underwater Vehicle (AUV). In this paper i) we define the need for integration of instrumentation in the Grid, ii) we introduce the solution of the Instrument Element, iii) we demonstrate a suitable end-user web portal for accessing Grid resources, iv) we describe from the Grid-technological point of view the process of the integration to the Grid of two advanced environmental monitoring devices. References [1] M. Surridge, S. Taylor, D. De Roure, and E. Zaluska, "Experiences with GRIA—Industrial Applications on a Web Services Grid," e-Science and Grid Computing, First International Conference on e-Science and Grid Computing, 2005, pp. 98-105. [2] A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, and S. Tuecke, "The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets," Journal of Network and Computer Applications, vol. 23, 2000, pp. 187-200. [3] B. Allcock, J. Bester, J. Bresnahan, A.L. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnel, and S. Tuecke, "Data management and transfer in high-performance computational grid environments," Parallel Computing, vol. 28, 2002, pp. 749-771. [4] E. Frizziero, M. Gulmini, F. Lelli, G. Maron, A. Oh, S. Orlando, A. Petrucci, S. Squizzato, and S. Traldi, "Instrument Element: A New Grid component that Enables the Control of Remote Instrumentation," Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06)-Volume 00, IEEE Computer Society Washington, DC, USA, 2006. [5] R. Ranon, L. De Marco, A. Senerchia, S. Gabrielli, L. Chittaro, R. Pugliese, L. Del Cano, F. Asnicar, and M. Prica, "A Web-based Tool for Collaborative Access to Scientific Instruments in Cyberinfrastructures." 1 The DORII project is supported by the European Commission within the 7th Framework Programme (FP7/2007-2013) under grant agreement no. RI-213110. URL: http://www.dorii.eu 2 Istituto Nazionale di Oceanografia e di Geofisica Sperimentale. URL: http://www.ogs.trieste.it
Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi
NASA Astrophysics Data System (ADS)
Abdurachmanov, David; Bockelman, Brian; Elmer, Peter; Eulisse, Giulio; Knight, Robert; Muzaffar, Shahzad
2015-05-01
Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sanders, W.M.; Campbell, C.L.; Lester, J.V.
1979-09-01
The Los Alamos Scientific Laboratory is funded by the US Department of Agriculture to apply scientific and computer technology to solve agricultural problems. This report summarizes work during the period October 1, 1977, through September 30, 1978, on the application of computer technology to three areas: (1) surveillance of slaughterplants in Texas; (2) a pilot study of the New Mexico Brucellosis Eradication Program; and (3) the Market Cattle Identification program in Texas.
1977-05-10
apply this method of forecast- ing in the solution of all major scientific-technical problems of the na- tional economy. Citing the slow...the future, however, computers will "mature" and learn to recognize patterns in what amounts to a much more complex language—the language of visual...images. Photoelectronic tracking devices or "eyes" will allow the computer to take in information in a much more complex form and to perform opera
Comments on the Development of Computational Mathematics in Czechoslovakia and in the USSR.
1987-03-01
ACT (COusduMe an reverse .eld NE 4040604W SWi 1410011 6F 660" ambe The talk is an Invited lecture at Ale Conference on the History of Scientific and...Numeric Computations, May 13-15, 1987, Princeton, New Jersey. It present soon basic subjective observations about the history of numerical methods in...invited lecture at ACH Conference on the History of Scientific and Numeric Computations, May 13’-15, 1987, Princeton, New Jersey. It present some basic
DOE Office of Scientific and Technical Information (OSTI.GOV)
Drugan, C.
2009-12-07
The word 'breakthrough' aptly describes the transformational science and milestones achieved at the Argonne Leadership Computing Facility (ALCF) throughout 2008. The number of research endeavors undertaken at the ALCF through the U.S. Department of Energy's (DOE) Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program grew from 9 in 2007 to 20 in 2008. The allocation of computer time awarded to researchers on the Blue Gene/P also spiked significantly - from nearly 10 million processor hours in 2007 to 111 million in 2008. To support this research, we expanded the capabilities of Intrepid, an IBM Blue Gene/P systemmore » at the ALCF, to 557 teraflops (TF) for production use. Furthermore, we enabled breakthrough levels of productivity and capability in visualization and data analysis with Eureka, a powerful installation of NVIDIA Quadro Plex S4 external graphics processing units. Eureka delivered a quantum leap in visual compute density, providing more than 111 TF and more than 3.2 terabytes of RAM. On April 21, 2008, the dedication of the ALCF realized DOE's vision to bring the power of the Department's high performance computing to open scientific research. In June, the IBM Blue Gene/P supercomputer at the ALCF debuted as the world's fastest for open science and third fastest overall. No question that the science benefited from this growth and system improvement. Four research projects spearheaded by Argonne National Laboratory computer scientists and ALCF users were named to the list of top ten scientific accomplishments supported by DOE's Advanced Scientific Computing Research (ASCR) program. Three of the top ten projects used extensive grants of computing time on the ALCF's Blue Gene/P to model the molecular basis of Parkinson's disease, design proteins at atomic scale, and create enzymes. As the year came to a close, the ALCF was recognized with several prestigious awards at SC08 in November. We provided resources for Linear Scaling Divide-and-Conquer Electronic Structure Calculations for Thousand Atom Nanostructures, a collaborative effort between Argonne, Lawrence Berkeley National Laboratory, and Oak Ridge National Laboratory that received the ACM Gordon Bell Prize Special Award for Algorithmic Innovation. The ALCF also was named a winner in two of the four categories in the HPC Challenge best performance benchmark competition.« less
Climate Analytics as a Service. Chapter 11
NASA Technical Reports Server (NTRS)
Schnase, John L.
2016-01-01
Exascale computing, big data, and cloud computing are driving the evolution of large-scale information systems toward a model of data-proximal analysis. In response, we are developing a concept of climate analytics as a service (CAaaS) that represents a convergence of data analytics and archive management. With this approach, high-performance compute-storage implemented as an analytic system is part of a dynamic archive comprising both static and computationally realized objects. It is a system whose capabilities are framed as behaviors over a static data collection, but where queries cause results to be created, not found and retrieved. Those results can be the product of a complex analysis, but, importantly, they also can be tailored responses to the simplest of requests. NASA's MERRA Analytic Service and associated Climate Data Services API provide a real-world example of climate analytics delivered as a service in this way. Our experiences reveal several advantages to this approach, not the least of which is orders-of-magnitude time reduction in the data assembly task common to many scientific workflows.
New project to support scientific collaboration electronically
NASA Astrophysics Data System (ADS)
Clauer, C. R.; Rasmussen, C. E.; Niciejewski, R. J.; Killeen, T. L.; Kelly, J. D.; Zambre, Y.; Rosenberg, T. J.; Stauning, P.; Friis-Christensen, E.; Mende, S. B.; Weymouth, T. E.; Prakash, A.; McDaniel, S. E.; Olson, G. M.; Finholt, T. A.; Atkins, D. E.
A new multidisciplinary effort is linking research in the upper atmospheric and space, computer, and behavioral sciences to develop a prototype electronic environment for conducting team science worldwide. A real-world electronic collaboration testbed has been established to support scientific work centered around the experimental operations being conducted with instruments from the Sondrestrom Upper Atmospheric Research Facility in Kangerlussuaq, Greenland. Such group computing environments will become an important component of the National Information Infrastructure initiative, which is envisioned as the high-performance communications infrastructure to support national scientific research.
Big Data and Dementia: Charting the Route Ahead for Research, Ethics, and Policy
Ienca, Marcello; Vayena, Effy; Blasimme, Alessandro
2018-01-01
Emerging trends in pervasive computing and medical informatics are creating the possibility for large-scale collection, sharing, aggregation and analysis of unprecedented volumes of data, a phenomenon commonly known as big data. In this contribution, we review the existing scientific literature on big data approaches to dementia, as well as commercially available mobile-based applications in this domain. Our analysis suggests that big data approaches to dementia research and care hold promise for improving current preventive and predictive models, casting light on the etiology of the disease, enabling earlier diagnosis, optimizing resource allocation, and delivering more tailored treatments to patients with specific disease trajectories. Such promissory outlook, however, has not materialized yet, and raises a number of technical, scientific, ethical, and regulatory challenges. This paper provides an assessment of these challenges and charts the route ahead for research, ethics, and policy. PMID:29468161
Integrating advanced visualization technology into the planetary Geoscience workflow
NASA Astrophysics Data System (ADS)
Huffman, John; Forsberg, Andrew; Loomis, Andrew; Head, James; Dickson, James; Fassett, Caleb
2011-09-01
Recent advances in computer visualization have allowed us to develop new tools for analyzing the data gathered during planetary missions, which is important, since these data sets have grown exponentially in recent years to tens of terabytes in size. As part of the Advanced Visualization in Solar System Exploration and Research (ADVISER) project, we utilize several advanced visualization techniques created specifically with planetary image data in mind. The Geoviewer application allows real-time active stereo display of images, which in aggregate have billions of pixels. The ADVISER desktop application platform allows fast three-dimensional visualization of planetary images overlain on digital terrain models. Both applications include tools for easy data ingest and real-time analysis in a programmatic manner. Incorporation of these tools into our everyday scientific workflow has proved important for scientific analysis, discussion, and publication, and enabled effective and exciting educational activities for students from high school through graduate school.
Big Data and Dementia: Charting the Route Ahead for Research, Ethics, and Policy.
Ienca, Marcello; Vayena, Effy; Blasimme, Alessandro
2018-01-01
Emerging trends in pervasive computing and medical informatics are creating the possibility for large-scale collection, sharing, aggregation and analysis of unprecedented volumes of data, a phenomenon commonly known as big data. In this contribution, we review the existing scientific literature on big data approaches to dementia, as well as commercially available mobile-based applications in this domain. Our analysis suggests that big data approaches to dementia research and care hold promise for improving current preventive and predictive models, casting light on the etiology of the disease, enabling earlier diagnosis, optimizing resource allocation, and delivering more tailored treatments to patients with specific disease trajectories. Such promissory outlook, however, has not materialized yet, and raises a number of technical, scientific, ethical, and regulatory challenges. This paper provides an assessment of these challenges and charts the route ahead for research, ethics, and policy.
Privacy-preserving GWAS analysis on federated genomic datasets.
Constable, Scott D; Tang, Yuzhe; Wang, Shuang; Jiang, Xiaoqian; Chapin, Steve
2015-01-01
The biomedical community benefits from the increasing availability of genomic data to support meaningful scientific research, e.g., Genome-Wide Association Studies (GWAS). However, high quality GWAS usually requires a large amount of samples, which can grow beyond the capability of a single institution. Federated genomic data analysis holds the promise of enabling cross-institution collaboration for effective GWAS, but it raises concerns about patient privacy and medical information confidentiality (as data are being exchanged across institutional boundaries), which becomes an inhibiting factor for the practical use. We present a privacy-preserving GWAS framework on federated genomic datasets. Our method is to layer the GWAS computations on top of secure multi-party computation (MPC) systems. This approach allows two parties in a distributed system to mutually perform secure GWAS computations, but without exposing their private data outside. We demonstrate our technique by implementing a framework for minor allele frequency counting and χ2 statistics calculation, one of typical computations used in GWAS. For efficient prototyping, we use a state-of-the-art MPC framework, i.e., Portable Circuit Format (PCF) 1. Our experimental results show promise in realizing both efficient and secure cross-institution GWAS computations.
Institute for scientific computing research;fiscal year 1999 annual report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keyes, D
2000-03-28
Large-scale scientific computation, and all of the disciplines that support it and help to validate it, have been placed at the focus of Lawrence Livermore National Laboratory by the Accelerated Strategic Computing Initiative (ASCI). The Laboratory operates the computer with the highest peak performance in the world and has undertaken some of the largest and most compute-intensive simulations ever performed. Computers at the architectural extremes, however, are notoriously difficult to use efficiently. Even such successes as the Laboratory's two Bell Prizes awarded in November 1999 only emphasize the need for much better ways of interacting with the results of large-scalemore » simulations. Advances in scientific computing research have, therefore, never been more vital to the core missions of the Laboratory than at present. Computational science is evolving so rapidly along every one of its research fronts that to remain on the leading edge, the Laboratory must engage researchers at many academic centers of excellence. In FY 1999, the Institute for Scientific Computing Research (ISCR) has expanded the Laboratory's bridge to the academic community in the form of collaborative subcontracts, visiting faculty, student internships, a workshop, and a very active seminar series. ISCR research participants are integrated almost seamlessly with the Laboratory's Center for Applied Scientific Computing (CASC), which, in turn, addresses computational challenges arising throughout the Laboratory. Administratively, the ISCR flourishes under the Laboratory's University Relations Program (URP). Together with the other four Institutes of the URP, it must navigate a course that allows the Laboratory to benefit from academic exchanges while preserving national security. Although FY 1999 brought more than its share of challenges to the operation of an academic-like research enterprise within the context of a national security laboratory, the results declare the challenges well met and well worth the continued effort. A change of administration for the ISCR occurred during FY 1999. Acting Director John Fitzgerald retired from LLNL in August after 35 years of service, including the last two at helm of the ISCR. David Keyes, who has been a regular visitor in conjunction with ASCI scalable algorithms research since October 1997, overlapped with John for three months and serves half-time as the new Acting Director.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Luo, Y.; Cameron, K.W.
1998-11-24
Workload characterization has been proven an essential tool to architecture design and performance evaluation in both scientific and commercial computing areas. Traditional workload characterization techniques include FLOPS rate, cache miss ratios, CPI (cycles per instruction or IPC, instructions per cycle) etc. With the complexity of sophisticated modern superscalar microprocessors, these traditional characterization techniques are not powerful enough to pinpoint the performance bottleneck of an application on a specific microprocessor. They are also incapable of immediately demonstrating the potential performance benefit of any architectural or functional improvement in a new processor design. To solve these problems, many people rely on simulators,more » which have substantial constraints especially on large-scale scientific computing applications. This paper presents a new technique of characterizing applications at the instruction level using hardware performance counters. It has the advantage of collecting instruction-level characteristics in a few runs virtually without overhead or slowdown. A variety of instruction counts can be utilized to calculate some average abstract workload parameters corresponding to microprocessor pipelines or functional units. Based on the microprocessor architectural constraints and these calculated abstract parameters, the architectural performance bottleneck for a specific application can be estimated. In particular, the analysis results can provide some insight to the problem that only a small percentage of processor peak performance can be achieved even for many very cache-friendly codes. Meanwhile, the bottleneck estimation can provide suggestions about viable architectural/functional improvement for certain workloads. Eventually, these abstract parameters can lead to the creation of an analytical microprocessor pipeline model and memory hierarchy model.« less
Biomedical ontologies: toward scientific debate.
Maojo, V; Crespo, J; García-Remesal, M; de la Iglesia, D; Perez-Rey, D; Kulikowski, C
2011-01-01
Biomedical ontologies have been very successful in structuring knowledge for many different applications, receiving widespread praise for their utility and potential. Yet, the role of computational ontologies in scientific research, as opposed to knowledge management applications, has not been extensively discussed. We aim to stimulate further discussion on the advantages and challenges presented by biomedical ontologies from a scientific perspective. We review various aspects of biomedical ontologies going beyond their practical successes, and focus on some key scientific questions in two ways. First, we analyze and discuss current approaches to improve biomedical ontologies that are based largely on classical, Aristotelian ontological models of reality. Second, we raise various open questions about biomedical ontologies that require further research, analyzing in more detail those related to visual reasoning and spatial ontologies. We outline significant scientific issues that biomedical ontologies should consider, beyond current efforts of building practical consensus between them. For spatial ontologies, we suggest an approach for building "morphospatial" taxonomies, as an example that could stimulate research on fundamental open issues for biomedical ontologies. Analysis of a large number of problems with biomedical ontologies suggests that the field is very much open to alternative interpretations of current work, and in need of scientific debate and discussion that can lead to new ideas and research directions.
Liu, Zhandong; Zheng, W Jim; Allen, Genevera I; Liu, Yin; Ruan, Jianhua; Zhao, Zhongming
2017-10-03
The 2016 International Conference on Intelligent Biology and Medicine (ICIBM 2016) was held on December 8-10, 2016 in Houston, Texas, USA. ICIBM included eight scientific sessions, four tutorials, one poster session, four highlighted talks and four keynotes that covered topics on 3D genomics structural analysis, next generation sequencing (NGS) analysis, computational drug discovery, medical informatics, cancer genomics, and systems biology. Here, we present a summary of the nine research articles selected from ICIBM 2016 program for publishing in BMC Bioinformatics.
Low Latency Workflow Scheduling and an Application of Hyperspectral Brightness Temperatures
NASA Astrophysics Data System (ADS)
Nguyen, P. T.; Chapman, D. R.; Halem, M.
2012-12-01
New system analytics for Big Data computing holds the promise of major scientific breakthroughs and discoveries from the exploration and mining of the massive data sets becoming available to the science community. However, such data intensive scientific applications face severe challenges in accessing, managing and analyzing petabytes of data. While the Hadoop MapReduce environment has been successfully applied to data intensive problems arising in business, there are still many scientific problem domains where limitations in the functionality of MapReduce systems prevent its wide adoption by those communities. This is mainly because MapReduce does not readily support the unique science discipline needs such as special science data formats, graphic and computational data analysis tools, maintaining high degrees of computational accuracies, and interfacing with application's existing components across heterogeneous computing processors. We address some of these limitations by exploiting the MapReduce programming model for satellite data intensive scientific problems and address scalability, reliability, scheduling, and data management issues when dealing with climate data records and their complex observational challenges. In addition, we will present techniques to support the unique Earth science discipline needs such as dealing with special science data formats (HDF and NetCDF). We have developed a Hadoop task scheduling algorithm that improves latency by 2x for a scientific workflow including the gridding of the EOS AIRS hyperspectral Brightness Temperatures (BT). This workflow processing algorithm has been tested at the Multicore Computing Center private Hadoop based Intel Nehalem cluster, as well as in a virtual mode under the Open Source Eucalyptus cloud. The 55TB AIRS hyperspectral L1b Brightness Temperature record has been gridded at the resolution of 0.5x1.0 degrees, and we have computed a 0.9 annual anti-correlation to the El Nino Southern oscillation in the Nino 4 region, as well as a 1.9 Kelvin decadal Arctic warming in the 4u and 12u spectral regions. Additionally, we will present the frequency of extreme global warming events by the use of a normalized maximum BT in a grid cell relative to its local standard deviation. A low-latency Hadoop scheduling environment maintains data integrity and fault tolerance in a MapReduce data intensive Cloud environment while improving the "time to solution" metric by 35% when compared to a more traditional parallel processing system for the same dataset. Our next step will be to improve the usability of our Hadoop task scheduling system, to enable rapid prototyping of data intensive experiments by means of processing "kernels". We will report on the performance and experience of implementing these experiments on the NEX testbed, and propose the use of a graphical directed acyclic graph (DAG) interface to help us develop on-demand scientific experiments. Our workflow system works within Hadoop infrastructure as a replacement for the FIFO or FairScheduler, thus the use of Apache "Pig" latin or other Apache tools may also be worth investigating on the NEX system to improve the usability of our workflow scheduling infrastructure for rapid experimentation.
Domain fusion analysis by applying relational algebra to protein sequence and domain databases
Truong, Kevin; Ikura, Mitsuhiko
2003-01-01
Background Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. Results This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at . Conclusion As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time. PMID:12734020
Virtual Observatory and Distributed Data Mining
NASA Astrophysics Data System (ADS)
Borne, Kirk D.
2012-03-01
New modes of discovery are enabled by the growth of data and computational resources (i.e., cyberinfrastructure) in the sciences. This cyberinfrastructure includes structured databases, virtual observatories (distributed data, as described in Section 20.2.1 of this chapter), high-performance computing (petascale machines), distributed computing (e.g., the Grid, the Cloud, and peer-to-peer networks), intelligent search and discovery tools, and innovative visualization environments. Data streams from experiments, sensors, and simulations are increasingly complex and growing in volume. This is true in most sciences, including astronomy, climate simulations, Earth observing systems, remote sensing data collections, and sensor networks. At the same time, we see an emerging confluence of new technologies and approaches to science, most clearly visible in the growing synergism of the four modes of scientific discovery: sensors-modeling-computing-data (Eastman et al. 2005). This has been driven by numerous developments, including the information explosion, development of large-array sensors, acceleration in high-performance computing (HPC) power, advances in algorithms, and efficient modeling techniques. Among these, the most extreme is the growth in new data. Specifically, the acquisition of data in all scientific disciplines is rapidly accelerating and causing a data glut (Bell et al. 2007). It has been estimated that data volumes double every year—for example, the NCSA (National Center for Supercomputing Applications) reported that their users cumulatively generated one petabyte of data over the first 19 years of NCSA operation, but they then generated their next one petabyte in the next year alone, and the data production has been growing by almost 100% each year after that (Butler 2008). The NCSA example is just one of many demonstrations of the exponential (annual data-doubling) growth in scientific data collections. In general, this putative data-doubling is an inevitable result of several compounding factors: the proliferation of data-generating devices, sensors, projects, and enterprises; the 18-month doubling of the digital capacity of these microprocessor-based sensors and devices (commonly referred to as "Moore’s law"); the move to digital for nearly all forms of information; the increase in human-generated data (both unstructured information on the web and structured data from experiments, models, and simulation); and the ever-expanding capability of higher density media to hold greater volumes of data (i.e., data production expands to fill the available storage space). These factors are consequently producing an exponential data growth rate, which will soon (if not already) become an insurmountable technical challenge even with the great advances in computation and algorithms. This technical challenge is compounded by the ever-increasing geographic dispersion of important data sources—the data collections are not stored uniformly at a single location, or with a single data model, or in uniform formats and modalities (e.g., images, databases, structured and unstructured files, and XML data sets)—the data are in fact large, distributed, heterogeneous, and complex. The greatest scientific research challenge with these massive distributed data collections is consequently extracting all of the rich information and knowledge content contained therein, thus requiring new approaches to scientific research. This emerging data-intensive and data-oriented approach to scientific research is sometimes called discovery informatics or X-informatics (where X can be any science, such as bio, geo, astro, chem, eco, or anything; Agresti 2003; Gray 2003; Borne 2010). This data-oriented approach to science is now recognized by some (e.g., Mahootian and Eastman 2009; Hey et al. 2009) as the fourth paradigm of research, following (historically) experiment/observation, modeling/analysis, and computational science.
Software Innovations Speed Scientific Computing
NASA Technical Reports Server (NTRS)
2012-01-01
To help reduce the time needed to analyze data from missions like those studying the Sun, Goddard Space Flight Center awarded SBIR funding to Tech-X Corporation of Boulder, Colorado. That work led to commercial technologies that help scientists accelerate their data analysis tasks. Thanks to its NASA work, the company doubled its number of headquarters employees to 70 and generated about $190,000 in revenue from its NASA-derived products.
Reinforcement learning with Marr.
Niv, Yael; Langdon, Angela
2016-10-01
To many, the poster child for David Marr's famous three levels of scientific inquiry is reinforcement learning-a computational theory of reward optimization, which readily prescribes algorithmic solutions that evidence striking resemblance to signals found in the brain, suggesting a straightforward neural implementation. Here we review questions that remain open at each level of analysis, concluding that the path forward to their resolution calls for inspiration across levels, rather than a focus on mutual constraints.
Margaret R. Holdaway
1994-01-01
Describes Geo-CLM, a computer application (for Mac or DOS) whose primary aim is to perform multiple kriging runs to interpolate the historic climatic record at research plots in the Lake States. It is an exploration and analysis tool. Addition capabilities include climatic databases, a flexible test mode, cross validation, lat/long conversion, English/metric units,...
Enabling a Scientific Cloud Marketplace: VGL (Invited)
NASA Astrophysics Data System (ADS)
Fraser, R.; Woodcock, R.; Wyborn, L. A.; Vote, J.; Rankine, T.; Cox, S. J.
2013-12-01
The Virtual Geophysics Laboratory (VGL) provides a flexible, web based environment where researchers can browse data and use a variety of scientific software packaged into tool kits that run in the Cloud. Both data and tool kits are published by multiple researchers and registered with the VGL infrastructure forming a data and application marketplace. The VGL provides the basic work flow of Discovery and Access to the disparate data sources and a Library for tool kits and scripting to drive the scientific codes. Computation is then performed on the Research or Commercial Clouds. Provenance information is collected throughout the work flow and can be published alongside the results allowing for experiment comparison and sharing with other researchers. VGL's "mix and match" approach to data, computational resources and scientific codes, enables a dynamic approach to scientific collaboration. VGL allows scientists to publish their specific contribution, be it data, code, compute or work flow, knowing the VGL framework will provide other components needed for a complete application. Other scientists can choose the pieces that suit them best to assemble an experiment. The coarse grain workflow of the VGL framework combined with the flexibility of the scripting library and computational toolkits allows for significant customisation and sharing amongst the community. The VGL utilises the cloud computational and storage resources from the Australian academic research cloud provided by the NeCTAR initiative and a large variety of data accessible from national and state agencies via the Spatial Information Services Stack (SISS - http://siss.auscope.org). VGL v1.2 screenshot - http://vgl.auscope.org
UMAMI: A Recipe for Generating Meaningful Metrics through Holistic I/O Performance Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lockwood, Glenn K.; Yoo, Wucherl; Byna, Suren
I/O efficiency is essential to productivity in scientific computing, especially as many scientific domains become more data-intensive. Many characterization tools have been used to elucidate specific aspects of parallel I/O performance, but analyzing components of complex I/O subsystems in isolation fails to provide insight into critical questions: how do the I/O components interact, what are reasonable expectations for application performance, and what are the underlying causes of I/O performance problems? To address these questions while capitalizing on existing component-level characterization tools, we propose an approach that combines on-demand, modular synthesis of I/O characterization data into a unified monitoring and metricsmore » interface (UMAMI) to provide a normalized, holistic view of I/O behavior. We evaluate the feasibility of this approach by applying it to a month-long benchmarking study on two distinct largescale computing platforms. We present three case studies that highlight the importance of analyzing application I/O performance in context with both contemporaneous and historical component metrics, and we provide new insights into the factors affecting I/O performance. By demonstrating the generality of our approach, we lay the groundwork for a production-grade framework for holistic I/O analysis.« less
Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing
Klimentov, A.; Buncic, P.; De, K.; ...
2015-05-22
The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS and ALICE are the largest collaborations ever assembled in the sciences and are at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, both experiments rely on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Managementmore » System (WMS) for managing the workflow for all data processing on hundreds of data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. The scale is demonstrated by the following numbers: PanDA manages O(10 2) sites, O(10 5) cores, O(10 8) jobs per year, O(10 3) users, and ATLAS data volume is O(10 17) bytes. In 2013 we started an ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF). The project titled 'Next Generation Workload Management and Analysis System for Big Data' (BigPanDA) is funded by DOE ASCR and HEP. Extending PanDA to clouds and LCF presents new challenges in managing heterogeneity and supporting workflow. The BigPanDA project is underway to setup and tailor PanDA at the Oak Ridge Leadership Computing Facility (OLCF) and at the National Research Center "Kurchatov Institute" together with ALICE distributed computing and ORNL computing professionals. Our approach to integration of HPC platforms at the OLCF and elsewhere is to reuse, as much as possible, existing components of the PanDA system. Finally, we will present our current accomplishments with running the PanDA WMS at OLCF and other supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications.« less
Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Klimentov, A.; Buncic, P.; De, K.
The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS and ALICE are the largest collaborations ever assembled in the sciences and are at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, both experiments rely on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Managementmore » System (WMS) for managing the workflow for all data processing on hundreds of data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. The scale is demonstrated by the following numbers: PanDA manages O(10 2) sites, O(10 5) cores, O(10 8) jobs per year, O(10 3) users, and ATLAS data volume is O(10 17) bytes. In 2013 we started an ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF). The project titled 'Next Generation Workload Management and Analysis System for Big Data' (BigPanDA) is funded by DOE ASCR and HEP. Extending PanDA to clouds and LCF presents new challenges in managing heterogeneity and supporting workflow. The BigPanDA project is underway to setup and tailor PanDA at the Oak Ridge Leadership Computing Facility (OLCF) and at the National Research Center "Kurchatov Institute" together with ALICE distributed computing and ORNL computing professionals. Our approach to integration of HPC platforms at the OLCF and elsewhere is to reuse, as much as possible, existing components of the PanDA system. Finally, we will present our current accomplishments with running the PanDA WMS at OLCF and other supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications.« less
Kong, Jun; Wang, Fusheng; Teodoro, George; Cooper, Lee; Moreno, Carlos S; Kurc, Tahsin; Pan, Tony; Saltz, Joel; Brat, Daniel
2013-12-01
In this paper, we present a novel framework for microscopic image analysis of nuclei, data management, and high performance computation to support translational research involving nuclear morphometry features, molecular data, and clinical outcomes. Our image analysis pipeline consists of nuclei segmentation and feature computation facilitated by high performance computing with coordinated execution in multi-core CPUs and Graphical Processor Units (GPUs). All data derived from image analysis are managed in a spatial relational database supporting highly efficient scientific queries. We applied our image analysis workflow to 159 glioblastomas (GBM) from The Cancer Genome Atlas dataset. With integrative studies, we found statistics of four specific nuclear features were significantly associated with patient survival. Additionally, we correlated nuclear features with molecular data and found interesting results that support pathologic domain knowledge. We found that Proneural subtype GBMs had the smallest mean of nuclear Eccentricity and the largest mean of nuclear Extent, and MinorAxisLength. We also found gene expressions of stem cell marker MYC and cell proliferation maker MKI67 were correlated with nuclear features. To complement and inform pathologists of relevant diagnostic features, we queried the most representative nuclear instances from each patient population based on genetic and transcriptional classes. Our results demonstrate that specific nuclear features carry prognostic significance and associations with transcriptional and genetic classes, highlighting the potential of high throughput pathology image analysis as a complementary approach to human-based review and translational research.
BioPig: a Hadoop-based analytic toolkit for large-scale sequence data.
Nordberg, Henrik; Bhatia, Karan; Wang, Kai; Wang, Zhong
2013-12-01
The recent revolution in sequencing technologies has led to an exponential growth of sequence data. As a result, most of the current bioinformatics tools become obsolete as they fail to scale with data. To tackle this 'data deluge', here we introduce the BioPig sequence analysis toolkit as one of the solutions that scale to data and computation. We built BioPig on the Apache's Hadoop MapReduce system and the Pig data flow language. Compared with traditional serial and MPI-based algorithms, BioPig has three major advantages: first, BioPig's programmability greatly reduces development time for parallel bioinformatics applications; second, testing BioPig with up to 500 Gb sequences demonstrates that it scales automatically with size of data; and finally, BioPig can be ported without modification on many Hadoop infrastructures, as tested with Magellan system at National Energy Research Scientific Computing Center and the Amazon Elastic Compute Cloud. In summary, BioPig represents a novel program framework with the potential to greatly accelerate data-intensive bioinformatics analysis.
iSPHERE - A New Approach to Collaborative Research and Cloud Computing
NASA Astrophysics Data System (ADS)
Al-Ubaidi, T.; Khodachenko, M. L.; Kallio, E. J.; Harry, A.; Alexeev, I. I.; Vázquez-Poletti, J. L.; Enke, H.; Magin, T.; Mair, M.; Scherf, M.; Poedts, S.; De Causmaecker, P.; Heynderickx, D.; Congedo, P.; Manolescu, I.; Esser, B.; Webb, S.; Ruja, C.
2015-10-01
The project iSPHERE (integrated Scientific Platform for HEterogeneous Research and Engineering) that has been proposed for Horizon 2020 (EINFRA-9- 2015, [1]) aims at creating a next generation Virtual Research Environment (VRE) that embraces existing and emerging technologies and standards in order to provide a versatile platform for scientific investigations and collaboration. The presentation will introduce the large project consortium, provide a comprehensive overview of iSPHERE's basic concepts and approaches and outline general user requirements that the VRE will strive to satisfy. An overview of the envisioned architecture will be given, focusing on the adapted Service Bus concept, i.e. the "Scientific Service Bus" as it is called in iSPHERE. The bus will act as a central hub for all communication and user access, and will be implemented in the course of the project. The agile approach [2] that has been chosen for detailed elaboration and documentation of user requirements, as well as for the actual implementation of the system, will be outlined and its motivation and basic structure will be discussed. The presentation will show which user communities will benefit and which concrete problems, scientific investigations are facing today, will be tackled by the system. Another focus of the presentation is iSPHERE's seamless integration of cloud computing resources and how these will benefit scientific modeling teams by providing a reliable and web based environment for cloud based model execution, storage of results, and comparison with measurements, including fully web based tools for data mining, analysis and visualization. Also the envisioned creation of a dedicated data model for experimental plasma physics will be discussed. It will be shown why the Scientific Service Bus provides an ideal basis to integrate a number of data models and communication protocols and to provide mechanisms for data exchange across multiple and even multidisciplinary platforms.
NASA Astrophysics Data System (ADS)
Pagnutti, Mary; Ryan, Robert E.; Cazenavette, George; Gold, Maxwell; Harlan, Ryan; Leggett, Edward; Pagnutti, James
2017-01-01
A comprehensive radiometric characterization of raw-data format imagery acquired with the Raspberry Pi 3 and V2.1 camera module is presented. The Raspberry Pi is a high-performance single-board computer designed to educate and solve real-world problems. This small computer supports a camera module that uses a Sony IMX219 8 megapixel CMOS sensor. This paper shows that scientific and engineering-grade imagery can be produced with the Raspberry Pi 3 and its V2.1 camera module. Raw imagery is shown to be linear with exposure and gain (ISO), which is essential for scientific and engineering applications. Dark frame, noise, and exposure stability assessments along with flat fielding results, spectral response measurements, and absolute radiometric calibration results are described. This low-cost imaging sensor, when calibrated to produce scientific quality data, can be used in computer vision, biophotonics, remote sensing, astronomy, high dynamic range imaging, and security applications, to name a few.
Yaniv, Ziv; Lowekamp, Bradley C; Johnson, Hans J; Beare, Richard
2018-06-01
Modern scientific endeavors increasingly require team collaborations to construct and interpret complex computational workflows. This work describes an image-analysis environment that supports the use of computational tools that facilitate reproducible research and support scientists with varying levels of software development skills. The Jupyter notebook web application is the basis of an environment that enables flexible, well-documented, and reproducible workflows via literate programming. Image-analysis software development is made accessible to scientists with varying levels of programming experience via the use of the SimpleITK toolkit, a simplified interface to the Insight Segmentation and Registration Toolkit. Additional features of the development environment include user friendly data sharing using online data repositories and a testing framework that facilitates code maintenance. SimpleITK provides a large number of examples illustrating educational and research-oriented image analysis workflows for free download from GitHub under an Apache 2.0 license: github.com/InsightSoftwareConsortium/SimpleITK-Notebooks .
Knowledge Discovery from Climate Data using Graph-Based Methods
NASA Astrophysics Data System (ADS)
Steinhaeuser, K.
2012-04-01
Climate and Earth sciences have recently experienced a rapid transformation from a historically data-poor to a data-rich environment, thus bringing them into the realm of the Fourth Paradigm of scientific discovery - a term coined by the late Jim Gray (Hey et al. 2009), the other three being theory, experimentation and computer simulation. In particular, climate-related observations from remote sensors on satellites and weather radars, in situ sensors and sensor networks, as well as outputs of climate or Earth system models from large-scale simulations, provide terabytes of spatio-temporal data. These massive and information-rich datasets offer a significant opportunity for advancing climate science and our understanding of the global climate system, yet current analysis techniques are not able to fully realize their potential benefits. We describe a class of computational approaches, specifically from the data mining and machine learning domains, which may be novel to the climate science domain and can assist in the analysis process. Computer scientists have developed spatial and spatio-temporal analysis techniques for a number of years now, and many of them may be applicable and/or adaptable to problems in climate science. We describe a large-scale, NSF-funded project aimed at addressing climate science question using computational analysis methods; team members include computer scientists, statisticians, and climate scientists from various backgrounds. One of the major thrusts is in the development of graph-based methods, and several illustrative examples of recent work in this area will be presented.
Heterogeneous high throughput scientific computing with APM X-Gene and Intel Xeon Phi
Abdurachmanov, David; Bockelman, Brian; Elmer, Peter; ...
2015-05-22
Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. As a result, we report our experience on software porting, performance and energy efficiency and evaluatemore » the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).« less
Postdoctoral Fellow | Center for Cancer Research
The Neuro-Oncology Branch (NOB), Center for Cancer Research (CCR), National Cancer Institute (NCI) of the National Institutes of Health (NIH) is seeking outstanding postdoctoral candidates interested in studying metabolic and cell signaling pathways in the context of brain cancers through construction of computational models amenable to formal computational analysis and simulation. The ability to closely collaborate with the modern metabolomics center developed at CCR provides a unique opportunity for a postdoctoral candidate with a strong theoretical background and interest in demonstrating the incredible potential of computational approaches to solve problems from scientific disciplines and improve lives. The candidate will be given the opportunity to both construct data-driven models, as well as biologically validate the models by demonstrating the ability to predict the effects of altering tumor metabolism in laboratory and clinical settings.