[Earth Science Technology Office's Computational Technologies Project
NASA Technical Reports Server (NTRS)
Fischer, James (Technical Monitor); Merkey, Phillip
2005-01-01
This grant supported the effort to characterize the problem domain of the Earth Science Technology Office's Computational Technologies Project, to engage the Beowulf Cluster Computing Community as well as the High Performance Computing Research Community so that we can predict the applicability of said technologies to the scientific community represented by the CT project and formulate long term strategies to provide the computational resources necessary to attain the anticipated scientific objectives of the CT project. Specifically, the goal of the evaluation effort is to use the information gathered over the course of the Round-3 investigations to quantify the trends in scientific expectations, the algorithmic requirements and capabilities of high-performance computers to satisfy this anticipated need.
[Earth and Space Sciences Project Services for NASA HPCC
NASA Technical Reports Server (NTRS)
Merkey, Phillip
2002-01-01
This grant supported the effort to characterize the problem domain of the Earth Science Technology Office's Computational Technologies Project, to engage the Beowulf Cluster Computing Community as well as the High Performance Computing Research Community so that we can predict the applicability of said technologies to the scientific community represented by the CT project and formulate long term strategies to provide the computational resources necessary to attain the anticipated scientific objectives of the CT project. Specifically, the goal of the evaluation effort is to use the information gathered over the course of the Round-3 investigations to quantify the trends in scientific expectations, the algorithmic requirements and capabilities of high-performance computers to satisfy this anticipated need.
Whole earth modeling: developing and disseminating scientific software for computational geophysics.
NASA Astrophysics Data System (ADS)
Kellogg, L. H.
2016-12-01
Historically, a great deal of specialized scientific software for modeling and data analysis has been developed by individual researchers or small groups of scientists working on their own specific research problems. As the magnitude of available data and computer power has increased, so has the complexity of scientific problems addressed by computational methods, creating both a need to sustain existing scientific software, and expand its development to take advantage of new algorithms, new software approaches, and new computational hardware. To that end, communities like the Computational Infrastructure for Geodynamics (CIG) have been established to support the use of best practices in scientific computing for solid earth geophysics research and teaching. Working as a scientific community enables computational geophysicists to take advantage of technological developments, improve the accuracy and performance of software, build on prior software development, and collaborate more readily. The CIG community, and others, have adopted an open-source development model, in which code is developed and disseminated by the community in an open fashion, using version control and software repositories like Git. One emerging issue is how to adequately identify and credit the intellectual contributions involved in creating open source scientific software. The traditional method of disseminating scientific ideas, peer reviewed publication, was not designed for review or crediting scientific software, although emerging publication strategies such software journals are attempting to address the need. We are piloting an integrated approach in which authors are identified and credited as scientific software is developed and run. Successful software citation requires integration with the scholarly publication and indexing mechanisms as well, to assign credit, ensure discoverability, and provide provenance for software.
The Petascale Data Storage Institute
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gibson, Garth; Long, Darrell; Honeyman, Peter
2013-07-01
Petascale computing infrastructures for scientific discovery make petascale demands on information storage capacity, performance, concurrency, reliability, availability, and manageability.The Petascale Data Storage Institute focuses on the data storage problems found in petascale scientific computing environments, with special attention to community issues such as interoperability, community buy-in, and shared tools.The Petascale Data Storage Institute is a collaboration between researchers at Carnegie Mellon University, National Energy Research Scientific Computing Center, Pacific Northwest National Laboratory, Oak Ridge National Laboratory, Sandia National Laboratory, Los Alamos National Laboratory, University of Michigan, and the University of California at Santa Cruz.
Joint the Center for Applied Scientific Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gamblin, Todd; Bremer, Timo; Van Essen, Brian
The Center for Applied Scientific Computing serves as Livermore Lab’s window to the broader computer science, computational physics, applied mathematics, and data science research communities. In collaboration with academic, industrial, and other government laboratory partners, we conduct world-class scientific research and development on problems critical to national security. CASC applies the power of high-performance computing and the efficiency of modern computational methods to the realms of stockpile stewardship, cyber and energy security, and knowledge discovery for intelligence applications.
Computer network access to scientific information systems for minority universities
NASA Astrophysics Data System (ADS)
Thomas, Valerie L.; Wakim, Nagi T.
1993-08-01
The evolution of computer networking technology has lead to the establishment of a massive networking infrastructure which interconnects various types of computing resources at many government, academic, and corporate institutions. A large segment of this infrastructure has been developed to facilitate information exchange and resource sharing within the scientific community. The National Aeronautics and Space Administration (NASA) supports both the development and the application of computer networks which provide its community with access to many valuable multi-disciplinary scientific information systems and on-line databases. Recognizing the need to extend the benefits of this advanced networking technology to the under-represented community, the National Space Science Data Center (NSSDC) in the Space Data and Computing Division at the Goddard Space Flight Center has developed the Minority University-Space Interdisciplinary Network (MU-SPIN) Program: a major networking and education initiative for Historically Black Colleges and Universities (HBCUs) and Minority Universities (MUs). In this paper, we will briefly explain the various components of the MU-SPIN Program while highlighting how, by providing access to scientific information systems and on-line data, it promotes a higher level of collaboration among faculty and students and NASA scientists.
Architectural Aspects of Grid Computing and its Global Prospects for E-Science Community
NASA Astrophysics Data System (ADS)
Ahmad, Mushtaq
2008-05-01
The paper reviews the imminent Architectural Aspects of Grid Computing for e-Science community for scientific research and business/commercial collaboration beyond physical boundaries. Grid Computing provides all the needed facilities; hardware, software, communication interfaces, high speed internet, safe authentication and secure environment for collaboration of research projects around the globe. It provides highly fast compute engine for those scientific and engineering research projects and business/commercial applications which are heavily compute intensive and/or require humongous amounts of data. It also makes possible the use of very advanced methodologies, simulation models, expert systems and treasure of knowledge available around the globe under the umbrella of knowledge sharing. Thus it makes possible one of the dreams of global village for the benefit of e-Science community across the globe.
NASA Astrophysics Data System (ADS)
Mhashilkar, Parag; Tiradani, Anthony; Holzman, Burt; Larson, Krista; Sfiligoi, Igor; Rynge, Mats
2014-06-01
Scientific communities have been in the forefront of adopting new technologies and methodologies in the computing. Scientific computing has influenced how science is done today, achieving breakthroughs that were impossible to achieve several decades ago. For the past decade several such communities in the Open Science Grid (OSG) and the European Grid Infrastructure (EGI) have been using GlideinWMS to run complex application workflows to effectively share computational resources over the grid. GlideinWMS is a pilot-based workload management system (WMS) that creates on demand, a dynamically sized overlay HTCondor batch system on grid resources. At present, the computational resources shared over the grid are just adequate to sustain the computing needs. We envision that the complexity of the science driven by "Big Data" will further push the need for computational resources. To fulfill their increasing demands and/or to run specialized workflows, some of the big communities like CMS are investigating the use of cloud computing as Infrastructure-As-A-Service (IAAS) with GlideinWMS as a potential alternative to fill the void. Similarly, communities with no previous access to computing resources can use GlideinWMS to setup up a batch system on the cloud infrastructure. To enable this, the architecture of GlideinWMS has been extended to enable support for interfacing GlideinWMS with different Scientific and commercial cloud providers like HLT, FutureGrid, FermiCloud and Amazon EC2. In this paper, we describe a solution for cloud bursting with GlideinWMS. The paper describes the approach, architectural changes and lessons learned while enabling support for cloud infrastructures in GlideinWMS.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mhashilkar, Parag; Tiradani, Anthony; Holzman, Burt
Scientific communities have been in the forefront of adopting new technologies and methodologies in the computing. Scientific computing has influenced how science is done today, achieving breakthroughs that were impossible to achieve several decades ago. For the past decade several such communities in the Open Science Grid (OSG) and the European Grid Infrastructure (EGI) have been using GlideinWMS to run complex application workflows to effectively share computational resources over the grid. GlideinWMS is a pilot-based workload management system (WMS) that creates on demand, a dynamically sized overlay HTCondor batch system on grid resources. At present, the computational resources shared overmore » the grid are just adequate to sustain the computing needs. We envision that the complexity of the science driven by 'Big Data' will further push the need for computational resources. To fulfill their increasing demands and/or to run specialized workflows, some of the big communities like CMS are investigating the use of cloud computing as Infrastructure-As-A-Service (IAAS) with GlideinWMS as a potential alternative to fill the void. Similarly, communities with no previous access to computing resources can use GlideinWMS to setup up a batch system on the cloud infrastructure. To enable this, the architecture of GlideinWMS has been extended to enable support for interfacing GlideinWMS with different Scientific and commercial cloud providers like HLT, FutureGrid, FermiCloud and Amazon EC2. In this paper, we describe a solution for cloud bursting with GlideinWMS. The paper describes the approach, architectural changes and lessons learned while enabling support for cloud infrastructures in GlideinWMS.« less
Introduction to computers: Reference guide
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ligon, F.V.
1995-04-01
The ``Introduction to Computers`` program establishes formal partnerships with local school districts and community-based organizations, introduces computer literacy to precollege students and their parents, and encourages students to pursue Scientific, Mathematical, Engineering, and Technical careers (SET). Hands-on assignments are given in each class, reinforcing the lesson taught. In addition, the program is designed to broaden the knowledge base of teachers in scientific/technical concepts, and Brookhaven National Laboratory continues to act as a liaison, offering educational outreach to diverse community organizations and groups. This manual contains the teacher`s lesson plans and the student documentation to this introduction to computer course.
Architectural Principles and Experimentation of Distributed High Performance Virtual Clusters
ERIC Educational Resources Information Center
Younge, Andrew J.
2016-01-01
With the advent of virtualization and Infrastructure-as-a-Service (IaaS), the broader scientific computing community is considering the use of clouds for their scientific computing needs. This is due to the relative scalability, ease of use, advanced user environment customization abilities, and the many novel computing paradigms available for…
What makes computational open source software libraries successful?
NASA Astrophysics Data System (ADS)
Bangerth, Wolfgang; Heister, Timo
2013-01-01
Software is the backbone of scientific computing. Yet, while we regularly publish detailed accounts about the results of scientific software, and while there is a general sense of which numerical methods work well, our community is largely unaware of best practices in writing the large-scale, open source scientific software upon which our discipline rests. This is particularly apparent in the commonly held view that writing successful software packages is largely the result of simply ‘being a good programmer’ when in fact there are many other factors involved, for example the social skill of community building. In this paper, we consider what we have found to be the necessary ingredients for successful scientific software projects and, in particular, for software libraries upon which the vast majority of scientific codes are built today. In particular, we discuss the roles of code, documentation, communities, project management and licenses. We also briefly comment on the impact on academic careers of engaging in software projects.
Center for Center for Technology for Advanced Scientific Component Software (TASCS)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kostadin, Damevski
A resounding success of the Scientific Discovery through Advanced Computing (SciDAC) program is that high-performance computational science is now universally recognized as a critical aspect of scientific discovery [71], complementing both theoretical and experimental research. As scientific communities prepare to exploit unprecedented computing capabilities of emerging leadership-class machines for multi-model simulations at the extreme scale [72], it is more important than ever to address the technical and social challenges of geographically distributed teams that combine expertise in domain science, applied mathematics, and computer science to build robust and flexible codes that can incorporate changes over time. The Center for Technologymore » for Advanced Scientific Component Software (TASCS)1 tackles these these issues by exploiting component-based software development to facilitate collaborative high-performance scientific computing.« less
Must Invisible Colleges Be Invisible? An Approach to Examining Large Communities of Network Users.
ERIC Educational Resources Information Center
Ruth, Stephen R.; Gouet, Raul
1993-01-01
Discussion of characteristics of users of computer-mediated communication systems and scientific networks focuses on a study of the scientific community in Chile. Topics addressed include users and nonusers; productivity; educational level; academic specialty; age; gender; international connectivity; public policy issues; and future research…
Scientific Services on the Cloud
NASA Astrophysics Data System (ADS)
Chapman, David; Joshi, Karuna P.; Yesha, Yelena; Halem, Milt; Yesha, Yaacov; Nguyen, Phuong
Scientific Computing was one of the first every applications for parallel and distributed computation. To this date, scientific applications remain some of the most compute intensive, and have inspired creation of petaflop compute infrastructure such as the Oak Ridge Jaguar and Los Alamos RoadRunner. Large dedicated hardware infrastructure has become both a blessing and a curse to the scientific community. Scientists are interested in cloud computing for much the same reason as businesses and other professionals. The hardware is provided, maintained, and administrated by a third party. Software abstraction and virtualization provide reliability, and fault tolerance. Graduated fees allow for multi-scale prototyping and execution. Cloud computing resources are only a few clicks away, and by far the easiest high performance distributed platform to gain access to. There may still be dedicated infrastructure for ultra-scale science, but the cloud can easily play a major part of the scientific computing initiative.
From the desktop to the grid: scalable bioinformatics via workflow conversion.
de la Garza, Luis; Veit, Johannes; Szolek, Andras; Röttig, Marc; Aiche, Stephan; Gesing, Sandra; Reinert, Knut; Kohlbacher, Oliver
2016-03-12
Reproducibility is one of the tenets of the scientific method. Scientific experiments often comprise complex data flows, selection of adequate parameters, and analysis and visualization of intermediate and end results. Breaking down the complexity of such experiments into the joint collaboration of small, repeatable, well defined tasks, each with well defined inputs, parameters, and outputs, offers the immediate benefit of identifying bottlenecks, pinpoint sections which could benefit from parallelization, among others. Workflows rest upon the notion of splitting complex work into the joint effort of several manageable tasks. There are several engines that give users the ability to design and execute workflows. Each engine was created to address certain problems of a specific community, therefore each one has its advantages and shortcomings. Furthermore, not all features of all workflow engines are royalty-free -an aspect that could potentially drive away members of the scientific community. We have developed a set of tools that enables the scientific community to benefit from workflow interoperability. We developed a platform-free structured representation of parameters, inputs, outputs of command-line tools in so-called Common Tool Descriptor documents. We have also overcome the shortcomings and combined the features of two royalty-free workflow engines with a substantial user community: the Konstanz Information Miner, an engine which we see as a formidable workflow editor, and the Grid and User Support Environment, a web-based framework able to interact with several high-performance computing resources. We have thus created a free and highly accessible way to design workflows on a desktop computer and execute them on high-performance computing resources. Our work will not only reduce time spent on designing scientific workflows, but also make executing workflows on remote high-performance computing resources more accessible to technically inexperienced users. We strongly believe that our efforts not only decrease the turnaround time to obtain scientific results but also have a positive impact on reproducibility, thus elevating the quality of obtained scientific results.
Recent Enhancements to the Community Multiscale Air Quality Modeling System (CMAQ)
EPA’s Office of Research and Development, Computational Exposure Division held a webinar on January 31, 2017 to present the recent scientific and computational updates made by EPA to the Community Multi-Scale Air Quality Model (CMAQ). Topics covered included: (1) Improveme...
Software and the Scientist: Coding and Citation Practices in Geodynamics
NASA Astrophysics Data System (ADS)
Hwang, Lorraine; Fish, Allison; Soito, Laura; Smith, MacKenzie; Kellogg, Louise H.
2017-11-01
In geodynamics as in other scientific areas, computation has become a core component of research, complementing field observation, laboratory analysis, experiment, and theory. Computational tools for data analysis, mapping, visualization, modeling, and simulation are essential for all aspects of the scientific workflow. Specialized scientific software is often developed by geodynamicists for their own use, and this effort represents a distinctive intellectual contribution. Drawing on a geodynamics community that focuses on developing and disseminating scientific software, we assess the current practices of software development and attribution, as well as attitudes about the need and best practices for software citation. We analyzed publications by participants in the Computational Infrastructure for Geodynamics and conducted mixed method surveys of the solid earth geophysics community. From this we learned that coding skills are typically learned informally. Participants considered good code as trusted, reusable, readable, and not overly complex and considered a good coder as one that participates in the community in an open and reasonable manor contributing to both long- and short-term community projects. Participants strongly supported citing software reflected by the high rate a software package was named in the literature and the high rate of citations in the references. However, lacking are clear instructions from developers on how to cite and education of users on what to cite. In addition, citations did not always lead to discoverability of the resource. A unique identifier to the software package itself, community education, and citation tools would contribute to better attribution practices.
2014 Annual Report - Argonne Leadership Computing Facility
DOE Office of Scientific and Technical Information (OSTI.GOV)
Collins, James R.; Papka, Michael E.; Cerny, Beth A.
The Argonne Leadership Computing Facility provides supercomputing capabilities to the scientific and engineering community to advance fundamental discovery and understanding in a broad range of disciplines.
2015 Annual Report - Argonne Leadership Computing Facility
DOE Office of Scientific and Technical Information (OSTI.GOV)
Collins, James R.; Papka, Michael E.; Cerny, Beth A.
The Argonne Leadership Computing Facility provides supercomputing capabilities to the scientific and engineering community to advance fundamental discovery and understanding in a broad range of disciplines.
ERIC Educational Resources Information Center
Donna, Joel D.; Miller, Brant G.
2013-01-01
Technology plays a crucial role in facilitating collaboration within the scientific community. Cloud-computing applications, such as Google Drive, can be used to model such collaboration and support inquiry within the secondary science classroom. Little is known about pre-service teachers' beliefs related to the envisioned use of collaborative,…
Adaptation of XMM-Newton SAS to GRID and VO architectures via web
NASA Astrophysics Data System (ADS)
Ibarra, A.; de La Calle, I.; Gabriel, C.; Salgado, J.; Osuna, P.
2008-10-01
The XMM-Newton Scientific Analysis Software (SAS) is a robust software that has allowed users to produce good scientific results since the beginning of the mission. This has been possible given the SAS capability to evolve with the advent of new technologies and adapt to the needs of the scientific community. The prototype of the Remote Interface for Science Analysis (RISA) presented here, is one such example, which provides remote analysis of XMM-Newton data with access to all the existing SAS functionality, while making use of GRID computing technology. This new technology has recently emerged within the astrophysical community to tackle the ever lasting problem of computer power for the reduction of large amounts of data.
Building a Culture of Health Informatics Innovation and Entrepreneurship: A New Frontier.
Househ, Mowafa; Alshammari, Riyad; Almutairi, Mariam; Jamal, Amr; Alshoaib, Saleh
2015-01-01
Entrepreneurship and innovation within the health informatics (HI) scientific community are relatively sluggish when compared to other disciplines such as computer science and engineering. Healthcare in general, and specifically, the health informatics scientific community needs to embrace more innovative and entrepreneurial practices. In this paper, we explore the concepts of innovation and entrepreneurship as they apply to the health informatics scientific community. We also outline several strategies to improve the culture of innovation and entrepreneurship within the health informatics scientific community such as: (I) incorporating innovation and entrepreneurship in health informatics education; (II) creating strong linkages with industry and healthcare organizations; (III) supporting national health innovation and entrepreneurship competitions; (IV) creating a culture of innovation and entrepreneurship within healthcare organizations; (V) developing health informatics policies that support innovation and entrepreneurship based on internationally recognized standards; and (VI) develop an health informatics entrepreneurship ecosystem. With these changes, we conclude that embracing health innovation and entrepreneurship may be more readily accepted over the long-term within the health informatics scientific community.
Web Services Provide Access to SCEC Scientific Research Application Software
NASA Astrophysics Data System (ADS)
Gupta, N.; Gupta, V.; Okaya, D.; Kamb, L.; Maechling, P.
2003-12-01
Web services offer scientific communities a new paradigm for sharing research codes and communicating results. While there are formal technical definitions of what constitutes a web service, for a user community such as the Southern California Earthquake Center (SCEC), we may conceptually consider a web service to be functionality provided on-demand by an application which is run on a remote computer located elsewhere on the Internet. The value of a web service is that it can (1) run a scientific code without the user needing to install and learn the intricacies of running the code; (2) provide the technical framework which allows a user's computer to talk to the remote computer which performs the service; (3) provide the computational resources to run the code; and (4) bundle several analysis steps and provide the end results in digital or (post-processed) graphical form. Within an NSF-sponsored ITR project coordinated by SCEC, we are constructing web services using architectural protocols and programming languages (e.g., Java). However, because the SCEC community has a rich pool of scientific research software (written in traditional languages such as C and FORTRAN), we also emphasize making existing scientific codes available by constructing web service frameworks which wrap around and directly run these codes. In doing so we attempt to broaden community usage of these codes. Web service wrapping of a scientific code can be done using a "web servlet" construction or by using a SOAP/WSDL-based framework. This latter approach is widely adopted in IT circles although it is subject to rapid evolution. Our wrapping framework attempts to "honor" the original codes with as little modification as is possible. For versatility we identify three methods of user access: (A) a web-based GUI (written in HTML and/or Java applets); (B) a Linux/OSX/UNIX command line "initiator" utility (shell-scriptable); and (C) direct access from within any Java application (and with the correct API interface from within C++ and/or C/Fortran). This poster presentation will provide descriptions of the following selected web services and their origin as scientific application codes: 3D community velocity models for Southern California, geocoordinate conversions (latitude/longitude to UTM), execution of GMT graphical scripts, data format conversions (Gocad to Matlab format), and implementation of Seismic Hazard Analysis application programs that calculate hazard curve and hazard map data sets.
Scholarly literature and the press: scientific impact and social perception of physics computing
NASA Astrophysics Data System (ADS)
Pia, M. G.; Basaglia, T.; Bell, Z. W.; Dressendorfer, P. V.
2014-06-01
The broad coverage of the search for the Higgs boson in the mainstream media is a relative novelty for high energy physics (HEP) research, whose achievements have traditionally been limited to scholarly literature. This paper illustrates the results of a scientometric analysis of HEP computing in scientific literature, institutional media and the press, and a comparative overview of similar metrics concerning representative particle physics measurements. The picture emerging from these scientometric data documents the relationship between the scientific impact and the social perception of HEP physics research versus that of HEP computing. The results of this analysis suggest that improved communication of the scientific and social role of HEP computing via press releases from the major HEP laboratories would be beneficial to the high energy physics community.
ERIC Educational Resources Information Center
Abuzaghleh, Omar; Goldschmidt, Kathleen; Elleithy, Yasser; Lee, Jeongkyu
2013-01-01
With the advances in computing power, high-performance computing (HPC) platforms have had an impact on not only scientific research in advanced organizations but also computer science curriculum in the educational community. For example, multicore programming and parallel systems are highly desired courses in the computer science major. However,…
The Computational Infrastructure for Geodynamics as a Community of Practice
NASA Astrophysics Data System (ADS)
Hwang, L.; Kellogg, L. H.
2016-12-01
Computational Infrastructure for Geodynamics (CIG), geodynamics.org, originated in 2005 out of community recognition that the efforts of individual or small groups of researchers to develop scientifically-sound software is impossible to sustain, duplicates effort, and makes it difficult for scientists to adopt state-of-the art computational methods that promote new discovery. As a community of practice, participants in CIG share an interest in computational modeling in geodynamics and work together on open source software to build the capacity to support complex, extensible, scalable, interoperable, reliable, and reusable software in an effort to increase the return on investment in scientific software development and increase the quality of the resulting software. The group interacts regularly to learn from each other and better their practices formally through webinar series, workshops, and tutorials and informally through listservs and hackathons. Over the past decade, we have learned that successful scientific software development requires at a minimum: collaboration between domain-expert researchers, software developers and computational scientists; clearly identified and committed lead developer(s); well-defined scientific and computational goals that are regularly evaluated and updated; well-defined benchmarks and testing throughout development; attention throughout development to usability and extensibility; understanding and evaluation of the complexity of dependent libraries; and managed user expectations through education, training, and support. CIG's code donation standards provide the basis for recently formalized best practices in software development (geodynamics.org/cig/dev/best-practices/). Best practices include use of version control; widely used, open source software libraries; extensive test suites; portable configuration and build systems; extensive documentation internal and external to the code; and structured, human readable input formats.
EarthCube Activities: Community Engagement Advancing Geoscience Research
NASA Astrophysics Data System (ADS)
Kinkade, D.
2015-12-01
Our ability to advance scientific research in order to better understand complex Earth systems, address emerging geoscience problems, and meet societal challenges is increasingly dependent upon the concept of Open Science and Data. Although these terms are relatively new to the world of research, Open Science and Data in this context may be described as transparency in the scientific process. This includes the discoverability, public accessibility and reusability of scientific data, as well as accessibility and transparency of scientific communication (www.openscience.org). Scientists and the US government alike are realizing the critical need for easy discovery and access to multidisciplinary data to advance research in the geosciences. The NSF-supported EarthCube project was created to meet this need. EarthCube is developing a community-driven common cyberinfrastructure for the purpose of accessing, integrating, analyzing, sharing and visualizing all forms of data and related resources through advanced technological and computational capabilities. Engaging the geoscience community in EarthCube's development is crucial to its success, and EarthCube is providing several opportunities for geoscience involvement. This presentation will provide an overview of the activities EarthCube is employing to entrain the community in the development process, from governance development and strategic planning, to technical needs gathering. Particular focus will be given to the collection of science-driven use cases as a means of capturing scientific and technical requirements. Such activities inform the development of key technical and computational components that collectively will form a cyberinfrastructure to meet the research needs of the geoscience community.
Changing from computing grid to knowledge grid in life-science grid.
Talukdar, Veera; Konar, Amit; Datta, Ayan; Choudhury, Anamika Roy
2009-09-01
Grid computing has a great potential to become a standard cyber infrastructure for life sciences that often require high-performance computing and large data handling, which exceeds the computing capacity of a single institution. Grid computer applies the resources of many computers in a network to a single problem at the same time. It is useful to scientific problems that require a great number of computer processing cycles or access to a large amount of data.As biologists,we are constantly discovering millions of genes and genome features, which are assembled in a library and distributed on computers around the world.This means that new, innovative methods must be developed that exploit the re-sources available for extensive calculations - for example grid computing.This survey reviews the latest grid technologies from the viewpoints of computing grid, data grid and knowledge grid. Computing grid technologies have been matured enough to solve high-throughput real-world life scientific problems. Data grid technologies are strong candidates for realizing a "resourceome" for bioinformatics. Knowledge grids should be designed not only from sharing explicit knowledge on computers but also from community formulation for sharing tacit knowledge among a community. By extending the concept of grid from computing grid to knowledge grid, it is possible to make use of a grid as not only sharable computing resources, but also as time and place in which people work together, create knowledge, and share knowledge and experiences in a community.
OMPC: an Open-Source MATLAB®-to-Python Compiler
Jurica, Peter; van Leeuwen, Cees
2008-01-01
Free access to scientific information facilitates scientific progress. Open-access scientific journals are a first step in this direction; a further step is to make auxiliary and supplementary materials that accompany scientific publications, such as methodological procedures and data-analysis tools, open and accessible to the scientific community. To this purpose it is instrumental to establish a software base, which will grow toward a comprehensive free and open-source language of technical and scientific computing. Endeavors in this direction are met with an important obstacle. MATLAB®, the predominant computation tool in many fields of research, is a closed-source commercial product. To facilitate the transition to an open computation platform, we propose Open-source MATLAB®-to-Python Compiler (OMPC), a platform that uses syntax adaptation and emulation to allow transparent import of existing MATLAB® functions into Python programs. The imported MATLAB® modules will run independently of MATLAB®, relying on Python's numerical and scientific libraries. Python offers a stable and mature open source platform that, in many respects, surpasses commonly used, expensive commercial closed source packages. The proposed software will therefore facilitate the transparent transition towards a free and general open-source lingua franca for scientific computation, while enabling access to the existing methods and algorithms of technical computing already available in MATLAB®. OMPC is available at http://ompc.juricap.com. PMID:19225577
NASA Technical Reports Server (NTRS)
Keller, Richard M.
1991-01-01
The construction of scientific software models is an integral part of doing science, both within NASA and within the scientific community at large. Typically, model-building is a time-intensive and painstaking process, involving the design of very large, complex computer programs. Despite the considerable expenditure of resources involved, completed scientific models cannot easily be distributed and shared with the larger scientific community due to the low-level, idiosyncratic nature of the implemented code. To address this problem, we have initiated a research project aimed at constructing a software tool called the Scientific Modeling Assistant. This tool provides automated assistance to the scientist in developing, using, and sharing software models. We describe the Scientific Modeling Assistant, and also touch on some human-machine interaction issues relevant to building a successful tool of this type.
On Establishing Big Data Wave Breakwaters with Analytics (Invited)
NASA Astrophysics Data System (ADS)
Riedel, M.
2013-12-01
The Research Data Alliance Big Data Analytics (RDA-BDA) Interest Group seeks to develop community based recommendations on feasible data analytics approaches to address scientific community needs of utilizing large quantities of data. RDA-BDA seeks to analyze different scientific domain applications and their potential use of various big data analytics techniques. A systematic classification of feasible combinations of analysis algorithms, analytical tools, data and resource characteristics and scientific queries will be covered in these recommendations. These combinations are complex since a wide variety of different data analysis algorithms exist (e.g. specific algorithms using GPUs of analyzing brain images) that need to work together with multiple analytical tools reaching from simple (iterative) map-reduce methods (e.g. with Apache Hadoop or Twister) to sophisticated higher level frameworks that leverage machine learning algorithms (e.g. Apache Mahout). These computational analysis techniques are often augmented with visual analytics techniques (e.g. computational steering on large-scale high performance computing platforms) to put the human judgement into the analysis loop or new approaches with databases that are designed to support new forms of unstructured or semi-structured data as opposed to the rather tradtional structural databases (e.g. relational databases). More recently, data analysis and underpinned analytics frameworks also have to consider energy footprints of underlying resources. To sum up, the aim of this talk is to provide pieces of information to understand big data analytics in the context of science and engineering using the aforementioned classification as the lighthouse and as the frame of reference for a systematic approach. This talk will provide insights about big data analytics methods in context of science within varios communities and offers different views of how approaches of correlation and causality offer complementary methods to advance in science and engineering today. The RDA Big Data Analytics Group seeks to understand what approaches are not only technically feasible, but also scientifically feasible. The lighthouse Goal of the RDA Big Data Analytics Group is a classification of clever combinations of various Technologies and scientific applications in order to provide clear recommendations to the scientific community what approaches are technicalla and scientifically feasible.
Trends in life science grid: from computing grid to knowledge grid.
Konagaya, Akihiko
2006-12-18
Grid computing has great potential to become a standard cyberinfrastructure for life sciences which often require high-performance computing and large data handling which exceeds the computing capacity of a single institution. This survey reviews the latest grid technologies from the viewpoints of computing grid, data grid and knowledge grid. Computing grid technologies have been matured enough to solve high-throughput real-world life scientific problems. Data grid technologies are strong candidates for realizing "resourceome" for bioinformatics. Knowledge grids should be designed not only from sharing explicit knowledge on computers but also from community formulation for sharing tacit knowledge among a community. Extending the concept of grid from computing grid to knowledge grid, it is possible to make use of a grid as not only sharable computing resources, but also as time and place in which people work together, create knowledge, and share knowledge and experiences in a community.
Trends in life science grid: from computing grid to knowledge grid
Konagaya, Akihiko
2006-01-01
Background Grid computing has great potential to become a standard cyberinfrastructure for life sciences which often require high-performance computing and large data handling which exceeds the computing capacity of a single institution. Results This survey reviews the latest grid technologies from the viewpoints of computing grid, data grid and knowledge grid. Computing grid technologies have been matured enough to solve high-throughput real-world life scientific problems. Data grid technologies are strong candidates for realizing "resourceome" for bioinformatics. Knowledge grids should be designed not only from sharing explicit knowledge on computers but also from community formulation for sharing tacit knowledge among a community. Conclusion Extending the concept of grid from computing grid to knowledge grid, it is possible to make use of a grid as not only sharable computing resources, but also as time and place in which people work together, create knowledge, and share knowledge and experiences in a community. PMID:17254294
Sustaining Open Source Communities through Hackathons - An Example from the ASPECT Community
NASA Astrophysics Data System (ADS)
Heister, T.; Hwang, L.; Bangerth, W.; Kellogg, L. H.
2016-12-01
The ecosystem surrounding a successful scientific open source software package combines both social and technical aspects. Much thought has been given to the technology side of writing sustainable software for large infrastructure projects and software libraries, but less about building the human capacity to perpetuate scientific software used in computational modeling. One effective format for building capacity is regular multi-day hackathons. Scientific hackathons bring together a group of science domain users and scientific software contributors to make progress on a specific software package. Innovation comes through the chance to work with established and new collaborations. Especially in the domain sciences with small communities, hackathons give geographically distributed scientists an opportunity to connect face-to-face. They foster lively discussions amongst scientists with different expertise, promote new collaborations, and increase transparency in both the technical and scientific aspects of code development. ASPECT is an open source, parallel, extensible finite element code to simulate thermal convection, that began development in 2011 under the Computational Infrastructure for Geodynamics. ASPECT hackathons for the past 3 years have grown the number of authors to >50, training new code maintainers in the process. Hackathons begin with leaders establishing project-specific conventions for development, demonstrating the workflow for code contributions, and reviewing relevant technical skills. Each hackathon expands the developer community. Over 20 scientists add >6,000 lines of code during the >1 week event. Participants grow comfortable contributing to the repository and over half continue to contribute afterwards. A high return rate of participants ensures continuity and stability of the group as well as mentoring for novice members. We hope to build other software communities on this model, but anticipate each to bring their own unique challenges.
NASA Astrophysics Data System (ADS)
Añel, Juan A.
2017-03-01
Nowadays, the majority of the scientific community is not aware of the risks and problems associated with an inadequate use of computer systems for research, mostly for reproducibility of scientific results. Such reproducibility can be compromised by the lack of clear standards and insufficient methodological description of the computational details involved in an experiment. In addition, the inappropriate application or ignorance of copyright laws can have undesirable effects on access to aspects of great importance of the design of experiments and therefore to the interpretation of results.
NASA Astrophysics Data System (ADS)
Khodachenko, Maxim; Miller, Steven; Stoeckler, Robert; Topf, Florian
2010-05-01
Computational modeling and observational data analysis are two major aspects of the modern scientific research. Both appear nowadays under extensive development and application. Many of the scientific goals of planetary space missions require robust models of planetary objects and environments as well as efficient data analysis algorithms, to predict conditions for mission planning and to interpret the experimental data. Europe has great strength in these areas, but it is insufficiently coordinated; individual groups, models, techniques and algorithms need to be coupled and integrated. Existing level of scientific cooperation and the technical capabilities for operative communication, allow considerable progress in the development of a distributed international Research Infrastructure (RI) which is based on the existing in Europe computational modelling and data analysis centers, providing the scientific community with dedicated services in the fields of their computational and data analysis expertise. These services will appear as a product of the collaborative communication and joint research efforts of the numerical and data analysis experts together with planetary scientists. The major goal of the EUROPLANET-RI / EMDAF is to make computational models and data analysis algorithms associated with particular national RIs and teams, as well as their outputs, more readily available to their potential user community and more tailored to scientific user requirements, without compromising front-line specialized research on model and data analysis algorithms development and software implementation. This objective will be met through four keys subdivisions/tasks of EMAF: 1) an Interactive Catalogue of Planetary Models; 2) a Distributed Planetary Modelling Laboratory; 3) a Distributed Data Analysis Laboratory, and 4) enabling Models and Routines for High Performance Computing Grids. Using the advantages of the coordinated operation and efficient communication between the involved computational modelling, research and data analysis expert teams and their related research infrastructures, EMDAF will provide a 1) flexible, 2) scientific user oriented, 3) continuously developing and fast upgrading computational and data analysis service to support and intensify the European planetary scientific research. At the beginning EMDAF will create a set of demonstrators and operational tests of this service in key areas of European planetary science. This work will aim at the following objectives: (a) Development and implementation of tools for distant interactive communication between the planetary scientists and computing experts (including related RIs); (b) Development of standard routine packages, and user-friendly interfaces for operation of the existing numerical codes and data analysis algorithms by the specialized planetary scientists; (c) Development of a prototype of numerical modelling services "on demand" for space missions and planetary researchers; (d) Development of a prototype of data analysis services "on demand" for space missions and planetary researchers; (e) Development of a prototype of coordinated interconnected simulations of planetary phenomena and objects (global multi-model simulators); (f) Providing the demonstrators of a coordinated use of high performance computing facilities (super-computer networks), done in cooperation with European HPC Grid DEISA.
Computational Science in Armenia (Invited Talk)
NASA Astrophysics Data System (ADS)
Marandjian, H.; Shoukourian, Yu.
This survey is devoted to the development of informatics and computer science in Armenia. The results in theoretical computer science (algebraic models, solutions to systems of general form recursive equations, the methods of coding theory, pattern recognition and image processing), constitute the theoretical basis for developing problem-solving-oriented environments. As examples can be mentioned: a synthesizer of optimized distributed recursive programs, software tools for cluster-oriented implementations of two-dimensional cellular automata, a grid-aware web interface with advanced service trading for linear algebra calculations. In the direction of solving scientific problems that require high-performance computing resources, examples of completed projects include the field of physics (parallel computing of complex quantum systems), astrophysics (Armenian virtual laboratory), biology (molecular dynamics study of human red blood cell membrane), meteorology (implementing and evaluating the Weather Research and Forecast Model for the territory of Armenia). The overview also notes that the Institute for Informatics and Automation Problems of the National Academy of Sciences of Armenia has established a scientific and educational infrastructure, uniting computing clusters of scientific and educational institutions of the country and provides the scientific community with access to local and international computational resources, that is a strong support for computational science in Armenia.
OMPC: an Open-Source MATLAB-to-Python Compiler.
Jurica, Peter; van Leeuwen, Cees
2009-01-01
Free access to scientific information facilitates scientific progress. Open-access scientific journals are a first step in this direction; a further step is to make auxiliary and supplementary materials that accompany scientific publications, such as methodological procedures and data-analysis tools, open and accessible to the scientific community. To this purpose it is instrumental to establish a software base, which will grow toward a comprehensive free and open-source language of technical and scientific computing. Endeavors in this direction are met with an important obstacle. MATLAB((R)), the predominant computation tool in many fields of research, is a closed-source commercial product. To facilitate the transition to an open computation platform, we propose Open-source MATLAB((R))-to-Python Compiler (OMPC), a platform that uses syntax adaptation and emulation to allow transparent import of existing MATLAB((R)) functions into Python programs. The imported MATLAB((R)) modules will run independently of MATLAB((R)), relying on Python's numerical and scientific libraries. Python offers a stable and mature open source platform that, in many respects, surpasses commonly used, expensive commercial closed source packages. The proposed software will therefore facilitate the transparent transition towards a free and general open-source lingua franca for scientific computation, while enabling access to the existing methods and algorithms of technical computing already available in MATLAB((R)). OMPC is available at http://ompc.juricap.com.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Almgren, Ann; DeMar, Phil; Vetter, Jeffrey
The widespread use of computing in the American economy would not be possible without a thoughtful, exploratory research and development (R&D) community pushing the performance edge of operating systems, computer languages, and software libraries. These are the tools and building blocks — the hammers, chisels, bricks, and mortar — of the smartphone, the cloud, and the computing services on which we rely. Engineers and scientists need ever-more specialized computing tools to discover new material properties for manufacturing, make energy generation safer and more efficient, and provide insight into the fundamentals of the universe, for example. The research division of themore » U.S. Department of Energy’s (DOE’s) Office of Advanced Scientific Computing and Research (ASCR Research) ensures that these tools and building blocks are being developed and honed to meet the extreme needs of modern science. See also http://exascaleage.org/ascr/ for additional information.« less
Gene regulation knowledge commons: community action takes care of DNA binding transcription factors
Tripathi, Sushil; Vercruysse, Steven; Chawla, Konika; Christie, Karen R.; Blake, Judith A.; Huntley, Rachael P.; Orchard, Sandra; Hermjakob, Henning; Thommesen, Liv; Lægreid, Astrid; Kuiper, Martin
2016-01-01
A large gap remains between the amount of knowledge in scientific literature and the fraction that gets curated into standardized databases, despite many curation initiatives. Yet the availability of comprehensive knowledge in databases is crucial for exploiting existing background knowledge, both for designing follow-up experiments and for interpreting new experimental data. Structured resources also underpin the computational integration and modeling of regulatory pathways, which further aids our understanding of regulatory dynamics. We argue how cooperation between the scientific community and professional curators can increase the capacity of capturing precise knowledge from literature. We demonstrate this with a project in which we mobilize biological domain experts who curate large amounts of DNA binding transcription factors, and show that they, although new to the field of curation, can make valuable contributions by harvesting reported knowledge from scientific papers. Such community curation can enhance the scientific epistemic process. Database URL: http://www.tfcheckpoint.org PMID:27270715
Reconfigurable Computing for Computational Science: A New Focus in High Performance Computing
2006-11-01
in the past decade. Researchers are regularly employing the power of large computing systems and parallel processing to tackle larger and more...complex problems in all of the physical sciences. For the past decade or so, most of this growth in computing power has been “free” with increased...the scientific computing community as a means to continued growth in computing capability. This paper offers a glimpse of the hardware and
Computational Infrastructure for Geodynamics (CIG)
NASA Astrophysics Data System (ADS)
Gurnis, M.; Kellogg, L. H.; Bloxham, J.; Hager, B. H.; Spiegelman, M.; Willett, S.; Wysession, M. E.; Aivazis, M.
2004-12-01
Solid earth geophysicists have a long tradition of writing scientific software to address a wide range of problems. In particular, computer simulations came into wide use in geophysics during the decade after the plate tectonic revolution. Solution schemes and numerical algorithms that developed in other areas of science, most notably engineering, fluid mechanics, and physics, were adapted with considerable success to geophysics. This software has largely been the product of individual efforts and although this approach has proven successful, its strength for solving problems of interest is now starting to show its limitations as we try to share codes and algorithms or when we want to recombine codes in novel ways to produce new science. With funding from the NSF, the US community has embarked on a Computational Infrastructure for Geodynamics (CIG) that will develop, support, and disseminate community-accessible software for the greater geodynamics community from model developers to end-users. The software is being developed for problems involving mantle and core dynamics, crustal and earthquake dynamics, magma migration, seismology, and other related topics. With a high level of community participation, CIG is leveraging state-of-the-art scientific computing into a suite of open-source tools and codes. The infrastructure that we are now starting to develop will consist of: (a) a coordinated effort to develop reusable, well-documented and open-source geodynamics software; (b) the basic building blocks - an infrastructure layer - of software by which state-of-the-art modeling codes can be quickly assembled; (c) extension of existing software frameworks to interlink multiple codes and data through a superstructure layer; (d) strategic partnerships with the larger world of computational science and geoinformatics; and (e) specialized training and workshops for both the geodynamics and broader Earth science communities. The CIG initiative has already started to leverage and develop long-term strategic partnerships with open source development efforts within the larger thrusts of scientific computing and geoinformatics. These strategic partnerships are essential as the frontier has moved into multi-scale and multi-physics problems in which many investigators now want to use simulation software for data interpretation, data assimilation, and hypothesis testing.
Experience Paper: Software Engineering and Community Codes Track in ATPESC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dubey, Anshu; Riley, Katherine M.
Argonne Training Program in Extreme Scale Computing (ATPESC) was started by the Argonne National Laboratory with the objective of expanding the ranks of better prepared users of high performance computing (HPC) machines. One of the unique aspects of the program was inclusion of software engineering and community codes track. The inclusion was motivated by the observation that the projects with a good scientific and software process were better able to meet their scientific goals. In this paper we present our experience of running the software track from the beginning of the program until now. We discuss the motivations, the reception,more » and the evolution of the track over the years. We welcome discussion and input from the community to enhance the track in ATPESC, and also to facilitate inclusion of similar tracks in other HPC oriented training programs.« less
A quick guide for building a successful bioinformatics community.
Budd, Aidan; Corpas, Manuel; Brazas, Michelle D; Fuller, Jonathan C; Goecks, Jeremy; Mulder, Nicola J; Michaut, Magali; Ouellette, B F Francis; Pawlik, Aleksandra; Blomberg, Niklas
2015-02-01
"Scientific community" refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i) the exchange and development of ideas and expertise; (ii) career development; (iii) coordinated funding activities; (iv) interactions and engagement with professionals from other fields; and (v) other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop "The 'How To Guide' for Establishing a Successful Bioinformatics Network" at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and the 12th European Conference on Computational Biology (ECCB).
ERIC Educational Resources Information Center
Howles, Trudy
2009-01-01
Student attrition and low graduation rates are critical problems in computer science education. Disappointing graduation rates and declining student interest have caught the attention of business leaders, researchers and universities. With weak graduation rates and little interest in scientific computing, many are concerned about the USA's ability…
USDA-ARS?s Scientific Manuscript database
Scientific data integration and computational service discovery are challenges for the bioinformatic community. This process is made more difficult by the separate and independent construction of biological databases, which makes the exchange of scientific data between information resources difficu...
The future of scientific workflows
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deelman, Ewa; Peterka, Tom; Altintas, Ilkay
Today’s computational, experimental, and observational sciences rely on computations that involve many related tasks. The success of a scientific mission often hinges on the computer automation of these workflows. In April 2015, the US Department of Energy (DOE) invited a diverse group of domain and computer scientists from national laboratories supported by the Office of Science, the National Nuclear Security Administration, from industry, and from academia to review the workflow requirements of DOE’s science and national security missions, to assess the current state of the art in science workflows, to understand the impact of emerging extreme-scale computing systems on thosemore » workflows, and to develop requirements for automated workflow management in future and existing environments. This article is a summary of the opinions of over 50 leading researchers attending this workshop. We highlight use cases, computing systems, workflow needs and conclude by summarizing the remaining challenges this community sees that inhibit large-scale scientific workflows from becoming a mainstream tool for extreme-scale science.« less
ERIC Educational Resources Information Center
Sainz, Milagros; Lopez-Saez, Mercedes
2010-01-01
The dearth of women in technology and ICT-related fields continues to be a topic of interest for both the scientific community and decision-makers. Research on attitudes towards computers proves that women display more negative computer attitudes than men and also make less intense use of technology and computers than their male counterparts. For…
Idea Paper: The Lifecycle of Software for Scientific Simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dubey, Anshu; McInnes, Lois C.
The software lifecycle is a well researched topic that has produced many models to meet the needs of different types of software projects. However, one class of projects, software development for scientific computing, has received relatively little attention from lifecycle researchers. In particular, software for end-to-end computations for obtaining scientific results has received few lifecycle proposals and no formalization of a development model. An examination of development approaches employed by the teams implementing large multicomponent codes reveals a great deal of similarity in their strategies. This idea paper formalizes these related approaches into a lifecycle model for end-to-end scientific applicationmore » software, featuring loose coupling between submodels for development of infrastructure and scientific capability. We also invite input from stakeholders to converge on a model that captures the complexity of this development processes and provides needed lifecycle guidance to the scientific software community.« less
Bioinformatics and Astrophysics Cluster (BinAc)
NASA Astrophysics Data System (ADS)
Krüger, Jens; Lutz, Volker; Bartusch, Felix; Dilling, Werner; Gorska, Anna; Schäfer, Christoph; Walter, Thomas
2017-09-01
BinAC provides central high performance computing capacities for bioinformaticians and astrophysicists from the state of Baden-Württemberg. The bwForCluster BinAC is part of the implementation concept for scientific computing for the universities in Baden-Württemberg. Community specific support is offered through the bwHPC-C5 project.
NASA Astrophysics Data System (ADS)
Fairley, J. P.; Hinds, J. J.
2003-12-01
The advent of the World Wide Web in the early 1990s not only revolutionized the exchange of ideas and information within the scientific community, but also provided educators with a new array of teaching, informational, and promotional tools. Use of computer graphics and animation to explain concepts and processes can stimulate classroom participation and student interest in the geosciences, which has historically attracted students with strong spatial and visualization skills. In today's job market, graduates are expected to have knowledge of computers and the ability to use them for acquiring, processing, and visually analyzing data. Furthermore, in addition to promoting visibility and communication within the scientific community, computer graphics and the Internet can be informative and educational for the general public. Although computer skills are crucial for earth science students and educators, many pitfalls exist in implementing computer technology and web-based resources into research and classroom activities. Learning to use these new tools effectively requires a significant time commitment and careful attention to the source and reliability of the data presented. Furthermore, educators have a responsibility to ensure that students and the public understand the assumptions and limitations of the materials presented, rather than allowing them to be overwhelmed by "gee-whiz" aspects of the technology. We present three examples of computer technology in the earth sciences classroom: 1) a computer animation of water table response to well pumping, 2) a 3-D fly-through animation of a fault controlled valley, and 3) a virtual field trip for an introductory geology class. These examples demonstrate some of the challenges and benefits of these new tools, and encourage educators to expand the responsible use of computer technology for teaching and communicating scientific results to the general public.
What do computer scientists tweet? Analyzing the link-sharing practice on Twitter.
Schmitt, Marco; Jäschke, Robert
2017-01-01
Twitter communication has permeated every sphere of society. To highlight and share small pieces of information with possibly vast audiences or small circles of the interested has some value in almost any aspect of social life. But what is the value exactly for a scientific field? We perform a comprehensive study of computer scientists using Twitter and their tweeting behavior concerning the sharing of web links. Discerning the domains, hosts and individual web pages being tweeted and the differences between computer scientists and a Twitter sample enables us to look in depth at the Twitter-based information sharing practices of a scientific community. Additionally, we aim at providing a deeper understanding of the role and impact of altmetrics in computer science and give a glance at the publications mentioned on Twitter that are most relevant for the computer science community. Our results show a link sharing culture that concentrates more heavily on public and professional quality information than the Twitter sample does. The results also show a broad variety in linked sources and especially in linked publications with some publications clearly related to community-specific interests of computer scientists, while others with a strong relation to attention mechanisms in social media. This refers to the observation that Twitter is a hybrid form of social media between an information service and a social network service. Overall the computer scientists' style of usage seems to be more on the information-oriented side and to some degree also on professional usage. Therefore, altmetrics are of considerable use in analyzing computer science.
What do computer scientists tweet? Analyzing the link-sharing practice on Twitter
Schmitt, Marco
2017-01-01
Twitter communication has permeated every sphere of society. To highlight and share small pieces of information with possibly vast audiences or small circles of the interested has some value in almost any aspect of social life. But what is the value exactly for a scientific field? We perform a comprehensive study of computer scientists using Twitter and their tweeting behavior concerning the sharing of web links. Discerning the domains, hosts and individual web pages being tweeted and the differences between computer scientists and a Twitter sample enables us to look in depth at the Twitter-based information sharing practices of a scientific community. Additionally, we aim at providing a deeper understanding of the role and impact of altmetrics in computer science and give a glance at the publications mentioned on Twitter that are most relevant for the computer science community. Our results show a link sharing culture that concentrates more heavily on public and professional quality information than the Twitter sample does. The results also show a broad variety in linked sources and especially in linked publications with some publications clearly related to community-specific interests of computer scientists, while others with a strong relation to attention mechanisms in social media. This refers to the observation that Twitter is a hybrid form of social media between an information service and a social network service. Overall the computer scientists’ style of usage seems to be more on the information-oriented side and to some degree also on professional usage. Therefore, altmetrics are of considerable use in analyzing computer science. PMID:28636619
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schlicher, Bob G; Kulesz, James J; Abercrombie, Robert K
A principal tenant of the scientific method is that experiments must be repeatable and relies on ceteris paribus (i.e., all other things being equal). As a scientific community, involved in data sciences, we must investigate ways to establish an environment where experiments can be repeated. We can no longer allude to where the data comes from, we must add rigor to the data collection and management process from which our analysis is conducted. This paper describes a computing environment to support repeatable scientific big data experimentation of world-wide scientific literature, and recommends a system that is housed at the Oakmore » Ridge National Laboratory in order to provide value to investigators from government agencies, academic institutions, and industry entities. The described computing environment also adheres to the recently instituted digital data management plan mandated by multiple US government agencies, which involves all stages of the digital data life cycle including capture, analysis, sharing, and preservation. It particularly focuses on the sharing and preservation of digital research data. The details of this computing environment are explained within the context of cloud services by the three layer classification of Software as a Service , Platform as a Service , and Infrastructure as a Service .« less
A Quick Guide for Building a Successful Bioinformatics Community
Budd, Aidan; Corpas, Manuel; Brazas, Michelle D.; Fuller, Jonathan C.; Goecks, Jeremy; Mulder, Nicola J.; Michaut, Magali; Ouellette, B. F. Francis; Pawlik, Aleksandra; Blomberg, Niklas
2015-01-01
“Scientific community” refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i) the exchange and development of ideas and expertise; (ii) career development; (iii) coordinated funding activities; (iv) interactions and engagement with professionals from other fields; and (v) other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop “The ‘How To Guide’ for Establishing a Successful Bioinformatics Network” at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and the 12th European Conference on Computational Biology (ECCB). PMID:25654371
DOE Office of Scientific and Technical Information (OSTI.GOV)
Potok, Thomas; Schuman, Catherine; Patton, Robert
The White House and Department of Energy have been instrumental in driving the development of a neuromorphic computing program to help the United States continue its lead in basic research into (1) Beyond Exascale—high performance computing beyond Moore’s Law and von Neumann architectures, (2) Scientific Discovery—new paradigms for understanding increasingly large and complex scientific data, and (3) Emerging Architectures—assessing the potential of neuromorphic and quantum architectures. Neuromorphic computing spans a broad range of scientific disciplines from materials science to devices, to computer science, to neuroscience, all of which are required to solve the neuromorphic computing grand challenge. In our workshopmore » we focus on the computer science aspects, specifically from a neuromorphic device through an application. Neuromorphic devices present a very different paradigm to the computer science community from traditional von Neumann architectures, which raises six major questions about building a neuromorphic application from the device level. We used these fundamental questions to organize the workshop program and to direct the workshop panels and discussions. From the white papers, presentations, panels, and discussions, there emerged several recommendations on how to proceed.« less
A future for systems and computational neuroscience in France?
Faugeras, Olivier; Frégnac, Yves; Samuelides, Manuel
2007-01-01
This special issue of the Journal of Physiology, Paris, is an outcome of NeuroComp'06, the first French conference in Computational Neuroscience. The preparation for this conference, held at Pont-à-Mousson in October 2006, was accompanied by a survey which has resulted in an up-to-date inventory of human resources and labs in France concerned with this emerging new field of research (see team directory in http://neurocomp.risc.cnrs.fr/). This thematic JPP issue gathers some of the key scientific presentations made on the occasion of this first interdisciplinary meeting, which should soon become recognized as a yearly national conference representative of a new scientific community. The present introductory paper presents the general scientific context of the conference and reviews some of the historical and conceptual foundations of Systems and Computational Neuroscience in France.
The StratusLab cloud distribution: Use-cases and support for scientific applications
NASA Astrophysics Data System (ADS)
Floros, E.
2012-04-01
The StratusLab project is integrating an open cloud software distribution that enables organizations to setup and provide their own private or public IaaS (Infrastructure as a Service) computing clouds. StratusLab distribution capitalizes on popular infrastructure virtualization solutions like KVM, the OpenNebula virtual machine manager, Claudia service manager and SlipStream deployment platform, which are further enhanced and expanded with additional components developed within the project. The StratusLab distribution covers the core aspects of a cloud IaaS architecture, namely Computing (life-cycle management of virtual machines), Storage, Appliance management and Networking. The resulting software stack provides a packaged turn-key solution for deploying cloud computing services. The cloud computing infrastructures deployed using StratusLab can support a wide range of scientific and business use cases. Grid computing has been the primary use case pursued by the project and for this reason the initial priority has been the support for the deployment and operation of fully virtualized production-level grid sites; a goal that has already been achieved by operating such a site as part of EGI's (European Grid Initiative) pan-european grid infrastructure. In this area the project is currently working to provide non-trivial capabilities like elastic and autonomic management of grid site resources. Although grid computing has been the motivating paradigm, StratusLab's cloud distribution can support a wider range of use cases. Towards this direction, we have developed and currently provide support for setting up general purpose computing solutions like Hadoop, MPI and Torque clusters. For what concerns scientific applications the project is collaborating closely with the Bioinformatics community in order to prepare VM appliances and deploy optimized services for bioinformatics applications. In a similar manner additional scientific disciplines like Earth Science can take advantage of StratusLab cloud solutions. Interested users are welcomed to join StratusLab's user community by getting access to the reference cloud services deployed by the project and offered to the public.
Final Report. Institute for Ultralscale Visualization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ma, Kwan-Liu; Galli, Giulia; Gygi, Francois
The SciDAC Institute for Ultrascale Visualization brought together leading experts from visualization, high-performance computing, and science application areas to make advanced visualization solutions for SciDAC scientists and the broader community. Over the five-year project, the Institute introduced many new enabling visualization techniques, which have significantly enhanced scientists’ ability to validate their simulations, interpret their data, and communicate with others about their work and findings. This Institute project involved a large number of junior and student researchers, who received the opportunities to work on some of the most challenging science applications and gain access to the most powerful high-performance computing facilitiesmore » in the world. They were readily trained and prepared for facing the greater challenges presented by extreme-scale computing. The Institute’s outreach efforts, through publications, workshops and tutorials, successfully disseminated the new knowledge and technologies to the SciDAC and the broader scientific communities. The scientific findings and experience of the Institute team helped plan the SciDAC3 program.« less
Eppig, Janan T
2017-07-01
The Mouse Genome Informatics (MGI) Resource supports basic, translational, and computational research by providing high-quality, integrated data on the genetics, genomics, and biology of the laboratory mouse. MGI serves a strategic role for the scientific community in facilitating biomedical, experimental, and computational studies investigating the genetics and processes of diseases and enabling the development and testing of new disease models and therapeutic interventions. This review describes the nexus of the body of growing genetic and biological data and the advances in computer technology in the late 1980s, including the World Wide Web, that together launched the beginnings of MGI. MGI develops and maintains a gold-standard resource that reflects the current state of knowledge, provides semantic and contextual data integration that fosters hypothesis testing, continually develops new and improved tools for searching and analysis, and partners with the scientific community to assure research data needs are met. Here we describe one slice of MGI relating to the development of community-wide large-scale mutagenesis and phenotyping projects and introduce ways to access and use these MGI data. References and links to additional MGI aspects are provided. © The Author 2017. Published by Oxford University Press.
Eppig, Janan T.
2017-01-01
Abstract The Mouse Genome Informatics (MGI) Resource supports basic, translational, and computational research by providing high-quality, integrated data on the genetics, genomics, and biology of the laboratory mouse. MGI serves a strategic role for the scientific community in facilitating biomedical, experimental, and computational studies investigating the genetics and processes of diseases and enabling the development and testing of new disease models and therapeutic interventions. This review describes the nexus of the body of growing genetic and biological data and the advances in computer technology in the late 1980s, including the World Wide Web, that together launched the beginnings of MGI. MGI develops and maintains a gold-standard resource that reflects the current state of knowledge, provides semantic and contextual data integration that fosters hypothesis testing, continually develops new and improved tools for searching and analysis, and partners with the scientific community to assure research data needs are met. Here we describe one slice of MGI relating to the development of community-wide large-scale mutagenesis and phenotyping projects and introduce ways to access and use these MGI data. References and links to additional MGI aspects are provided. PMID:28838066
Science preparedness and science response: perspectives on the dynamics of preparedness conference.
Lant, Timothy; Lurie, Nicole
2013-01-01
The ability of the scientific modeling community to meaningfully contribute to postevent response activities during public health emergencies was the direct result of a discrete set of preparedness activities as well as advances in theory and technology. Scientists and decision-makers have recognized the value of developing scientific tools (e.g. models, data sets, communities of practice) to prepare them to be able to respond quickly--in a manner similar to preparedness activities by first-responders and emergency managers. Computational models have matured in their ability to better inform response plans by modeling human behaviors and complex systems. We advocate for further development of science preparedness activities as deliberate actions taken in advance of an unpredicted event (or an event with unknown consequences) to increase the scientific tools and evidence-base available to decision makers and the whole-of-community to limit adverse outcomes.
ISCR Annual Report: Fical Year 2004
DOE Office of Scientific and Technical Information (OSTI.GOV)
McGraw, J R
2005-03-03
Large-scale scientific computation and all of the disciplines that support and help to validate it have been placed at the focus of Lawrence Livermore National Laboratory (LLNL) by the Advanced Simulation and Computing (ASC) program of the National Nuclear Security Administration (NNSA) and the Scientific Discovery through Advanced Computing (SciDAC) initiative of the Office of Science of the Department of Energy (DOE). The maturation of computational simulation as a tool of scientific and engineering research is underscored in the November 2004 statement of the Secretary of Energy that, ''high performance computing is the backbone of the nation's science and technologymore » enterprise''. LLNL operates several of the world's most powerful computers--including today's single most powerful--and has undertaken some of the largest and most compute-intensive simulations ever performed. Ultrascale simulation has been identified as one of the highest priorities in DOE's facilities planning for the next two decades. However, computers at architectural extremes are notoriously difficult to use efficiently. Furthermore, each successful terascale simulation only points out the need for much better ways of interacting with the resulting avalanche of data. Advances in scientific computing research have, therefore, never been more vital to LLNL's core missions than at present. Computational science is evolving so rapidly along every one of its research fronts that to remain on the leading edge, LLNL must engage researchers at many academic centers of excellence. In Fiscal Year 2004, the Institute for Scientific Computing Research (ISCR) served as one of LLNL's main bridges to the academic community with a program of collaborative subcontracts, visiting faculty, student internships, workshops, and an active seminar series. The ISCR identifies researchers from the academic community for computer science and computational science collaborations with LLNL and hosts them for short- and long-term visits with the aim of encouraging long-term academic research agendas that address LLNL's research priorities. Through such collaborations, ideas and software flow in both directions, and LLNL cultivates its future workforce. The Institute strives to be LLNL's ''eyes and ears'' in the computer and information sciences, keeping the Laboratory aware of and connected to important external advances. It also attempts to be the ''feet and hands'' that carry those advances into the Laboratory and incorporates them into practice. ISCR research participants are integrated into LLNL's Computing and Applied Research (CAR) Department, especially into its Center for Applied Scientific Computing (CASC). In turn, these organizations address computational challenges arising throughout the rest of the Laboratory. Administratively, the ISCR flourishes under LLNL's University Relations Program (URP). Together with the other five institutes of the URP, it navigates a course that allows LLNL to benefit from academic exchanges while preserving national security. While it is difficult to operate an academic-like research enterprise within the context of a national security laboratory, the results declare the challenges well met and worth the continued effort.« less
Capturing Petascale Application Characteristics with the Sequoia Toolkit
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vetter, Jeffrey S; Bhatia, Nikhil; Grobelny, Eric M
2005-01-01
Characterization of the computation, communication, memory, and I/O demands of current scientific applications is crucial for identifying which technologies will enable petascale scientific computing. In this paper, we present the Sequoia Toolkit for characterizing HPC applications. The Sequoia Toolkit consists of the Sequoia trace capture library and the Sequoia Event Analysis Library, or SEAL, that facilitates the development of tools for analyzing Sequoia event traces. Using the Sequoia Toolkit, we have characterized the behavior of application runs with up to 2048 application processes. To illustrate the use of the Sequoia Toolkit, we present a preliminary characterization of LAMMPS, a molecularmore » dynamics application of great interest to the computational biology community.« less
Evolution of the Virtualized HPC Infrastructure of Novosibirsk Scientific Center
NASA Astrophysics Data System (ADS)
Adakin, A.; Anisenkov, A.; Belov, S.; Chubarov, D.; Kalyuzhny, V.; Kaplin, V.; Korol, A.; Kuchin, N.; Lomakin, S.; Nikultsev, V.; Skovpen, K.; Sukharev, A.; Zaytsev, A.
2012-12-01
Novosibirsk Scientific Center (NSC), also known worldwide as Akademgorodok, is one of the largest Russian scientific centers hosting Novosibirsk State University (NSU) and more than 35 research organizations of the Siberian Branch of Russian Academy of Sciences including Budker Institute of Nuclear Physics (BINP), Institute of Computational Technologies, and Institute of Computational Mathematics and Mathematical Geophysics (ICM&MG). Since each institute has specific requirements on the architecture of computing farms involved in its research field, currently we've got several computing facilities hosted by NSC institutes, each optimized for a particular set of tasks, of which the largest are the NSU Supercomputer Center, Siberian Supercomputer Center (ICM&MG), and a Grid Computing Facility of BINP. A dedicated optical network with the initial bandwidth of 10 Gb/s connecting these three facilities was built in order to make it possible to share the computing resources among the research communities, thus increasing the efficiency of operating the existing computing facilities and offering a common platform for building the computing infrastructure for future scientific projects. Unification of the computing infrastructure is achieved by extensive use of virtualization technology based on XEN and KVM platforms. This contribution gives a thorough review of the present status and future development prospects for the NSC virtualized computing infrastructure and the experience gained while using it for running production data analysis jobs related to HEP experiments being carried out at BINP, especially the KEDR detector experiment at the VEPP-4M electron-positron collider.
Multicore: Fallout from a Computing Evolution
Yelick, Kathy [Director, NERSC
2017-12-09
July 22, 2008 Berkeley Lab lecture: Parallel computing used to be reserved for big science and engineering projects, but in two years that's all changed. Even laptops and hand-helds use parallel processors. Unfortunately, the software hasn't kept pace. Kathy Yelick, Director of the National Energy Research Scientific Computing Center at Berkeley Lab, describes the resulting chaos and the computing community's efforts to develop exciting applications that take advantage of tens or hundreds of processors on a single chip.
Optimizing Engineering Tools Using Modern Ground Architectures
2017-12-01
Considerations,” International Journal of Computer Science & Engineering Survey , vol. 5, no. 4, 2014. [10] R. Bell. (n.d). A beginner’s guide to big O notation...scientific community. Traditional computing architectures were not capable of processing the data efficiently, or in some cases, could not process the...thesis investigates how these modern computing architectures could be leveraged by industry and academia to improve the performance and capabilities of
ACM TOMS replicated computational results initiative
Heroux, Michael Allen
2015-06-03
In this study, the scientific community relies on the peer review process for assuring the quality of published material, the goal of which is to build a body of work we can trust. Computational journals such as The ACM Transactions on Mathematical Software (TOMS) use this process for rigorously promoting the clarity and completeness of content, and citation of prior work. At the same time, it is unusual to independently confirm computational results.
NASA Astrophysics Data System (ADS)
Okaya, D.; Deelman, E.; Maechling, P.; Wong-Barnum, M.; Jordan, T. H.; Meyers, D.
2007-12-01
Large scientific collaborations, such as the SCEC Petascale Cyberfacility for Physics-based Seismic Hazard Analysis (PetaSHA) Project, involve interactions between many scientists who exchange ideas and research results. These groups must organize, manage, and make accessible their community materials of observational data, derivative (research) results, computational products, and community software. The integration of scientific workflows as a paradigm to solve complex computations provides advantages of efficiency, reliability, repeatability, choices, and ease of use. The underlying resource needed for a scientific workflow to function and create discoverable and exchangeable products is the construction, tracking, and preservation of metadata. In the scientific workflow environment there is a two-tier structure of metadata. Workflow-level metadata and provenance describe operational steps, identity of resources, execution status, and product locations and names. Domain-level metadata essentially define the scientific meaning of data, codes and products. To a large degree the metadata at these two levels are separate. However, between these two levels is a subset of metadata produced at one level but is needed by the other. This crossover metadata suggests that some commonality in metadata handling is needed. SCEC researchers are collaborating with computer scientists at SDSC, the USC Information Sciences Institute, and Carnegie Mellon Univ. in order to perform earthquake science using high-performance computational resources. A primary objective of the "PetaSHA" collaboration is to perform physics-based estimations of strong ground motion associated with real and hypothetical earthquakes located within Southern California. Construction of 3D earth models, earthquake representations, and numerical simulation of seismic waves are key components of these estimations. Scientific workflows are used to orchestrate the sequences of scientific tasks and to access distributed computational facilities such as the NSF TeraGrid. Different types of metadata are produced and captured within the scientific workflows. One workflow within PetaSHA ("Earthworks") performs a linear sequence of tasks with workflow and seismological metadata preserved. Downstream scientific codes ingest these metadata produced by upstream codes. The seismological metadata uses attribute-value pairing in plain text; an identified need is to use more advanced handling methods. Another workflow system within PetaSHA ("Cybershake") involves several complex workflows in order to perform statistical analysis of ground shaking due to thousands of hypothetical but plausible earthquakes. Metadata management has been challenging due to its construction around a number of legacy scientific codes. We describe difficulties arising in the scientific workflow due to the lack of this metadata and suggest corrective steps, which in some cases include the cultural shift of domain science programmers coding for metadata.
Alford, Rebecca F.; Dolan, Erin L.
2017-01-01
Computational biology is an interdisciplinary field, and many computational biology research projects involve distributed teams of scientists. To accomplish their work, these teams must overcome both disciplinary and geographic barriers. Introducing new training paradigms is one way to facilitate research progress in computational biology. Here, we describe a new undergraduate program in biomolecular structure prediction and design in which students conduct research at labs located at geographically-distributed institutions while remaining connected through an online community. This 10-week summer program begins with one week of training on computational biology methods development, transitions to eight weeks of research, and culminates in one week at the Rosetta annual conference. To date, two cohorts of students have participated, tackling research topics including vaccine design, enzyme design, protein-based materials, glycoprotein modeling, crowd-sourced science, RNA processing, hydrogen bond networks, and amyloid formation. Students in the program report outcomes comparable to students who participate in similar in-person programs. These outcomes include the development of a sense of community and increases in their scientific self-efficacy, scientific identity, and science values, all predictors of continuing in a science research career. Furthermore, the program attracted students from diverse backgrounds, which demonstrates the potential of this approach to broaden the participation of young scientists from backgrounds traditionally underrepresented in computational biology. PMID:29216185
Alford, Rebecca F; Leaver-Fay, Andrew; Gonzales, Lynda; Dolan, Erin L; Gray, Jeffrey J
2017-12-01
Computational biology is an interdisciplinary field, and many computational biology research projects involve distributed teams of scientists. To accomplish their work, these teams must overcome both disciplinary and geographic barriers. Introducing new training paradigms is one way to facilitate research progress in computational biology. Here, we describe a new undergraduate program in biomolecular structure prediction and design in which students conduct research at labs located at geographically-distributed institutions while remaining connected through an online community. This 10-week summer program begins with one week of training on computational biology methods development, transitions to eight weeks of research, and culminates in one week at the Rosetta annual conference. To date, two cohorts of students have participated, tackling research topics including vaccine design, enzyme design, protein-based materials, glycoprotein modeling, crowd-sourced science, RNA processing, hydrogen bond networks, and amyloid formation. Students in the program report outcomes comparable to students who participate in similar in-person programs. These outcomes include the development of a sense of community and increases in their scientific self-efficacy, scientific identity, and science values, all predictors of continuing in a science research career. Furthermore, the program attracted students from diverse backgrounds, which demonstrates the potential of this approach to broaden the participation of young scientists from backgrounds traditionally underrepresented in computational biology.
Technologies for Large Data Management in Scientific Computing
NASA Astrophysics Data System (ADS)
Pace, Alberto
2014-01-01
In recent years, intense usage of computing has been the main strategy of investigations in several scientific research projects. The progress in computing technology has opened unprecedented opportunities for systematic collection of experimental data and the associated analysis that were considered impossible only few years ago. This paper focuses on the strategies in use: it reviews the various components that are necessary for an effective solution that ensures the storage, the long term preservation, and the worldwide distribution of large quantities of data that are necessary in a large scientific research project. The paper also mentions several examples of data management solutions used in High Energy Physics for the CERN Large Hadron Collider (LHC) experiments in Geneva, Switzerland which generate more than 30,000 terabytes of data every year that need to be preserved, analyzed, and made available to a community of several tenth of thousands scientists worldwide.
Constraint-based stoichiometric modelling from single organisms to microbial communities
Olivier, Brett G.; Bruggeman, Frank J.; Teusink, Bas
2016-01-01
Microbial communities are ubiquitously found in Nature and have direct implications for the environment, human health and biotechnology. The species composition and overall function of microbial communities are largely shaped by metabolic interactions such as competition for resources and cross-feeding. Although considerable scientific progress has been made towards mapping and modelling species-level metabolism, elucidating the metabolic exchanges between microorganisms and steering the community dynamics remain an enormous scientific challenge. In view of the complexity, computational models of microbial communities are essential to obtain systems-level understanding of ecosystem functioning. This review discusses the applications and limitations of constraint-based stoichiometric modelling tools, and in particular flux balance analysis (FBA). We explain this approach from first principles and identify the challenges one faces when extending it to communities, and discuss the approaches used in the field in view of these challenges. We distinguish between steady-state and dynamic FBA approaches extended to communities. We conclude that much progress has been made, but many of the challenges are still open. PMID:28334697
Schopf, Jennifer M.; Nitzberg, Bill
2002-01-01
The design and implementation of a national computing system and data grid has become a reachable goal from both the computer science and computational science point of view. A distributed infrastructure capable of sophisticated computational functions can bring many benefits to scientific work, but poses many challenges, both technical and socio-political. Technical challenges include having basic software tools, higher-level services, functioning and pervasive security, and standards, while socio-political issues include building a user community, adding incentives for sites to be part of a user-centric environment, and educating funding sources about the needs of this community. This paper details the areasmore » relating to Grid research that we feel still need to be addressed to fully leverage the advantages of the Grid.« less
Data Crosscutting Requirements Review
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kleese van Dam, Kerstin; Shoshani, Arie; Plata, Charity
2013-04-01
In April 2013, a diverse group of researchers from the U.S. Department of Energy (DOE) scientific community assembled to assess data requirements associated with DOE-sponsored scientific facilities and large-scale experiments. Participants in the review included facilities staff, program managers, and scientific experts from the offices of Basic Energy Sciences, Biological and Environmental Research, High Energy Physics, and Advanced Scientific Computing Research. As part of the meeting, review participants discussed key issues associated with three distinct aspects of the data challenge: 1) processing, 2) management, and 3) analysis. These discussions identified commonalities and differences among the needs of varied scientific communities.more » They also helped to articulate gaps between current approaches and future needs, as well as the research advances that will be required to close these gaps. Moreover, the review provided a rare opportunity for experts from across the Office of Science to learn about their collective expertise, challenges, and opportunities. The "Data Crosscutting Requirements Review" generated specific findings and recommendations for addressing large-scale data crosscutting requirements.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Childers, L.; Liming, L.; Foster, I.
2008-10-15
This report summarizes the methodology and results of a user perspectives study conducted by the Community Driven Improvement of Globus Software (CDIGS) project. The purpose of the study was to document the work-related goals and challenges facing today's scientific technology users, to record their perspectives on Globus software and the distributed-computing ecosystem, and to provide recommendations to the Globus community based on the observations. Globus is a set of open source software components intended to provide a framework for collaborative computational science activities. Rather than attempting to characterize all users or potential users of Globus software, our strategy has beenmore » to speak in detail with a small group of individuals in the scientific community whose work appears to be the kind that could benefit from Globus software, learn as much as possible about their work goals and the challenges they face, and describe what we found. The result is a set of statements about specific individuals experiences. We do not claim that these are representative of a potential user community, but we do claim to have found commonalities and differences among the interviewees that may be reflected in the user community as a whole. We present these as a series of hypotheses that can be tested by subsequent studies, and we offer recommendations to Globus developers based on the assumption that these hypotheses are representative. Specifically, we conducted interviews with thirty technology users in the scientific community. We included both people who have used Globus software and those who have not. We made a point of including individuals who represent a variety of roles in scientific projects, for example, scientists, software developers, engineers, and infrastructure providers. The following material is included in this report: (1) A summary of the reported work-related goals, significant issues, and points of satisfaction with the use of Globus software; (2) A method for characterizing users according to their technology interactions, and identification of four user types among the interviewees using the method; (3) Four profiles that highlight points of commonality and diversity in each user type; (4) Recommendations for technology developers and future studies; (5) A description of the interview protocol and overall study methodology; (6) An anonymized list of the interviewees; and (7) Interview writeups and summary data. The interview summaries in Section 3 and transcripts in Appendix D illustrate the value of distributed computing software--and Globus in particular--to scientific enterprises. They also document opportunities to make these tools still more useful both to current users and to new communities. We aim our recommendations at developers who intend their software to be used and reused in many applications. (This kind of software is often referred to as 'middleware.') Our two core recommendations are as follows. First, it is essential for middleware developers to understand and explicitly manage the multiple user products in which their software components are used. We must avoid making assumptions about the commonality of these products and, instead, study and account for their diversity. Second, middleware developers should engage in different ways with different kinds of users. Having identified four general user types in Section 4, we provide specific ideas for how to engage them in Section 5.« less
Multicore: Fallout From a Computing Evolution (LBNL Summer Lecture Series)
Yelick, Kathy [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
2018-05-07
Summer Lecture Series 2008: Parallel computing used to be reserved for big science and engineering projects, but in two years that's all changed. Even laptops and hand-helds use parallel processors. Unfortunately, the software hasn't kept pace. Kathy Yelick, Director of the National Energy Research Scientific Computing Center at Berkeley Lab, describes the resulting chaos and the computing community's efforts to develop exciting applications that take advantage of tens or hundreds of processors on a single chip.
A feasibility study on porting the community land model onto accelerators using OpenACC
Wang, Dali; Wu, Wei; Winkler, Frank; ...
2014-01-01
As environmental models (such as Accelerated Climate Model for Energy (ACME), Parallel Reactive Flow and Transport Model (PFLOTRAN), Arctic Terrestrial Simulator (ATS), etc.) became more and more complicated, we are facing enormous challenges regarding to porting those applications onto hybrid computing architecture. OpenACC appears as a very promising technology, therefore, we have conducted a feasibility analysis on porting the Community Land Model (CLM), a terrestrial ecosystem model within the Community Earth System Models (CESM)). Specifically, we used automatic function testing platform to extract a small computing kernel out of CLM, then we apply this kernel into the actually CLM dataflowmore » procedure, and investigate the strategy of data parallelization and the benefit of data movement provided by current implementation of OpenACC. Even it is a non-intensive kernel, on a single 16-core computing node, the performance (based on the actual computation time using one GPU) of OpenACC implementation is 2.3 time faster than that of OpenMP implementation using single OpenMP thread, but it is 2.8 times slower than the performance of OpenMP implementation using 16 threads. On multiple nodes, MPI_OpenACC implementation demonstrated very good scalability on up to 128 GPUs on 128 computing nodes. This study also provides useful information for us to look into the potential benefits of “deep copy” capability and “routine” feature of OpenACC standards. In conclusion, we believe that our experience on the environmental model, CLM, can be beneficial to many other scientific research programs who are interested to porting their large scale scientific code using OpenACC onto high-end computers, empowered by hybrid computing architecture.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keyes, D E; McGraw, J R
2006-02-02
Large-scale scientific computation and all of the disciplines that support and help validate it have been placed at the focus of Lawrence Livermore National Laboratory (LLNL) by the Advanced Simulation and Computing (ASC) program of the National Nuclear Security Administration (NNSA) and the Scientific Discovery through Advanced Computing (SciDAC) initiative of the Office of Science of the Department of Energy (DOE). The maturation of simulation as a fundamental tool of scientific and engineering research is underscored in the President's Information Technology Advisory Committee (PITAC) June 2005 finding that ''computational science has become critical to scientific leadership, economic competitiveness, and nationalmore » security''. LLNL operates several of the world's most powerful computers--including today's single most powerful--and has undertaken some of the largest and most compute-intensive simulations ever performed, most notably the molecular dynamics simulation that sustained more than 100 Teraflop/s and won the 2005 Gordon Bell Prize. Ultrascale simulation has been identified as one of the highest priorities in DOE's facilities planning for the next two decades. However, computers at architectural extremes are notoriously difficult to use in an efficient manner. Furthermore, each successful terascale simulation only points out the need for much better ways of interacting with the resulting avalanche of data. Advances in scientific computing research have, therefore, never been more vital to the core missions of LLNL than at present. Computational science is evolving so rapidly along every one of its research fronts that to remain on the leading edge, LLNL must engage researchers at many academic centers of excellence. In FY 2005, the Institute for Scientific Computing Research (ISCR) served as one of LLNL's main bridges to the academic community with a program of collaborative subcontracts, visiting faculty, student internships, workshops, and an active seminar series. The ISCR identifies researchers from the academic community for computer science and computational science collaborations with LLNL and hosts them for both brief and extended visits with the aim of encouraging long-term academic research agendas that address LLNL research priorities. Through these collaborations, ideas and software flow in both directions, and LLNL cultivates its future workforce. The Institute strives to be LLNL's ''eyes and ears'' in the computer and information sciences, keeping the Laboratory aware of and connected to important external advances. It also attempts to be the ''hands and feet'' that carry those advances into the Laboratory and incorporate them into practice. ISCR research participants are integrated into LLNL's Computing Applications and Research (CAR) Department, especially into its Center for Applied Scientific Computing (CASC). In turn, these organizations address computational challenges arising throughout the rest of the Laboratory. Administratively, the ISCR flourishes under LLNL's University Relations Program (URP). Together with the other four institutes of the URP, the ISCR navigates a course that allows LLNL to benefit from academic exchanges while preserving national security. While it is difficult to operate an academic-like research enterprise within the context of a national security laboratory, the results declare the challenges well met and worth the continued effort. The pages of this annual report summarize the activities of the faculty members, postdoctoral researchers, students, and guests from industry and other laboratories who participated in LLNL's computational mission under the auspices of the ISCR during FY 2005.« less
Self-Concept of Computer and Math Ability: Gender Implications across Time and within ICT Studies
ERIC Educational Resources Information Center
Sainz, Milagros; Eccles, Jacquelynne
2012-01-01
The scarcity of women in ICT-related studies has been systematically reported by the scientific community for many years. This paper has three goals: to analyze gender differences in self-concept of computer and math abilities along with math performance in two consecutive academic years; to study the ontogeny of gender differences in self-concept…
Geospatial-enabled Data Exploration and Computation through Data Infrastructure Building Blocks
NASA Astrophysics Data System (ADS)
Song, C. X.; Biehl, L. L.; Merwade, V.; Villoria, N.
2015-12-01
Geospatial data are present everywhere today with the proliferation of location-aware computing devices and sensors. This is especially true in the scientific community where large amounts of data are driving research and education activities in many domains. Collaboration over geospatial data, for example, in modeling, data analysis and visualization, must still overcome the barriers of specialized software and expertise among other challenges. The GABBs project aims at enabling broader access to geospatial data exploration and computation by developing spatial data infrastructure building blocks that leverage capabilities of end-to-end application service and virtualized computing framework in HUBzero. Funded by NSF Data Infrastructure Building Blocks (DIBBS) initiative, GABBs provides a geospatial data architecture that integrates spatial data management, mapping and visualization and will make it available as open source. The outcome of the project will enable users to rapidly create tools and share geospatial data and tools on the web for interactive exploration of data without requiring significant software development skills, GIS expertise or IT administrative privileges. This presentation will describe the development of geospatial data infrastructure building blocks and the scientific use cases that help drive the software development, as well as seek feedback from the user communities.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boman, Erik G.; Catalyurek, Umit V.; Chevalier, Cedric
2015-01-16
This final progress report summarizes the work accomplished at the Combinatorial Scientific Computing and Petascale Simulations Institute. We developed Zoltan, a parallel mesh partitioning library that made use of accurate hypergraph models to provide load balancing in mesh-based computations. We developed several graph coloring algorithms for computing Jacobian and Hessian matrices and organized them into a software package called ColPack. We developed parallel algorithms for graph coloring and graph matching problems, and also designed multi-scale graph algorithms. Three PhD students graduated, six more are continuing their PhD studies, and four postdoctoral scholars were advised. Six of these students and Fellowsmore » have joined DOE Labs (Sandia, Berkeley), as staff scientists or as postdoctoral scientists. We also organized the SIAM Workshop on Combinatorial Scientific Computing (CSC) in 2007, 2009, and 2011 to continue to foster the CSC community.« less
Data handling and visualization for NASA's science programs
NASA Technical Reports Server (NTRS)
Bredekamp, Joseph H. (Editor)
1995-01-01
Advanced information systems capabilities are essential to conducting NASA's scientific research mission. Access to these capabilities is no longer a luxury for a select few within the science community, but rather an absolute necessity for carrying out scientific investigations. The dependence on high performance computing and networking, as well as ready and expedient access to science data, metadata, and analysis tools is the fundamental underpinning for the entire research endeavor. At the same time, advances in the whole range of information technologies continues on an almost explosive growth path, reaching beyond the research community to affect the population as a whole. Capitalizing on and exploiting these advances are critical to the continued success of space science investigations. NASA must remain abreast of developments in the field and strike an appropriate balance between being a smart buyer and a direct investor in the technology which serves its unique requirements. Another key theme deals with the need for the space and computer science communities to collaborate as partners to more fully realize the potential of information technology in the space science research environment.
Uses of the Drupal CMS Collaborative Framework in the Woods Hole Scientific Community (Invited)
NASA Astrophysics Data System (ADS)
Maffei, A. R.; Chandler, C. L.; Work, T. T.; Shorthouse, D.; Furfey, J.; Miller, H.
2010-12-01
Organizations that comprise the Woods Hole scientific community (Woods Hole Oceanographic Institution, Marine Biological Laboratory, USGS Woods Hole Coastal and Marine Science Center, Woods Hole Research Center, NOAA NMFS Northeast Fisheries Science Center, SEA Education Association) have a long history of collaborative activity regarding computing, computer network and information technologies that support common, inter-disciplinary science needs. Over the past several years there has been growing interest in the use of the Drupal Content Management System (CMS) playing a variety of roles in support of research projects resident at several of these organizations. Many of these projects are part of science programs that are national and international in scope. Here we survey the current uses of Drupal within the Woods Hole scientific community and examine reasons it has been adopted. The promise of emerging semantic features in the Drupal framework is examined and projections of how pre-existing Drupal-based websites might benefit are made. Closer examination of Drupal software design exposes it as more than simply a content management system. The flexibility of its architecture; the power of its taxonomy module; the care taken in nurturing the open-source developer community that surrounds it (including organized and often well-attended code sprints); the ability to bind emerging software technologies as Drupal modules; the careful selection process used in adopting core functionality; multi-site hosting and cross-site deployment of updates and a recent trend towards development of use-case inspired Drupal distributions casts Drupal as a general-purpose application deployment framework. Recent work in the semantic arena casts Drupal as an emerging RDF framework as well. Examples of roles played by Drupal-based websites within the Woods Hole scientific community that will be discussed include: science data metadata database, organization main website, biological taxonomy development, bibliographic database, physical media data archive inventory manager, disaster-response website development framework, science project task management, science conference planning, and spreadsheet-to-database converter.
Towards Reproducibility in Computational Hydrology
NASA Astrophysics Data System (ADS)
Hutton, Christopher; Wagener, Thorsten; Freer, Jim; Han, Dawei; Duffy, Chris; Arheimer, Berit
2017-04-01
Reproducibility is a foundational principle in scientific research. The ability to independently re-run an experiment helps to verify the legitimacy of individual findings, and evolve (or reject) hypotheses and models of how environmental systems function, and move them from specific circumstances to more general theory. Yet in computational hydrology (and in environmental science more widely) the code and data that produces published results are not regularly made available, and even if they are made available, there remains a multitude of generally unreported choices that an individual scientist may have made that impact the study result. This situation strongly inhibits the ability of our community to reproduce and verify previous findings, as all the information and boundary conditions required to set up a computational experiment simply cannot be reported in an article's text alone. In Hutton et al 2016 [1], we argue that a cultural change is required in the computational hydrological community, in order to advance and make more robust the process of knowledge creation and hypothesis testing. We need to adopt common standards and infrastructures to: (1) make code readable and re-useable; (2) create well-documented workflows that combine re-useable code together with data to enable published scientific findings to be reproduced; (3) make code and workflows available, easy to find, and easy to interpret, using code and code metadata repositories. To create change we argue for improved graduate training in these areas. In this talk we reflect on our progress in achieving reproducible, open science in computational hydrology, which are relevant to the broader computational geoscience community. In particular, we draw on our experience in the Switch-On (EU funded) virtual water science laboratory (http://www.switch-on-vwsl.eu/participate/), which is an open platform for collaboration in hydrological experiments (e.g. [2]). While we use computational hydrology as the example application area, we believe that our conclusions are of value to the wider environmental and geoscience community as far as the use of code and models for scientific advancement is concerned. References: [1] Hutton, C., T. Wagener, J. Freer, D. Han, C. Duffy, and B. Arheimer (2016), Most computational hydrology is not reproducible, so is it really science?, Water Resour. Res., 52, 7548-7555, doi:10.1002/2016WR019285. [2] Ceola, S., et al. (2015), Virtual laboratories: New opportunities for collaborative water science, Hydrol. Earth Syst. Sci. Discuss., 11(12), 13443-13478, doi:10.5194/hessd-11-13443-2014.
Computational Simulations and the Scientific Method
NASA Technical Reports Server (NTRS)
Kleb, Bil; Wood, Bill
2005-01-01
As scientific simulation software becomes more complicated, the scientific-software implementor's need for component tests from new model developers becomes more crucial. The community's ability to follow the basic premise of the Scientific Method requires independently repeatable experiments, and model innovators are in the best position to create these test fixtures. Scientific software developers also need to quickly judge the value of the new model, i.e., its cost-to-benefit ratio in terms of gains provided by the new model and implementation risks such as cost, time, and quality. This paper asks two questions. The first is whether other scientific software developers would find published component tests useful, and the second is whether model innovators think publishing test fixtures is a feasible approach.
ITK: enabling reproducible research and open science
McCormick, Matthew; Liu, Xiaoxiao; Jomier, Julien; Marion, Charles; Ibanez, Luis
2014-01-01
Reproducibility verification is essential to the practice of the scientific method. Researchers report their findings, which are strengthened as other independent groups in the scientific community share similar outcomes. In the many scientific fields where software has become a fundamental tool for capturing and analyzing data, this requirement of reproducibility implies that reliable and comprehensive software platforms and tools should be made available to the scientific community. The tools will empower them and the public to verify, through practice, the reproducibility of observations that are reported in the scientific literature. Medical image analysis is one of the fields in which the use of computational resources, both software and hardware, are an essential platform for performing experimental work. In this arena, the introduction of the Insight Toolkit (ITK) in 1999 has transformed the field and facilitates its progress by accelerating the rate at which algorithmic implementations are developed, tested, disseminated and improved. By building on the efficiency and quality of open source methodologies, ITK has provided the medical image community with an effective platform on which to build a daily workflow that incorporates the true scientific practices of reproducibility verification. This article describes the multiple tools, methodologies, and practices that the ITK community has adopted, refined, and followed during the past decade, in order to become one of the research communities with the most modern reproducibility verification infrastructure. For example, 207 contributors have created over 2400 unit tests that provide over 84% code line test coverage. The Insight Journal, an open publication journal associated with the toolkit, has seen over 360,000 publication downloads. The median normalized closeness centrality, a measure of knowledge flow, resulting from the distributed peer code review system was high, 0.46. PMID:24600387
ITK: enabling reproducible research and open science.
McCormick, Matthew; Liu, Xiaoxiao; Jomier, Julien; Marion, Charles; Ibanez, Luis
2014-01-01
Reproducibility verification is essential to the practice of the scientific method. Researchers report their findings, which are strengthened as other independent groups in the scientific community share similar outcomes. In the many scientific fields where software has become a fundamental tool for capturing and analyzing data, this requirement of reproducibility implies that reliable and comprehensive software platforms and tools should be made available to the scientific community. The tools will empower them and the public to verify, through practice, the reproducibility of observations that are reported in the scientific literature. Medical image analysis is one of the fields in which the use of computational resources, both software and hardware, are an essential platform for performing experimental work. In this arena, the introduction of the Insight Toolkit (ITK) in 1999 has transformed the field and facilitates its progress by accelerating the rate at which algorithmic implementations are developed, tested, disseminated and improved. By building on the efficiency and quality of open source methodologies, ITK has provided the medical image community with an effective platform on which to build a daily workflow that incorporates the true scientific practices of reproducibility verification. This article describes the multiple tools, methodologies, and practices that the ITK community has adopted, refined, and followed during the past decade, in order to become one of the research communities with the most modern reproducibility verification infrastructure. For example, 207 contributors have created over 2400 unit tests that provide over 84% code line test coverage. The Insight Journal, an open publication journal associated with the toolkit, has seen over 360,000 publication downloads. The median normalized closeness centrality, a measure of knowledge flow, resulting from the distributed peer code review system was high, 0.46.
INDIGO-DataCloud solutions for Earth Sciences
NASA Astrophysics Data System (ADS)
Aguilar Gómez, Fernando; de Lucas, Jesús Marco; Fiore, Sandro; Monna, Stephen; Chen, Yin
2017-04-01
INDIGO-DataCloud (https://www.indigo-datacloud.eu/) is a European Commission funded project aiming to develop a data and computing platform targeting scientific communities, deployable on multiple hardware and provisioned over hybrid (private or public) e-infrastructures. The development of INDIGO solutions covers the different layers in cloud computing (IaaS, PaaS, SaaS), and provides tools to exploit resources like HPC or GPGPUs. INDIGO is oriented to support European Scientific research communities, that are well represented in the project. Twelve different Case Studies have been analyzed in detail from different fields: Biological & Medical sciences, Social sciences & Humanities, Environmental and Earth sciences and Physics & Astrophysics. INDIGO-DataCloud provides solutions to emerging challenges in Earth Science like: -Enabling an easy deployment of community services at different cloud sites. Many Earth Science research infrastructures often involve distributed observation stations across countries, and also have distributed data centers to support the corresponding data acquisition and curation. There is a need to easily deploy new data center services while the research infrastructure continuous spans. As an example: LifeWatch (ESFRI, Ecosystems and Biodiversity) uses INDIGO solutions to manage the deployment of services to perform complex hydrodynamics and water quality modelling over a Cloud Computing environment, predicting algae blooms, using the Docker technology: TOSCA requirement description, Docker repository, Orchestrator for deployment, AAI (AuthN, AuthZ) and OneData (Distributed Storage System). -Supporting Big Data Analysis. Nowadays, many Earth Science research communities produce large amounts of data and and are challenged by the difficulties of processing and analysing it. A climate models intercomparison data analysis case study for the European Network for Earth System Modelling (ENES) community has been setup, based on the Ophidia big data analysis framework and the Kepler workflow management system. Such services normally involve a large and distributed set of data and computing resources. In this regard, this case study exploits the INDIGO PaaS for a flexible and dynamic allocation of the resources at the infrastructural level. -Providing Distributed Data Storage Solutions. In order to allow scientific communities to perform heavy computation on huge datasets, INDIGO provides global data access solutions allowing researchers to access data in a distributed environment like fashion regardless of its location, and also to publish and share their research results with public or close communities. INDIGO solutions that support the access to distributed data storage (OneData) are being tested on EMSO infrastructure (Ocean Sciences and Geohazards) data. Another aspect of interest for the EMSO community is in efficient data processing by exploiting INDIGO services like PaaS Orchestrator. Further, for HPC exploitation, a new solution named Udocker has been implemented, enabling users to execute docker containers in supercomputers, without requiring administration privileges. This presentation will overview INDIGO solutions that are interesting and useful for Earth science communities and will show how they can be applied to other Case Studies.
[Computer simulation of thyroid regulatory mechanisms in health and malignancy].
Abduvaliev, A A; Gil'dieva, M S; Khidirov, B N; Saĭdalieva, M; Saatov, T S
2010-07-01
The paper describes a computer model for regulation of the number of thyroid follicular cells in health and malignancy. The authors'computer program for mathematical simulation of the regulatory mechanisms of a thyroid follicular cellular community cannot be now referred to as good commercial products. For commercialization of this product, it is necessary to draw up a direct relation of the introduced corrected values from the actually existing normal values, such as the peripheral blood concentrations of thyroid hormones or the mean values of endocrine tissue mitotic activity. However, the described computer program has been also used in researches by our scientific group in the study of thyroid cancer. The available biological experimental data and theoretical provisions on thyroid structural and functional organization at the cellular level allow one to construct mathematical models for quantitative analysis of the regulation of the size of a cellular community of a thyroid follicle in health and abnormalities, by using the method for simulation of the regulatory mechanisms of living systems and the equations of cellular community regulatory communities.
Curating Big Data Made Simple: Perspectives from Scientific Communities.
Sowe, Sulayman K; Zettsu, Koji
2014-03-01
The digital universe is exponentially producing an unprecedented volume of data that has brought benefits as well as fundamental challenges for enterprises and scientific communities alike. This trend is inherently exciting for the development and deployment of cloud platforms to support scientific communities curating big data. The excitement stems from the fact that scientists can now access and extract value from the big data corpus, establish relationships between bits and pieces of information from many types of data, and collaborate with a diverse community of researchers from various domains. However, despite these perceived benefits, to date, little attention is focused on the people or communities who are both beneficiaries and, at the same time, producers of big data. The technical challenges posed by big data are as big as understanding the dynamics of communities working with big data, whether scientific or otherwise. Furthermore, the big data era also means that big data platforms for data-intensive research must be designed in such a way that research scientists can easily search and find data for their research, upload and download datasets for onsite/offsite use, perform computations and analysis, share their findings and research experience, and seamlessly collaborate with their colleagues. In this article, we present the architecture and design of a cloud platform that meets some of these requirements, and a big data curation model that describes how a community of earth and environmental scientists is using the platform to curate data. Motivation for developing the platform, lessons learnt in overcoming some challenges associated with supporting scientists to curate big data, and future research directions are also presented.
ISEES: an institute for sustainable software to accelerate environmental science
NASA Astrophysics Data System (ADS)
Jones, M. B.; Schildhauer, M.; Fox, P. A.
2013-12-01
Software is essential to the full science lifecycle, spanning data acquisition, processing, quality assessment, data integration, analysis, modeling, and visualization. Software runs our meteorological sensor systems, our data loggers, and our ocean gliders. Every aspect of science is impacted by, and improved by, software. Scientific advances ranging from modeling climate change to the sequencing of the human genome have been rendered possible in the last few decades due to the massive improvements in the capabilities of computers to process data through software. This pivotal role of software in science is broadly acknowledged, while simultaneously being systematically undervalued through minimal investments in maintenance and innovation. As a community, we need to embrace the creation, use, and maintenance of software within science, and address problems such as code complexity, openness,reproducibility, and accessibility. We also need to fully develop new skills and practices in software engineering as a core competency in our earth science disciplines, starting with undergraduate and graduate education and extending into university and agency professional positions. The Institute for Sustainable Earth and Environmental Software (ISEES) is being envisioned as a community-driven activity that can facilitate and galvanize activites around scientific software in an analogous way to synthesis centers such as NCEAS and NESCent that have stimulated massive advances in ecology and evolution. We will describe the results of six workshops (Science Drivers, Software Lifecycles, Software Components, Workforce Development and Training, Sustainability and Governance, and Community Engagement) that have been held in 2013 to envision such an institute. We will present community recommendations from these workshops and our strategic vision for how ISEES will address the technical issues in the software lifecycle, sustainability of the whole software ecosystem, and the critical issue of computational training for the scientific community. Process for envisioning ISEES.
The Center for Nanophase Materials Sciences
NASA Astrophysics Data System (ADS)
Lowndes, Douglas
2005-03-01
The Center for Nanophase Materials Sciences (CNMS) located at Oak Ridge National Laboratory (ORNL) will be the first DOE Nanoscale Science Research Center to begin operation, with construction to be completed in April 2005 and initial operations in October 2005. The CNMS' scientific program has been developed through workshops with the national community, with the goal of creating a highly collaborative research environment to accelerate discovery and drive technological advances. Research at the CNMS is organized under seven Scientific Themes selected to address challenges to understanding and to exploit particular ORNL strengths (see http://cnms.ornl.govhttp://cnms.ornl.gov). These include extensive synthesis and characterization capabilities for soft, hard, nanostructured, magnetic and catalytic materials and their composites; neutron scattering at the Spallation Neutron Source and High Flux Isotope Reactor; computational nanoscience in the CNMS' Nanomaterials Theory Institute and utilizing facilities and expertise of the Center for Computational Sciences and the new Leadership Scientific Computing Facility at ORNL; a new CNMS Nanofabrication Research Laboratory; and a suite of unique and state-of-the-art instruments to be made reliably available to the national community for imaging, manipulation, and properties measurements on nanoscale materials in controlled environments. The new research facilities will be described together with the planned operation of the user research program, the latter illustrated by the current ``jump start'' user program that utilizes existing ORNL/CNMS facilities.
The Centre of High-Performance Scientific Computing, Geoverbund, ABC/J - Geosciences enabled by HPSC
NASA Astrophysics Data System (ADS)
Kollet, Stefan; Görgen, Klaus; Vereecken, Harry; Gasper, Fabian; Hendricks-Franssen, Harrie-Jan; Keune, Jessica; Kulkarni, Ketan; Kurtz, Wolfgang; Sharples, Wendy; Shrestha, Prabhakar; Simmer, Clemens; Sulis, Mauro; Vanderborght, Jan
2016-04-01
The Centre of High-Performance Scientific Computing (HPSC TerrSys) was founded 2011 to establish a centre of competence in high-performance scientific computing in terrestrial systems and the geosciences enabling fundamental and applied geoscientific research in the Geoverbund ABC/J (geoscientfic research alliance of the Universities of Aachen, Cologne, Bonn and the Research Centre Jülich, Germany). The specific goals of HPSC TerrSys are to achieve relevance at the national and international level in (i) the development and application of HPSC technologies in the geoscientific community; (ii) student education; (iii) HPSC services and support also to the wider geoscientific community; and in (iv) the industry and public sectors via e.g., useful applications and data products. A key feature of HPSC TerrSys is the Simulation Laboratory Terrestrial Systems, which is located at the Jülich Supercomputing Centre (JSC) and provides extensive capabilities with respect to porting, profiling, tuning and performance monitoring of geoscientific software in JSC's supercomputing environment. We will present a summary of success stories of HPSC applications including integrated terrestrial model development, parallel profiling and its application from watersheds to the continent; massively parallel data assimilation using physics-based models and ensemble methods; quasi-operational terrestrial water and energy monitoring; and convection permitting climate simulations over Europe. The success stories stress the need for a formalized education of students in the application of HPSC technologies in future.
NASA Astrophysics Data System (ADS)
Michaelis, A.; Wang, W.; Melton, F. S.; Votava, P.; Milesi, C.; Hashimoto, H.; Nemani, R. R.; Hiatt, S. H.
2009-12-01
As the length and diversity of the global earth observation data records grow, modeling and analyses of biospheric conditions increasingly requires multiple terabytes of data from a diversity of models and sensors. With network bandwidth beginning to flatten, transmission of these data from centralized data archives presents an increasing challenge, and costs associated with local storage and management of data and compute resources are often significant for individual research and application development efforts. Sharing community valued intermediary data sets, results and codes from individual efforts with others that are not in direct funded collaboration can also be a challenge with respect to time, cost and expertise. We purpose a modeling, data and knowledge center that houses NASA satellite data, climate data and ancillary data where a focused community may come together to share modeling and analysis codes, scientific results, knowledge and expertise on a centralized platform, named Ecosystem Modeling Center (EMC). With the recent development of new technologies for secure hardware virtualization, an opportunity exists to create specific modeling, analysis and compute environments that are customizable, “archiveable” and transferable. Allowing users to instantiate such environments on large compute infrastructures that are directly connected to large data archives may significantly reduce costs and time associated with scientific efforts by alleviating users from redundantly retrieving and integrating data sets and building modeling analysis codes. The EMC platform also provides the possibility for users receiving indirect assistance from expertise through prefabricated compute environments, potentially reducing study “ramp up” times.
Inconsistencies in Numerical Simulations of Dynamical Systems Using Interval Arithmetic
NASA Astrophysics Data System (ADS)
Nepomuceno, Erivelton G.; Peixoto, Márcia L. C.; Martins, Samir A. M.; Rodrigues, Heitor M.; Perc, Matjaž
Over the past few decades, interval arithmetic has been attracting widespread interest from the scientific community. With the expansion of computing power, scientific computing is encountering a noteworthy shift from floating-point arithmetic toward increased use of interval arithmetic. Notwithstanding the significant reliability of interval arithmetic, this paper presents a theoretical inconsistency in a simulation of dynamical systems using a well-known implementation of arithmetic interval. We have observed that two natural interval extensions present an empty intersection during a finite time range, which is contrary to the fundamental theorem of interval analysis. We have proposed a procedure to at least partially overcome this problem, based on the union of the two generated pseudo-orbits. This paper also shows a successful case of interval arithmetic application in the reduction of interval width size on the simulation of discrete map. The implications of our findings on the reliability of scientific computing using interval arithmetic have been properly addressed using two numerical examples.
Practices in source code sharing in astrophysics
NASA Astrophysics Data System (ADS)
Shamir, Lior; Wallin, John F.; Allen, Alice; Berriman, Bruce; Teuben, Peter; Nemiroff, Robert J.; Mink, Jessica; Hanisch, Robert J.; DuPrie, Kimberly
2013-02-01
While software and algorithms have become increasingly important in astronomy, the majority of authors who publish computational astronomy research do not share the source code they develop, making it difficult to replicate and reuse the work. In this paper we discuss the importance of sharing scientific source code with the entire astrophysics community, and propose that journals require authors to make their code publicly available when a paper is published. That is, we suggest that a paper that involves a computer program not be accepted for publication unless the source code becomes publicly available. The adoption of such a policy by editors, editorial boards, and reviewers will improve the ability to replicate scientific results, and will also make computational astronomy methods more available to other researchers who wish to apply them to their data.
MODELS-3/CMAQ APPLICATIONS WHICH ILLUSTRATE CAPABILITY AND FUNCTIONALITY
The Models-3/CMAQ developed by the U.S. Environmental Protections Agency (USEPA) is a third generation multiscale, multi-pollutant air quality modeling system within a high-level, object-oriented computer framework (Models-3). It has been available to the scientific community ...
Fenwick, Matthew; Sesanker, Colbert; Schiller, Martin R.; Ellis, Heidi JC; Hinman, M. Lee; Vyas, Jay; Gryk, Michael R.
2012-01-01
Scientists are continually faced with the need to express complex mathematical notions in code. The renaissance of functional languages such as LISP and Haskell is often credited to their ability to implement complex data operations and mathematical constructs in an expressive and natural idiom. The slow adoption of functional computing in the scientific community does not, however, reflect the congeniality of these fields. Unfortunately, the learning curve for adoption of functional programming techniques is steeper than that for more traditional languages in the scientific community, such as Python and Java, and this is partially due to the relative sparseness of available learning resources. To fill this gap, we demonstrate and provide applied, scientifically substantial examples of functional programming, We present a multi-language source-code repository for software integration and algorithm development, which generally focuses on the fields of machine learning, data processing, bioinformatics. We encourage scientists who are interested in learning the basics of functional programming to adopt, reuse, and learn from these examples. The source code is available at: https://github.com/CONNJUR/CONNJUR-Sandbox (see also http://www.connjur.org). PMID:25328913
Fenwick, Matthew; Sesanker, Colbert; Schiller, Martin R; Ellis, Heidi Jc; Hinman, M Lee; Vyas, Jay; Gryk, Michael R
2012-01-01
Scientists are continually faced with the need to express complex mathematical notions in code. The renaissance of functional languages such as LISP and Haskell is often credited to their ability to implement complex data operations and mathematical constructs in an expressive and natural idiom. The slow adoption of functional computing in the scientific community does not, however, reflect the congeniality of these fields. Unfortunately, the learning curve for adoption of functional programming techniques is steeper than that for more traditional languages in the scientific community, such as Python and Java, and this is partially due to the relative sparseness of available learning resources. To fill this gap, we demonstrate and provide applied, scientifically substantial examples of functional programming, We present a multi-language source-code repository for software integration and algorithm development, which generally focuses on the fields of machine learning, data processing, bioinformatics. We encourage scientists who are interested in learning the basics of functional programming to adopt, reuse, and learn from these examples. The source code is available at: https://github.com/CONNJUR/CONNJUR-Sandbox (see also http://www.connjur.org).
Network and computing infrastructure for scientific applications in Georgia
NASA Astrophysics Data System (ADS)
Kvatadze, R.; Modebadze, Z.
2016-09-01
Status of network and computing infrastructure and available services for research and education community of Georgia are presented. Research and Educational Networking Association - GRENA provides the following network services: Internet connectivity, network services, cyber security, technical support, etc. Computing resources used by the research teams are located at GRENA and at major state universities. GE-01-GRENA site is included in European Grid infrastructure. Paper also contains information about programs of Learning Center and research and development projects in which GRENA is participating.
Widening the adoption of workflows to include human and human-machine scientific processes
NASA Astrophysics Data System (ADS)
Salayandia, L.; Pinheiro da Silva, P.; Gates, A. Q.
2010-12-01
Scientific workflows capture knowledge in the form of technical recipes to access and manipulate data that help scientists manage and reuse established expertise to conduct their work. Libraries of scientific workflows are being created in particular fields, e.g., Bioinformatics, where combined with cyber-infrastructure environments that provide on-demand access to data and tools, result in powerful workbenches for scientists of those communities. The focus in these particular fields, however, has been more on automating rather than documenting scientific processes. As a result, technical barriers have impeded a wider adoption of scientific workflows by scientific communities that do not rely as heavily on cyber-infrastructure and computing environments. Semantic Abstract Workflows (SAWs) are introduced to widen the applicability of workflows as a tool to document scientific recipes or processes. SAWs intend to capture a scientists’ perspective about the process of how she or he would collect, filter, curate, and manipulate data to create the artifacts that are relevant to her/his work. In contrast, scientific workflows describe the process from the point of view of how technical methods and tools are used to conduct the work. By focusing on a higher level of abstraction that is closer to a scientist’s understanding, SAWs effectively capture the controlled vocabularies that reflect a particular scientific community, as well as the types of datasets and methods used in a particular domain. From there on, SAWs provide the flexibility to adapt to different environments to carry out the recipes or processes. These environments range from manual fieldwork to highly technical cyber-infrastructure environments, i.e., such as those already supported by scientific workflows. Two cases, one from Environmental Science and another from Geophysics, are presented as illustrative examples.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, Chase Qishi; Zhu, Michelle Mengxia
The advent of large-scale collaborative scientific applications has demonstrated the potential for broad scientific communities to pool globally distributed resources to produce unprecedented data acquisition, movement, and analysis. System resources including supercomputers, data repositories, computing facilities, network infrastructures, storage systems, and display devices have been increasingly deployed at national laboratories and academic institutes. These resources are typically shared by large communities of users over Internet or dedicated networks and hence exhibit an inherent dynamic nature in their availability, accessibility, capacity, and stability. Scientific applications using either experimental facilities or computation-based simulations with various physical, chemical, climatic, and biological models featuremore » diverse scientific workflows as simple as linear pipelines or as complex as a directed acyclic graphs, which must be executed and supported over wide-area networks with massively distributed resources. Application users oftentimes need to manually configure their computing tasks over networks in an ad hoc manner, hence significantly limiting the productivity of scientists and constraining the utilization of resources. The success of these large-scale distributed applications requires a highly adaptive and massively scalable workflow platform that provides automated and optimized computing and networking services. This project is to design and develop a generic Scientific Workflow Automation and Management Platform (SWAMP), which contains a web-based user interface specially tailored for a target application, a set of user libraries, and several easy-to-use computing and networking toolkits for application scientists to conveniently assemble, execute, monitor, and control complex computing workflows in heterogeneous high-performance network environments. SWAMP will enable the automation and management of the entire process of scientific workflows with the convenience of a few mouse clicks while hiding the implementation and technical details from end users. Particularly, we will consider two types of applications with distinct performance requirements: data-centric and service-centric applications. For data-centric applications, the main workflow task involves large-volume data generation, catalog, storage, and movement typically from supercomputers or experimental facilities to a team of geographically distributed users; while for service-centric applications, the main focus of workflow is on data archiving, preprocessing, filtering, synthesis, visualization, and other application-specific analysis. We will conduct a comprehensive comparison of existing workflow systems and choose the best suited one with open-source code, a flexible system structure, and a large user base as the starting point for our development. Based on the chosen system, we will develop and integrate new components including a black box design of computing modules, performance monitoring and prediction, and workflow optimization and reconfiguration, which are missing from existing workflow systems. A modular design for separating specification, execution, and monitoring aspects will be adopted to establish a common generic infrastructure suited for a wide spectrum of science applications. We will further design and develop efficient workflow mapping and scheduling algorithms to optimize the workflow performance in terms of minimum end-to-end delay, maximum frame rate, and highest reliability. We will develop and demonstrate the SWAMP system in a local environment, the grid network, and the 100Gpbs Advanced Network Initiative (ANI) testbed. The demonstration will target scientific applications in climate modeling and high energy physics and the functions to be demonstrated include workflow deployment, execution, steering, and reconfiguration. Throughout the project period, we will work closely with the science communities in the fields of climate modeling and high energy physics including Spallation Neutron Source (SNS) and Large Hadron Collider (LHC) projects to mature the system for production use.« less
NASA Technical Reports Server (NTRS)
Bush, Drew; Sieber, Renee; Seiler, Gale; Chandler, Mark
2016-01-01
A gap has existed between the tools and processes of scientists working on anthropogenic global climate change (AGCC) and the technologies and curricula available to educators teaching the subject through student inquiry. Designing realistic scientific inquiry into AGCC poses a challenge because research on it relies on complex computer models, globally distributed data sets, and complex laboratory and data collection procedures. Here we examine efforts by the scientific community and educational researchers to design new curricula and technology that close this gap and impart robust AGCC and Earth Science understanding. We find technology-based teaching shows promise in promoting robust AGCC understandings if associated curricula address mitigating factors such as time constraints in incorporating technology and the need to support teachers implementing AGCC and Earth Science inquiry. We recommend the scientific community continue to collaborate with educational researchers to focus on developing those inquiry technologies and curricula that use realistic scientific processes from AGCC research and/or the methods for determining how human society should respond to global change.
Some Thoughts Regarding Practical Quantum Computing
NASA Astrophysics Data System (ADS)
Ghoshal, Debabrata; Gomez, Richard; Lanzagorta, Marco; Uhlmann, Jeffrey
2006-03-01
Quantum computing has become an important area of research in computer science because of its potential to provide more efficient algorithmic solutions to certain problems than are possible with classical computing. The ability of performing parallel operations over an exponentially large computational space has proved to be the main advantage of the quantum computing model. In this regard, we are particularly interested in the potential applications of quantum computers to enhance real software systems of interest to the defense, industrial, scientific and financial communities. However, while much has been written in popular and scientific literature about the benefits of the quantum computational model, several of the problems associated to the practical implementation of real-life complex software systems in quantum computers are often ignored. In this presentation we will argue that practical quantum computation is not as straightforward as commonly advertised, even if the technological problems associated to the manufacturing and engineering of large-scale quantum registers were solved overnight. We will discuss some of the frequently overlooked difficulties that plague quantum computing in the areas of memories, I/O, addressing schemes, compilers, oracles, approximate information copying, logical debugging, error correction and fault-tolerant computing protocols.
Lecueder, Silvia; Manyari, Dante E.
2000-01-01
A new form of scientific medical meeting has emerged in the last few years—the virtual congress. This article describes the general role of computer technologies and the Internet in the development of this new means of scientific communication, by reviewing the history of “cyber sessions” in medical education and the rationale, methods, and initial results of the First Virtual Congress of Cardiology. Instructions on how to participate in this virtual congress, either actively or as an observer, are included. Current advantages and disadvantages of virtual congresses, their impact on the scientific community at large, and future developments and possibilities in this area are discussed. PMID:10641960
Computational Aspects of Data Assimilation and the ESMF
NASA Technical Reports Server (NTRS)
daSilva, A.
2003-01-01
The scientific challenge of developing advanced data assimilation applications is a daunting task. Independently developed components may have incompatible interfaces or may be written in different computer languages. The high-performance computer (HPC) platforms required by numerically intensive Earth system applications are complex, varied, rapidly evolving and multi-part systems themselves. Since the market for high-end platforms is relatively small, there is little robust middleware available to buffer the modeler from the difficulties of HPC programming. To complicate matters further, the collaborations required to develop large Earth system applications often span initiatives, institutions and agencies, involve geoscience, software engineering, and computer science communities, and cross national borders.The Earth System Modeling Framework (ESMF) project is a concerted response to these challenges. Its goal is to increase software reuse, interoperability, ease of use and performance in Earth system models through the use of a common software framework, developed in an open manner by leaders in the modeling community. The ESMF addresses the technical and to some extent the cultural - aspects of Earth system modeling, laying the groundwork for addressing the more difficult scientific aspects, such as the physical compatibility of components, in the future. In this talk we will discuss the general philosophy and architecture of the ESMF, focussing on those capabilities useful for developing advanced data assimilation applications.
Understanding Scientific Ideas: An Honors Course.
ERIC Educational Resources Information Center
Capps, Joan; Schueler, Paul
At Raritan Valley Community College (RVCC) in New Jersey, an honors philosophy course was developed which taught mathematics and science concepts independent of computational skill. The course required that students complete a weekly writing assignment designed as a continuous refinement of logical reasoning development. This refinement was…
NASA Technical Reports Server (NTRS)
Talbot, Bryan; Zhou, Shu-Jia; Higgins, Glenn
2002-01-01
One of the most significant challenges in large-scale climate modeling, as well as in high-performance computing in other scientific fields, is that of effectively integrating many software models from multiple contributors. A software framework facilitates the integration task. both in the development and runtime stages of the simulation. Effective software frameworks reduce the programming burden for the investigators, freeing them to focus more on the science and less on the parallel communication implementation, while maintaining high performance across numerous supercomputer and workstation architectures. This document proposes a strawman framework design for the climate community based on the integration of Cactus, from the relativistic physics community, and UCLA/UCB Distributed Data Broker (DDB) from the climate community. This design is the result of an extensive survey of climate models and frameworks in the climate community as well as frameworks from many other scientific communities. The design addresses fundamental development and runtime needs using Cactus, a framework with interfaces for FORTRAN and C-based languages, and high-performance model communication needs using DDB. This document also specifically explores object-oriented design issues in the context of climate modeling as well as climate modeling issues in terms of object-oriented design.
Automated metadata--final project report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schissel, David
This report summarizes the work of the Automated Metadata, Provenance Cataloging, and Navigable Interfaces: Ensuring the Usefulness of Extreme-Scale Data Project (MPO Project) funded by the United States Department of Energy (DOE), Offices of Advanced Scientific Computing Research and Fusion Energy Sciences. Initially funded for three years starting in 2012, it was extended for 6 months with additional funding. The project was a collaboration between scientists at General Atomics, Lawrence Berkley National Laboratory (LBNL), and Massachusetts Institute of Technology (MIT). The group leveraged existing computer science technology where possible, and extended or created new capabilities where required. The MPO projectmore » was able to successfully create a suite of software tools that can be used by a scientific community to automatically document their scientific workflows. These tools were integrated into workflows for fusion energy and climate research illustrating the general applicability of the project’s toolkit. Feedback was very positive on the project’s toolkit and the value of such automatic workflow documentation to the scientific endeavor.« less
Enabling a Scientific Cloud Marketplace: VGL (Invited)
NASA Astrophysics Data System (ADS)
Fraser, R.; Woodcock, R.; Wyborn, L. A.; Vote, J.; Rankine, T.; Cox, S. J.
2013-12-01
The Virtual Geophysics Laboratory (VGL) provides a flexible, web based environment where researchers can browse data and use a variety of scientific software packaged into tool kits that run in the Cloud. Both data and tool kits are published by multiple researchers and registered with the VGL infrastructure forming a data and application marketplace. The VGL provides the basic work flow of Discovery and Access to the disparate data sources and a Library for tool kits and scripting to drive the scientific codes. Computation is then performed on the Research or Commercial Clouds. Provenance information is collected throughout the work flow and can be published alongside the results allowing for experiment comparison and sharing with other researchers. VGL's "mix and match" approach to data, computational resources and scientific codes, enables a dynamic approach to scientific collaboration. VGL allows scientists to publish their specific contribution, be it data, code, compute or work flow, knowing the VGL framework will provide other components needed for a complete application. Other scientists can choose the pieces that suit them best to assemble an experiment. The coarse grain workflow of the VGL framework combined with the flexibility of the scripting library and computational toolkits allows for significant customisation and sharing amongst the community. The VGL utilises the cloud computational and storage resources from the Australian academic research cloud provided by the NeCTAR initiative and a large variety of data accessible from national and state agencies via the Spatial Information Services Stack (SISS - http://siss.auscope.org). VGL v1.2 screenshot - http://vgl.auscope.org
Can the behavioral sciences self-correct? A social epistemic study.
Romero, Felipe
2016-12-01
Advocates of the self-corrective thesis argue that scientific method will refute false theories and find closer approximations to the truth in the long run. I discuss a contemporary interpretation of this thesis in terms of frequentist statistics in the context of the behavioral sciences. First, I identify experimental replications and systematic aggregation of evidence (meta-analysis) as the self-corrective mechanism. Then, I present a computer simulation study of scientific communities that implement this mechanism to argue that frequentist statistics may converge upon a correct estimate or not depending on the social structure of the community that uses it. Based on this study, I argue that methodological explanations of the "replicability crisis" in psychology are limited and propose an alternative explanation in terms of biases. Finally, I conclude suggesting that scientific self-correction should be understood as an interaction effect between inference methods and social structures. Copyright © 2016 Elsevier Ltd. All rights reserved.
QMC Goes BOINC: Using Public Resource Computing to Perform Quantum Monte Carlo Calculations
NASA Astrophysics Data System (ADS)
Rainey, Cameron; Engelhardt, Larry; Schröder, Christian; Hilbig, Thomas
2008-10-01
Theoretical modeling of magnetic molecules traditionally involves the diagonalization of quantum Hamiltonian matrices. However, as the complexity of these molecules increases, the matrices become so large that this process becomes unusable. An additional challenge to this modeling is that many repetitive calculations must be performed, further increasing the need for computing power. Both of these obstacles can be overcome by using a quantum Monte Carlo (QMC) method and a distributed computing project. We have recently implemented a QMC method within the Spinhenge@home project, which is a Public Resource Computing (PRC) project where private citizens allow part-time usage of their PCs for scientific computing. The use of PRC for scientific computing will be described in detail, as well as how you can contribute to the project. See, e.g., L. Engelhardt, et. al., Angew. Chem. Int. Ed. 47, 924 (2008). C. Schröoder, in Distributed & Grid Computing - Science Made Transparent for Everyone. Principles, Applications and Supporting Communities. (Weber, M.H.W., ed., 2008). Project URL: http://spin.fh-bielefeld.de
Accelerating scientific discovery : 2007 annual report.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beckman, P.; Dave, P.; Drugan, C.
2008-11-14
As a gateway for scientific discovery, the Argonne Leadership Computing Facility (ALCF) works hand in hand with the world's best computational scientists to advance research in a diverse span of scientific domains, ranging from chemistry, applied mathematics, and materials science to engineering physics and life sciences. Sponsored by the U.S. Department of Energy's (DOE) Office of Science, researchers are using the IBM Blue Gene/L supercomputer at the ALCF to study and explore key scientific problems that underlie important challenges facing our society. For instance, a research team at the University of California-San Diego/ SDSC is studying the molecular basis ofmore » Parkinson's disease. The researchers plan to use the knowledge they gain to discover new drugs to treat the disease and to identify risk factors for other diseases that are equally prevalent. Likewise, scientists from Pratt & Whitney are using the Blue Gene to understand the complex processes within aircraft engines. Expanding our understanding of jet engine combustors is the secret to improved fuel efficiency and reduced emissions. Lessons learned from the scientific simulations of jet engine combustors have already led Pratt & Whitney to newer designs with unprecedented reductions in emissions, noise, and cost of ownership. ALCF staff members provide in-depth expertise and assistance to those using the Blue Gene/L and optimizing user applications. Both the Catalyst and Applications Performance Engineering and Data Analytics (APEDA) teams support the users projects. In addition to working with scientists running experiments on the Blue Gene/L, we have become a nexus for the broader global community. In partnership with the Mathematics and Computer Science Division at Argonne National Laboratory, we have created an environment where the world's most challenging computational science problems can be addressed. Our expertise in high-end scientific computing enables us to provide guidance for applications that are transitioning to petascale as well as to produce software that facilitates their development, such as the MPICH library, which provides a portable and efficient implementation of the MPI standard--the prevalent programming model for large-scale scientific applications--and the PETSc toolkit that provides a programming paradigm that eases the development of many scientific applications on high-end computers.« less
Opportunities for Computational Discovery in Basic Energy Sciences
NASA Astrophysics Data System (ADS)
Pederson, Mark
2011-03-01
An overview of the broad-ranging support of computational physics and computational science within the Department of Energy Office of Science will be provided. Computation as the third branch of physics is supported by all six offices (Advanced Scientific Computing, Basic Energy, Biological and Environmental, Fusion Energy, High-Energy Physics, and Nuclear Physics). Support focuses on hardware, software and applications. Most opportunities within the fields of~condensed-matter physics, chemical-physics and materials sciences are supported by the Officeof Basic Energy Science (BES) or through partnerships between BES and the Office for Advanced Scientific Computing. Activities include radiation sciences, catalysis, combustion, materials in extreme environments, energy-storage materials, light-harvesting and photovoltaics, solid-state lighting and superconductivity.~ A summary of two recent reports by the computational materials and chemical communities on the role of computation during the next decade will be provided. ~In addition to materials and chemistry challenges specific to energy sciences, issues identified~include a focus on the role of the domain scientist in integrating, expanding and sustaining applications-oriented capabilities on evolving high-performance computing platforms and on the role of computation in accelerating the development of innovative technologies. ~~
New Trends in E-Science: Machine Learning and Knowledge Discovery in Databases
NASA Astrophysics Data System (ADS)
Brescia, Massimo
2012-11-01
Data mining, or Knowledge Discovery in Databases (KDD), while being the main methodology to extract the scientific information contained in Massive Data Sets (MDS), needs to tackle crucial problems since it has to orchestrate complex challenges posed by transparent access to different computing environments, scalability of algorithms, reusability of resources. To achieve a leap forward for the progress of e-science in the data avalanche era, the community needs to implement an infrastructure capable of performing data access, processing and mining in a distributed but integrated context. The increasing complexity of modern technologies carried out a huge production of data, whose related warehouse management and the need to optimize analysis and mining procedures lead to a change in concept on modern science. Classical data exploration, based on local user own data storage and limited computing infrastructures, is no more efficient in the case of MDS, worldwide spread over inhomogeneous data centres and requiring teraflop processing power. In this context modern experimental and observational science requires a good understanding of computer science, network infrastructures, Data Mining, etc. i.e. of all those techniques which fall into the domain of the so called e-science (recently assessed also by the Fourth Paradigm of Science). Such understanding is almost completely absent in the older generations of scientists and this reflects in the inadequacy of most academic and research programs. A paradigm shift is needed: statistical pattern recognition, object oriented programming, distributed computing, parallel programming need to become an essential part of scientific background. A possible practical solution is to provide the research community with easy-to understand, easy-to-use tools, based on the Web 2.0 technologies and Machine Learning methodology. Tools where almost all the complexity is hidden to the final user, but which are still flexible and able to produce efficient and reliable scientific results. All these considerations will be described in the detail in the chapter. Moreover, examples of modern applications offering to a wide variety of e-science communities a large spectrum of computational facilities to exploit the wealth of available massive data sets and powerful machine learning and statistical algorithms will be also introduced.
Community detection in complex networks using proximate support vector clustering
NASA Astrophysics Data System (ADS)
Wang, Feifan; Zhang, Baihai; Chai, Senchun; Xia, Yuanqing
2018-03-01
Community structure, one of the most attention attracting properties in complex networks, has been a cornerstone in advances of various scientific branches. A number of tools have been involved in recent studies concentrating on the community detection algorithms. In this paper, we propose a support vector clustering method based on a proximity graph, owing to which the introduced algorithm surpasses the traditional support vector approach both in accuracy and complexity. Results of extensive experiments undertaken on computer generated networks and real world data sets illustrate competent performances in comparison with the other counterparts.
A Component-based Programming Model for Composite, Distributed Applications
NASA Technical Reports Server (NTRS)
Eidson, Thomas M.; Bushnell, Dennis M. (Technical Monitor)
2001-01-01
The nature of scientific programming is evolving to larger, composite applications that are composed of smaller element applications. These composite applications are more frequently being targeted for distributed, heterogeneous networks of computers. They are most likely programmed by a group of developers. Software component technology and computational frameworks are being proposed and developed to meet the programming requirements of these new applications. Historically, programming systems have had a hard time being accepted by the scientific programming community. In this paper, a programming model is outlined that attempts to organize the software component concepts and fundamental programming entities into programming abstractions that will be better understood by the application developers. The programming model is designed to support computational frameworks that manage many of the tedious programming details, but also that allow sufficient programmer control to design an accurate, high-performance application.
Key Lessons in Building "Data Commons": The Open Science Data Cloud Ecosystem
NASA Astrophysics Data System (ADS)
Patterson, M.; Grossman, R.; Heath, A.; Murphy, M.; Wells, W.
2015-12-01
Cloud computing technology has created a shift around data and data analysis by allowing researchers to push computation to data as opposed to having to pull data to an individual researcher's computer. Subsequently, cloud-based resources can provide unique opportunities to capture computing environments used both to access raw data in its original form and also to create analysis products which may be the source of data for tables and figures presented in research publications. Since 2008, the Open Cloud Consortium (OCC) has operated the Open Science Data Cloud (OSDC), which provides scientific researchers with computational resources for storing, sharing, and analyzing large (terabyte and petabyte-scale) scientific datasets. OSDC has provided compute and storage services to over 750 researchers in a wide variety of data intensive disciplines. Recently, internal users have logged about 2 million core hours each month. The OSDC also serves the research community by colocating these resources with access to nearly a petabyte of public scientific datasets in a variety of fields also accessible for download externally by the public. In our experience operating these resources, researchers are well served by "data commons," meaning cyberinfrastructure that colocates data archives, computing, and storage infrastructure and supports essential tools and services for working with scientific data. In addition to the OSDC public data commons, the OCC operates a data commons in collaboration with NASA and is developing a data commons for NOAA datasets. As cloud-based infrastructures for distributing and computing over data become more pervasive, we ask, "What does it mean to publish data in a data commons?" Here we present the OSDC perspective and discuss several services that are key in architecting data commons, including digital identifier services.
EVER-EST: a virtual research environment for Earth Sciences
NASA Astrophysics Data System (ADS)
Marelli, Fulvio; Albani, Mirko; Glaves, Helen
2016-04-01
There is an increasing requirement for researchers to work collaboratively using common resources whilst being geographically dispersed. By creating a virtual research environment (VRE) using a service oriented architecture (SOA) tailored to the needs of Earth Science (ES) communities, the EVEREST project will provide a range of both generic and domain specific data management services to support a dynamic approach to collaborative research. EVER-EST will provide the means to overcome existing barriers to sharing of Earth Science data and information allowing research teams to discover, access, share and process heterogeneous data, algorithms, results and experiences within and across their communities, including those domains beyond Earth Science. Researchers will be able to seamlessly manage both the data involved in their computationally intensive disciplines and the scientific methods applied in their observations and modelling, which lead to the specific results that need to be attributable, validated and shared both within the community and more widely e.g. in the form of scholarly communications. Central to the EVEREST approach is the concept of the Research Object (RO) , which provides a semantically rich mechanism to aggregate related resources about a scientific investigation so that they can be shared together using a single unique identifier. Although several e-laboratories are incorporating the research object concept in their infrastructure, the EVER-EST VRE will be the first infrastructure to leverage the concept of Research Objects and their application in observational rather than experimental disciplines. Development of the EVEREST VRE will leverage the results of several previous projects which have produced state-of-the-art technologies for scientific data management and curation as well those which have developed models, techniques and tools for the preservation of scientific methods and their implementation in computational forms such as scientific workflows. The EVER-EST data processing infrastructure will be based on a Cloud Computing approach, in which new applications can be integrated using "virtual machines" that have their own specifications (disk size, processor speed, operating system etc.) and run on shared private (physical deployment over local hardware) or commercial Cloud infrastructures. The EVER-EST e-infrastructure will be validated by four virtual research communities (VRC) covering different multidisciplinary Earth Science domains including: ocean monitoring, natural hazards, land monitoring and risk management (volcanoes and seismicity). Each VRC will use the virtual research environment according to its own specific requirements for data, software, best practice and community engagement. This user-centric approach will allow an assessment to be made of the capability for the proposed solution to satisfy the heterogeneous needs of a variety of Earth Science communities for more effective collaboration, and higher efficiency and creativity in research. EVER-EST is funded by the European Commission's H2020 for three years starting in October 2015. The project is led by the European Space Agency (ESA), involves some of the major European Earth Science data providers/users including NERC, DLR, INGV, CNR and SatCEN.
Colen, Rivka; Foster, Ian; Gatenby, Robert; Giger, Mary Ellen; Gillies, Robert; Gutman, David; Heller, Matthew; Jain, Rajan; Madabhushi, Anant; Madhavan, Subha; Napel, Sandy; Rao, Arvind; Saltz, Joel; Tatum, James; Verhaak, Roeland; Whitman, Gary
2014-10-01
The National Cancer Institute (NCI) Cancer Imaging Program organized two related workshops on June 26-27, 2013, entitled "Correlating Imaging Phenotypes with Genomics Signatures Research" and "Scalable Computational Resources as Required for Imaging-Genomics Decision Support Systems." The first workshop focused on clinical and scientific requirements, exploring our knowledge of phenotypic characteristics of cancer biological properties to determine whether the field is sufficiently advanced to correlate with imaging phenotypes that underpin genomics and clinical outcomes, and exploring new scientific methods to extract phenotypic features from medical images and relate them to genomics analyses. The second workshop focused on computational methods that explore informatics and computational requirements to extract phenotypic features from medical images and relate them to genomics analyses and improve the accessibility and speed of dissemination of existing NIH resources. These workshops linked clinical and scientific requirements of currently known phenotypic and genotypic cancer biology characteristics with imaging phenotypes that underpin genomics and clinical outcomes. The group generated a set of recommendations to NCI leadership and the research community that encourage and support development of the emerging radiogenomics research field to address short-and longer-term goals in cancer research.
An introduction to computer forensics.
Furneaux, Nick
2006-07-01
This paper provides an introduction to the discipline of Computer Forensics. With computers being involved in an increasing number, and type, of crimes the trace data left on electronic media can play a vital part in the legal process. To ensure acceptance by the courts, accepted processes and procedures have to be adopted and demonstrated which are not dissimilar to the issues surrounding traditional forensic investigations. This paper provides a straightforward overview of the three steps involved in the examination of digital media: Acquisition of data. Investigation of evidence. Reporting and presentation of evidence. Although many of the traditional readers of Medicine, Science and the Law are those involved in the biological aspects of forensics, I believe that both disciplines can learn from each other, with electronic evidence being more readily sought and considered by the legal community and the long, tried and tested scientific methods of the forensic community being shared and adopted by the computer forensic world.
A Queue Simulation Tool for a High Performance Scientific Computing Center
NASA Technical Reports Server (NTRS)
Spear, Carrie; McGalliard, James
2007-01-01
The NASA Center for Computational Sciences (NCCS) at the Goddard Space Flight Center provides high performance highly parallel processors, mass storage, and supporting infrastructure to a community of computational Earth and space scientists. Long running (days) and highly parallel (hundreds of CPUs) jobs are common in the workload. NCCS management structures batch queues and allocates resources to optimize system use and prioritize workloads. NCCS technical staff use a locally developed discrete event simulation tool to model the impacts of evolving workloads, potential system upgrades, alternative queue structures and resource allocation policies.
Exploring Two Approaches for an End-to-End Scientific Analysis Workflow
NASA Astrophysics Data System (ADS)
Dodelson, Scott; Kent, Steve; Kowalkowski, Jim; Paterno, Marc; Sehrish, Saba
2015-12-01
The scientific discovery process can be advanced by the integration of independently-developed programs run on disparate computing facilities into coherent workflows usable by scientists who are not experts in computing. For such advancement, we need a system which scientists can use to formulate analysis workflows, to integrate new components to these workflows, and to execute different components on resources that are best suited to run those components. In addition, we need to monitor the status of the workflow as components get scheduled and executed, and to access the intermediate and final output for visual exploration and analysis. Finally, it is important for scientists to be able to share their workflows with collaborators. We have explored two approaches for such an analysis framework for the Large Synoptic Survey Telescope (LSST) Dark Energy Science Collaboration (DESC); the first one is based on the use and extension of Galaxy, a web-based portal for biomedical research, and the second one is based on a programming language, Python. In this paper, we present a brief description of the two approaches, describe the kinds of extensions to the Galaxy system we have found necessary in order to support the wide variety of scientific analysis in the cosmology community, and discuss how similar efforts might be of benefit to the HEP community.
Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel; Oliker, Leonid; Vuduc, Richard
2008-10-16
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific-optimization methodologies for important scientific computations. In this work, we examine sparse matrix-vector multiply (SpMV) - one of the most heavily used kernels in scientific computing - across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD quad-core, AMD dual-core, and Intel quad-core designs, the heterogeneous STI Cell, as well as one ofmore » the first scientific studies of the highly multithreaded Sun Victoria Falls (a Niagara2 SMP). We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing state-of-the-art serial and parallel SpMV implementations. Additionally, we present key insights into the architectural trade-offs of leading multicore design strategies, in the context of demanding memory-bound numerical algorithms.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gerber, Richard; Allcock, William; Beggio, Chris
2014-10-17
U.S. Department of Energy (DOE) High Performance Computing (HPC) facilities are on the verge of a paradigm shift in the way they deliver systems and services to science and engineering teams. Research projects are producing a wide variety of data at unprecedented scale and level of complexity, with community-specific services that are part of the data collection and analysis workflow. On June 18-19, 2014 representatives from six DOE HPC centers met in Oakland, CA at the DOE High Performance Operational Review (HPCOR) to discuss how they can best provide facilities and services to enable large-scale data-driven scientific discovery at themore » DOE national laboratories. The report contains findings from that review.« less
NASA Astrophysics Data System (ADS)
Cofino, A. S.; Fernández Quiruelas, V.; Blanco Real, J. C.; García Díez, M.; Fernández, J.
2013-12-01
Nowadays Grid Computing is powerful computational tool which is ready to be used for scientific community in different areas (such as biomedicine, astrophysics, climate, etc.). However, the use of this distributed computing infrastructures (DCI) is not yet common practice in climate research, and only a few teams and applications in this area take advantage of this infrastructure. Thus, the WRF4G project objective is to popularize the use of this technology in the atmospheric sciences area. In order to achieve this objective, one of the most used applications has been taken (WRF; a limited- area model, successor of the MM5 model), that has a user community formed by more than 8000 researchers worldwide. This community develop its research activity on different areas and could benefit from the advantages of Grid resources (case study simulations, regional hind-cast/forecast, sensitivity studies, etc.). The WRF model is used by many groups, in the climate research community, to carry on downscaling simulations. Therefore this community will also benefit. However, Grid infrastructures have some drawbacks for the execution of applications that make an intensive use of CPU and memory for a long period of time. This makes necessary to develop a specific framework (middleware). This middleware encapsulates the application and provides appropriate services for the monitoring and management of the simulations and the data. Thus,another objective of theWRF4G project consists on the development of a generic adaptation of WRF to DCIs. It should simplify the access to the DCIs for the researchers, and also to free them from the technical and computational aspects of the use of theses DCI. Finally, in order to demonstrate the ability of WRF4G solving actual scientific challenges with interest and relevance on the climate science (implying a high computational cost) we will shown results from different kind of downscaling experiments, like ERA-Interim re-analysis, CMIP5 models, or seasonal. WRF4G is been used to run WRF simulations which are contributing to the CORDEX initiative and others projects like SPECS and EUPORIAS. This work is been partially funded by the European Regional Development Fund (ERDF) and the Spanish National R&D Plan 2008-2011 (CGL2011-28864)
An infrastructure for the integration of geoscience instruments and sensors on the Grid
NASA Astrophysics Data System (ADS)
Pugliese, R.; Prica, M.; Kourousias, G.; Del Linz, A.; Curri, A.
2009-04-01
The Grid, as a computing paradigm, has long been in the attention of both academia and industry[1]. The distributed and expandable nature of its general architecture result to scalability and more efficient utilisation of the computing infrastructures. The scientific community, including that of geosciences, often handles problems with very high requirements in data processing, transferring, and storing[2,3]. This has raised the interest on Grid technologies but these are often viewed solely as an access gateway to HPC. Suitable Grid infrastructures could provide the geoscience community with additional benefits like those of sharing, remote access and control of scientific systems. These systems can be scientific instruments, sensors, robots, cameras and any other device used in geosciences. The solution for practical, general, and feasible Grid-enabling of such devices requires non-intrusive extensions on core parts of the current Grid architecture. We propose an extended version of an architecture[4] that can serve as the solution to the problem. The solution we propose is called Grid Instrument Element (IE) [5]. It is an addition to the existing core Grid parts; the Computing Element (CE) and the Storage Element (SE) that serve the purposes that their name suggests. The IE that we will be referring to, and the related technologies have been developed in the EU project on the Deployment of Remote Instrumentation Infrastructure (DORII1). In DORII, partners of various scientific communities including those of Earthquake, Environmental science, and Experimental science, have adopted the technology of the Instrument Element in order to integrate to the Grid their devices. The Oceanographic and coastal observation and modelling Mediterranean Ocean Observing Network (OGS2), a DORII partner, is in the process of deploying the above mentioned Grid technologies on two types of observational modules: Argo profiling floats and a novel Autonomous Underwater Vehicle (AUV). In this paper i) we define the need for integration of instrumentation in the Grid, ii) we introduce the solution of the Instrument Element, iii) we demonstrate a suitable end-user web portal for accessing Grid resources, iv) we describe from the Grid-technological point of view the process of the integration to the Grid of two advanced environmental monitoring devices. References [1] M. Surridge, S. Taylor, D. De Roure, and E. Zaluska, "Experiences with GRIA—Industrial Applications on a Web Services Grid," e-Science and Grid Computing, First International Conference on e-Science and Grid Computing, 2005, pp. 98-105. [2] A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, and S. Tuecke, "The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets," Journal of Network and Computer Applications, vol. 23, 2000, pp. 187-200. [3] B. Allcock, J. Bester, J. Bresnahan, A.L. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnel, and S. Tuecke, "Data management and transfer in high-performance computational grid environments," Parallel Computing, vol. 28, 2002, pp. 749-771. [4] E. Frizziero, M. Gulmini, F. Lelli, G. Maron, A. Oh, S. Orlando, A. Petrucci, S. Squizzato, and S. Traldi, "Instrument Element: A New Grid component that Enables the Control of Remote Instrumentation," Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06)-Volume 00, IEEE Computer Society Washington, DC, USA, 2006. [5] R. Ranon, L. De Marco, A. Senerchia, S. Gabrielli, L. Chittaro, R. Pugliese, L. Del Cano, F. Asnicar, and M. Prica, "A Web-based Tool for Collaborative Access to Scientific Instruments in Cyberinfrastructures." 1 The DORII project is supported by the European Commission within the 7th Framework Programme (FP7/2007-2013) under grant agreement no. RI-213110. URL: http://www.dorii.eu 2 Istituto Nazionale di Oceanografia e di Geofisica Sperimentale. URL: http://www.ogs.trieste.it
The Quake-Catcher Network: An Innovative Community-Based Seismic Network
NASA Astrophysics Data System (ADS)
Saltzman, J.; Cochran, E. S.; Lawrence, J. F.; Christensen, C. M.
2009-12-01
The Quake-Catcher Network (QCN) is a volunteer computing seismic network that engages citizen scientists, teachers, and museums to participate in the detection of earthquakes. In less than two years, the network has grown to over 1000 participants globally and continues to expand. QCN utilizes Micro-Electro-Mechanical System (MEMS) accelerometers, in laptops and external to desktop computers, to detect moderate to large earthquakes. One goal of the network is to involve K-12 classrooms and museums by providing sensors and software to introduce participants to seismology and community-based scientific data collection. The Quake-Catcher Network provides a unique opportunity to engage participants directly in the scientific process, through hands-on activities that link activities and outcomes to their daily lives. Partnerships with teachers and museum staff are critical to growth of the Quake Catcher Network. Each participating institution receives a MEMS accelerometer to connect, via USB, to a computer that can be used for hands-on activities and to record earthquakes through a distributed computing system. We developed interactive software (QCNLive) that allows participants to view sensor readings in real time. Participants can also record earthquakes and download earthquake data that was collected by their sensor or other QCN sensors. The Quake-Catcher Network combines research and outreach to improve seismic networks and increase awareness and participation in science-based research in K-12 schools.
Modeling aspects of human memory for scientific study.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Caudell, Thomas P.; Watson, Patrick; McDaniel, Mark A.
Working with leading experts in the field of cognitive neuroscience and computational intelligence, SNL has developed a computational architecture that represents neurocognitive mechanisms associated with how humans remember experiences in their past. The architecture represents how knowledge is organized and updated through information from individual experiences (episodes) via the cortical-hippocampal declarative memory system. We compared the simulated behavioral characteristics with those of humans measured under well established experimental standards, controlling for unmodeled aspects of human processing, such as perception. We used this knowledge to create robust simulations of & human memory behaviors that should help move the scientific community closermore » to understanding how humans remember information. These behaviors were experimentally validated against actual human subjects, which was published. An important outcome of the validation process will be the joining of specific experimental testing procedures from the field of neuroscience with computational representations from the field of cognitive modeling and simulation.« less
Discovery of the Kalman filter as a practical tool for aerospace and industry
NASA Technical Reports Server (NTRS)
Mcgee, L. A.; Schmidt, S. F.
1985-01-01
The sequence of events which led the researchers at Ames Research Center to the early discovery of the Kalman filter shortly after its introduction into the literature is recounted. The scientific breakthroughs and reformulations that were necessary to transform Kalman's work into a useful tool for a specific aerospace application are described. The resulting extended Kalman filter, as it is now known, is often still referred to simply as the Kalman filter. As the filter's use gained in popularity in the scientific community, the problems of implementation on small spaceborne and airborne computers led to a square-root formulation of the filter to overcome numerical difficulties associated with computer word length. The work that led to this new formulation is also discussed, including the first airborne computer implementation and flight test. Since then the applications of the extended and square-root formulations of the Kalman filter have grown rapidly throughout the aerospace industry.
Wölfling, K; Müller, K W
2010-04-01
Behavioral addictions, like pathological gambling and computer game addiction (or internet addiction), have become a growing concern in research and public interest. Currently similarities between behavioral addictions and substance dependency are controversially discussed in the scientific community. Unfortunately a mismatch exists between the large number of people seeking treatment and the small number of scientific studies on pathological gambling and computer game addiction. Prevalence of pathological gambling among the German population is estimated to be 0.2-0.5%. These estimations are comparable to prevalence rates reported for drug dependency. Latest research states that about 3% of German adolescents and young adults are believed to suffer from computer game addiction. Therefore, it is important to enhance investigations regarding the clinical and neuroscientific basis of computer game addiction. This review offers a summary of current results of research regarding pathological gambling and internet addiction. The phenomenological description of these two disorders is meant to allow a deeper understanding of behavioral addictions.
Parallel Computation of the Regional Ocean Modeling System (ROMS)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, P; Song, Y T; Chao, Y
2005-04-05
The Regional Ocean Modeling System (ROMS) is a regional ocean general circulation modeling system solving the free surface, hydrostatic, primitive equations over varying topography. It is free software distributed world-wide for studying both complex coastal ocean problems and the basin-to-global scale ocean circulation. The original ROMS code could only be run on shared-memory systems. With the increasing need to simulate larger model domains with finer resolutions and on a variety of computer platforms, there is a need in the ocean-modeling community to have a ROMS code that can be run on any parallel computer ranging from 10 to hundreds ofmore » processors. Recently, we have explored parallelization for ROMS using the MPI programming model. In this paper, an efficient parallelization strategy for such a large-scale scientific software package, based on an existing shared-memory computing model, is presented. In addition, scientific applications and data-performance issues on a couple of SGI systems, including Columbia, the world's third-fastest supercomputer, are discussed.« less
XML Based Scientific Data Management Facility
NASA Technical Reports Server (NTRS)
Mehrotra, Piyush; Zubair, M.; Ziebartt, John (Technical Monitor)
2001-01-01
The World Wide Web consortium has developed an Extensible Markup Language (XML) to support the building of better information management infrastructures. The scientific computing community realizing the benefits of HTML has designed markup languages for scientific data. In this paper, we propose a XML based scientific data management facility, XDMF. The project is motivated by the fact that even though a lot of scientific data is being generated, it is not being shared because of lack of standards and infrastructure support for discovering and transforming the data. The proposed data management facility can be used to discover the scientific data itself, the transformation functions, and also for applying the required transformations. We have built a prototype system of the proposed data management facility that can work on different platforms. We have implemented the system using Java, and Apache XSLT engine Xalan. To support remote data and transformation functions, we had to extend the XSLT specification and the Xalan package.
XML Based Scientific Data Management Facility
NASA Technical Reports Server (NTRS)
Mehrotra, P.; Zubair, M.; Bushnell, Dennis M. (Technical Monitor)
2002-01-01
The World Wide Web consortium has developed an Extensible Markup Language (XML) to support the building of better information management infrastructures. The scientific computing community realizing the benefits of XML has designed markup languages for scientific data. In this paper, we propose a XML based scientific data management ,facility, XDMF. The project is motivated by the fact that even though a lot of scientific data is being generated, it is not being shared because of lack of standards and infrastructure support for discovering and transforming the data. The proposed data management facility can be used to discover the scientific data itself, the transformation functions, and also for applying the required transformations. We have built a prototype system of the proposed data management facility that can work on different platforms. We have implemented the system using Java, and Apache XSLT engine Xalan. To support remote data and transformation functions, we had to extend the XSLT specification and the Xalan package.
Cultural and Technological Issues and Solutions for Geodynamics Software Citation
NASA Astrophysics Data System (ADS)
Heien, E. M.; Hwang, L.; Fish, A. E.; Smith, M.; Dumit, J.; Kellogg, L. H.
2014-12-01
Computational software and custom-written codes play a key role in scientific research and teaching, providing tools to perform data analysis and forward modeling through numerical computation. However, development of these codes is often hampered by the fact that there is no well-defined way for the authors to receive credit or professional recognition for their work through the standard methods of scientific publication and subsequent citation of the work. This in turn may discourage researchers from publishing their codes or making them easier for other scientists to use. We investigate the issues involved in citing software in a scientific context, and introduce features that should be components of a citation infrastructure, particularly oriented towards the codes and scientific culture in the area of geodynamics research. The codes used in geodynamics are primarily specialized numerical modeling codes for continuum mechanics problems; they may be developed by individual researchers, teams of researchers, geophysicists in collaboration with computational scientists and applied mathematicians, or by coordinated community efforts such as the Computational Infrastructure for Geodynamics. Some but not all geodynamics codes are open-source. These characteristics are common to many areas of geophysical software development and use. We provide background on the problem of software citation and discuss some of the barriers preventing adoption of such citations, including social/cultural barriers, insufficient technological support infrastructure, and an overall lack of agreement about what a software citation should consist of. We suggest solutions in an initial effort to create a system to support citation of software and promotion of scientific software development.
Grids, virtualization, and clouds at Fermilab
Timm, S.; Chadwick, K.; Garzoglio, G.; ...
2014-06-11
Fermilab supports a scientific program that includes experiments and scientists located across the globe. To better serve this community, in 2004, the (then) Computing Division undertook the strategy of placing all of the High Throughput Computing (HTC) resources in a Campus Grid known as FermiGrid, supported by common shared services. In 2007, the FermiGrid Services group deployed a service infrastructure that utilized Xen virtualization, LVS network routing and MySQL circular replication to deliver highly available services that offered significant performance, reliability and serviceability improvements. This deployment was further enhanced through the deployment of a distributed redundant network core architecture andmore » the physical distribution of the systems that host the virtual machines across multiple buildings on the Fermilab Campus. In 2010, building on the experience pioneered by FermiGrid in delivering production services in a virtual infrastructure, the Computing Sector commissioned the FermiCloud, General Physics Computing Facility and Virtual Services projects to serve as platforms for support of scientific computing (FermiCloud 6 GPCF) and core computing (Virtual Services). Lastly, this work will present the evolution of the Fermilab Campus Grid, Virtualization and Cloud Computing infrastructure together with plans for the future.« less
Grids, virtualization, and clouds at Fermilab
NASA Astrophysics Data System (ADS)
Timm, S.; Chadwick, K.; Garzoglio, G.; Noh, S.
2014-06-01
Fermilab supports a scientific program that includes experiments and scientists located across the globe. To better serve this community, in 2004, the (then) Computing Division undertook the strategy of placing all of the High Throughput Computing (HTC) resources in a Campus Grid known as FermiGrid, supported by common shared services. In 2007, the FermiGrid Services group deployed a service infrastructure that utilized Xen virtualization, LVS network routing and MySQL circular replication to deliver highly available services that offered significant performance, reliability and serviceability improvements. This deployment was further enhanced through the deployment of a distributed redundant network core architecture and the physical distribution of the systems that host the virtual machines across multiple buildings on the Fermilab Campus. In 2010, building on the experience pioneered by FermiGrid in delivering production services in a virtual infrastructure, the Computing Sector commissioned the FermiCloud, General Physics Computing Facility and Virtual Services projects to serve as platforms for support of scientific computing (FermiCloud 6 GPCF) and core computing (Virtual Services). This work will present the evolution of the Fermilab Campus Grid, Virtualization and Cloud Computing infrastructure together with plans for the future.
Science in the cloud (SIC): A use case in MRI connectomics
Gorgolewski, Krzysztof J.; Kleissas, Dean; Roncal, William Gray; Litt, Brian; Wandell, Brian; Poldrack, Russel A.; Wiener, Martin; Vogelstein, R. Jacob; Burns, Randal
2017-01-01
Abstract Modern technologies are enabling scientists to collect extraordinary amounts of complex and sophisticated data across a huge range of scales like never before. With this onslaught of data, we can allow the focal point to shift from data collection to data analysis. Unfortunately, lack of standardized sharing mechanisms and practices often make reproducing or extending scientific results very difficult. With the creation of data organization structures and tools that drastically improve code portability, we now have the opportunity to design such a framework for communicating extensible scientific discoveries. Our proposed solution leverages these existing technologies and standards, and provides an accessible and extensible model for reproducible research, called ‘science in the cloud’ (SIC). Exploiting scientific containers, cloud computing, and cloud data services, we show the capability to compute in the cloud and run a web service that enables intimate interaction with the tools and data presented. We hope this model will inspire the community to produce reproducible and, importantly, extensible results that will enable us to collectively accelerate the rate at which scientific breakthroughs are discovered, replicated, and extended. PMID:28327935
Science in the cloud (SIC): A use case in MRI connectomics.
Kiar, Gregory; Gorgolewski, Krzysztof J; Kleissas, Dean; Roncal, William Gray; Litt, Brian; Wandell, Brian; Poldrack, Russel A; Wiener, Martin; Vogelstein, R Jacob; Burns, Randal; Vogelstein, Joshua T
2017-05-01
Modern technologies are enabling scientists to collect extraordinary amounts of complex and sophisticated data across a huge range of scales like never before. With this onslaught of data, we can allow the focal point to shift from data collection to data analysis. Unfortunately, lack of standardized sharing mechanisms and practices often make reproducing or extending scientific results very difficult. With the creation of data organization structures and tools that drastically improve code portability, we now have the opportunity to design such a framework for communicating extensible scientific discoveries. Our proposed solution leverages these existing technologies and standards, and provides an accessible and extensible model for reproducible research, called 'science in the cloud' (SIC). Exploiting scientific containers, cloud computing, and cloud data services, we show the capability to compute in the cloud and run a web service that enables intimate interaction with the tools and data presented. We hope this model will inspire the community to produce reproducible and, importantly, extensible results that will enable us to collectively accelerate the rate at which scientific breakthroughs are discovered, replicated, and extended. © The Author 2017. Published by Oxford University Press.
ERIC Educational Resources Information Center
National Science Foundation, Arlington, VA. Directorate for Computer and Information Science and Engineering.
The purpose of this summary of awards is to provide the scientific and engineering communities with a summary of the grants awarded in 1994 by the National Science Foundation's Division of Microelectronic Information Processing Systems. Similar areas of research are grouped together. Grantee institutions and principal investigators are identified…
Computational Prediction of Protein-Protein Interactions
Ehrenberger, Tobias; Cantley, Lewis C.; Yaffe, Michael B.
2015-01-01
The prediction of protein-protein interactions and kinase-specific phosphorylation sites on individual proteins is critical for correctly placing proteins within signaling pathways and networks. The importance of this type of annotation continues to increase with the continued explosion of genomic and proteomic data, particularly with emerging data categorizing posttranslational modifications on a large scale. A variety of computational tools are available for this purpose. In this chapter, we review the general methodologies for these types of computational predictions and present a detailed user-focused tutorial of one such method and computational tool, Scansite, which is freely available to the entire scientific community over the Internet. PMID:25859943
NASA Astrophysics Data System (ADS)
Mills, R. T.; Rupp, K.; Smith, B. F.; Brown, J.; Knepley, M.; Zhang, H.; Adams, M.; Hammond, G. E.
2017-12-01
As the high-performance computing community pushes towards the exascale horizon, power and heat considerations have driven the increasing importance and prevalence of fine-grained parallelism in new computer architectures. High-performance computing centers have become increasingly reliant on GPGPU accelerators and "manycore" processors such as the Intel Xeon Phi line, and 512-bit SIMD registers have even been introduced in the latest generation of Intel's mainstream Xeon server processors. The high degree of fine-grained parallelism and more complicated memory hierarchy considerations of such "manycore" processors present several challenges to existing scientific software. Here, we consider how the massively parallel, open-source hydrologic flow and reactive transport code PFLOTRAN - and the underlying Portable, Extensible Toolkit for Scientific Computation (PETSc) library on which it is built - can best take advantage of such architectures. We will discuss some key features of these novel architectures and our code optimizations and algorithmic developments targeted at them, and present experiences drawn from working with a wide range of PFLOTRAN benchmark problems on these architectures.
Scientific Discovery through Advanced Computing in Plasma Science
NASA Astrophysics Data System (ADS)
Tang, William
2005-03-01
Advanced computing is generally recognized to be an increasingly vital tool for accelerating progress in scientific research during the 21st Century. For example, the Department of Energy's ``Scientific Discovery through Advanced Computing'' (SciDAC) Program was motivated in large measure by the fact that formidable scientific challenges in its research portfolio could best be addressed by utilizing the combination of the rapid advances in super-computing technology together with the emergence of effective new algorithms and computational methodologies. The imperative is to translate such progress into corresponding increases in the performance of the scientific codes used to model complex physical systems such as those encountered in high temperature plasma research. If properly validated against experimental measurements and analytic benchmarks, these codes can provide reliable predictive capability for the behavior of a broad range of complex natural and engineered systems. This talk reviews recent progress and future directions for advanced simulations with some illustrative examples taken from the plasma science applications area. Significant recent progress has been made in both particle and fluid simulations of fine-scale turbulence and large-scale dynamics, giving increasingly good agreement between experimental observations and computational modeling. This was made possible by the combination of access to powerful new computational resources together with innovative advances in analytic and computational methods for developing reduced descriptions of physics phenomena spanning a huge range in time and space scales. In particular, the plasma science community has made excellent progress in developing advanced codes for which computer run-time and problem size scale well with the number of processors on massively parallel machines (MPP's). A good example is the effective usage of the full power of multi-teraflop (multi-trillion floating point computations per second) MPP's to produce three-dimensional, general geometry, nonlinear particle simulations which have accelerated progress in understanding the nature of plasma turbulence in magnetically-confined high temperature plasmas. These calculations, which typically utilized billions of particles for thousands of time-steps, would not have been possible without access to powerful present generation MPP computers and the associated diagnostic and visualization capabilities. In general, results from advanced simulations provide great encouragement for being able to include increasingly realistic dynamics to enable deeper physics insights into plasmas in both natural and laboratory environments. The associated scientific excitement should serve to stimulate improved cross-cutting collaborations with other fields and also to help attract bright young talent to the computational science area.
NAS (Numerical Aerodynamic Simulation Program) technical summaries, March 1989 - February 1990
NASA Technical Reports Server (NTRS)
1990-01-01
Given here are selected scientific results from the Numerical Aerodynamic Simulation (NAS) Program's third year of operation. During this year, the scientific community was given access to a Cray-2 and a Cray Y-MP supercomputer. Topics covered include flow field analysis of fighter wing configurations, large-scale ocean modeling, the Space Shuttle flow field, advanced computational fluid dynamics (CFD) codes for rotary-wing airloads and performance prediction, turbulence modeling of separated flows, airloads and acoustics of rotorcraft, vortex-induced nonlinearities on submarines, and standing oblique detonation waves.
Software Attribution for Geoscience Applications in the Computational Infrastructure for Geodynamics
NASA Astrophysics Data System (ADS)
Hwang, L.; Dumit, J.; Fish, A.; Soito, L.; Kellogg, L. H.; Smith, M.
2015-12-01
Scientific software is largely developed by individual scientists and represents a significant intellectual contribution to the field. As the scientific culture and funding agencies move towards an expectation that software be open-source, there is a corresponding need for mechanisms to cite software, both to provide credit and recognition to developers, and to aid in discoverability of software and scientific reproducibility. We assess the geodynamic modeling community's current citation practices by examining more than 300 predominantly self-reported publications utilizing scientific software in the past 5 years that is available through the Computational Infrastructure for Geodynamics (CIG). Preliminary results indicate that authors cite and attribute software either through citing (in rank order) peer-reviewed scientific publications, a user's manual, and/or a paper describing the software code. Attributions maybe found directly in the text, in acknowledgements, in figure captions, or in footnotes. What is considered citable varies widely. Citations predominantly lack software version numbers or persistent identifiers to find the software package. Versioning may be implied through reference to a versioned user manual. Authors sometimes report code features used and whether they have modified the code. As an open-source community, CIG requests that researchers contribute their modifications to the repository. However, such modifications may not be contributed back to a repository code branch, decreasing the chances of discoverability and reproducibility. Survey results through CIG's Software Attribution for Geoscience Applications (SAGA) project suggest that lack of knowledge, tools, and workflows to cite codes are barriers to effectively implement the emerging citation norms. Generated on-demand attributions on software landing pages and a prototype extensible plug-in to automatically generate attributions in codes are the first steps towards reproducibility.
Damsel: A Data Model Storage Library for Exascale Science
DOE Office of Scientific and Technical Information (OSTI.GOV)
Koziol, Quincey
The goal of this project is to enable exascale computational science applications to interact conveniently and efficiently with storage through abstractions that match their data models. We will accomplish this through three major activities: (1) identifying major data model motifs in computational science applications and developing representative benchmarks; (2) developing a data model storage library, called Damsel, that supports these motifs, provides efficient storage data layouts, incorporates optimizations to enable exascale operation, and is tolerant to failures; and (3) productizing Damsel and working with computational scientists to encourage adoption of this library by the scientific community.
Moutsatsos, Ioannis K; Hossain, Imtiaz; Agarinis, Claudia; Harbinski, Fred; Abraham, Yann; Dobler, Luc; Zhang, Xian; Wilson, Christopher J; Jenkins, Jeremy L; Holway, Nicholas; Tallarico, John; Parker, Christian N
2017-03-01
High-throughput screening generates large volumes of heterogeneous data that require a diverse set of computational tools for management, processing, and analysis. Building integrated, scalable, and robust computational workflows for such applications is challenging but highly valuable. Scientific data integration and pipelining facilitate standardized data processing, collaboration, and reuse of best practices. We describe how Jenkins-CI, an "off-the-shelf," open-source, continuous integration system, is used to build pipelines for processing images and associated data from high-content screening (HCS). Jenkins-CI provides numerous plugins for standard compute tasks, and its design allows the quick integration of external scientific applications. Using Jenkins-CI, we integrated CellProfiler, an open-source image-processing platform, with various HCS utilities and a high-performance Linux cluster. The platform is web-accessible, facilitates access and sharing of high-performance compute resources, and automates previously cumbersome data and image-processing tasks. Imaging pipelines developed using the desktop CellProfiler client can be managed and shared through a centralized Jenkins-CI repository. Pipelines and managed data are annotated to facilitate collaboration and reuse. Limitations with Jenkins-CI (primarily around the user interface) were addressed through the selection of helper plugins from the Jenkins-CI community.
Moutsatsos, Ioannis K.; Hossain, Imtiaz; Agarinis, Claudia; Harbinski, Fred; Abraham, Yann; Dobler, Luc; Zhang, Xian; Wilson, Christopher J.; Jenkins, Jeremy L.; Holway, Nicholas; Tallarico, John; Parker, Christian N.
2016-01-01
High-throughput screening generates large volumes of heterogeneous data that require a diverse set of computational tools for management, processing, and analysis. Building integrated, scalable, and robust computational workflows for such applications is challenging but highly valuable. Scientific data integration and pipelining facilitate standardized data processing, collaboration, and reuse of best practices. We describe how Jenkins-CI, an “off-the-shelf,” open-source, continuous integration system, is used to build pipelines for processing images and associated data from high-content screening (HCS). Jenkins-CI provides numerous plugins for standard compute tasks, and its design allows the quick integration of external scientific applications. Using Jenkins-CI, we integrated CellProfiler, an open-source image-processing platform, with various HCS utilities and a high-performance Linux cluster. The platform is web-accessible, facilitates access and sharing of high-performance compute resources, and automates previously cumbersome data and image-processing tasks. Imaging pipelines developed using the desktop CellProfiler client can be managed and shared through a centralized Jenkins-CI repository. Pipelines and managed data are annotated to facilitate collaboration and reuse. Limitations with Jenkins-CI (primarily around the user interface) were addressed through the selection of helper plugins from the Jenkins-CI community. PMID:27899692
Toward Scientific Numerical Modeling
NASA Technical Reports Server (NTRS)
Kleb, Bil
2007-01-01
Ultimately, scientific numerical models need quantified output uncertainties so that modeling can evolve to better match reality. Documenting model input uncertainties and verifying that numerical models are translated into code correctly, however, are necessary first steps toward that goal. Without known input parameter uncertainties, model sensitivities are all one can determine, and without code verification, output uncertainties are simply not reliable. To address these two shortcomings, two proposals are offered: (1) an unobtrusive mechanism to document input parameter uncertainties in situ and (2) an adaptation of the Scientific Method to numerical model development and deployment. Because these two steps require changes in the computational simulation community to bear fruit, they are presented in terms of the Beckhard-Harris-Gleicher change model.
Astro-WISE: Chaining to the Universe
NASA Astrophysics Data System (ADS)
Valentijn, E. A.; McFarland, J. P.; Snigula, J.; Begeman, K. G.; Boxhoorn, D. R.; Rengelink, R.; Helmich, E.; Heraudeau, P.; Verdoes Kleijn, G.; Vermeij, R.; Vriend, W.-J.; Tempelaar, M. J.; Deul, E.; Kuijken, K.; Capaccioli, M.; Silvotti, R.; Bender, R.; Neeser, M.; Saglia, R.; Bertin, E.; Mellier, Y.
2007-10-01
The recent explosion of recorded digital data and its processed derivatives threatens to overwhelm researchers when analysing their experimental data or looking up data items in archives and file systems. While current hardware developments allow the acquisition, processing and storage of hundreds of terabytes of data at the cost of a modern sports car, the software systems to handle these data are lagging behind. This problem is very general and is well recognized by various scientific communities; several large projects have been initiated, e.g., DATAGRID/EGEE {http://www.eu-egee.org/} federates compute and storage power over the high-energy physical community, while the international astronomical community is building an Internet geared Virtual Observatory {http://www.euro-vo.org/pub/} (Padovani 2006) connecting archival data. These large projects either focus on a specific distribution aspect or aim to connect many sub-communities and have a relatively long trajectory for setting standards and a common layer. Here, we report first light of a very different solution (Valentijn & Kuijken 2004) to the problem initiated by a smaller astronomical IT community. It provides an abstract scientific information layer which integrates distributed scientific analysis with distributed processing and federated archiving and publishing. By designing new abstractions and mixing in old ones, a Science Information System with fully scalable cornerstones has been achieved, transforming data systems into knowledge systems. This break-through is facilitated by the full end-to-end linking of all dependent data items, which allows full backward chaining from the observer/researcher to the experiment. Key is the notion that information is intrinsic in nature and thus is the data acquired by a scientific experiment. The new abstraction is that software systems guide the user to that intrinsic information by forcing full backward and forward chaining in the data modelling.
Zanin, Massimiliano; Chorbev, Ivan; Stres, Blaz; Stalidzans, Egils; Vera, Julio; Tieri, Paolo; Castiglione, Filippo; Groen, Derek; Zheng, Huiru; Baumbach, Jan; Schmid, Johannes A; Basilio, José; Klimek, Peter; Debeljak, Nataša; Rozman, Damjana; Schmidt, Harald H H W
2017-12-05
Systems medicine holds many promises, but has so far provided only a limited number of proofs of principle. To address this road block, possible barriers and challenges of translating systems medicine into clinical practice need to be identified and addressed. The members of the European Cooperation in Science and Technology (COST) Action CA15120 Open Multiscale Systems Medicine (OpenMultiMed) wish to engage the scientific community of systems medicine and multiscale modelling, data science and computing, to provide their feedback in a structured manner. This will result in follow-up white papers and open access resources to accelerate the clinical translation of systems medicine. © The Author 2017. Published by Oxford University Press.
Institute for scientific computing research;fiscal year 1999 annual report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keyes, D
2000-03-28
Large-scale scientific computation, and all of the disciplines that support it and help to validate it, have been placed at the focus of Lawrence Livermore National Laboratory by the Accelerated Strategic Computing Initiative (ASCI). The Laboratory operates the computer with the highest peak performance in the world and has undertaken some of the largest and most compute-intensive simulations ever performed. Computers at the architectural extremes, however, are notoriously difficult to use efficiently. Even such successes as the Laboratory's two Bell Prizes awarded in November 1999 only emphasize the need for much better ways of interacting with the results of large-scalemore » simulations. Advances in scientific computing research have, therefore, never been more vital to the core missions of the Laboratory than at present. Computational science is evolving so rapidly along every one of its research fronts that to remain on the leading edge, the Laboratory must engage researchers at many academic centers of excellence. In FY 1999, the Institute for Scientific Computing Research (ISCR) has expanded the Laboratory's bridge to the academic community in the form of collaborative subcontracts, visiting faculty, student internships, a workshop, and a very active seminar series. ISCR research participants are integrated almost seamlessly with the Laboratory's Center for Applied Scientific Computing (CASC), which, in turn, addresses computational challenges arising throughout the Laboratory. Administratively, the ISCR flourishes under the Laboratory's University Relations Program (URP). Together with the other four Institutes of the URP, it must navigate a course that allows the Laboratory to benefit from academic exchanges while preserving national security. Although FY 1999 brought more than its share of challenges to the operation of an academic-like research enterprise within the context of a national security laboratory, the results declare the challenges well met and well worth the continued effort. A change of administration for the ISCR occurred during FY 1999. Acting Director John Fitzgerald retired from LLNL in August after 35 years of service, including the last two at helm of the ISCR. David Keyes, who has been a regular visitor in conjunction with ASCI scalable algorithms research since October 1997, overlapped with John for three months and serves half-time as the new Acting Director.« less
Exploring Two Approaches for an End-to-End Scientific Analysis Workflow
Dodelson, Scott; Kent, Steve; Kowalkowski, Jim; ...
2015-12-23
The advance of the scientific discovery process is accomplished by the integration of independently-developed programs run on disparate computing facilities into coherent workflows usable by scientists who are not experts in computing. For such advancement, we need a system which scientists can use to formulate analysis workflows, to integrate new components to these workflows, and to execute different components on resources that are best suited to run those components. In addition, we need to monitor the status of the workflow as components get scheduled and executed, and to access the intermediate and final output for visual exploration and analysis. Finally,more » it is important for scientists to be able to share their workflows with collaborators. Moreover we have explored two approaches for such an analysis framework for the Large Synoptic Survey Telescope (LSST) Dark Energy Science Collaboration (DESC), the first one is based on the use and extension of Galaxy, a web-based portal for biomedical research, and the second one is based on a programming language, Python. In our paper, we present a brief description of the two approaches, describe the kinds of extensions to the Galaxy system we have found necessary in order to support the wide variety of scientific analysis in the cosmology community, and discuss how similar efforts might be of benefit to the HEP community.« less
GABBs: Cyberinfrastructure for Self-Service Geospatial Data Exploration, Computation, and Sharing
NASA Astrophysics Data System (ADS)
Song, C. X.; Zhao, L.; Biehl, L. L.; Merwade, V.; Villoria, N.
2016-12-01
Geospatial data are present everywhere today with the proliferation of location-aware computing devices. This is especially true in the scientific community where large amounts of data are driving research and education activities in many domains. Collaboration over geospatial data, for example, in modeling, data analysis and visualization, must still overcome the barriers of specialized software and expertise among other challenges. In addressing these needs, the Geospatial data Analysis Building Blocks (GABBs) project aims at building geospatial modeling, data analysis and visualization capabilities in an open source web platform, HUBzero. Funded by NSF's Data Infrastructure Building Blocks initiative, GABBs is creating a geospatial data architecture that integrates spatial data management, mapping and visualization, and interfaces in the HUBzero platform for scientific collaborations. The geo-rendering enabled Rappture toolkit, a generic Python mapping library, geospatial data exploration and publication tools, and an integrated online geospatial data management solution are among the software building blocks from the project. The GABBS software will be available through Amazon's AWS Marketplace VM images and open source. Hosting services are also available to the user community. The outcome of the project will enable researchers and educators to self-manage their scientific data, rapidly create GIS-enable tools, share geospatial data and tools on the web, and build dynamic workflows connecting data and tools, all without requiring significant software development skills, GIS expertise or IT administrative privileges. This presentation will describe the GABBs architecture, toolkits and libraries, and showcase the scientific use cases that utilize GABBs capabilities, as well as the challenges and solutions for GABBs to interoperate with other cyberinfrastructure platforms.
Computer-assisted instruction: a library service for the community teaching hospital.
McCorkel, J; Cook, V
1986-04-01
This paper reports on five years of experience with computer-assisted instruction (CAI) at Winthrop-University Hospital, a major affiliate of the SUNY at Stony Brook School of Medicine. It compares CAI programs available from Ohio State University and Massachusetts General Hospital (accessed by telephone and modem), and software packages purchased from the Health Sciences Consortium (MED-CAPS) and Scientific American (DISCOTEST). The comparison documents one library's experience of the cost of these programs and the use made of them by medical students, house staff, and attending physicians. It describes the space allocated for necessary equipment, as well as the marketing of CAI. Finally, in view of the decision of the National Board of Medical Examiners to administer the Part III examination on computer (the so-called CBX) starting in 1988, the paper speculates on the future importance of CAI in the community teaching hospital.
26 CFR 53.4942(a)-2 - Computation of undistributed income.
Code of Federal Regulations, 2012 CFR
2012-04-01
... is within a larger complex of endeavors which makes available to the scientific community and the... shares of the stock of M Corporation. M stock is regularly traded on the New York Stock Exchange. U consistently follows a practice of valuing its 1,000 shares of M stock on the last trading day of each month...
26 CFR 53.4942(a)-2 - Computation of undistributed income.
Code of Federal Regulations, 2014 CFR
2014-04-01
... is within a larger complex of endeavors which makes available to the scientific community and the... shares of the stock of M Corporation. M stock is regularly traded on the New York Stock Exchange. U consistently follows a practice of valuing its 1,000 shares of M stock on the last trading day of each month...
26 CFR 53.4942(a)-2 - Computation of undistributed income.
Code of Federal Regulations, 2013 CFR
2013-04-01
... is within a larger complex of endeavors which makes available to the scientific community and the... shares of the stock of M Corporation. M stock is regularly traded on the New York Stock Exchange. U consistently follows a practice of valuing its 1,000 shares of M stock on the last trading day of each month...
The International Symposium on Grids and Clouds
NASA Astrophysics Data System (ADS)
The International Symposium on Grids and Clouds (ISGC) 2012 will be held at Academia Sinica in Taipei from 26 February to 2 March 2012, with co-located events and workshops. The conference is hosted by the Academia Sinica Grid Computing Centre (ASGC). 2012 is the decennium anniversary of the ISGC which over the last decade has tracked the convergence, collaboration and innovation of individual researchers across the Asia Pacific region to a coherent community. With the continuous support and dedication from the delegates, ISGC has provided the primary international distributed computing platform where distinguished researchers and collaboration partners from around the world share their knowledge and experiences. The last decade has seen the wide-scale emergence of e-Infrastructure as a critical asset for the modern e-Scientist. The emergence of large-scale research infrastructures and instruments that has produced a torrent of electronic data is forcing a generational change in the scientific process and the mechanisms used to analyse the resulting data deluge. No longer can the processing of these vast amounts of data and production of relevant scientific results be undertaken by a single scientist. Virtual Research Communities that span organisations around the world, through an integrated digital infrastructure that connects the trust and administrative domains of multiple resource providers, have become critical in supporting these analyses. Topics covered in ISGC 2012 include: High Energy Physics, Biomedicine & Life Sciences, Earth Science, Environmental Changes and Natural Disaster Mitigation, Humanities & Social Sciences, Operations & Management, Middleware & Interoperability, Security and Networking, Infrastructure Clouds & Virtualisation, Business Models & Sustainability, Data Management, Distributed Volunteer & Desktop Grid Computing, High Throughput Computing, and High Performance, Manycore & GPU Computing.
Science Gateways, Scientific Workflows and Open Community Software
NASA Astrophysics Data System (ADS)
Pierce, M. E.; Marru, S.
2014-12-01
Science gateways and scientific workflows occupy different ends of the spectrum of user-focused cyberinfrastructure. Gateways, sometimes called science portals, provide a way for enabling large numbers of users to take advantage of advanced computing resources (supercomputers, advanced storage systems, science clouds) by providing Web and desktop interfaces and supporting services. Scientific workflows, at the other end of the spectrum, support advanced usage of cyberinfrastructure that enable "power users" to undertake computational experiments that are not easily done through the usual mechanisms (managing simulations across multiple sites, for example). Despite these different target communities, gateways and workflows share many similarities and can potentially be accommodated by the same software system. For example, pipelines to process InSAR imagery sets or to datamine GPS time series data are workflows. The results and the ability to make downstream products may be made available through a gateway, and power users may want to provide their own custom pipelines. In this abstract, we discuss our efforts to build an open source software system, Apache Airavata, that can accommodate both gateway and workflow use cases. Our approach is general, and we have applied the software to problems in a number of scientific domains. In this talk, we discuss our applications to usage scenarios specific to earth science, focusing on earthquake physics examples drawn from the QuakSim.org and GeoGateway.org efforts. We also examine the role of the Apache Software Foundation's open community model as a way to build up common commmunity codes that do not depend upon a single "owner" to sustain. Pushing beyond open source software, we also see the need to provide gateways and workflow systems as cloud services. These services centralize operations, provide well-defined programming interfaces, scale elastically, and have global-scale fault tolerance. We discuss our work providing Apache Airavata as a hosted service to provide these features.
Application of infrared thermography in computer aided diagnosis
NASA Astrophysics Data System (ADS)
Faust, Oliver; Rajendra Acharya, U.; Ng, E. Y. K.; Hong, Tan Jen; Yu, Wenwei
2014-09-01
The invention of thermography, in the 1950s, posed a formidable problem to the research community: What is the relationship between disease and heat radiation captured with Infrared (IR) cameras? The research community responded with a continuous effort to find this crucial relationship. This effort was aided by advances in processing techniques, improved sensitivity and spatial resolution of thermal sensors. However, despite this progress fundamental issues with this imaging modality still remain. The main problem is that the link between disease and heat radiation is complex and in many cases even non-linear. Furthermore, the change in heat radiation as well as the change in radiation pattern, which indicate disease, is minute. On a technical level, this poses high requirements on image capturing and processing. On a more abstract level, these problems lead to inter-observer variability and on an even more abstract level they lead to a lack of trust in this imaging modality. In this review, we adopt the position that these problems can only be solved through a strict application of scientific principles and objective performance assessment. Computing machinery is inherently objective; this helps us to apply scientific principles in a transparent way and to assess the performance results. As a consequence, we aim to promote thermography based Computer-Aided Diagnosis (CAD) systems. Another benefit of CAD systems comes from the fact that the diagnostic accuracy is linked to the capability of the computing machinery and, in general, computers become ever more potent. We predict that a pervasive application of computers and networking technology in medicine will help us to overcome the shortcomings of any single imaging modality and this will pave the way for integrated health care systems which maximize the quality of patient care.
Agent-based model for the h-index - exact solution
NASA Astrophysics Data System (ADS)
Żogała-Siudem, Barbara; Siudem, Grzegorz; Cena, Anna; Gagolewski, Marek
2016-01-01
Hirsch's h-index is perhaps the most popular citation-based measure of scientific excellence. In 2013, Ionescu and Chopard proposed an agent-based model describing a process for generating publications and citations in an abstract scientific community [G. Ionescu, B. Chopard, Eur. Phys. J. B 86, 426 (2013)]. Within such a framework, one may simulate a scientist's activity, and - by extension - investigate the whole community of researchers. Even though the Ionescu and Chopard model predicts the h-index quite well, the authors provided a solution based solely on simulations. In this paper, we complete their results with exact, analytic formulas. What is more, by considering a simplified version of the Ionescu-Chopard model, we obtained a compact, easy to compute formula for the h-index. The derived approximate and exact solutions are investigated on a simulated and real-world data sets.
Report on Computing and Networking in the Space Science Laboratory by the SSL Computer Committee
NASA Technical Reports Server (NTRS)
Gallagher, D. L. (Editor)
1993-01-01
The Space Science Laboratory (SSL) at Marshall Space Flight Center is a multiprogram facility. Scientific research is conducted in four discipline areas: earth science and applications, solar-terrestrial physics, astrophysics, and microgravity science and applications. Representatives from each of these discipline areas participate in a Laboratory computer requirements committee, which developed this document. The purpose is to establish and discuss Laboratory objectives for computing and networking in support of science. The purpose is also to lay the foundation for a collective, multiprogram approach to providing these services. Special recognition is given to the importance of the national and international efforts of our research communities toward the development of interoperable, network-based computer applications.
Earth System Grid II, Turning Climate Datasets into Community Resources
DOE Office of Scientific and Technical Information (OSTI.GOV)
Middleton, Don
2006-08-01
The Earth System Grid (ESG) II project, funded by the Department of Energy’s Scientific Discovery through Advanced Computing program, has transformed climate data into community resources. ESG II has accomplished this goal by creating a virtual collaborative environment that links climate centers and users around the world to models and data via a computing Grid, which is based on the Department of Energy’s supercomputing resources and the Internet. Our project’s success stems from partnerships between climate researchers and computer scientists to advance basic and applied research in the terrestrial, atmospheric, and oceanic sciences. By interfacing with other climate science projects,more » we have learned that commonly used methods to manage and remotely distribute data among related groups lack infrastructure and under-utilize existing technologies. Knowledge and expertise gained from ESG II have helped the climate community plan strategies to manage a rapidly growing data environment more effectively. Moreover, approaches and technologies developed under the ESG project have impacted datasimulation integration in other disciplines, such as astrophysics, molecular biology and materials science.« less
NASA Astrophysics Data System (ADS)
Lamarque, J. F.
2016-12-01
In this talk, we will discuss the upcoming release of CESM2 and the computational and scientific challenges encountered in the process. We will then discuss upcoming new opportunities in development and applications of Earth System Models; in particular, we will discuss additional ways in which the university community can contribute to CESM.
Distributed storage and cloud computing: a test case
NASA Astrophysics Data System (ADS)
Piano, S.; Delia Ricca, G.
2014-06-01
Since 2003 the computing farm hosted by the INFN Tier3 facility in Trieste supports the activities of many scientific communities. Hundreds of jobs from 45 different VOs, including those of the LHC experiments, are processed simultaneously. Given that normally the requirements of the different computational communities are not synchronized, the probability that at any given time the resources owned by one of the participants are not fully utilized is quite high. A balanced compensation should in principle allocate the free resources to other users, but there are limits to this mechanism. In fact, the Trieste site may not hold the amount of data needed to attract enough analysis jobs, and even in that case there could be a lack of bandwidth for their access. The Trieste ALICE and CMS computing groups, in collaboration with other Italian groups, aim to overcome the limitations of existing solutions using two approaches: sharing the data among all the participants taking full advantage of GARR-X wide area networks (10 GB/s) and integrating the resources dedicated to batch analysis with the ones reserved for dynamic interactive analysis, through modern solutions as cloud computing.
NASA Astrophysics Data System (ADS)
Schildhauer, M.; Jones, M. B.; Bolker, B.; Lenhardt, W. C.; Hampton, S. E.; Idaszak, R.; Rebich Hespanha, S.; Ahalt, S.; Christopherson, L.
2014-12-01
Continuing advances in computational capabilities, access to Big Data, and virtual collaboration technologies are creating exciting new opportunities for accomplishing Earth science research at finer resolutions, with much broader scope, using powerful modeling and analytical approaches that were unachievable just a few years ago. Yet, there is a perceptible lag in the abilities of the research community to capitalize on these new possibilities, due to lacking the relevant skill-sets, especially with regards to multi-disciplinary and integrative investigations that involve active collaboration. UC Santa Barbara's National Center for Ecological Analysis and Synthesis (NCEAS), and the University of North Carolina's Renaissance Computing Institute (RENCI), were recipients of NSF OCI S2I2 "Conceptualization awards", charged with helping define the needs of the research community relative to enabling science and education through "sustained software infrastructure". Over the course of our activities, a consistent request from Earth scientists was for "better training in software that enables more effective, reproducible research." This community-based feedback led to creation of an "Open Science for Synthesis" Institute— a innovative, three-week, bi-coastal training program for early career researchers. We provided a mix of lectures, hands-on exercises, and working group experience on topics including: data discovery and preservation; code creation, management, sharing, and versioning; scientific workflow documentation and reproducibility; statistical and machine modeling techniques; virtual collaboration mechanisms; and methods for communicating scientific results. All technologies and quantitative tools presented were suitable for advancing open, collaborative, and reproducible synthesis research. In this talk, we will report on the lessons learned from running this ambitious training program, that involved coordinating classrooms among two remote sites, and included developing original synthesis research activities as part of the course. We also report on the feedback provided by participants as to the learning approaches and topical issues they found most engaging, and why.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gerber, Richard; Hack, James; Riley, Katherine
The mission of the U.S. Department of Energy Office of Science (DOE SC) is the delivery of scientific discoveries and major scientific tools to transform our understanding of nature and to advance the energy, economic, and national security missions of the United States. To achieve these goals in today’s world requires investments in not only the traditional scientific endeavors of theory and experiment, but also in computational science and the facilities that support large-scale simulation and data analysis. The Advanced Scientific Computing Research (ASCR) program addresses these challenges in the Office of Science. ASCR’s mission is to discover, develop, andmore » deploy computational and networking capabilities to analyze, model, simulate, and predict complex phenomena important to DOE. ASCR supports research in computational science, three high-performance computing (HPC) facilities — the National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory and Leadership Computing Facilities at Argonne (ALCF) and Oak Ridge (OLCF) National Laboratories — and the Energy Sciences Network (ESnet) at Berkeley Lab. ASCR is guided by science needs as it develops research programs, computers, and networks at the leading edge of technologies. As we approach the era of exascale computing, technology changes are creating challenges for science programs in SC for those who need to use high performance computing and data systems effectively. Numerous significant modifications to today’s tools and techniques will be needed to realize the full potential of emerging computing systems and other novel computing architectures. To assess these needs and challenges, ASCR held a series of Exascale Requirements Reviews in 2015–2017, one with each of the six SC program offices,1 and a subsequent Crosscut Review that sought to integrate the findings from each. Participants at the reviews were drawn from the communities of leading domain scientists, experts in computer science and applied mathematics, ASCR facility staff, and DOE program managers in ASCR and the respective program offices. The purpose of these reviews was to identify mission-critical scientific problems within the DOE Office of Science (including experimental facilities) and determine the requirements for the exascale ecosystem that would be needed to address those challenges. The exascale ecosystem includes exascale computing systems, high-end data capabilities, efficient software at scale, libraries, tools, and other capabilities. This effort will contribute to the development of a strategic roadmap for ASCR compute and data facility investments and will help the ASCR Facility Division establish partnerships with Office of Science stakeholders. It will also inform the Office of Science research needs and agenda. The results of the six reviews have been published in reports available on the web at http://exascaleage.org/. This report presents a summary of the individual reports and of common and crosscutting findings, and it identifies opportunities for productive collaborations among the DOE SC program offices.« less
Patient Privacy in the Era of Big Data.
Kayaalp, Mehmet
2018-01-20
Privacy was defined as a fundamental human right in the Universal Declaration of Human Rights at the 1948 United Nations General Assembly. However, there is still no consensus on what constitutes privacy. In this review, we look at the evolution of privacy as a concept from the era of Hippocrates to the era of social media and big data. To appreciate the modern measures of patient privacy protection and correctly interpret the current regulatory framework in the United States, we need to analyze and understand the concepts of individually identifiable information, individually identifiable health information, protected health information, and de-identification. The Privacy Rule of the Health Insurance Portability and Accountability Act defines the regulatory framework and casts a balance between protective measures and access to health information for secondary (scientific) use. The rule defines the conditions when health information is protected by law and how protected health information can be de-identified for secondary use. With the advents of artificial intelligence and computational linguistics, computational text de-identification algorithms produce de-identified results nearly as well as those produced by human experts, but much faster, more consistently and basically for free. Modern clinical text de-identification systems now pave the road to big data and enable scientists to access de-identified clinical information while firmly protecting patient privacy. However, clinical text de-identification is not a perfect process. In order to maximize the protection of patient privacy and to free clinical and scientific information from the confines of electronic healthcare systems, all stakeholders, including patients, health institutions and institutional review boards, scientists and the scientific communities, as well as regulatory and law enforcement agencies must collaborate closely. On the one hand, public health laws and privacy regulations define rules and responsibilities such as requesting and granting only the amount of health information that is necessary for the scientific study. On the other hand, developers of de-identification systems provide guidelines to use different modes of operations to maximize the effectiveness of their tools and the success of de-identification. Institutions with clinical repositories need to follow these rules and guidelines closely to successfully protect patient privacy. To open the gates of big data to scientific communities, healthcare institutions need to be supported in their de-identification and data sharing efforts by the public, scientific communities, and local, state, and federal legislators and government agencies.
Patient Privacy in the Era of Big Data
Kayaalp, Mehmet
2018-01-01
Privacy was defined as a fundamental human right in the Universal Declaration of Human Rights at the 1948 United Nations General Assembly. However, there is still no consensus on what constitutes privacy. In this review, we look at the evolution of privacy as a concept from the era of Hippocrates to the era of social media and big data. To appreciate the modern measures of patient privacy protection and correctly interpret the current regulatory framework in the United States, we need to analyze and understand the concepts of individually identifiable information, individually identifiable health information, protected health information, and de-identification. The Privacy Rule of the Health Insurance Portability and Accountability Act defines the regulatory framework and casts a balance between protective measures and access to health information for secondary (scientific) use. The rule defines the conditions when health information is protected by law and how protected health information can be de-identified for secondary use. With the advents of artificial intelligence and computational linguistics, computational text de-identification algorithms produce de-identified results nearly as well as those produced by human experts, but much faster, more consistently and basically for free. Modern clinical text de-identification systems now pave the road to big data and enable scientists to access de-identified clinical information while firmly protecting patient privacy. However, clinical text de-identification is not a perfect process. In order to maximize the protection of patient privacy and to free clinical and scientific information from the confines of electronic healthcare systems, all stakeholders, including patients, health institutions and institutional review boards, scientists and the scientific communities, as well as regulatory and law enforcement agencies must collaborate closely. On the one hand, public health laws and privacy regulations define rules and responsibilities such as requesting and granting only the amount of health information that is necessary for the scientific study. On the other hand, developers of de-identification systems provide guidelines to use different modes of operations to maximize the effectiveness of their tools and the success of de-identification. Institutions with clinical repositories need to follow these rules and guidelines closely to successfully protect patient privacy. To open the gates of big data to scientific communities, healthcare institutions need to be supported in their de-identification and data sharing efforts by the public, scientific communities, and local, state, and federal legislators and government agencies. PMID:28903886
Measuring mumbo jumbo: A preliminary quantification of the use of jargon in science communication.
Sharon, Aviv J; Baram-Tsabari, Ayelet
2014-07-01
Leaders of the scientific community encourage scientists to learn effective science communication, including honing the skill to discuss science with little professional jargon. However, avoiding jargon is not trivial for scientists for several reasons, and this demands special attention in teaching and evaluation. Despite this, no standard measurement for the use of scientific jargon in speech has been developed to date. Here a standard yardstick for the use of scientific jargon in spoken texts, using a computational linguistics approach, is proposed. Analyzed transcripts included academic speech, scientific TEDTalks, and communication about the discovery of a Higgs-like boson at CERN. Findings suggest that scientists use less jargon in communication with a general audience than in communication with peers, but not always less obscure jargon. These findings may lay the groundwork for evaluating the use of jargon.
A uniform approach for programming distributed heterogeneous computing systems
Grasso, Ivan; Pellegrini, Simone; Cosenza, Biagio; Fahringer, Thomas
2014-01-01
Large-scale compute clusters of heterogeneous nodes equipped with multi-core CPUs and GPUs are getting increasingly popular in the scientific community. However, such systems require a combination of different programming paradigms making application development very challenging. In this article we introduce libWater, a library-based extension of the OpenCL programming model that simplifies the development of heterogeneous distributed applications. libWater consists of a simple interface, which is a transparent abstraction of the underlying distributed architecture, offering advanced features such as inter-context and inter-node device synchronization. It provides a runtime system which tracks dependency information enforced by event synchronization to dynamically build a DAG of commands, on which we automatically apply two optimizations: collective communication pattern detection and device-host-device copy removal. We assess libWater’s performance in three compute clusters available from the Vienna Scientific Cluster, the Barcelona Supercomputing Center and the University of Innsbruck, demonstrating improved performance and scaling with different test applications and configurations. PMID:25844015
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel; Oliker, Leonid; Vuduc, Richard
2007-01-01
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific optimization methodologies for important scientific computations. In this work, we examine sparse matrix-vector multiply (SpMV) - one of the most heavily used kernels in scientific computing - across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD dual-core and Intel quad-core designs, the heterogeneous STI Cell, as well as the first scientificmore » study of the highly multithreaded Sun Niagara2. We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing state-of-the-art serial and parallel SpMV implementations. Additionally, we present key insights into the architectural tradeoffs of leading multicore design strategies, in the context of demanding memory-bound numerical algorithms.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Curtis, Darren S.; Peterson, Elena S.; Oehmen, Chris S.
2008-05-04
This work presents the ScalaBLAST Web Application (SWA), a web based application implemented using the PHP script language, MySQL DBMS, and Apache web server under a GNU/Linux platform. SWA is an application built as part of the Data Intensive Computer for Complex Biological Systems (DICCBS) project at the Pacific Northwest National Laboratory (PNNL). SWA delivers accelerated throughput of bioinformatics analysis via high-performance computing through a convenient, easy-to-use web interface. This approach greatly enhances emerging fields of study in biology such as ontology-based homology, and multiple whole genome comparisons which, in the absence of a tool like SWA, require a heroicmore » effort to overcome the computational bottleneck associated with genome analysis. The current version of SWA includes a user account management system, a web based user interface, and a backend process that generates the files necessary for the Internet scientific community to submit a ScalaBLAST parallel processing job on a dedicated cluster.« less
A uniform approach for programming distributed heterogeneous computing systems.
Grasso, Ivan; Pellegrini, Simone; Cosenza, Biagio; Fahringer, Thomas
2014-12-01
Large-scale compute clusters of heterogeneous nodes equipped with multi-core CPUs and GPUs are getting increasingly popular in the scientific community. However, such systems require a combination of different programming paradigms making application development very challenging. In this article we introduce libWater, a library-based extension of the OpenCL programming model that simplifies the development of heterogeneous distributed applications. libWater consists of a simple interface, which is a transparent abstraction of the underlying distributed architecture, offering advanced features such as inter-context and inter-node device synchronization. It provides a runtime system which tracks dependency information enforced by event synchronization to dynamically build a DAG of commands, on which we automatically apply two optimizations: collective communication pattern detection and device-host-device copy removal. We assess libWater's performance in three compute clusters available from the Vienna Scientific Cluster, the Barcelona Supercomputing Center and the University of Innsbruck, demonstrating improved performance and scaling with different test applications and configurations.
Predicting future discoveries from current scientific literature.
Petrič, Ingrid; Cestnik, Bojan
2014-01-01
Knowledge discovery in biomedicine is a time-consuming process starting from the basic research, through preclinical testing, towards possible clinical applications. Crossing of conceptual boundaries is often needed for groundbreaking biomedical research that generates highly inventive discoveries. We demonstrate the ability of a creative literature mining method to advance valuable new discoveries based on rare ideas from existing literature. When emerging ideas from scientific literature are put together as fragments of knowledge in a systematic way, they may lead to original, sometimes surprising, research findings. If enough scientific evidence is already published for the association of such findings, they can be considered as scientific hypotheses. In this chapter, we describe a method for the computer-aided generation of such hypotheses based on the existing scientific literature. Our literature-based discovery of NF-kappaB with its possible connections to autism was recently approved by scientific community, which confirms the ability of our literature mining methodology to accelerate future discoveries based on rare ideas from existing literature.
A suite of exercises for verifying dynamic earthquake rupture codes
Harris, Ruth A.; Barall, Michael; Aagaard, Brad T.; Ma, Shuo; Roten, Daniel; Olsen, Kim B.; Duan, Benchun; Liu, Dunyu; Luo, Bin; Bai, Kangchen; Ampuero, Jean-Paul; Kaneko, Yoshihiro; Gabriel, Alice-Agnes; Duru, Kenneth; Ulrich, Thomas; Wollherr, Stephanie; Shi, Zheqiang; Dunham, Eric; Bydlon, Sam; Zhang, Zhenguo; Chen, Xiaofei; Somala, Surendra N.; Pelties, Christian; Tago, Josue; Cruz-Atienza, Victor Manuel; Kozdon, Jeremy; Daub, Eric; Aslam, Khurram; Kase, Yuko; Withers, Kyle; Dalguer, Luis
2018-01-01
We describe a set of benchmark exercises that are designed to test if computer codes that simulate dynamic earthquake rupture are working as intended. These types of computer codes are often used to understand how earthquakes operate, and they produce simulation results that include earthquake size, amounts of fault slip, and the patterns of ground shaking and crustal deformation. The benchmark exercises examine a range of features that scientists incorporate in their dynamic earthquake rupture simulations. These include implementations of simple or complex fault geometry, off‐fault rock response to an earthquake, stress conditions, and a variety of formulations for fault friction. Many of the benchmarks were designed to investigate scientific problems at the forefronts of earthquake physics and strong ground motions research. The exercises are freely available on our website for use by the scientific community.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Clouse, C. J.; Edwards, M. J.; McCoy, M. G.
2015-07-07
Through its Advanced Scientific Computing (ASC) and Inertial Confinement Fusion (ICF) code development efforts, Lawrence Livermore National Laboratory (LLNL) provides a world leading numerical simulation capability for the National HED/ICF program in support of the Stockpile Stewardship Program (SSP). In addition the ASC effort provides high performance computing platform capabilities upon which these codes are run. LLNL remains committed to, and will work with, the national HED/ICF program community to help insure numerical simulation needs are met and to make those capabilities available, consistent with programmatic priorities and available resources.
Sinogram-based adaptive iterative reconstruction for sparse view x-ray computed tomography
NASA Astrophysics Data System (ADS)
Trinca, D.; Zhong, Y.; Wang, Y.-Z.; Mamyrbayev, T.; Libin, E.
2016-10-01
With the availability of more powerful computing processors, iterative reconstruction algorithms have recently been successfully implemented as an approach to achieving significant dose reduction in X-ray CT. In this paper, we propose an adaptive iterative reconstruction algorithm for X-ray CT, that is shown to provide results comparable to those obtained by proprietary algorithms, both in terms of reconstruction accuracy and execution time. The proposed algorithm is thus provided for free to the scientific community, for regular use, and for possible further optimization.
Havery Mudd 2014-2015 Computer Science Conduit Clinic Final Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aspesi, G; Bai, J; Deese, R
2015-05-12
Conduit, a new open-source library developed at Lawrence Livermore National Laboratories, provides a C++ application programming interface (API) to describe and access scientific data. Conduit’s primary use is for inmemory data exchange in high performance computing (HPC) applications. Our team tested and improved Conduit to make it more appealing to potential adopters in the HPC community. We extended Conduit’s capabilities by prototyping four libraries: one for parallel communication using MPI, one for I/O functionality, one for aggregating performance data, and one for data visualization.
A Federal Vision for Future Computing: A Nanotechnology-Inspired Grand Challenge
2016-07-29
Science Foundation (NSF), Department of Defense (DOD), National Institute of Standards and Technology (NIST), Intelligence Community (IC) Introduction...multiple Federal agencies: • Intelligent big data sensors that act autonomously and are programmable via the network for increased flexibility, and... intelligence for scientific discovery enabled by rapid extreme-scale data analysis, capable of understanding and making sense of results and thereby
From computer-assisted intervention research to clinical impact: The need for a holistic approach.
Ourselin, Sébastien; Emberton, Mark; Vercauteren, Tom
2016-10-01
The early days of the field of medical image computing (MIC) and computer-assisted intervention (CAI), when publishing a strong self-contained methodological algorithm was enough to produce impact, are over. As a community, we now have substantial responsibility to translate our scientific progresses into improved patient care. In the field of computer-assisted interventions, the emphasis is also shifting from the mere use of well-known established imaging modalities and position trackers to the design and combination of innovative sensing, elaborate computational models and fine-grained clinical workflow analysis to create devices with unprecedented capabilities. The barriers to translating such devices in the complex and understandably heavily regulated surgical and interventional environment can seem daunting. Whether we leave the translation task mostly to our industrial partners or welcome, as researchers, an important share of it is up to us. We argue that embracing the complexity of surgical and interventional sciences is mandatory to the evolution of the field. Being able to do so requires large-scale infrastructure and a critical mass of expertise that very few research centres have. In this paper, we emphasise the need for a holistic approach to computer-assisted interventions where clinical, scientific, engineering and regulatory expertise are combined as a means of moving towards clinical impact. To ensure that the breadth of infrastructure and expertise required for translational computer-assisted intervention research does not lead to a situation where the field advances only thanks to a handful of exceptionally large research centres, we also advocate that solutions need to be designed to lower the barriers to entry. Inspired by fields such as particle physics and astronomy, we claim that centralised very large innovation centres with state of the art technology and health technology assessment capabilities backed by core support staff and open interoperability standards need to be accessible to the wider computer-assisted intervention research community. Copyright © 2016. Published by Elsevier B.V.
Kraemer Diaz, Anne E.; Spears Johnson, Chaya R.; Arcury, Thomas A.
2013-01-01
Community-based participatory research (CBPR) has become essential in health disparities and environmental justice research; however, the scientific integrity of CBPR projects has become a concern. Some concerns, such as appropriate research training, lack of access to resources and finances, have been discussed as possibly limiting the scientific integrity of a project. Prior to understanding what threatens scientific integrity in CBPR, it is vital to understand what scientific integrity means for the professional and community investigators who are involved in CBPR. This analysis explores the interpretation of scientific integrity in CBPR among 74 professional and community research team members from of 25 CBPR projects in nine states in the southeastern United States in 2012. It describes the basic definition for scientific integrity and then explores variations in the interpretation of scientific integrity in CBPR. Variations in the interpretations were associated with team member identity as professional or community investigators. Professional investigators understood scientific integrity in CBPR as either conceptually or logistically flexible, as challenging to balance with community needs, or no different than traditional scientific integrity. Community investigators interpret other factors as important in scientific integrity, such as trust, accountability, and overall benefit to the community. This research demonstrates that the variations in the interpretation of scientific integrity in CBPR call for a new definition of scientific integrity in CBPR that takes into account the understanding and needs of all investigators. PMID:24161098
Integrating multiple scientific computing needs via a Private Cloud infrastructure
NASA Astrophysics Data System (ADS)
Bagnasco, S.; Berzano, D.; Brunetti, R.; Lusso, S.; Vallero, S.
2014-06-01
In a typical scientific computing centre, diverse applications coexist and share a single physical infrastructure. An underlying Private Cloud facility eases the management and maintenance of heterogeneous use cases such as multipurpose or application-specific batch farms, Grid sites catering to different communities, parallel interactive data analysis facilities and others. It allows to dynamically and efficiently allocate resources to any application and to tailor the virtual machines according to the applications' requirements. Furthermore, the maintenance of large deployments of complex and rapidly evolving middleware and application software is eased by the use of virtual images and contextualization techniques; for example, rolling updates can be performed easily and minimizing the downtime. In this contribution we describe the Private Cloud infrastructure at the INFN-Torino Computer Centre, that hosts a full-fledged WLCG Tier-2 site and a dynamically expandable PROOF-based Interactive Analysis Facility for the ALICE experiment at the CERN LHC and several smaller scientific computing applications. The Private Cloud building blocks include the OpenNebula software stack, the GlusterFS filesystem (used in two different configurations for worker- and service-class hypervisors) and the OpenWRT Linux distribution (used for network virtualization). A future integration into a federated higher-level infrastructure is made possible by exposing commonly used APIs like EC2 and by using mainstream contextualization tools like CloudInit.
Flexible workflow sharing and execution services for e-scientists
NASA Astrophysics Data System (ADS)
Kacsuk, Péter; Terstyanszky, Gábor; Kiss, Tamas; Sipos, Gergely
2013-04-01
The sequence of computational and data manipulation steps required to perform a specific scientific analysis is called a workflow. Workflows that orchestrate data and/or compute intensive applications on Distributed Computing Infrastructures (DCIs) recently became standard tools in e-science. At the same time the broad and fragmented landscape of workflows and DCIs slows down the uptake of workflow-based work. The development, sharing, integration and execution of workflows is still a challenge for many scientists. The FP7 "Sharing Interoperable Workflow for Large-Scale Scientific Simulation on Available DCIs" (SHIWA) project significantly improved the situation, with a simulation platform that connects different workflow systems, different workflow languages, different DCIs and workflows into a single, interoperable unit. The SHIWA Simulation Platform is a service package, already used by various scientific communities, and used as a tool by the recently started ER-flow FP7 project to expand the use of workflows among European scientists. The presentation will introduce the SHIWA Simulation Platform and the services that ER-flow provides based on the platform to space and earth science researchers. The SHIWA Simulation Platform includes: 1. SHIWA Repository: A database where workflows and meta-data about workflows can be stored. The database is a central repository to discover and share workflows within and among communities . 2. SHIWA Portal: A web portal that is integrated with the SHIWA Repository and includes a workflow executor engine that can orchestrate various types of workflows on various grid and cloud platforms. 3. SHIWA Desktop: A desktop environment that provides similar access capabilities than the SHIWA Portal, however it runs on the users' desktops/laptops instead of a portal server. 4. Workflow engines: the ASKALON, Galaxy, GWES, Kepler, LONI Pipeline, MOTEUR, Pegasus, P-GRADE, ProActive, Triana, Taverna and WS-PGRADE workflow engines are already integrated with the execution engine of the SHIWA Portal. Other engines can be added when required. Through the SHIWA Portal one can define and run simulations on the SHIWA Virtual Organisation, an e-infrastructure that gathers computing and data resources from various DCIs, including the European Grid Infrastructure. The Portal via third party workflow engines provides support for the most widely used academic workflow engines and it can be extended with other engines on demand. Such extensions translate between workflow languages and facilitate the nesting of workflows into larger workflows even when those are written in different languages and require different interpreters for execution. Through the workflow repository and the portal lonely scientists and scientific collaborations can share and offer workflows for reuse and execution. Given the integrated nature of the SHIWA Simulation Platform the shared workflows can be executed online, without installing any special client environment and downloading workflows. The FP7 "Building a European Research Community through Interoperable Workflows and Data" (ER-flow) project disseminates the achievements of the SHIWA project and use these achievements to build workflow user communities across Europe. ER-flow provides application supports to research communities within and beyond the project consortium to develop, share and run workflows with the SHIWA Simulation Platform.
Kraemer Diaz, Anne E.; Spears Johnson, Chaya R.; Arcury, Thomas A.
2015-01-01
Scientific integrity is necessary for strong science; yet many variables can influence scientific integrity. In traditional research, some common threats are the pressure to publish, competition for funds, and career advancement. Community-based participatory research (CBPR) provides a different context for scientific integrity with additional and unique concerns. Understanding the perceptions that promote or discourage scientific integrity in CBPR as identified by professional and community investigators is essential to promoting the value of CBPR. This analysis explores the perceptions that facilitate scientific integrity in CBPR as well as the barriers among a sample of 74 professional and community CBPR investigators from 25 CBPR projects in nine states in the southeastern United States in 2012. There were variations in perceptions associated with team member identity as professional or community investigators. Perceptions identified to promote and discourage scientific integrity in CBPR by professional and community investigators were external pressures, community participation, funding, quality control and supervision, communication, training, and character and trust. Some perceptions such as communication and training promoted scientific integrity whereas other perceptions, such as a lack of funds and lack of trust could discourage scientific integrity. These results demonstrate that one of the most important perceptions in maintaining scientific integrity in CBPR is active community participation, which enables a co-responsibility by scientists and community members to provide oversight for scientific integrity. Credible CBPR science is crucial to empower the vulnerable communities to be heard by those in positions of power and policy making. PMID:25588933
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bethel, E. Wes; Greenwald, Martin; Kleese van Dam, Kerstin
Scientific user facilities—particle accelerators, telescopes, colliders, supercomputers, light sources, sequencing facilities, and more—operated by the U.S. Department of Energy (DOE) Office of Science (SC) generate ever increasing volumes of data at unprecedented rates from experiments, observations, and simulations. At the same time there is a growing community of experimentalists that require real-time data analysis feedback, to enable them to steer their complex experimental instruments to optimized scientific outcomes and new discoveries. Recent efforts in DOE-SC have focused on articulating the data-centric challenges and opportunities facing these science communities. Key challenges include difficulties coping with data size, rate, and complexity inmore » the context of both real-time and post-experiment data analysis and interpretation. Solutions will require algorithmic and mathematical advances, as well as hardware and software infrastructures that adequately support data-intensive scientific workloads. This paper presents the summary findings of a workshop held by DOE-SC in September 2015, convened to identify the major challenges and the research that is needed to meet those challenges.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bethel, E. Wes; Greenwald, Martin; Kleese van Dam, Kersten
Scientific user facilities---particle accelerators, telescopes, colliders, supercomputers, light sources, sequencing facilities, and more---operated by the U.S. Department of Energy (DOE) Office of Science (SC) generate ever increasing volumes of data at unprecedented rates from experiments, observations, and simulations. At the same time there is a growing community of experimentalists that require real-time data analysis feedback, to enable them to steer their complex experimental instruments to optimized scientific outcomes and new discoveries. Recent efforts in DOE-SC have focused on articulating the data-centric challenges and opportunities facing these science communities. Key challenges include difficulties coping with data size, rate, and complexity inmore » the context of both real-time and post-experiment data analysis and interpretation. Solutions will require algorithmic and mathematical advances, as well as hardware and software infrastructures that adequately support data-intensive scientific workloads. This paper presents the summary findings of a workshop held by DOE-SC in September 2015, convened to identify the major challenges and the research that is needed to meet those challenges.« less
NASA Astrophysics Data System (ADS)
Chulaki, A.; Kuznetsova, M. M.; Rastaetter, L.; MacNeice, P. J.; Shim, J. S.; Pulkkinen, A. A.; Taktakishvili, A.; Mays, M. L.; Mendoza, A. M. M.; Zheng, Y.; Mullinix, R.; Collado-Vega, Y. M.; Maddox, M. M.; Pembroke, A. D.; Wiegand, C.
2015-12-01
Community Coordinated Modeling Center (CCMC) is a NASA affiliated interagency partnership with the primary goal of aiding the transition of modern space science models into space weather forecasting while supporting space science research. Additionally, over the past ten years it has established itself as a global space science education resource supporting undergraduate and graduate education and research, and spreading space weather awareness worldwide. A unique combination of assets, capabilities and close ties to the scientific and educational communities enable this small group to serve as a hub for raising generations of young space scientists and engineers. CCMC resources are publicly available online, providing unprecedented global access to the largest collection of modern space science models (developed by the international research community). CCMC has revolutionized the way simulations are utilized in classrooms settings, student projects, and scientific labs and serves hundreds of educators, students and researchers every year. Another major CCMC asset is an expert space weather prototyping team primarily serving NASA's interplanetary space weather needs. Capitalizing on its unrivaled capabilities and experiences, the team provides in-depth space weather training to students and professionals worldwide, and offers an amazing opportunity for undergraduates to engage in real-time space weather monitoring, analysis, forecasting and research. In-house development of state-of-the-art space weather tools and applications provides exciting opportunities to students majoring in computer science and computer engineering fields to intern with the software engineers at the CCMC while also learning about the space weather from the NASA scientists.
Comparing Emerging XML Based Formats from a Multi-discipline Perspective
NASA Astrophysics Data System (ADS)
Sawyer, D. M.; Reich, L. I.; Nikhinson, S.
2002-12-01
This paper analyzes the similarity and differences among several examples of an emerging generation of Scientific Data Formats that are based on XML technologies. Some of the factors evaluated include the goals of these efforts, the data models, and XML technologies used, and the maturity of currently available software. This paper then investigates the practicality of developing a single set of structural data objects and basic scientific concepts, such as units, that could be used across discipline boundaries and extended by disciplines and missions to create Scientific Data Formats for their communities. This analysis is partly based on an effort sponsored by the ESDIS office at GSFC to compare the Earth Science Markup Language (ESML) and the eXtensible Data Format( XDF), two members of this new generation of XML based Data Description Languages that have been developed by NASA funded efforts in recent years. This paper adds FITSML and potentially CDFML to the list of XML based Scientific Data Formats discussed. This paper draws heavily a Formats Evolution Process Committee (http://ssdoo.gsfc.nasa.gov/nost/fep/) draft white paper primarily developed by Lou Reich, Mike Folk and Don Sawyer to assist the Space Science community in understanding Scientific Data Formats. One of primary conclusions of that paper is that a scientific data format object model should be examined along two basic axes. The first is the complexity of the computer/mathematical data types supported and the second is the level of scientific domain specialization incorporated. This paper also discusses several of the issues that affect the decision on whether to implement a discipline or project specific Scientific Data Format as a formal extension of a general purpose Scientific Data Format or to implement the APIs independently.
What can the programming language Rust do for astrophysics?
NASA Astrophysics Data System (ADS)
Blanco-Cuaresma, Sergi; Bolmont, Emeline
2017-06-01
The astrophysics community uses different tools for computational tasks such as complex systems simulations, radiative transfer calculations or big data. Programming languages like Fortran, C or C++ are commonly present in these tools and, generally, the language choice was made based on the need for performance. However, this comes at a cost: safety. For instance, a common source of error is the access to invalid memory regions, which produces random execution behaviors and affects the scientific interpretation of the results. In 2015, Mozilla Research released the first stable version of a new programming language named Rust. Many features make this new language attractive for the scientific community, it is open source and it guarantees memory safety while offering zero-cost abstraction. We explore the advantages and drawbacks of Rust for astrophysics by re-implementing the fundamental parts of Mercury-T, a Fortran code that simulates the dynamical and tidal evolution of multi-planet systems.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bailey, David H.
The NAS Parallel Benchmarks (NPB) are a suite of parallel computer performance benchmarks. They were originally developed at the NASA Ames Research Center in 1991 to assess high-end parallel supercomputers. Although they are no longer used as widely as they once were for comparing high-end system performance, they continue to be studied and analyzed a great deal in the high-performance computing community. The acronym 'NAS' originally stood for the Numerical Aeronautical Simulation Program at NASA Ames. The name of this organization was subsequently changed to the Numerical Aerospace Simulation Program, and more recently to the NASA Advanced Supercomputing Center, althoughmore » the acronym remains 'NAS.' The developers of the original NPB suite were David H. Bailey, Eric Barszcz, John Barton, David Browning, Russell Carter, LeoDagum, Rod Fatoohi, Samuel Fineberg, Paul Frederickson, Thomas Lasinski, Rob Schreiber, Horst Simon, V. Venkatakrishnan and Sisira Weeratunga. The original NAS Parallel Benchmarks consisted of eight individual benchmark problems, each of which focused on some aspect of scientific computing. The principal focus was in computational aerophysics, although most of these benchmarks have much broader relevance, since in a much larger sense they are typical of many real-world scientific computing applications. The NPB suite grew out of the need for a more rational procedure to select new supercomputers for acquisition by NASA. The emergence of commercially available highly parallel computer systems in the late 1980s offered an attractive alternative to parallel vector supercomputers that had been the mainstay of high-end scientific computing. However, the introduction of highly parallel systems was accompanied by a regrettable level of hype, not only on the part of the commercial vendors but even, in some cases, by scientists using the systems. As a result, it was difficult to discern whether the new systems offered any fundamental performance advantage over vector supercomputers, and, if so, which of the parallel offerings would be most useful in real-world scientific computation. In part to draw attention to some of the performance reporting abuses prevalent at the time, the present author wrote a humorous essay 'Twelve Ways to Fool the Masses,' which described in a light-hearted way a number of the questionable ways in which both vendor marketing people and scientists were inflating and distorting their performance results. All of this underscored the need for an objective and scientifically defensible measure to compare performance on these systems.« less
NASA Astrophysics Data System (ADS)
Gil, Y.; Zanzerkia, E. E.; Munoz-Avila, H.
2015-12-01
The National Science Foundation (NSF) Directorate for Geosciences (GEO) and Directorate for Computer and Information Science (CISE) acknowledge the significant scientific challenges required to understand the fundamental processes of the Earth system, within the atmospheric and geospace, Earth, ocean and polar sciences, and across those boundaries. A broad view of the opportunities and directions for GEO are described in the report "Dynamic Earth: GEO imperative and Frontiers 2015-2020." Many of the aspects of geosciences research, highlighted both in this document and other community grand challenges, pose novel problems for researchers in intelligent systems. Geosciences research will require solutions for data-intensive science, advanced computational capabilities, and transformative concepts for visualizing, using, analyzing and understanding geo phenomena and data. Opportunities for the scientific community to engage in addressing these challenges are available and being developed through NSF's portfolio of investments and activities. The NSF-wide initiative, Cyberinfrastructure Framework for 21st Century Science and Engineering (CIF21), looks to accelerate research and education through new capabilities in data, computation, software and other aspects of cyberinfrastructure. EarthCube, a joint program between GEO and the Advanced Cyberinfrastructure Division, aims to create a well-connected and facile environment to share data and knowledge in an open, transparent, and inclusive manner, thus accelerating our ability to understand and predict the Earth system. EarthCube's mission opens an opportunity for collaborative research on novel information systems enhancing and supporting geosciences research efforts. NSF encourages true, collaborative partnerships between scientists in computer sciences and the geosciences to meet these challenges.
Towards Test Driven Development for Computational Science with pFUnit
NASA Technical Reports Server (NTRS)
Rilee, Michael L.; Clune, Thomas L.
2014-01-01
Developers working in Computational Science & Engineering (CSE)/High Performance Computing (HPC) must contend with constant change due to advances in computing technology and science. Test Driven Development (TDD) is a methodology that mitigates software development risks due to change at the cost of adding comprehensive and continuous testing to the development process. Testing frameworks tailored for CSE/HPC, like pFUnit, can lower the barriers to such testing, yet CSE software faces unique constraints foreign to the broader software engineering community. Effective testing of numerical software requires a comprehensive suite of oracles, i.e., use cases with known answers, as well as robust estimates for the unavoidable numerical errors associated with implementation with finite-precision arithmetic. At first glance these concerns often seem exceedingly challenging or even insurmountable for real-world scientific applications. However, we argue that this common perception is incorrect and driven by (1) a conflation between model validation and software verification and (2) the general tendency in the scientific community to develop relatively coarse-grained, large procedures that compound numerous algorithmic steps.We believe TDD can be applied routinely to numerical software if developers pursue fine-grained implementations that permit testing, neatly side-stepping concerns about needing nontrivial oracles as well as the accumulation of errors. We present an example of a successful, complex legacy CSE/HPC code whose development process shares some aspects with TDD, which we contrast with current and potential capabilities. A mix of our proposed methodology and framework support should enable everyday use of TDD by CSE-expert developers.
Kraemer Diaz, Anne E; Spears Johnson, Chaya R; Arcury, Thomas A
2015-06-01
Scientific integrity is necessary for strong science; yet many variables can influence scientific integrity. In traditional research, some common threats are the pressure to publish, competition for funds, and career advancement. Community-based participatory research (CBPR) provides a different context for scientific integrity with additional and unique concerns. Understanding the perceptions that promote or discourage scientific integrity in CBPR as identified by professional and community investigators is essential to promoting the value of CBPR. This analysis explores the perceptions that facilitate scientific integrity in CBPR as well as the barriers among a sample of 74 professional and community CBPR investigators from 25 CBPR projects in nine states in the southeastern United States in 2012. There were variations in perceptions associated with team member identity as professional or community investigators. Perceptions identified to promote and discourage scientific integrity in CBPR by professional and community investigators were external pressures, community participation, funding, quality control and supervision, communication, training, and character and trust. Some perceptions such as communication and training promoted scientific integrity whereas other perceptions, such as a lack of funds and lack of trust could discourage scientific integrity. These results demonstrate that one of the most important perceptions in maintaining scientific integrity in CBPR is active community participation, which enables a co-responsibility by scientists and community members to provide oversight for scientific integrity. Credible CBPR science is crucial to empower the vulnerable communities to be heard by those in positions of power and policy making. © 2015 Society for Public Health Education.
Shrager, Jeff; Billman, Dorrit; Convertino, Gregorio; Massar, J P; Pirolli, Peter
2010-01-01
Science is a form of distributed analysis involving both individual work that produces new knowledge and collaborative work to exchange information with the larger community. There are many particular ways in which individual and community can interact in science, and it is difficult to assess how efficient these are, and what the best way might be to support them. This paper reports on a series of experiments in this area and a prototype implementation using a research platform called CACHE. CACHE both supports experimentation with different structures of interaction between individual and community cognition and serves as a prototype for computational support for those structures. We particularly focus on CACHE-BC, the Bayes community version of CACHE, within which the community can break up analytical tasks into "mind-sized" units and use provenance tracking to keep track of the relationship between these units. Copyright © 2009 Cognitive Science Society, Inc.
Android application and REST server system for quasar spectrum presentation and analysis
NASA Astrophysics Data System (ADS)
Wasiewicz, P.; Pietralik, K.; Hryniewicz, K.
2017-08-01
This paper describes the implementation of a system consisting of a mobile application and RESTful architecture server intended for the analysis and presentation of quasars' spectrum. It also depicts the quasar's characteristics and significance to the scientific community, the source for acquiring astronomical objects' spectral data, used software solutions as well as presents the aspect of Cloud Computing and various possible deployment configurations.
The INDIGO-Datacloud Authentication and Authorization Infrastructure
NASA Astrophysics Data System (ADS)
Ceccanti, A.; Hardt, M.; Wegh, B.; Millar, AP; Caberletti, M.; Vianello, E.; Licehammer, S.
2017-10-01
Contemporary distributed computing infrastructures (DCIs) are not easily and securely accessible by scientists. These computing environments are typically hard to integrate due to interoperability problems resulting from the use of different authentication mechanisms, identity negotiation protocols and access control policies. Such limitations have a big impact on the user experience making it hard for user communities to port and run their scientific applications on resources aggregated from multiple providers. The INDIGO-DataCloud project wants to provide the services and tools needed to enable a secure composition of resources from multiple providers in support of scientific applications. In order to do so, a common AAI architecture has to be defined that supports multiple authentication mechanisms, support delegated authorization across services and can be easily integrated in off-the-shelf software. In this contribution we introduce the INDIGO Authentication and Authorization Infrastructure, describing its main components and their status and how authentication, delegation and authorization flows are implemented across services.
SBOL Visual: A Graphical Language for Genetic Designs.
Quinn, Jacqueline Y; Cox, Robert Sidney; Adler, Aaron; Beal, Jacob; Bhatia, Swapnil; Cai, Yizhi; Chen, Joanna; Clancy, Kevin; Galdzicki, Michal; Hillson, Nathan J; Le Novère, Nicolas; Maheshwari, Akshay J; McLaughlin, James Alastair; Myers, Chris J; P, Umesh; Pocock, Matthew; Rodriguez, Cesar; Soldatova, Larisa; Stan, Guy-Bart V; Swainston, Neil; Wipat, Anil; Sauro, Herbert M
2015-12-01
Synthetic Biology Open Language (SBOL) Visual is a graphical standard for genetic engineering. It consists of symbols representing DNA subsequences, including regulatory elements and DNA assembly features. These symbols can be used to draw illustrations for communication and instruction, and as image assets for computer-aided design. SBOL Visual is a community standard, freely available for personal, academic, and commercial use (Creative Commons CC0 license). We provide prototypical symbol images that have been used in scientific publications and software tools. We encourage users to use and modify them freely, and to join the SBOL Visual community: http://www.sbolstandard.org/visual.
76 FR 41234 - Advanced Scientific Computing Advisory Committee Charter Renewal
Federal Register 2010, 2011, 2012, 2013, 2014
2011-07-13
... Secretariat, General Services Administration, notice is hereby given that the Advanced Scientific Computing... advice and recommendations concerning the Advanced Scientific Computing program in response only to... Advanced Scientific Computing Research program and recommendations based thereon; --Advice on the computing...
Zhou, Ji; Applegate, Christopher; Alonso, Albor Dobon; Reynolds, Daniel; Orford, Simon; Mackiewicz, Michal; Griffiths, Simon; Penfield, Steven; Pullen, Nick
2017-01-01
Plants demonstrate dynamic growth phenotypes that are determined by genetic and environmental factors. Phenotypic analysis of growth features over time is a key approach to understand how plants interact with environmental change as well as respond to different treatments. Although the importance of measuring dynamic growth traits is widely recognised, available open software tools are limited in terms of batch image processing, multiple traits analyses, software usability and cross-referencing results between experiments, making automated phenotypic analysis problematic. Here, we present Leaf-GP (Growth Phenotypes), an easy-to-use and open software application that can be executed on different computing platforms. To facilitate diverse scientific communities, we provide three software versions, including a graphic user interface (GUI) for personal computer (PC) users, a command-line interface for high-performance computer (HPC) users, and a well-commented interactive Jupyter Notebook (also known as the iPython Notebook) for computational biologists and computer scientists. The software is capable of extracting multiple growth traits automatically from large image datasets. We have utilised it in Arabidopsis thaliana and wheat ( Triticum aestivum ) growth studies at the Norwich Research Park (NRP, UK). By quantifying a number of growth phenotypes over time, we have identified diverse plant growth patterns between different genotypes under several experimental conditions. As Leaf-GP has been evaluated with noisy image series acquired by different imaging devices (e.g. smartphones and digital cameras) and still produced reliable biological outputs, we therefore believe that our automated analysis workflow and customised computer vision based feature extraction software implementation can facilitate a broader plant research community for their growth and development studies. Furthermore, because we implemented Leaf-GP based on open Python-based computer vision, image analysis and machine learning libraries, we believe that our software not only can contribute to biological research, but also demonstrates how to utilise existing open numeric and scientific libraries (e.g. Scikit-image, OpenCV, SciPy and Scikit-learn) to build sound plant phenomics analytic solutions, in a efficient and effective way. Leaf-GP is a sophisticated software application that provides three approaches to quantify growth phenotypes from large image series. We demonstrate its usefulness and high accuracy based on two biological applications: (1) the quantification of growth traits for Arabidopsis genotypes under two temperature conditions; and (2) measuring wheat growth in the glasshouse over time. The software is easy-to-use and cross-platform, which can be executed on Mac OS, Windows and HPC, with open Python-based scientific libraries preinstalled. Our work presents the advancement of how to integrate computer vision, image analysis, machine learning and software engineering in plant phenomics software implementation. To serve the plant research community, our modulated source code, detailed comments, executables (.exe for Windows; .app for Mac), and experimental results are freely available at https://github.com/Crop-Phenomics-Group/Leaf-GP/releases.
Human-computer interfaces applied to numerical solution of the Plateau problem
NASA Astrophysics Data System (ADS)
Elias Fabris, Antonio; Soares Bandeira, Ivana; Ramos Batista, Valério
2015-09-01
In this work we present a code in Matlab to solve the Problem of Plateau numerically, and the code will include human-computer interface. The Problem of Plateau has applications in areas of knowledge like, for instance, Computer Graphics. The solution method will be the same one of the Surface Evolver, but the difference will be a complete graphical interface with the user. This will enable us to implement other kinds of interface like ocular mouse, voice, touch, etc. To date, Evolver does not include any graphical interface, which restricts its use by the scientific community. Specially, its use is practically impossible for most of the Physically Challenged People.
Controlador para un Reloj GPS de Referencia en el Protocolo NTP
NASA Astrophysics Data System (ADS)
Hauscarriaga, F.; Bareilles, F. A.
The synchronization between computers in a local network plays a very important role on enviroments similar to IAR. Calculations for exact time are needed before, during and after an observation. For this purpose the IAR's GNU/Linux Software Development Team implemented a driver inside NTP protocol (an internet standard for time synchronization of computers) for a GPS receiver acquired a few years ago by IAR, which did not have support in such protocol. Today our Institute has a stable and reliable time base synchronized to atomic clocks on board GPS Satellites according to computers's synchronization standard, offering precise time services to all scientific community and particularly to the University of La Plata. FULL TEXT IN SPANISH
Color engineering in the age of digital convergence
NASA Astrophysics Data System (ADS)
MacDonald, Lindsay W.
1998-09-01
Digital color imaging has developed over the past twenty years from specialized scientific applications into the mainstream of computing. In addition to the phenomenal growth of computer processing power and storage capacity, great advances have been made in the capabilities and cost-effectiveness of color imaging peripherals. The majority of imaging applications, including the graphic arts, video and film have made the transition from analogue to digital production methods. Digital convergence of computing, communications and television now heralds new possibilities for multimedia publishing and mobile lifestyles. Color engineering, the application of color science to the design of imaging products, is an emerging discipline that poses exciting challenges to the international color imaging community for training, research and standards.
Low Latency Workflow Scheduling and an Application of Hyperspectral Brightness Temperatures
NASA Astrophysics Data System (ADS)
Nguyen, P. T.; Chapman, D. R.; Halem, M.
2012-12-01
New system analytics for Big Data computing holds the promise of major scientific breakthroughs and discoveries from the exploration and mining of the massive data sets becoming available to the science community. However, such data intensive scientific applications face severe challenges in accessing, managing and analyzing petabytes of data. While the Hadoop MapReduce environment has been successfully applied to data intensive problems arising in business, there are still many scientific problem domains where limitations in the functionality of MapReduce systems prevent its wide adoption by those communities. This is mainly because MapReduce does not readily support the unique science discipline needs such as special science data formats, graphic and computational data analysis tools, maintaining high degrees of computational accuracies, and interfacing with application's existing components across heterogeneous computing processors. We address some of these limitations by exploiting the MapReduce programming model for satellite data intensive scientific problems and address scalability, reliability, scheduling, and data management issues when dealing with climate data records and their complex observational challenges. In addition, we will present techniques to support the unique Earth science discipline needs such as dealing with special science data formats (HDF and NetCDF). We have developed a Hadoop task scheduling algorithm that improves latency by 2x for a scientific workflow including the gridding of the EOS AIRS hyperspectral Brightness Temperatures (BT). This workflow processing algorithm has been tested at the Multicore Computing Center private Hadoop based Intel Nehalem cluster, as well as in a virtual mode under the Open Source Eucalyptus cloud. The 55TB AIRS hyperspectral L1b Brightness Temperature record has been gridded at the resolution of 0.5x1.0 degrees, and we have computed a 0.9 annual anti-correlation to the El Nino Southern oscillation in the Nino 4 region, as well as a 1.9 Kelvin decadal Arctic warming in the 4u and 12u spectral regions. Additionally, we will present the frequency of extreme global warming events by the use of a normalized maximum BT in a grid cell relative to its local standard deviation. A low-latency Hadoop scheduling environment maintains data integrity and fault tolerance in a MapReduce data intensive Cloud environment while improving the "time to solution" metric by 35% when compared to a more traditional parallel processing system for the same dataset. Our next step will be to improve the usability of our Hadoop task scheduling system, to enable rapid prototyping of data intensive experiments by means of processing "kernels". We will report on the performance and experience of implementing these experiments on the NEX testbed, and propose the use of a graphical directed acyclic graph (DAG) interface to help us develop on-demand scientific experiments. Our workflow system works within Hadoop infrastructure as a replacement for the FIFO or FairScheduler, thus the use of Apache "Pig" latin or other Apache tools may also be worth investigating on the NEX system to improve the usability of our workflow scheduling infrastructure for rapid experimentation.
Mohammed, Yassene; Dickmann, Frank; Sax, Ulrich; von Voigt, Gabriele; Smith, Matthew; Rienhoff, Otto
2010-01-01
Natural scientists such as physicists pioneered the sharing of computing resources, which led to the creation of the Grid. The inter domain transfer process of this technology has hitherto been an intuitive process without in depth analysis. Some difficulties facing the life science community in this transfer can be understood using the Bozeman's "Effectiveness Model of Technology Transfer". Bozeman's and classical technology transfer approaches deal with technologies which have achieved certain stability. Grid and Cloud solutions are technologies, which are still in flux. We show how Grid computing creates new difficulties in the transfer process that are not considered in Bozeman's model. We show why the success of healthgrids should be measured by the qualified scientific human capital and the opportunities created, and not primarily by the market impact. We conclude with recommendations that can help improve the adoption of Grid and Cloud solutions into the biomedical community. These results give a more concise explanation of the difficulties many life science IT projects are facing in the late funding periods, and show leveraging steps that can help overcoming the "vale of tears".
NASA Astrophysics Data System (ADS)
Alameda, J. C.
2011-12-01
Development and optimization of computational science models, particularly on high performance computers, and with the advent of ubiquitous multicore processor systems, practically on every system, has been accomplished with basic software tools, typically, command-line based compilers, debuggers, performance tools that have not changed substantially from the days of serial and early vector computers. However, model complexity, including the complexity added by modern message passing libraries such as MPI, and the need for hybrid code models (such as openMP and MPI) to be able to take full advantage of high performance computers with an increasing core count per shared memory node, has made development and optimization of such codes an increasingly arduous task. Additional architectural developments, such as many-core processors, only complicate the situation further. In this paper, we describe how our NSF-funded project, "SI2-SSI: A Productive and Accessible Development Workbench for HPC Applications Using the Eclipse Parallel Tools Platform" (WHPC) seeks to improve the Eclipse Parallel Tools Platform, an environment designed to support scientific code development targeted at a diverse set of high performance computing systems. Our WHPC project to improve Eclipse PTP takes an application-centric view to improve PTP. We are using a set of scientific applications, each with a variety of challenges, and using PTP to drive further improvements to both the scientific application, as well as to understand shortcomings in Eclipse PTP from an application developer perspective, to drive our list of improvements we seek to make. We are also partnering with performance tool providers, to drive higher quality performance tool integration. We have partnered with the Cactus group at Louisiana State University to improve Eclipse's ability to work with computational frameworks and extremely complex build systems, as well as to develop educational materials to incorporate into computational science and engineering codes. Finally, we are partnering with the lead PTP developers at IBM, to ensure we are as effective as possible within the Eclipse community development. We are also conducting training and outreach to our user community, including conference BOF sessions, monthly user calls, and an annual user meeting, so that we can best inform the improvements we make to Eclipse PTP. With these activities we endeavor to encourage use of modern software engineering practices, as enabled through the Eclipse IDE, with computational science and engineering applications. These practices include proper use of source code repositories, tracking and rectifying issues, measuring and monitoring code performance changes against both optimizations as well as ever-changing software stacks and configurations on HPC systems, as well as ultimately encouraging development and maintenance of testing suites -- things that have become commonplace in many software endeavors, but have lagged in the development of science applications. We view that the challenge with the increased complexity of both HPC systems and science applications demands the use of better software engineering methods, preferably enabled by modern tools such as Eclipse PTP, to help the computational science community thrive as we evolve the HPC landscape.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aurich, Maike K.; Fleming, Ronan M. T.; Thiele, Ines
Metabolomic data sets provide a direct read-out of cellular phenotypes and are increasingly generated to study biological questions. Previous work, by us and others, revealed the potential of analyzing extracellular metabolomic data in the context of the metabolic model using constraint-based modeling. With the MetaboTools, we make our methods available to the broader scientific community. The MetaboTools consist of a protocol, a toolbox, and tutorials of two use cases. The protocol describes, in a step-wise manner, the workflow of data integration, and computational analysis. The MetaboTools comprise the Matlab code required to complete the workflow described in the protocol. Tutorialsmore » explain the computational steps for integration of two different data sets and demonstrate a comprehensive set of methods for the computational analysis of metabolic models and stratification thereof into different phenotypes. The presented workflow supports integrative analysis of multiple omics data sets. Importantly, all analysis tools can be applied to metabolic models without performing the entire workflow. Taken together, the MetaboTools constitute a comprehensive guide to the intra-model analysis of extracellular metabolomic data from microbial, plant, or human cells. In conclusion, this computational modeling resource offers a broad set of computational analysis tools for a wide biomedical and non-biomedical research community.« less
NASA Astrophysics Data System (ADS)
Strayer, Michael
2007-09-01
Good morning. Welcome to Boston, the home of the Red Sox, Celtics and Bruins, baked beans, tea parties, Robert Parker, and SciDAC 2007. A year ago I stood before you to share the legacy of the first SciDAC program and identify the challenges that we must address on the road to petascale computing—a road E E Cummins described as `. . . never traveled, gladly beyond any experience.' Today, I want to explore the preparations for the rapidly approaching extreme scale (X-scale) generation. These preparations are the first step propelling us along the road of burgeoning scientific discovery enabled by the application of X- scale computing. We look to petascale computing and beyond to open up a world of discovery that cuts across scientific fields and leads us to a greater understanding of not only our world, but our universe. As part of the President's America Competitiveness Initiative, the ASCR Office has been preparing a ten year vision for computing. As part of this planning the LBNL together with ORNL and ANL hosted three town hall meetings on Simulation and Modeling at the Exascale for Energy, Ecological Sustainability and Global Security (E3). The proposed E3 initiative is organized around four programmatic themes: Engaging our top scientists, engineers, computer scientists and applied mathematicians; investing in pioneering large-scale science; developing scalable analysis algorithms, and storage architectures to accelerate discovery; and accelerating the build-out and future development of the DOE open computing facilities. It is clear that we have only just started down the path to extreme scale computing. Plan to attend Thursday's session on the out-briefing and discussion of these meetings. The road to the petascale has been at best rocky. In FY07, the continuing resolution provided 12% less money for Advanced Scientific Computing than either the President, the Senate, or the House. As a consequence, many of you had to absorb a no cost extension for your SciDAC work. I am pleased that the President's FY08 budget restores the funding for SciDAC. Quoting from Advanced Scientific Computing Research description in the House Energy and Water Development Appropriations Bill for FY08, "Perhaps no other area of research at the Department is so critical to sustaining U.S. leadership in science and technology, revolutionizing the way science is done and improving research productivity." As a society we need to revolutionize our approaches to energy, environmental and global security challenges. As we go forward along the road to the X-scale generation, the use of computation will continue to be a critical tool along with theory and experiment in understanding the behavior of the fundamental components of nature as well as for fundamental discovery and exploration of the behavior of complex systems. The foundation to overcome these societal challenges will build from the experiences and knowledge gained as you, members of our SciDAC research teams, work together to attack problems at the tera- and peta- scale. If SciDAC is viewed as an experiment for revolutionizing scientific methodology, then a strategic goal of ASCR program must be to broaden the intellectual base prepared to address the challenges of the new X-scale generation of computing. We must focus our computational science experiences gained over the past five years on the opportunities introduced with extreme scale computing. Our facilities are on a path to provide the resources needed to undertake the first part of our journey. Using the newly upgraded 119 teraflop Cray XT system at the Leadership Computing Facility, SciDAC research teams have in three days performed a 100-year study of the time evolution of the atmospheric CO2 concentration originating from the land surface. The simulation of the El Nino/Southern Oscillation which was part of this study has been characterized as `the most impressive new result in ten years' gained new insight into the behavior of superheated ionic gas in the ITER reactor as a result of an AORSA run on 22,500 processors that achieved over 87 trillion calculations per second (87 teraflops) which is 74% of the system's theoretical peak. Tomorrow, Argonne and IBM will announce that the first IBM Blue Gene/P, a 100 teraflop system, will be shipped to the Argonne Leadership Computing Facility later this fiscal year. By the end of FY2007 ASCR high performance and leadership computing resources will include the 114 teraflop IBM Blue Gene/P; a 102 teraflop Cray XT4 at NERSC and a 119 teraflop Cray XT system at Oak Ridge. Before ringing in the New Year, Oak Ridge will upgrade to 250 teraflops with the replacement of the dual core processors with quad core processors and Argonne will upgrade to between 250-500 teraflops, and next year, a petascale Cray Baker system is scheduled for delivery at Oak Ridge. The multidisciplinary teams in our SciDAC Centers for Enabling Technologies and our SciDAC Institutes must continue to work with our Scientific Application teams to overcome the barriers that prevent effective use of these new systems. These challenges include: the need for new algorithms as well as operating system and runtime software and tools which scale to parallel systems composed of hundreds of thousands processors; program development environments and tools which scale effectively and provide ease of use for developers and scientific end users; and visualization and data management systems that support moving, storing, analyzing, manipulating and visualizing multi-petabytes of scientific data and objects. The SciDAC Centers, located primarily at our DOE national laboratories will take the lead in ensuring that critical computer science and applied mathematics issues are addressed in a timely and comprehensive fashion and to address issues associated with research software lifecycle. In contrast, the SciDAC Institutes, which are university-led centers of excellence, will have more flexibility to pursue new research topics through a range of research collaborations. The Institutes will also work to broaden the intellectual and researcher base—conducting short courses and summer schools to take advantage of new high performance computing capabilities. The SciDAC Outreach Center at Lawrence Berkeley National Laboratory complements the outreach efforts of the SciDAC Institutes. The Outreach Center is our clearinghouse for SciDAC activities and resources and will communicate with the high performance computing community in part to understand their needs for workshops, summer schools and institutes. SciDAC is not ASCR's only effort to broaden the computational science community needed to meet the challenges of the new X-scale generation. I hope that you were able to attend the Computational Science Graduate Fellowship poster session last night. ASCR developed the fellowship in 1991 to meet the nation's growing need for scientists and technology professionals with advanced computer skills. CSGF, now jointly funded between ASCR and NNSA, is more than a traditional academic fellowship. It has provided more than 200 of the best and brightest graduate students with guidance, support and community in preparing them as computational scientists. Today CSGF alumni are bringing their diverse top-level skills and knowledge to research teams at DOE laboratories and in industries such as Proctor and Gamble, Lockheed Martin and Intel. At universities they are working to train the next generation of computational scientists. To build on this success, we intend to develop a wholly new Early Career Principal Investigator's (ECPI) program. Our objective is to stimulate academic research in scientific areas within ASCR's purview especially among faculty in early stages of their academic careers. Last February, we lost Ken Kennedy, one of the leading lights of our community. As we move forward into the extreme computing generation, his vision and insight will be greatly missed. In memorial to Ken Kennedy, we shall designate the ECPI grants to beginning faculty in Computer Science as the Ken Kennedy Fellowship. Watch the ASCR website for more information about ECPI and other early career programs in the computational sciences. We look to you, our scientists, researchers, and visionaries to take X-scale computing and use it to explode scientific discovery in your fields. We at SciDAC will work to ensure that this tool is the sharpest and most precise and efficient instrument to carve away the unknown and reveal the most exciting secrets and stimulating scientific discoveries of our time. The partnership between research and computing is the marriage that will spur greater discovery, and as Spencer said to Susan in Robert Parker's novel, `Sudden Mischief', `We stick together long enough, and we may get as smart as hell'. Michael Strayer
Separating Added Value from Hype: Some Experiences and Prognostications
NASA Astrophysics Data System (ADS)
Reed, Dan
2004-03-01
These are exciting times for the interplay of science and computing technology. As new data archives, instruments and computing facilities are connected nationally and internationally, a new model of distributed scientific collaboration is emerging. However, any new technology brings both opportunities and challenges -- Grids are no exception. In this talk, we will discuss some of the experiences deploying Grid software in production environments, illustrated with experiences from the NSF PACI Alliance, the NSF Extensible Terascale Facility (ETF) and other Grid projects. From these experiences, we derive some guidelines for deployment and some suggestions for community engagement, software development and infrastructure
Educational and Scientific Applications of Climate Model Diagnostic Analyzer
NASA Astrophysics Data System (ADS)
Lee, S.; Pan, L.; Zhai, C.; Tang, B.; Kubar, T. L.; Zhang, J.; Bao, Q.
2016-12-01
Climate Model Diagnostic Analyzer (CMDA) is a web-based information system designed for the climate modeling and model analysis community to analyze climate data from models and observations. CMDA provides tools to diagnostically analyze climate data for model validation and improvement, and to systematically manage analysis provenance for sharing results with other investigators. CMDA utilizes cloud computing resources, multi-threading computing, machine-learning algorithms, web service technologies, and provenance-supporting technologies to address technical challenges that the Earth science modeling and model analysis community faces in evaluating and diagnosing climate models. As CMDA infrastructure and technology have matured, we have developed the educational and scientific applications of CMDA. Educationally, CMDA supported the summer school of the JPL Center for Climate Sciences for three years since 2014. In the summer school, the students work on group research projects where CMDA provide datasets and analysis tools. Each student is assigned to a virtual machine with CMDA installed in Amazon Web Services. A provenance management system for CMDA is developed to keep track of students' usages of CMDA, and to recommend datasets and analysis tools for their research topic. The provenance system also allows students to revisit their analysis results and share them with their group. Scientifically, we have developed several science use cases of CMDA covering various topics, datasets, and analysis types. Each use case developed is described and listed in terms of a scientific goal, datasets used, the analysis tools used, scientific results discovered from the use case, an analysis result such as output plots and data files, and a link to the exact analysis service call with all the input arguments filled. For example, one science use case is the evaluation of NCAR CAM5 model with MODIS total cloud fraction. The analysis service used is Difference Plot Service of Two Variables, and the datasets used are NCAR CAM total cloud fraction and MODIS total cloud fraction. The scientific highlight of the use case is that the CAM5 model overall does a fairly decent job at simulating total cloud cover, though simulates too few clouds especially near and offshore of the eastern ocean basins where low clouds are dominant.
NASA Astrophysics Data System (ADS)
Doyle, Paul; Mtenzi, Fred; Smith, Niall; Collins, Adrian; O'Shea, Brendan
2012-09-01
The scientific community is in the midst of a data analysis crisis. The increasing capacity of scientific CCD instrumentation and their falling costs is contributing to an explosive generation of raw photometric data. This data must go through a process of cleaning and reduction before it can be used for high precision photometric analysis. Many existing data processing pipelines either assume a relatively small dataset or are batch processed by a High Performance Computing centre. A radical overhaul of these processing pipelines is required to allow reduction and cleaning rates to process terabyte sized datasets at near capture rates using an elastic processing architecture. The ability to access computing resources and to allow them to grow and shrink as demand fluctuates is essential, as is exploiting the parallel nature of the datasets. A distributed data processing pipeline is required. It should incorporate lossless data compression, allow for data segmentation and support processing of data segments in parallel. Academic institutes can collaborate and provide an elastic computing model without the requirement for large centralized high performance computing data centers. This paper demonstrates how a base 10 order of magnitude improvement in overall processing time has been achieved using the "ACN pipeline", a distributed pipeline spanning multiple academic institutes.
NASA Astrophysics Data System (ADS)
Kees, C. E.; Farthing, M. W.; Terrel, A.; Certik, O.; Seljebotn, D.
2013-12-01
This presentation will focus on two barriers to progress in the hydrological modeling community, and research and development conducted to lessen or eliminate them. The first is a barrier to sharing hydrological models among specialized scientists that is caused by intertwining the implementation of numerical methods with the implementation of abstract numerical modeling information. In the Proteus toolkit for computational methods and simulation, we have decoupled these two important parts of computational model through separate "physics" and "numerics" interfaces. More recently we have begun developing the Strong Form Language for easy and direct representation of the mathematical model formulation in a domain specific language embedded in Python. The second major barrier is sharing ANY scientific software tools that have complex library or module dependencies, as most parallel, multi-physics hydrological models must have. In this setting, users and developer are dependent on an entire distribution, possibly depending on multiple compilers and special instructions depending on the environment of the target machine. To solve these problem we have developed, hashdist, a stateless package management tool and a resulting portable, open source scientific software distribution.
AstrodyToolsWeb an e-Science project in Astrodynamics and Celestial Mechanics fields
NASA Astrophysics Data System (ADS)
López, R.; San-Juan, J. F.
2013-05-01
Astrodynamics Web Tools, AstrodyToolsWeb (http://tastrody.unirioja.es), is an ongoing collaborative Web Tools computing infrastructure project which has been specially designed to support scientific computation. AstrodyToolsWeb provides project collaborators with all the technical and human facilities in order to wrap, manage, and use specialized noncommercial software tools in Astrodynamics and Celestial Mechanics fields, with the aim of optimizing the use of resources, both human and material. However, this project is open to collaboration from the whole scientific community in order to create a library of useful tools and their corresponding theoretical backgrounds. AstrodyToolsWeb offers a user-friendly web interface in order to choose applications, introduce data, and select appropriate constraints in an intuitive and easy way for the user. After that, the application is executed in real time, whenever possible; then the critical information about program behavior (errors and logs) and output, including the postprocessing and interpretation of its results (graphical representation of data, statistical analysis or whatever manipulation therein), are shown via the same web interface or can be downloaded to the user's computer.
Computational modelling of oxygenation processes in enzymes and biomimetic model complexes.
de Visser, Sam P; Quesne, Matthew G; Martin, Bodo; Comba, Peter; Ryde, Ulf
2014-01-11
With computational resources becoming more efficient and more powerful and at the same time cheaper, computational methods have become more and more popular for studies on biochemical and biomimetic systems. Although large efforts from the scientific community have gone into exploring the possibilities of computational methods for studies on large biochemical systems, such studies are not without pitfalls and often cannot be routinely done but require expert execution. In this review we summarize and highlight advances in computational methodology and its application to enzymatic and biomimetic model complexes. In particular, we emphasize on topical and state-of-the-art methodologies that are able to either reproduce experimental findings, e.g., spectroscopic parameters and rate constants, accurately or make predictions of short-lived intermediates and fast reaction processes in nature. Moreover, we give examples of processes where certain computational methods dramatically fail.
Riding the Hype Wave: Evaluating new AI Techniques for their Applicability in Earth Science
NASA Astrophysics Data System (ADS)
Ramachandran, R.; Zhang, J.; Maskey, M.; Lee, T. J.
2016-12-01
Every few years a new technology rides the hype wave generated by the computer science community. Converts to this new technology who surface from both the science community and the informatics community promulgate that it can radically improve or even change the existing scientific process. Recent examples of new technology following in the footsteps of "big data" now include deep learning algorithms and knowledge graphs. Deep learning algorithms mimic the human brain and process information through multiple stages of transformation and representation. These algorithms are able to learn complex functions that map pixels directly to outputs without relying on human-crafted features and solve some of the complex classification problems that exist in science. Similarly, knowledge graphs aggregate information around defined topics that enable users to resolve their query without having to navigate and assemble information manually. Knowledge graphs could potentially be used in scientific research to assist in hypothesis formulation, testing, and review. The challenge for the Earth science research community is to evaluate these new technologies by asking the right questions and considering what-if scenarios. What is this new technology enabling/providing that is innovative and different? Can one justify the adoption costs with respect to the research returns? Since nothing comes for free, utilizing a new technology entails adoption costs that may outweigh the benefits. Furthermore, these technologies may require significant computing infrastructure in order to be utilized effectively. Results from two different projects will be presented along with lessons learned from testing these technologies. The first project primarily evaluates deep learning techniques for different applications of image retrieval within Earth science while the second project builds a prototype knowledge graph constructed for Hurricane science.
76 FR 31945 - Advanced Scientific Computing Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2011-06-02
... DEPARTMENT OF ENERGY Advanced Scientific Computing Advisory Committee AGENCY: Department of Energy... teleconference meeting of the Advanced Scientific Computing Advisory Committee (ASCAC). The Federal [email protected] . FOR FURTHER INFORMATION CONTACT: Melea Baker, Office of Advanced Scientific Computing...
Intuitive Space Weather Displays to Improve Space Situational Awareness (SSA)
2011-09-01
parsimonious offering. After engaging several mathematicians and space physicists to devise valid computational formulas for aggregating the four hazard... PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) Aptima, Inc.,12 Gill Street Ste 200,Woburn,MA... physicists , the operational users find little use in receiving particle fluxes or magnetometer readings collected by the scientific community. Fortunately
75 FR 9887 - Advanced Scientific Computing Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2010-03-04
... DEPARTMENT OF ENERGY Advanced Scientific Computing Advisory Committee AGENCY: Department of Energy... Advanced Scientific Computing Advisory Committee (ASCAC). Federal Advisory Committee Act (Pub. L. 92-463... INFORMATION CONTACT: Melea Baker, Office of Advanced Scientific Computing Research; SC-21/Germantown Building...
76 FR 9765 - Advanced Scientific Computing Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2011-02-22
... DEPARTMENT OF ENERGY Advanced Scientific Computing Advisory Committee AGENCY: Office of Science... Advanced Scientific Computing Advisory Committee (ASCAC). The Federal Advisory Committee Act (Pub. L. 92... INFORMATION CONTACT: Melea Baker, Office of Advanced Scientific Computing Research, SC-21/Germantown Building...
77 FR 45345 - DOE/Advanced Scientific Computing Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2012-07-31
... Recompetition results for Scientific Discovery through Advanced Computing (SciDAC) applications Co-design Public... DEPARTMENT OF ENERGY DOE/Advanced Scientific Computing Advisory Committee AGENCY: Office of... the Advanced Scientific Computing Advisory Committee (ASCAC). The Federal Advisory Committee Act (Pub...
75 FR 64720 - DOE/Advanced Scientific Computing Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-20
... DEPARTMENT OF ENERGY DOE/Advanced Scientific Computing Advisory Committee AGENCY: Department of... the Advanced Scientific Computing Advisory Committee (ASCAC). Federal Advisory Committee Act (Pub. L.... FOR FURTHER INFORMATION CONTACT: Melea Baker, Office of Advanced Scientific Computing Research; SC-21...
Computing through Scientific Abstractions in SysBioPS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chin, George; Stephan, Eric G.; Gracio, Deborah K.
2004-10-13
Today, biologists and bioinformaticists have a tremendous amount of computational power at their disposal. With the availability of supercomputers, burgeoning scientific databases and digital libraries such as GenBank and PubMed, and pervasive computational environments such as the Grid, biologists have access to a wealth of computational capabilities and scientific data at hand. Yet, the rapid development of computational technologies has far exceeded the typical biologist’s ability to effectively apply the technology in their research. Computational sciences research and development efforts such as the Biology Workbench, BioSPICE (Biological Simulation Program for Intra-Cellular Evaluation), and BioCoRE (Biological Collaborative Research Environment) are importantmore » in connecting biologists and their scientific problems to computational infrastructures. On the Computational Cell Environment and Heuristic Entity-Relationship Building Environment projects at the Pacific Northwest National Laboratory, we are jointly developing a new breed of scientific problem solving environment called SysBioPSE that will allow biologists to access and apply computational resources in the scientific research context. In contrast to other computational science environments, SysBioPSE operates as an abstraction layer above a computational infrastructure. The goal of SysBioPSE is to allow biologists to apply computational resources in the context of the scientific problems they are addressing and the scientific perspectives from which they conduct their research. More specifically, SysBioPSE allows biologists to capture and represent scientific concepts and theories and experimental processes, and to link these views to scientific applications, data repositories, and computer systems.« less
The JASMIN Cloud: specialised and hybrid to meet the needs of the Environmental Sciences Community
NASA Astrophysics Data System (ADS)
Kershaw, Philip; Lawrence, Bryan; Churchill, Jonathan; Pritchard, Matt
2014-05-01
Cloud computing provides enormous opportunities for the research community. The large public cloud providers provide near-limitless scaling capability. However, adapting Cloud to scientific workloads is not without its problems. The commodity nature of the public cloud infrastructure can be at odds with the specialist requirements of the research community. Issues such as trust, ownership of data, WAN bandwidth and costing models make additional barriers to more widespread adoption. Alongside the application of public cloud for scientific applications, a number of private cloud initiatives are underway in the research community of which the JASMIN Cloud is one example. Here, cloud service models are being effectively super-imposed over more established services such as data centres, compute cluster facilities and Grids. These have the potential to deliver the specialist infrastructure needed for the science community coupled with the benefits of a Cloud service model. The JASMIN facility based at the Rutherford Appleton Laboratory was established in 2012 to support the data analysis requirements of the climate and Earth Observation community. In its first year of operation, the 5PB of available storage capacity was filled and the hosted compute capability used extensively. JASMIN has modelled the concept of a centralised large-volume data analysis facility. Key characteristics have enabled success: peta-scale fast disk connected via low latency networks to compute resources and the use of virtualisation for effective management of the resources for a range of users. A second phase is now underway funded through NERC's (Natural Environment Research Council) Big Data initiative. This will see significant expansion to the resources available with a doubling of disk-based storage to 12PB and an increase of compute capacity by a factor of ten to over 3000 processing cores. This expansion is accompanied by a broadening in the scope for JASMIN, as a service available to the entire UK environmental science community. Experience with the first phase demonstrated the range of user needs. A trade-off is needed between access privileges to resources, flexibility of use and security. This has influenced the form and types of service under development for the new phase. JASMIN will deploy a specialised private cloud organised into "Managed" and "Unmanaged" components. In the Managed Cloud, users have direct access to the storage and compute resources for optimal performance but for reasons of security, via a more restrictive PaaS (Platform-as-a-Service) interface. The Unmanaged Cloud is deployed in an isolated part of the network but co-located with the rest of the infrastructure. This enables greater liberty to tenants - full IaaS (Infrastructure-as-a-Service) capability to provision customised infrastructure - whilst at the same time protecting more sensitive parts of the system from direct access using these elevated privileges. The private cloud will be augmented with cloud-bursting capability so that it can exploit the resources available from public clouds, making it effectively a hybrid solution. A single interface will overlay the functionality of both the private cloud and external interfaces to public cloud providers giving users the flexibility to migrate resources between infrastructures as requirements dictate.
75 FR 43518 - Advanced Scientific Computing Advisory Committee; Meeting
Federal Register 2010, 2011, 2012, 2013, 2014
2010-07-26
... DEPARTMENT OF ENERGY Advanced Scientific Computing Advisory Committee; Meeting AGENCY: Office of... Scientific Computing Advisory Committee (ASCAC). Federal Advisory Committee Act (Pub. L. 92-463, 86 Stat. 770...: Melea Baker, Office of Advanced Scientific Computing Research; SC-21/Germantown Building; U. S...
Science, Society, and Social Networking
NASA Astrophysics Data System (ADS)
White, K. S.; Lohwater, T.
2009-12-01
The increased use of social networking is changing the way that scientific societies interact with their members and others. The American Association for the Advancement of Science (AAAS) uses a variety of online networks to engage its members and the broader scientific community. AAAS members and non-members can interact with AAAS staff and each other on AAAS sites on Facebook, YouTube, and Twitter, as well as blogs and forums on the AAAS website (www.aaas.org). These tools allow scientists to more readily become engaged in policy by providing information on current science policy topics as well as methods of involvement. For example, members and the public can comment on policy-relevant stories from Science magazine’s ScienceInsider blog, download a weekly policy podcast, receive a weekly email update of policy issues affecting the scientific community, or watch a congressional hearing from their computer. AAAS resource websites and outreach programs, including Communicating Science (www.aaas.org/communicatingscience), Working with Congress (www.aaas.org/spp/cstc/) and Science Careers (http://sciencecareers.sciencemag.org) also provide tools for scientists to become more personally engaged in communicating their findings and involved in the policy process.
NASA Astrophysics Data System (ADS)
Campbell, S. W.; Williams, K.; Marston, L.; Kreutz, K. J.; Osterberg, E. C.; Wake, C. P.
2013-12-01
For the past six years, a multi-institution effort has undertaken a broad glaciological and climate research project in Denali National Park. Most recently, two ~208 m long surface to bedrock ice cores were recovered from the Mt. Hunter plateau with supporting geophysical and weather data collected. Twenty two individuals have participated in the field program providing thousands of person-hours towards completing our research goals. Technical and scientific results have been disseminated to the broader scientific community through dozens of professional presentations and six peer-reviewed publications. In addition, we have pursued the development of interactive computer applications that use our results for educational purposes, publically available fact sheets through Denali National Park, and most recently, with assistance from PolarTREC and other affiliations, the development of a children's book and roll-out of K-8 science curriculum based on this project. The K-8 curriculum will provide students with an opportunity to use real scientific data to meet their educational requirements through alternative, interactive, and exciting methods relative to more standard educational programs. Herein, we present examples of this diverse approach towards incorporating polar research into K-12 STEM classrooms.
Astrolabe: Curating, Linking, and Computing Astronomy’s Dark Data
NASA Astrophysics Data System (ADS)
Heidorn, P. Bryan; Stahlman, Gretchen R.; Steffen, Julie
2018-05-01
Where appropriate repositories are not available to support all relevant astronomical data products, data can fall into darkness: unseen and unavailable for future reference and reuse. Some data in this category are legacy or old data, but newer data sets are also often uncurated and could remain dark. This paper provides a description of the design motivation and development of Astrolabe, a cyberinfrastructure project that addresses a set of community recommendations for locating and ensuring the long-term curation of dark or otherwise at-risk data and integrated computing. This paper also describes the outcomes of the series of community workshops that informed creation of Astrolabe. According to participants in these workshops, much astronomical dark data currently exist that are not curated elsewhere, as well as software that can only be executed by a few individuals and therefore becomes unusable because of changes in computing platforms. Astronomical research questions and challenges would be better addressed with integrated data and computational resources that fall outside the scope of existing observatory and space mission projects. As a solution, the design of the Astrolabe system is aimed at developing new resources for management of astronomical data. The project is based in CyVerse cyberinfrastructure technology and is a collaboration between the University of Arizona and the American Astronomical Society. Overall, the project aims to support open access to research data by leveraging existing cyberinfrastructure resources and promoting scientific discovery by making potentially useful data available to the astronomical community, in a computable format.
Advancing Water Science through Improved Cyberinfrastructure
NASA Astrophysics Data System (ADS)
Koch, B. J.; Miles, B.; Rai, A.; Ahalt, S.; Band, L. E.; Minsker, B.; Palmer, M.; Williams, M. R.; Idaszak, R.; Whitton, M. C.
2012-12-01
Major scientific advances are needed to help address impacts of climate change and increasing human-mediated environmental modification on the water cycle at global and local scales. However, such advances within the water sciences are limited in part by inadequate information infrastructures. For example, cyberinfrastructure (CI) includes the integrated computer hardware, software, networks, sensors, data, and human capital that enable scientific workflows to be carried out within and among individual research efforts and across varied disciplines. A coordinated transformation of existing CI and development of new CI could accelerate the productivity of water science by enabling greater discovery, access, and interoperability of data and models, and by freeing scientists to do science rather than create and manage technological tools. To elucidate specific ways in which improved CI could advance water science, three challenges confronting the water science community were evaluated: 1) How does ecohydrologic patch structure affect nitrogen transport and fate in watersheds?, 2) How can human-modified environments emulate natural water and nutrient cycling to enhance both human and ecosystem well-being?, 3) How do changes in climate affect water availability to support biodiversity and human needs? We assessed the approaches used by researchers to address components of these challenges, identified barriers imposed by limitations of current CI, and interviewed leaders in various water science subdisciplines to determine the most recent CI tools employed. Our preliminary findings revealed four areas where CI improvements are likely to stimulate scientific advances: 1) sensor networks, 2) data quality assurance/quality control, 3) data and modeling standards, 4) high performance computing. In addition, the full potential of a re-envisioned water science CI cannot be realized without a substantial training component. In light of these findings, we suggest that CI industry-proven practices such as open-source community architecture, agile development methodologies, and sound software engineering methods offer a promising pathway to a transformed water science CI capable of meeting the demands of both individual scientists and community-wide research initiatives.
NASA Astrophysics Data System (ADS)
Memon, Shahbaz; Vallot, Dorothée; Zwinger, Thomas; Neukirchen, Helmut
2017-04-01
Scientific communities generate complex simulations through orchestration of semi-structured analysis pipelines which involves execution of large workflows on multiple, distributed and heterogeneous computing and data resources. Modeling ice dynamics of glaciers requires workflows consisting of many non-trivial, computationally expensive processing tasks which are coupled to each other. From this domain, we present an e-Science use case, a workflow, which requires the execution of a continuum ice flow model and a discrete element based calving model in an iterative manner. Apart from the execution, this workflow also contains data format conversion tasks that support the execution of ice flow and calving by means of transition through sequential, nested and iterative steps. Thus, the management and monitoring of all the processing tasks including data management and transfer of the workflow model becomes more complex. From the implementation perspective, this workflow model was initially developed on a set of scripts using static data input and output references. In the course of application usage when more scripts or modifications introduced as per user requirements, the debugging and validation of results were more cumbersome to achieve. To address these problems, we identified a need to have a high-level scientific workflow tool through which all the above mentioned processes can be achieved in an efficient and usable manner. We decided to make use of the e-Science middleware UNICORE (Uniform Interface to Computing Resources) that allows seamless and automated access to different heterogenous and distributed resources which is supported by a scientific workflow engine. Based on this, we developed a high-level scientific workflow model for coupling of massively parallel High-Performance Computing (HPC) jobs: a continuum ice sheet model (Elmer/Ice) and a discrete element calving and crevassing model (HiDEM). In our talk we present how the use of a high-level scientific workflow middleware enables reproducibility of results more convenient and also provides a reusable and portable workflow template that can be deployed across different computing infrastructures. Acknowledgements This work was kindly supported by NordForsk as part of the Nordic Center of Excellence (NCoE) eSTICC (eScience Tools for Investigating Climate Change at High Northern Latitudes) and the Top-level Research Initiative NCoE SVALI (Stability and Variation of Arctic Land Ice).
NASA Astrophysics Data System (ADS)
Puchala, Brian; Tarcea, Glenn; Marquis, Emmanuelle. A.; Hedstrom, Margaret; Jagadish, H. V.; Allison, John E.
2016-08-01
Accelerating the pace of materials discovery and development requires new approaches and means of collaborating and sharing information. To address this need, we are developing the Materials Commons, a collaboration platform and information repository for use by the structural materials community. The Materials Commons has been designed to be a continuous, seamless part of the scientific workflow process. Researchers upload the results of experiments and computations as they are performed, automatically where possible, along with the provenance information describing the experimental and computational processes. The Materials Commons website provides an easy-to-use interface for uploading and downloading data and data provenance, as well as for searching and sharing data. This paper provides an overview of the Materials Commons. Concepts are also outlined for integrating the Materials Commons with the broader Materials Information Infrastructure that is evolving to support the Materials Genome Initiative.
ArrayBridge: Interweaving declarative array processing with high-performance computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xing, Haoyuan; Floratos, Sofoklis; Blanas, Spyros
Scientists are increasingly turning to datacenter-scale computers to produce and analyze massive arrays. Despite decades of database research that extols the virtues of declarative query processing, scientists still write, debug and parallelize imperative HPC kernels even for the most mundane queries. This impedance mismatch has been partly attributed to the cumbersome data loading process; in response, the database community has proposed in situ mechanisms to access data in scientific file formats. Scientists, however, desire more than a passive access method that reads arrays from files. This paper describes ArrayBridge, a bi-directional array view mechanism for scientific file formats, that aimsmore » to make declarative array manipulations interoperable with imperative file-centric analyses. Our prototype implementation of ArrayBridge uses HDF5 as the underlying array storage library and seamlessly integrates into the SciDB open-source array database system. In addition to fast querying over external array objects, ArrayBridge produces arrays in the HDF5 file format just as easily as it can read from it. ArrayBridge also supports time travel queries from imperative kernels through the unmodified HDF5 API, and automatically deduplicates between array versions for space efficiency. Our extensive performance evaluation in NERSC, a large-scale scientific computing facility, shows that ArrayBridge exhibits statistically indistinguishable performance and I/O scalability to the native SciDB storage engine.« less
Computer Aided Enzyme Design and Catalytic Concepts
Frushicheva, Maria P.; Mills, Matthew J. L.; Schopf, Patrick; Singh, Manoj K.; Warshel, Arieh
2014-01-01
Gaining a deeper understanding of enzyme catalysis is of great practical and fundamental importance. Over the years it has become clear that despite advances made in experimental mutational studies, a quantitative understanding of enzyme catalysis will not be possible without the use of computer modeling approaches. While we believe that electrostatic preorganization is by far the most important catalytic factor, convincing the wider scientific community of this may require the demonstration of effective rational enzyme design. Here we make the point that the main current advances in enzyme design are basically advances in directed evolution and that computer aided enzyme design must involve approaches that can reproduce catalysis in well-defined test cases. Such an approach is provided by the empirical valence bond method. PMID:24814389
SBOL Visual: A Graphical Language for Genetic Designs
Quinn, Jacqueline Y.; Cox, Robert Sidney; Adler, Aaron; ...
2015-12-03
Synthetic Biology Open Language (SBOL) Visual is a graphical standard for genetic engineering. We report that it consists of symbols representing DNA subsequences, including regulatory elements and DNA assembly features. These symbols can be used to draw illustrations for communication and instruction, and as image assets for computer-aided design. SBOL Visual is a community standard, freely available for personal, academic, and commercial use (Creative Commons CC0 license). We provide prototypical symbol images that have been used in scientific publications and software tools. We encourage users to use and modify them freely, and to join the SBOL Visual community: http://www.sbolstandard.org/visual.
SBOL Visual: A Graphical Language for Genetic Designs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Quinn, Jacqueline Y.; Cox, Robert Sidney; Adler, Aaron
Synthetic Biology Open Language (SBOL) Visual is a graphical standard for genetic engineering. We report that it consists of symbols representing DNA subsequences, including regulatory elements and DNA assembly features. These symbols can be used to draw illustrations for communication and instruction, and as image assets for computer-aided design. SBOL Visual is a community standard, freely available for personal, academic, and commercial use (Creative Commons CC0 license). We provide prototypical symbol images that have been used in scientific publications and software tools. We encourage users to use and modify them freely, and to join the SBOL Visual community: http://www.sbolstandard.org/visual.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ramos, Jaime
2012-12-14
To unlock the potential of micro grids we plan to build, commission and operate a 5 kWDC PV array and integrate it to the UTPA Engineering building low voltage network, as a micro grid; and promote community awareness. Assisted by a solar radiation tracker providing on-line information of its measurements and performing analysis for the use by the scientific and engineering community, we will write, perform and operate a set of Laboratory experiments and computer simulations supporting Electrical Engineering (graduate and undergraduate) courses on Renewable Energy, as well as Senior Design projects.
SBOL Visual: A Graphical Language for Genetic Designs
Adler, Aaron; Beal, Jacob; Bhatia, Swapnil; Cai, Yizhi; Chen, Joanna; Clancy, Kevin; Galdzicki, Michal; Hillson, Nathan J.; Le Novère, Nicolas; Maheshwari, Akshay J.; McLaughlin, James Alastair; Myers, Chris J.; P, Umesh; Pocock, Matthew; Rodriguez, Cesar; Soldatova, Larisa; Stan, Guy-Bart V.; Swainston, Neil; Wipat, Anil; Sauro, Herbert M.
2015-01-01
Synthetic Biology Open Language (SBOL) Visual is a graphical standard for genetic engineering. It consists of symbols representing DNA subsequences, including regulatory elements and DNA assembly features. These symbols can be used to draw illustrations for communication and instruction, and as image assets for computer-aided design. SBOL Visual is a community standard, freely available for personal, academic, and commercial use (Creative Commons CC0 license). We provide prototypical symbol images that have been used in scientific publications and software tools. We encourage users to use and modify them freely, and to join the SBOL Visual community: http://www.sbolstandard.org/visual. PMID:26633141
Hu, Hao; Hong, Xingchen; Terstriep, Jeff; Liu, Yan; Finn, Michael P.; Rush, Johnathan; Wendel, Jeffrey; Wang, Shaowen
2016-01-01
Geospatial data, often embedded with geographic references, are important to many application and science domains, and represent a major type of big data. The increased volume and diversity of geospatial data have caused serious usability issues for researchers in various scientific domains, which call for innovative cyberGIS solutions. To address these issues, this paper describes a cyberGIS community data service framework to facilitate geospatial big data access, processing, and sharing based on a hybrid supercomputer architecture. Through the collaboration between the CyberGIS Center at the University of Illinois at Urbana-Champaign (UIUC) and the U.S. Geological Survey (USGS), a community data service for accessing, customizing, and sharing digital elevation model (DEM) and its derived datasets from the 10-meter national elevation dataset, namely TopoLens, is created to demonstrate the workflow integration of geospatial big data sources, computation, analysis needed for customizing the original dataset for end user needs, and a friendly online user environment. TopoLens provides online access to precomputed and on-demand computed high-resolution elevation data by exploiting the ROGER supercomputer. The usability of this prototype service has been acknowledged in community evaluation.
Prototyping a 10 Gigabit-Ethernet Event-Builder for the CTA Camera Server
NASA Astrophysics Data System (ADS)
Hoffmann, Dirk; Houles, Julien
2012-12-01
While the Cherenkov Telescope Array will end its Preperatory Phase in 2012 or 2013 with the publication of a Technical Design Report, our lab has undertaken within the french CTA community the design and prototyping of a Camera-Server, which is a PC architecture based computer, used as a switchboard assigned to each of a hundred telescopes to handle a maximum amount of scientific data recorded by each telescope. Our work aims for a data acquisition hardware and software system for the scientific raw data at optimal speed. We have evaluated the maximum performance that can be obtained by choosing standard (COTS) hardware and software (Linux) in conjunction with a 10 Gb/s switch.
Crossing the chasm: how to develop weather and climate models for next generation computers?
NASA Astrophysics Data System (ADS)
Lawrence, Bryan N.; Rezny, Michael; Budich, Reinhard; Bauer, Peter; Behrens, Jörg; Carter, Mick; Deconinck, Willem; Ford, Rupert; Maynard, Christopher; Mullerworth, Steven; Osuna, Carlos; Porter, Andrew; Serradell, Kim; Valcke, Sophie; Wedi, Nils; Wilson, Simon
2018-05-01
Weather and climate models are complex pieces of software which include many individual components, each of which is evolving under pressure to exploit advances in computing to enhance some combination of a range of possible improvements (higher spatio-temporal resolution, increased fidelity in terms of resolved processes, more quantification of uncertainty, etc.). However, after many years of a relatively stable computing environment with little choice in processing architecture or programming paradigm (basically X86 processors using MPI for parallelism), the existing menu of processor choices includes significant diversity, and more is on the horizon. This computational diversity, coupled with ever increasing software complexity, leads to the very real possibility that weather and climate modelling will arrive at a chasm which will separate scientific aspiration from our ability to develop and/or rapidly adapt codes to the available hardware. In this paper we review the hardware and software trends which are leading us towards this chasm, before describing current progress in addressing some of the tools which we may be able to use to bridge the chasm. This brief introduction to current tools and plans is followed by a discussion outlining the scientific requirements for quality model codes which have satisfactory performance and portability, while simultaneously supporting productive scientific evolution. We assert that the existing method of incremental model improvements employing small steps which adjust to the changing hardware environment is likely to be inadequate for crossing the chasm between aspiration and hardware at a satisfactory pace, in part because institutions cannot have all the relevant expertise in house. Instead, we outline a methodology based on large community efforts in engineering and standardisation, which will depend on identifying a taxonomy of key activities - perhaps based on existing efforts to develop domain-specific languages, identify common patterns in weather and climate codes, and develop community approaches to commonly needed tools and libraries - and then collaboratively building up those key components. Such a collaborative approach will depend on institutions, projects, and individuals adopting new interdependencies and ways of working.
The GENIUS Grid Portal and robot certificates: a new tool for e-Science
Barbera, Roberto; Donvito, Giacinto; Falzone, Alberto; La Rocca, Giuseppe; Milanesi, Luciano; Maggi, Giorgio Pietro; Vicario, Saverio
2009-01-01
Background Grid technology is the computing model which allows users to share a wide pletora of distributed computational resources regardless of their geographical location. Up to now, the high security policy requested in order to access distributed computing resources has been a rather big limiting factor when trying to broaden the usage of Grids into a wide community of users. Grid security is indeed based on the Public Key Infrastructure (PKI) of X.509 certificates and the procedure to get and manage those certificates is unfortunately not straightforward. A first step to make Grids more appealing for new users has recently been achieved with the adoption of robot certificates. Methods Robot certificates have recently been introduced to perform automated tasks on Grids on behalf of users. They are extremely useful for instance to automate grid service monitoring, data processing production, distributed data collection systems. Basically these certificates can be used to identify a person responsible for an unattended service or process acting as client and/or server. Robot certificates can be installed on a smart card and used behind a portal by everyone interested in running the related applications in a Grid environment using a user-friendly graphic interface. In this work, the GENIUS Grid Portal, powered by EnginFrame, has been extended in order to support the new authentication based on the adoption of these robot certificates. Results The work carried out and reported in this manuscript is particularly relevant for all users who are not familiar with personal digital certificates and the technical aspects of the Grid Security Infrastructure (GSI). The valuable benefits introduced by robot certificates in e-Science can so be extended to users belonging to several scientific domains, providing an asset in raising Grid awareness to a wide number of potential users. Conclusion The adoption of Grid portals extended with robot certificates, can really contribute to creating transparent access to computational resources of Grid Infrastructures, enhancing the spread of this new paradigm in researchers' working life to address new global scientific challenges. The evaluated solution can of course be extended to other portals, applications and scientific communities. PMID:19534747
The GENIUS Grid Portal and robot certificates: a new tool for e-Science.
Barbera, Roberto; Donvito, Giacinto; Falzone, Alberto; La Rocca, Giuseppe; Milanesi, Luciano; Maggi, Giorgio Pietro; Vicario, Saverio
2009-06-16
Grid technology is the computing model which allows users to share a wide pletora of distributed computational resources regardless of their geographical location. Up to now, the high security policy requested in order to access distributed computing resources has been a rather big limiting factor when trying to broaden the usage of Grids into a wide community of users. Grid security is indeed based on the Public Key Infrastructure (PKI) of X.509 certificates and the procedure to get and manage those certificates is unfortunately not straightforward. A first step to make Grids more appealing for new users has recently been achieved with the adoption of robot certificates. Robot certificates have recently been introduced to perform automated tasks on Grids on behalf of users. They are extremely useful for instance to automate grid service monitoring, data processing production, distributed data collection systems. Basically these certificates can be used to identify a person responsible for an unattended service or process acting as client and/or server. Robot certificates can be installed on a smart card and used behind a portal by everyone interested in running the related applications in a Grid environment using a user-friendly graphic interface. In this work, the GENIUS Grid Portal, powered by EnginFrame, has been extended in order to support the new authentication based on the adoption of these robot certificates. The work carried out and reported in this manuscript is particularly relevant for all users who are not familiar with personal digital certificates and the technical aspects of the Grid Security Infrastructure (GSI). The valuable benefits introduced by robot certificates in e-Science can so be extended to users belonging to several scientific domains, providing an asset in raising Grid awareness to a wide number of potential users. The adoption of Grid portals extended with robot certificates, can really contribute to creating transparent access to computational resources of Grid Infrastructures, enhancing the spread of this new paradigm in researchers' working life to address new global scientific challenges. The evaluated solution can of course be extended to other portals, applications and scientific communities.
Agile parallel bioinformatics workflow management using Pwrake.
Mishima, Hiroyuki; Sasaki, Kensaku; Tanaka, Masahiro; Tatebe, Osamu; Yoshiura, Koh-Ichiro
2011-09-08
In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error.Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability and maintainability of rakefiles may facilitate sharing workflows among the scientific community. Workflows for GATK and Dindel are available at http://github.com/misshie/Workflows.
Agile parallel bioinformatics workflow management using Pwrake
2011-01-01
Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error. Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. Findings We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Conclusions Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability and maintainability of rakefiles may facilitate sharing workflows among the scientific community. Workflows for GATK and Dindel are available at http://github.com/misshie/Workflows. PMID:21899774
OOI CyberInfrastructure - Next Generation Oceanographic Research
NASA Astrophysics Data System (ADS)
Farcas, C.; Fox, P.; Arrott, M.; Farcas, E.; Klacansky, I.; Krueger, I.; Meisinger, M.; Orcutt, J.
2008-12-01
Software has become a key enabling technology for scientific discovery, observation, modeling, and exploitation of natural phenomena. New value emerges from the integration of individual subsystems into networked federations of capabilities exposed to the scientific community. Such data-intensive interoperability networks are crucial for future scientific collaborative research, as they open up new ways of fusing data from different sources and across various domains, and analysis on wide geographic areas. The recently established NSF OOI program, through its CyberInfrastructure component addresses this challenge by providing broad access from sensor networks for data acquisition up to computational grids for massive computations and binding infrastructure facilitating policy management and governance of the emerging system-of-scientific-systems. We provide insight into the integration core of this effort, namely, a hierarchic service-oriented architecture for a robust, performant, and maintainable implementation. We first discuss the relationship between data management and CI crosscutting concerns such as identity management, policy and governance, which define the organizational contexts for data access and usage. Next, we detail critical services including data ingestion, transformation, preservation, inventory, and presentation. To address interoperability issues between data represented in various formats we employ a semantic framework derived from the Earth System Grid technology, a canonical representation for scientific data based on DAP/OPeNDAP, and related data publishers such as ERDDAP. Finally, we briefly present the underlying transport based on a messaging infrastructure over the AMQP protocol, and the preservation based on a distributed file system through SDSC iRODS.
NASA Astrophysics Data System (ADS)
de Groot, R.
2008-12-01
The Southern California Earthquake Center (SCEC) has been developing groundbreaking computer modeling capabilities for studying earthquakes. These visualizations were initially shared within the scientific community but have recently gained visibility via television news coverage in Southern California. Computers have opened up a whole new world for scientists working with large data sets, and students can benefit from the same opportunities (Libarkin & Brick, 2002). For example, The Great Southern California ShakeOut was based on a potential magnitude 7.8 earthquake on the southern San Andreas fault. The visualization created for the ShakeOut was a key scientific and communication tool for the earthquake drill. This presentation will also feature SCEC Virtual Display of Objects visualization software developed by SCEC Undergraduate Studies in Earthquake Information Technology interns. According to Gordin and Pea (1995), theoretically visualization should make science accessible, provide means for authentic inquiry, and lay the groundwork to understand and critique scientific issues. This presentation will discuss how the new SCEC visualizations and other earthquake imagery achieve these results, how they fit within the context of major themes and study areas in science communication, and how the efficacy of these tools can be improved.
Advancing Science through Mining Libraries, Ontologies, and Communities*
Evans, James A.; Rzhetsky, Andrey
2011-01-01
Life scientists today cannot hope to read everything relevant to their research. Emerging text-mining tools can help by identifying topics and distilling statements from books and articles with increased accuracy. Researchers often organize these statements into ontologies, consistent systems of reality claims. Like scientific thinking and interchange, however, text-mined information (even when accurately captured) is complex, redundant, sometimes incoherent, and often contradictory: it is rooted in a mixture of only partially consistent ontologies. We review work that models scientific reason and suggest how computational reasoning across ontologies and the broader distribution of textual statements can assess the certainty of statements and the process by which statements become certain. With the emergence of digitized data regarding networks of scientific authorship, institutions, and resources, we explore the possibility of accounting for social dependences and cultural biases in reasoning models. Computational reasoning is starting to fill out ontologies and flag internal inconsistencies in several areas of bioscience. In the not too distant future, scientists may be able to use statements and rich models of the processes that produced them to identify underexplored areas, resurrect forgotten findings and ideas, deconvolute the spaghetti of underlying ontologies, and synthesize novel knowledge and hypotheses. PMID:21566119
pFlogger: The Parallel Fortran Logging Utility
NASA Technical Reports Server (NTRS)
Clune, Tom; Cruz, Carlos A.
2017-01-01
In the context of high performance computing (HPC), software investments in support of text-based diagnostics, which monitor a running application, are typically limited compared to those for other types of IO. Examples of such diagnostics include reiteration of configuration parameters, progress indicators, simple metrics (e.g., mass conservation, convergence of solvers, etc.), and timers. To some degree, this difference in priority is justifiable as other forms of output are the primary products of a scientific model and, due to their large data volume, much more likely to be a significant performance concern. In contrast, text-based diagnostic content is generally not shared beyond the individual or group running an application and is most often used to troubleshoot when something goes wrong. We suggest that a more systematic approach enabled by a logging facility (or 'logger)' similar to those routinely used by many communities would provide significant value to complex scientific applications. In the context of high-performance computing, an appropriate logger would provide specialized support for distributed and shared-memory parallelism and have low performance overhead. In this paper, we present our prototype implementation of pFlogger - a parallel Fortran-based logging framework, and assess its suitability for use in a complex scientific application.
Java Performance for Scientific Applications on LLNL Computer Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kapfer, C; Wissink, A
2002-05-10
Languages in use for high performance computing at the laboratory--Fortran (f77 and f90), C, and C++--have many years of development behind them and are generally considered the fastest available. However, Fortran and C do not readily extend to object-oriented programming models, limiting their capability for very complex simulation software. C++ facilitates object-oriented programming but is a very complex and error-prone language. Java offers a number of capabilities that these other languages do not. For instance it implements cleaner (i.e., easier to use and less prone to errors) object-oriented models than C++. It also offers networking and security as part ofmore » the language standard, and cross-platform executables that make it architecture neutral, to name a few. These features have made Java very popular for industrial computing applications. The aim of this paper is to explain the trade-offs in using Java for large-scale scientific applications at LLNL. Despite its advantages, the computational science community has been reluctant to write large-scale computationally intensive applications in Java due to concerns over its poor performance. However, considerable progress has been made over the last several years. The Java Grande Forum [1] has been promoting the use of Java for large-scale computing. Members have introduced efficient array libraries, developed fast just-in-time (JIT) compilers, and built links to existing packages used in high performance parallel computing.« less
Unified Planetary Coordinates System: A Searchable Database of Geodetic Information
NASA Technical Reports Server (NTRS)
Becker, K. J.a; Gaddis, L. R.; Soderblom, L. A.; Kirk, R. L.; Archinal, B. A.; Johnson, J. R.; Anderson, J. A.; Bowman-Cisneros, E.; LaVoie, S.; McAuley, M.
2005-01-01
Over the past 40 years, an enormous quantity of orbital remote sensing data has been collected for Mars from many missions and instruments. Unfortunately these datasets currently exist in a wide range of disparate coordinate systems, making it extremely difficult for the scientific community to easily correlate, combine, and compare data from different Mars missions and instruments. As part of our work for the PDS Imaging Node and on behalf of the USGS Astrogeology Team, we are working to solve this problem and to provide the NASA scientific research community with easy access to Mars orbital data in a unified, consistent coordinate system along with a wide variety of other key geometric variables. The Unified Planetary Coordinates (UPC) system is comprised of two main elements: (1) a database containing Mars orbital remote sensing data computed using a uniform coordinate system, and (2) a process by which continual maintainance and updates to the contents of the database are performed.
Operating a production pilot factory serving several scientific domains
NASA Astrophysics Data System (ADS)
Sfiligoi, I.; Würthwein, F.; Andrews, W.; Dost, J. M.; MacNeill, I.; McCrea, A.; Sheripon, E.; Murphy, C. W.
2011-12-01
Pilot infrastructures are becoming prominent players in the Grid environment. One of the major advantages is represented by the reduced effort required by the user communities (also known as Virtual Organizations or VOs) due to the outsourcing of the Grid interfacing services, i.e. the pilot factory, to Grid experts. One such pilot factory, based on the glideinWMS pilot infrastructure, is being operated by the Open Science Grid at University of California San Diego (UCSD). This pilot factory is serving multiple VOs from several scientific domains. Currently the three major clients are the analysis operations of the HEP experiment CMS, the community VO HCC, which serves mostly math, biology and computer science users, and the structural biology VO NEBioGrid. The UCSD glidein factory allows the served VOs to use Grid resources distributed over 150 sites in North and South America, in Europe, and in Asia. This paper presents the steps taken to create a production quality pilot factory, together with the challenges encountered along the road.
Gender equity and equality on Korean student scientists: A life history narrative study
NASA Astrophysics Data System (ADS)
Hur, Changsoo
Much research, including that by Koreans (e.g., Mo, 1999), agrees on two major points relating to the inequitable and unequal condition of women in the scientific community: (1) the fact that the under-representation of women in the scientific community has been taken for granted for years (e.g., Rathgeber, 1998), and (2) documenting women's lives has been largely excluded in women's studies (e.g., Sutton, 1998). The basis for the design of this study relates to the aforementioned observations. This study addresses two major research questions: how do social stereotypes exist in terms of gender equity and equality in the South Korean scientific and educational fields, and how do these stereotypes influence women and men's socializations, in terms of gender equity and equality, in the South Korean scientific and educational fields? To investigate the research questions, this qualitative study utilizes a life history narrative approach in examining various theoretical perspectives, such as critical theory, post-structuralism, and postmodernism. Through the participants' perceptions and experiences in the scientific community and in South Korean society, this study fords gendered stereotypes, practices, and socializations in school, family, and the scientific community. These findings demonstrate asymmetric gendered structures in South Korea. Moreover, with the comparison among male and female participants, this study shows how they perceive and experience differently in school, family, and the scientific community. This study attempts to understand the South Korean scientific community as represented by four student scientists through social structures. Education appears to function significantly as an hegemonic power in conveying legitimating ideologies. This process reproduces man-centered social structures, especially in the scientific community. This suggests that to emancipate women's under-representations in the scientific community, educational administrators and teachers should carefully consider gendered practices, stereotypes, and socialization in science classes. There have been significant findings of educative authenticity criteria (Guba & Lincoln, 1989) that stimulate the needs of future studies on gender, especially women in the scientific community in general and, more specifically, in South Korea. These findings suggest the importance of active involvement by women participants to enhance a more detailed examination, by women's studies, of the scientific community.
NASA Astrophysics Data System (ADS)
Branch, B. D.; Raskin, R. G.; Rock, B.; Gagnon, M.; Lecompte, M. A.; Hayden, L. B.
2009-12-01
With the nation challenged to comply with Executive Order 12906 and its needs to augment the Science, Technology, Engineering and Mathematics (STEM) pipeline, applied focus on geosciences pipelines issue may be at risk. The Geosciences pipeline may require intentional K-12 standard course of study consideration in the form of project based, science based and evidenced based learning. Thus, the K-12 to geosciences to informatics pipeline may benefit from an earth science experience that utilizes a community based “learning by doing” approach. Terms such as Community GIS, Community Remotes Sensing, and Community Based Ontology development are termed Community Informatics. Here, approaches of interdisciplinary work to promote and earth science literacy are affordable, consisting of low cost equipment that renders GIS/remote sensing data processing skills necessary in the workforce. Hence, informal community ontology development may evolve or mature from a local community towards formal scientific community collaboration. Such consideration may become a means to engage educational policy towards earth science paradigms and needs, specifically linking synergy among Math, Computer Science, and Earth Science disciplines.
Introducing Argonne’s Theta Supercomputer
DOE Office of Scientific and Technical Information (OSTI.GOV)
None
Theta, the Argonne Leadership Computing Facility’s (ALCF) new Intel-Cray supercomputer, is officially open to the research community. Theta’s massively parallel, many-core architecture puts the ALCF on the path to Aurora, the facility’s future Intel-Cray system. Capable of nearly 10 quadrillion calculations per second, Theta enables researchers to break new ground in scientific investigations that range from modeling the inner workings of the brain to developing new materials for renewable energy applications.
Modeling Guru: Knowledge Base for NASA Modelers
NASA Astrophysics Data System (ADS)
Seablom, M. S.; Wojcik, G. S.; van Aartsen, B. H.
2009-05-01
Modeling Guru is an on-line knowledge-sharing resource for anyone involved with or interested in NASA's scientific models or High End Computing (HEC) systems. Developed and maintained by the NASA's Software Integration and Visualization Office (SIVO) and the NASA Center for Computational Sciences (NCCS), Modeling Guru's combined forums and knowledge base for research and collaboration is becoming a repository for the accumulated expertise of NASA's scientific modeling and HEC communities. All NASA modelers and associates are encouraged to participate and provide knowledge about the models and systems so that other users may benefit from their experience. Modeling Guru is divided into a hierarchy of communities, each with its own set forums and knowledge base documents. Current modeling communities include those for space science, land and atmospheric dynamics, atmospheric chemistry, and oceanography. In addition, there are communities focused on NCCS systems, HEC tools and libraries, and programming and scripting languages. Anyone may view most of the content on Modeling Guru (available at http://modelingguru.nasa.gov/), but you must log in to post messages and subscribe to community postings. The site offers a full range of "Web 2.0" features, including discussion forums, "wiki" document generation, document uploading, RSS feeds, search tools, blogs, email notification, and "breadcrumb" links. A discussion (a.k.a. forum "thread") is used to post comments, solicit feedback, or ask questions. If marked as a question, SIVO will monitor the thread, and normally respond within a day. Discussions can include embedded images, tables, and formatting through the use of the Rich Text Editor. Also, the user can add "Tags" to their thread to facilitate later searches. The "knowledge base" is comprised of documents that are used to capture and share expertise with others. The default "wiki" document lets users edit within the browser so others can easily collaborate on the same document, even allowing the author to select those who may edit and approve the document. To maintain knowledge integrity, all documents are moderated before they are visible to the public. Modeling Guru, running on Clearspace by Jive Software, has been an active resource to the NASA modeling and HEC communities for more than a year and currently has more than 100 active users. SIVO will soon install live instant messaging support, as well as a user-customizable homepage with social-networking features. In addition, SIVO plans to implement a large dataset/file storage capability so that users can quickly and easily exchange datasets and files with one another. Continued active community participation combined with periodic software updates and improved features will ensure that Modeling Guru remains a vibrant, effective, easy-to-use tool for the NASA scientific community.
INFN-Pisa scientific computation environment (GRID, HPC and Interactive Analysis)
NASA Astrophysics Data System (ADS)
Arezzini, S.; Carboni, A.; Caruso, G.; Ciampa, A.; Coscetti, S.; Mazzoni, E.; Piras, S.
2014-06-01
The INFN-Pisa Tier2 infrastructure is described, optimized not only for GRID CPU and Storage access, but also for a more interactive use of the resources in order to provide good solutions for the final data analysis step. The Data Center, equipped with about 6700 production cores, permits the use of modern analysis techniques realized via advanced statistical tools (like RooFit and RooStat) implemented in multicore systems. In particular a POSIX file storage access integrated with standard SRM access is provided. Therefore the unified storage infrastructure is described, based on GPFS and Xrootd, used both for SRM data repository and interactive POSIX access. Such a common infrastructure allows a transparent access to the Tier2 data to the users for their interactive analysis. The organization of a specialized many cores CPU facility devoted to interactive analysis is also described along with the login mechanism integrated with the INFN-AAI (National INFN Infrastructure) to extend the site access and use to a geographical distributed community. Such infrastructure is used also for a national computing facility in use to the INFN theoretical community, it enables a synergic use of computing and storage resources. Our Center initially developed for the HEP community is now growing and includes also HPC resources fully integrated. In recent years has been installed and managed a cluster facility (1000 cores, parallel use via InfiniBand connection) and we are now updating this facility that will provide resources for all the intermediate level HPC computing needs of the INFN theoretical national community.
Community Intelligence in Knowledge Curation: An Application to Managing Scientific Nomenclature
Zou, Dong; Li, Ang; Liu, Guocheng; Chen, Fei; Wu, Jiayan; Xiao, Jingfa; Wang, Xumin; Yu, Jun; Zhang, Zhang
2013-01-01
Harnessing community intelligence in knowledge curation bears significant promise in dealing with communication and education in the flood of scientific knowledge. As knowledge is accumulated at ever-faster rates, scientific nomenclature, a particular kind of knowledge, is concurrently generated in all kinds of fields. Since nomenclature is a system of terms used to name things in a particular discipline, accurate translation of scientific nomenclature in different languages is of critical importance, not only for communications and collaborations with English-speaking people, but also for knowledge dissemination among people in the non-English-speaking world, particularly young students and researchers. However, it lacks of accuracy and standardization when translating scientific nomenclature from English to other languages, especially for those languages that do not belong to the same language family as English. To address this issue, here we propose for the first time the application of community intelligence in scientific nomenclature management, namely, harnessing collective intelligence for translation of scientific nomenclature from English to other languages. As community intelligence applied to knowledge curation is primarily aided by wiki and Chinese is the native language for about one-fifth of the world’s population, we put the proposed application into practice, by developing a wiki-based English-to-Chinese Scientific Nomenclature Dictionary (ESND; http://esnd.big.ac.cn). ESND is a wiki-based, publicly editable and open-content platform, exploiting the whole power of the scientific community in collectively and collaboratively managing scientific nomenclature. Based on community curation, ESND is capable of achieving accurate, standard, and comprehensive scientific nomenclature, demonstrating a valuable application of community intelligence in knowledge curation. PMID:23451119
Community intelligence in knowledge curation: an application to managing scientific nomenclature.
Dai, Lin; Xu, Chao; Tian, Ming; Sang, Jian; Zou, Dong; Li, Ang; Liu, Guocheng; Chen, Fei; Wu, Jiayan; Xiao, Jingfa; Wang, Xumin; Yu, Jun; Zhang, Zhang
2013-01-01
Harnessing community intelligence in knowledge curation bears significant promise in dealing with communication and education in the flood of scientific knowledge. As knowledge is accumulated at ever-faster rates, scientific nomenclature, a particular kind of knowledge, is concurrently generated in all kinds of fields. Since nomenclature is a system of terms used to name things in a particular discipline, accurate translation of scientific nomenclature in different languages is of critical importance, not only for communications and collaborations with English-speaking people, but also for knowledge dissemination among people in the non-English-speaking world, particularly young students and researchers. However, it lacks of accuracy and standardization when translating scientific nomenclature from English to other languages, especially for those languages that do not belong to the same language family as English. To address this issue, here we propose for the first time the application of community intelligence in scientific nomenclature management, namely, harnessing collective intelligence for translation of scientific nomenclature from English to other languages. As community intelligence applied to knowledge curation is primarily aided by wiki and Chinese is the native language for about one-fifth of the world's population, we put the proposed application into practice, by developing a wiki-based English-to-Chinese Scientific Nomenclature Dictionary (ESND; http://esnd.big.ac.cn). ESND is a wiki-based, publicly editable and open-content platform, exploiting the whole power of the scientific community in collectively and collaboratively managing scientific nomenclature. Based on community curation, ESND is capable of achieving accurate, standard, and comprehensive scientific nomenclature, demonstrating a valuable application of community intelligence in knowledge curation.
Perspectives on Sharing Models and Related Resources in Computational Biomechanics Research.
Erdemir, Ahmet; Hunter, Peter J; Holzapfel, Gerhard A; Loew, Leslie M; Middleton, John; Jacobs, Christopher R; Nithiarasu, Perumal; Löhner, Rainlad; Wei, Guowei; Winkelstein, Beth A; Barocas, Victor H; Guilak, Farshid; Ku, Joy P; Hicks, Jennifer L; Delp, Scott L; Sacks, Michael; Weiss, Jeffrey A; Ateshian, Gerard A; Maas, Steve A; McCulloch, Andrew D; Peng, Grace C Y
2018-02-01
The role of computational modeling for biomechanics research and related clinical care will be increasingly prominent. The biomechanics community has been developing computational models routinely for exploration of the mechanics and mechanobiology of diverse biological structures. As a result, a large array of models, data, and discipline-specific simulation software has emerged to support endeavors in computational biomechanics. Sharing computational models and related data and simulation software has first become a utilitarian interest, and now, it is a necessity. Exchange of models, in support of knowledge exchange provided by scholarly publishing, has important implications. Specifically, model sharing can facilitate assessment of reproducibility in computational biomechanics and can provide an opportunity for repurposing and reuse, and a venue for medical training. The community's desire to investigate biological and biomechanical phenomena crossing multiple systems, scales, and physical domains, also motivates sharing of modeling resources as blending of models developed by domain experts will be a required step for comprehensive simulation studies as well as the enhancement of their rigor and reproducibility. The goal of this paper is to understand current perspectives in the biomechanics community for the sharing of computational models and related resources. Opinions on opportunities, challenges, and pathways to model sharing, particularly as part of the scholarly publishing workflow, were sought. A group of journal editors and a handful of investigators active in computational biomechanics were approached to collect short opinion pieces as a part of a larger effort of the IEEE EMBS Computational Biology and the Physiome Technical Committee to address model reproducibility through publications. A synthesis of these opinion pieces indicates that the community recognizes the necessity and usefulness of model sharing. There is a strong will to facilitate model sharing, and there are corresponding initiatives by the scientific journals. Outside the publishing enterprise, infrastructure to facilitate model sharing in biomechanics exists, and simulation software developers are interested in accommodating the community's needs for sharing of modeling resources. Encouragement for the use of standardized markups, concerns related to quality assurance, acknowledgement of increased burden, and importance of stewardship of resources are noted. In the short-term, it is advisable that the community builds upon recent strategies and experiments with new pathways for continued demonstration of model sharing, its promotion, and its utility. Nonetheless, the need for a long-term strategy to unify approaches in sharing computational models and related resources is acknowledged. Development of a sustainable platform supported by a culture of open model sharing will likely evolve through continued and inclusive discussions bringing all stakeholders at the table, e.g., by possibly establishing a consortium.
xSDK Foundations: Toward an Extreme-scale Scientific Software Development Kit
DOE Office of Scientific and Technical Information (OSTI.GOV)
Heroux, Michael A.; Bartlett, Roscoe; Demeshko, Irina
Here, extreme-scale computational science increasingly demands multiscale and multiphysics formulations. Combining software developed by independent groups is imperative: no single team has resources for all predictive science and decision support capabilities. Scientific libraries provide high-quality, reusable software components for constructing applications with improved robustness and portability. However, without coordination, many libraries cannot be easily composed. Namespace collisions, inconsistent arguments, lack of third-party software versioning, and additional difficulties make composition costly. The Extreme-scale Scientific Software Development Kit (xSDK) defines community policies to improve code quality and compatibility across independently developed packages (hypre, PETSc, SuperLU, Trilinos, and Alquimia) and provides a foundationmore » for addressing broader issues in software interoperability, performance portability, and sustainability. The xSDK provides turnkey installation of member software and seamless combination of aggregate capabilities, and it marks first steps toward extreme-scale scientific software ecosystems from which future applications can be composed rapidly with assured quality and scalability.« less
xSDK Foundations: Toward an Extreme-scale Scientific Software Development Kit
Heroux, Michael A.; Bartlett, Roscoe; Demeshko, Irina; ...
2017-03-01
Here, extreme-scale computational science increasingly demands multiscale and multiphysics formulations. Combining software developed by independent groups is imperative: no single team has resources for all predictive science and decision support capabilities. Scientific libraries provide high-quality, reusable software components for constructing applications with improved robustness and portability. However, without coordination, many libraries cannot be easily composed. Namespace collisions, inconsistent arguments, lack of third-party software versioning, and additional difficulties make composition costly. The Extreme-scale Scientific Software Development Kit (xSDK) defines community policies to improve code quality and compatibility across independently developed packages (hypre, PETSc, SuperLU, Trilinos, and Alquimia) and provides a foundationmore » for addressing broader issues in software interoperability, performance portability, and sustainability. The xSDK provides turnkey installation of member software and seamless combination of aggregate capabilities, and it marks first steps toward extreme-scale scientific software ecosystems from which future applications can be composed rapidly with assured quality and scalability.« less
Enhancements to VTK enabling Scientific Visualization in Immersive Environments
DOE Office of Scientific and Technical Information (OSTI.GOV)
O'Leary, Patrick; Jhaveri, Sankhesh; Chaudhary, Aashish
Modern scientific, engineering and medical computational sim- ulations, as well as experimental and observational data sens- ing/measuring devices, produce enormous amounts of data. While statistical analysis provides insight into this data, scientific vi- sualization is tactically important for scientific discovery, prod- uct design and data analysis. These benefits are impeded, how- ever, when scientific visualization algorithms are implemented from scratch—a time-consuming and redundant process in im- mersive application development. This process can greatly ben- efit from leveraging the state-of-the-art open-source Visualization Toolkit (VTK) and its community. Over the past two (almost three) decades, integrating VTK with a virtual reality (VR)more » environment has only been attempted to varying degrees of success. In this pa- per, we demonstrate two new approaches to simplify this amalga- mation of an immersive interface with visualization rendering from VTK. In addition, we cover several enhancements to VTK that pro- vide near real-time updates and efficient interaction. Finally, we demonstrate the combination of VTK with both Vrui and OpenVR immersive environments in example applications.« less
NASA Technical Reports Server (NTRS)
Hope, W. W.; Johnson, L. P.; Obl, W.; Stewart, A.; Harris, W. C.; Craig, R. D.
2000-01-01
Faculty in the Department of Physical, Environmental and Computer Sciences strongly believe in the concept that undergraduate research and research-related activities must be integrated into the fabric of our undergraduate Science and Technology curricula. High level skills, such as problem solving, reasoning, collaboration and the ability to engage in research, are learned for advanced study in graduate school or for competing for well paying positions in the scientific community. One goal of our academic programs is to have a pipeline of research activities from high school to four year college, to graduate school, based on the GISS Institute on Climate and Planets model.
Requirements for migration of NSSD code systems from LTSS to NLTSS
NASA Technical Reports Server (NTRS)
Pratt, M.
1984-01-01
The purpose of this document is to address the requirements necessary for a successful conversion of the Nuclear Design (ND) application code systems to the NLTSS environment. The ND application code system community can be characterized as large-scale scientific computation carried out on supercomputers. NLTSS is a distributed operating system being developed at LLNL to replace the LTSS system currently in use. The implications of change are examined including a description of the computational environment and users in ND. The discussion then turns to requirements, first in a general way, followed by specific requirements, including a proposal for managing the transition.
On transferring the grid technology to the biomedical community.
Mohammed, Yassene; Sax, Ulrich; Dickmann, Frank; Lippert, Joerg; Solodenko, Juri; von Voigt, Gabriele; Smith, Matthew; Rienhoff, Otto
2010-01-01
Natural scientists such as physicists pioneered the sharing of computing resources, which resulted in the Grid. The inter domain transfer process of this technology has been an intuitive process. Some difficulties facing the life science community can be understood using the Bozeman's "Effectiveness Model of Technology Transfer". Bozeman's and classical technology transfer approaches deal with technologies that have achieved certain stability. Grid and Cloud solutions are technologies that are still in flux. We illustrate how Grid computing creates new difficulties for the technology transfer process that are not considered in Bozeman's model. We show why the success of health Grids should be measured by the qualified scientific human capital and opportunities created, and not primarily by the market impact. With two examples we show how the Grid technology transfer theory corresponds to the reality. We conclude with recommendations that can help improve the adoption of Grid solutions into the biomedical community. These results give a more concise explanation of the difficulties most life science IT projects are facing in the late funding periods, and show some leveraging steps which can help to overcome the "vale of tears".
Complexity and Innovation: Army Transformation and the Reality of War
2004-05-26
necessary to instill confidence among all members of the interested community that the causal relationships...continues to gain momentum and general acceptance within the scientific community . The topic is addressed in numerous books, studies and scientific journals...scientific community has steadily grown. Since the time of Galileo and Newton, scientific endeavor has been characterized by reductionism (the process
National Laboratory for Advanced Scientific Visualization at UNAM - Mexico
NASA Astrophysics Data System (ADS)
Manea, Marina; Constantin Manea, Vlad; Varela, Alfredo
2016-04-01
In 2015, the National Autonomous University of Mexico (UNAM) joined the family of Universities and Research Centers where advanced visualization and computing plays a key role to promote and advance missions in research, education, community outreach, as well as business-oriented consulting. This initiative provides access to a great variety of advanced hardware and software resources and offers a range of consulting services that spans a variety of areas related to scientific visualization, among which are: neuroanatomy, embryonic development, genome related studies, geosciences, geography, physics and mathematics related disciplines. The National Laboratory for Advanced Scientific Visualization delivers services through three main infrastructure environments: the 3D fully immersive display system Cave, the high resolution parallel visualization system Powerwall, the high resolution spherical displays Earth Simulator. The entire visualization infrastructure is interconnected to a high-performance-computing-cluster (HPCC) called ADA in honor to Ada Lovelace, considered to be the first computer programmer. The Cave is an extra large 3.6m wide room with projected images on the front, left and right, as well as floor walls. Specialized crystal eyes LCD-shutter glasses provide a strong stereo depth perception, and a variety of tracking devices allow software to track the position of a user's hand, head and wand. The Powerwall is designed to bring large amounts of complex data together through parallel computing for team interaction and collaboration. This system is composed by 24 (6x4) high-resolution ultra-thin (2 mm) bezel monitors connected to a high-performance GPU cluster. The Earth Simulator is a large (60") high-resolution spherical display used for global-scale data visualization like geophysical, meteorological, climate and ecology data. The HPCC-ADA, is a 1000+ computing core system, which offers parallel computing resources to applications that requires large quantity of memory as well as large and fast parallel storage systems. The entire system temperature is controlled by an energy and space efficient cooling solution, based on large rear door liquid cooled heat exchangers. This state-of-the-art infrastructure will boost research activities in the region, offer a powerful scientific tool for teaching at undergraduate and graduate levels, and enhance association and cooperation with business-oriented organizations.
MetaboTools: A comprehensive toolbox for analysis of genome-scale metabolic models
Aurich, Maike K.; Fleming, Ronan M. T.; Thiele, Ines
2016-08-03
Metabolomic data sets provide a direct read-out of cellular phenotypes and are increasingly generated to study biological questions. Previous work, by us and others, revealed the potential of analyzing extracellular metabolomic data in the context of the metabolic model using constraint-based modeling. With the MetaboTools, we make our methods available to the broader scientific community. The MetaboTools consist of a protocol, a toolbox, and tutorials of two use cases. The protocol describes, in a step-wise manner, the workflow of data integration, and computational analysis. The MetaboTools comprise the Matlab code required to complete the workflow described in the protocol. Tutorialsmore » explain the computational steps for integration of two different data sets and demonstrate a comprehensive set of methods for the computational analysis of metabolic models and stratification thereof into different phenotypes. The presented workflow supports integrative analysis of multiple omics data sets. Importantly, all analysis tools can be applied to metabolic models without performing the entire workflow. Taken together, the MetaboTools constitute a comprehensive guide to the intra-model analysis of extracellular metabolomic data from microbial, plant, or human cells. In conclusion, this computational modeling resource offers a broad set of computational analysis tools for a wide biomedical and non-biomedical research community.« less
Coastal erosion management in Accra: Combining local knowledge and empirical research
2016-01-01
Coastal erosion along the Accra coast has become a chronic phenomenon that threatens both life and property. The issue has assumed a centre stage of national debate in recent times because of its impact on the coastal communities. Lack of reliable geospatial data hinders effective scientific investigations into the changing trends in the shoreline position. However, knowledge about coastal erosion, by the local people, and how far the shoreline has migrated inland over time is high in the coastal communities in Accra. This opens a new chapter in coastal erosion research to include local knowledge of the local settlers in developing sustainable coastal management. This article adopted a scientific approach to estimate rate of erosion and tested the results against perceived erosion trend by the local settlers. The study used a 1974 digital topographic map and 1996 aerial photographs. The end point rate statistical method in DSAS was used to compute the rates of change. The short-term rate of change for the 22-year period under study was estimated as -0.91 m/annum ± 0.49 m/annum. It was revealed that about 79% of the shoreline is eroding, while the remaining 21% is either stabilised or accreting. It emerged, from semi-structured interviews with inhabitants in the Accra coastal communities, that an average of about 30 m of coastal lands are perceived to have been lost to erosion for a period of about 20 years. This translates to a historic rate of change of about 1.5 m/year, which corroborates the results of the scientific study. Again this study has established that the local knowledge of the inhabitants, about coastal erosion, can serve as reliable information under scarcity of scientific data for coastal erosion analyses in developing countries.
Hu, Ye; Bajorath, Jürgen
2014-01-01
In 2012, we reported 30 compound data sets and/or programs developed in our laboratory in a data article and made them freely available to the scientific community to support chemoinformatics and computational medicinal chemistry applications. These data sets and computational tools were provided for download from our website. Since publication of this data article, we have generated 13 new data sets with which we further extend our collection of publicly available data and tools. Due to changes in web servers and website architectures, data accessibility has recently been limited at times. Therefore, we have also transferred our data sets and tools to a public repository to ensure full and stable accessibility. To aid in data selection, we have classified the data sets according to scientific subject areas. Herein, we describe new data sets, introduce the data organization scheme, summarize the database content and provide detailed access information in ZENODO (doi: 10.5281/zenodo.8451 and doi:10.5281/zenodo.8455).
NASA Astrophysics Data System (ADS)
Kaplinger, Brian Douglas
For the past few decades, both the scientific community and the general public have been becoming more aware that the Earth lives in a shooting gallery of small objects. We classify all of these asteroids and comets, known or unknown, that cross Earth's orbit as near-Earth objects (NEOs). A look at our geologic history tells us that NEOs have collided with Earth in the past, and we expect that they will continue to do so. With thousands of known NEOs crossing the orbit of Earth, there has been significant scientific interest in developing the capability to deflect an NEO from an impacting trajectory. This thesis applies the ideas of Smoothed Particle Hydrodynamics (SPH) theory to the NEO disruption problem. A simulation package was designed that allows efficacy simulation to be integrated into the mission planning and design process. This is done by applying ideas in high-performance computing (HPC) on the computer graphics processing unit (GPU). Rather than prove a concept through large standalone simulations on a supercomputer, a highly parallel structure allows for flexible, target dependent questions to be resolved. Built around nonclassified data and analysis, this computer package will allow academic institutions to better tackle the issue of NEO mitigation effectiveness.
Presenting Numerical Modelling of Explosive Volcanic Eruption to a General Public
NASA Astrophysics Data System (ADS)
Demaria, C.; Todesco, M.; Neri, A.; Blasi, G.
2001-12-01
Numerical modeling of explosive volcanic eruptions has been widely applied, during the last decades, to study pyroclastic flows dispersion along volcano's flanks and to evaluate their impact on urban areas. Results from these transient multi-phase and multi-component simulations are often reproduced in form of computer animations, representing the spatial and temporal evolution of relevant flow variables (such as temperature, or particle concentration). Despite being a sophisticated, technical tool to analyze and share modeling results within the scientific community, these animations truly look like colorful cartoons showing an erupting volcano and are especially suited to be shown to a general public. Thanks to their particular appeal, and to the large interest usually risen by exploding volcanoes, these animations have been presented several times on television and magazines and are currently displayed in a permanent exposition, at the Vesuvius Observatory in Naples. This work represents an effort to produce an accompanying tool for these animations, capable of explaining to a large audience the scientific meaning of what can otherwise look as a graphical exercise. Dealing with research aimed at the study of dangerous, explosive volcanoes, improving the general understanding of these scientific results plays an important role as far as risk perception is concerned. An educated population has better chances to follow an appropriate behavior, i.e.: one that could lead, on the long period, to a reduction of the potential risk. In this sense, a correct divulgation of scientific results, while improving the confidence of the population in the scientific community, should belong to the strategies adopted to mitigate volcanic risk. Due to the relevance of the long term final goal of such divulgation experiment, this work represents an interdisciplinary effort, combining scientific expertise and specific competence from the modern communication science and risk perception studies.
The Symbiotic Relationship between Scientific Workflow and Provenance (Invited)
NASA Astrophysics Data System (ADS)
Stephan, E.
2010-12-01
The purpose of this presentation is to describe the symbiotic nature of scientific workflows and provenance. We will also discuss the current trends and real world challenges facing these two distinct research areas. Although motivated differently, the needs of the international science communities are the glue that binds this relationship together. Understanding and articulating the science drivers to these communities is paramount as these technologies evolve and mature. Originally conceived for managing business processes, workflows are now becoming invaluable assets in both computational and experimental sciences. These reconfigurable, automated systems provide essential technology to perform complex analyses by coupling together geographically distributed disparate data sources and applications. As a result, workflows are capable of higher throughput in a shorter amount of time than performing the steps manually. Today many different workflow products exist; these could include Kepler and Taverna or similar products like MeDICI, developed at PNNL, that are standardized on the Business Process Execution Language (BPEL). Provenance, originating from the French term Provenir “to come from”, is used to describe the curation process of artwork as art is passed from owner to owner. The concept of provenance was adopted by digital libraries as a means to track the lineage of documents while standards such as the DublinCore began to emerge. In recent years the systems science community has increasingly expressed the need to expand the concept of provenance to formally articulate the history of scientific data. Communities such as the International Provenance and Annotation Workshop (IPAW) have formalized a provenance data model. The Open Provenance Model, and the W3C is hosting a provenance incubator group featuring the Proof Markup Language. Although both workflows and provenance have risen from different communities and operate independently, their mutual success is tied together, forming a symbiotic relationship where research and development advances in one effort can provide tremendous benefits to the other. For example, automating provenance extraction within scientific applications is still a relatively new concept; the workflow engine provides the framework to capture application specific operations, inputs, and resulting data. It provides a description of the process history and data flow by wrapping workflow components around the applications and data sources. On the other hand, a lack of cooperation between workflows and provenance can inhibit usefulness of both to science. Blindly tracking the execution history without having a true understanding of what kinds of questions end users may have makes the provenance indecipherable to the target users. Over the past nine years PNNL has been actively involved in provenance research in support of computational chemistry, molecular dynamics, biology, hydrology, and climate. PNNL has also been actively involved in efforts by the international community to develop open standards for provenance and the development of architectures to support provenance capture, storage, and querying. This presentation will provide real world use cases of how provenance and workflow can be leveraged and implemented to meet different needs and the challenges that lie ahead.
The Principles for Successful Scientific Data Management Revisited
NASA Astrophysics Data System (ADS)
Walker, R. J.; King, T. A.; Joy, S. P.
2005-12-01
It has been 23 years since the National Research Council's Committee on Data Management and Computation (CODMAC) published its famous list of principles for successful scientific data management that have provided the framework for modern space science data management. CODMAC outlined seven principles: 1. Scientific Involvement in all aspects of space science missions. 2. Scientific Oversight of all scientific data-management activities. 3. Data Availability - Validated data should be made available to the scientific community in a timely manner. They should include appropriate ancillary data, and complete documentation. 4. Facilities - A proper balance between cost and scientific productivity should be maintained. 5. Software - Transportable well documented software should be available to process and analyze the data. 6. Scientific Data Storage - The data should be preserved in retrievable form. 7. Data System Funding - Adequate data funding should be made available at the outset of missions and protected from overruns. In this paper we will review the lessons learned in trying to apply these principles to space derived data. The Planetary Data System created the concept of data curation to carry out the CODMAC principles. Data curators are scientists and technologists who work directly with the mission scientists to create data products. The efficient application of the CODMAC principles requires that data curators and the mission team start early in a mission to plan for data access and archiving. To build the data products the planetary discipline adopted data access and documentation standards and has adhered to them. The data curators and mission team work together to produce data products and make them available. However even with early planning and agreement on standards the needs of the science community frequently far exceed the available resources. This is especially true for smaller principal investigator run missions. We will argue that one way to make data systems for small missions more effective is for the data curators to provide software tools to help develop the mission data system.
Proceedings: Fourth Workshop on Mining Scientific Datasets
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kamath, C
Commercial applications of data mining in areas such as e-commerce, market-basket analysis, text-mining, and web-mining have taken on a central focus in the JCDD community. However, there is a significant amount of innovative data mining work taking place in the context of scientific and engineering applications that is not well represented in the mainstream KDD conferences. For example, scientific data mining techniques are being developed and applied to diverse fields such as remote sensing, physics, chemistry, biology, astronomy, structural mechanics, computational fluid dynamics etc. In these areas, data mining frequently complements and enhances existing analysis methods based on statistics, exploratorymore » data analysis, and domain-specific approaches. On the surface, it may appear that data from one scientific field, say genomics, is very different from another field, such as physics. However, despite their diversity, there is much that is common across the mining of scientific and engineering data. For example, techniques used to identify objects in images are very similar, regardless of whether the images came from a remote sensing application, a physics experiment, an astronomy observation, or a medical study. Further, with data mining being applied to new types of data, such as mesh data from scientific simulations, there is the opportunity to apply and extend data mining to new scientific domains. This one-day workshop brings together data miners analyzing science data and scientists from diverse fields to share their experiences, learn how techniques developed in one field can be applied in another, and better understand some of the newer techniques being developed in the KDD community. This is the fourth workshop on the topic of Mining Scientific Data sets; for information on earlier workshops, see http://www.ahpcrc.org/conferences/. This workshop continues the tradition of addressing challenging problems in a field where the diversity of applications is matched only by the opportunities that await a practitioner.« less
Noise characteristics of the Skylab S-193 altimeter altitude measurements
NASA Technical Reports Server (NTRS)
Hatch, W. E.
1975-01-01
The statistical characteristics of the SKYLAB S-193 altimeter altitude noise are considered. These results are reported in a concise format for use and analysis by the scientific community. In most instances the results have been grouped according to satellite pointing so that the effects of pointing on the statistical characteristics can be readily seen. The altimeter measurements and the processing techniques are described. The mathematical descriptions of the computer programs used for these results are included.
dCache on Steroids - Delegated Storage Solutions
Mkrtchyan, Tigran; Adeyemi, F.; Ashish, A.; ...
2017-11-23
For over a decade, dCache.org has delivered a robust software used at more than 80 Universities and research institutes around the world, allowing these sites to provide reliable storage services for the WLCG experiments as well as many other scientific communities. The flexible architecture of dCache allows running it in a wide variety of configurations and platforms - from a SoC based all-in-one Raspberry-Pi up to hundreds of nodes in a multipetabyte installation. Due to lack of managed storage at the time, dCache implemented data placement, replication and data integrity directly. Today, many alternatives are available: S3, GlusterFS, CEPH andmore » others. While such solutions position themselves as scalable storage systems, they cannot be used by many scientific communities out of the box. The absence of community-accepted authentication and authorization mechanisms, the use of product specific protocols and the lack of namespace are some of the reasons that prevent wide-scale adoption of these alternatives. Most of these limitations are already solved by dCache. By delegating low-level storage management functionality to the above-mentioned new systems and providing the missing layer through dCache, we provide a solution which combines the benefits of both worlds - industry standard storage building blocks with the access protocols and authentication required by scientific communities. In this paper, we focus on CEPH, a popular software for clustered storage that supports file, block and object interfaces. CEPH is often used in modern computing centers, for example as a backend to OpenStack services. We will show prototypes of dCache running with a CEPH backend and discuss the benefits and limitations of such an approach. As a result, we will also outline the roadmap for supporting ‘delegated storage’ within the dCache releases.« less
EVEREST: a virtual research environment for the Earth Sciences
NASA Astrophysics Data System (ADS)
Glaves, H. M.; Marelli, F.; Albani, M.
2015-12-01
There is an increasing requirement for researchers to work collaboratively using common resources whilst being geographically dispersed. By creating a virtual research environment (VRE) using a service oriented architecture (SOA) tailored to the needs of Earth Science (ES) communities, the EVEREST project will provide a range of both generic and domain specific data management services to support a dynamic approach to collaborative research. EVER-EST will provide the means to overcome existing barriers to sharing of Earth Science data and information allowing research teams to discover, access, share and process heterogeneous data, algorithms, results and experiences within and across their communities, including those domains beyond Earth Science. Data providers will be also able to monitor user experiences and collect feedback through the VRE, improving their capacity to adapt to the changing requirements of their end-users. The EVER-EST e-infrastructure will be validated by four virtual research communities (VRC) covering different multidisciplinary ES domains: including ocean monitoring, selected natural hazards (flooding, ground instability and extreme weather events), land monitoring and risk management (volcanoes and seismicity). Each of the VRC represents a different collaborative use case for the VRE according to its own specific requirements for data, software, best practice and community engagement. The diverse use cases will demonstrate how the VRE can be used for a range of activities from straight forward data/software sharing to investigating ways to improve cooperative working. Development of the EVEREST VRE will leverage on the results of several previous projects which have produced state-of-the-art technologies for scientific data management and curation as well those initiatives which have developed models, techniques and tools for the preservation of scientific methods and their implementation in computational forms such as scientific workflows.
dCache on Steroids - Delegated Storage Solutions
NASA Astrophysics Data System (ADS)
Mkrtchyan, T.; Adeyemi, F.; Ashish, A.; Behrmann, G.; Fuhrmann, P.; Litvintsev, D.; Millar, P.; Rossi, A.; Sahakyan, M.; Starek, J.
2017-10-01
For over a decade, dCache.org has delivered a robust software used at more than 80 Universities and research institutes around the world, allowing these sites to provide reliable storage services for the WLCG experiments as well as many other scientific communities. The flexible architecture of dCache allows running it in a wide variety of configurations and platforms - from a SoC based all-in-one Raspberry-Pi up to hundreds of nodes in a multipetabyte installation. Due to lack of managed storage at the time, dCache implemented data placement, replication and data integrity directly. Today, many alternatives are available: S3, GlusterFS, CEPH and others. While such solutions position themselves as scalable storage systems, they cannot be used by many scientific communities out of the box. The absence of community-accepted authentication and authorization mechanisms, the use of product specific protocols and the lack of namespace are some of the reasons that prevent wide-scale adoption of these alternatives. Most of these limitations are already solved by dCache. By delegating low-level storage management functionality to the above-mentioned new systems and providing the missing layer through dCache, we provide a solution which combines the benefits of both worlds - industry standard storage building blocks with the access protocols and authentication required by scientific communities. In this paper, we focus on CEPH, a popular software for clustered storage that supports file, block and object interfaces. CEPH is often used in modern computing centers, for example as a backend to OpenStack services. We will show prototypes of dCache running with a CEPH backend and discuss the benefits and limitations of such an approach. We will also outline the roadmap for supporting ‘delegated storage’ within the dCache releases.
dCache on Steroids - Delegated Storage Solutions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mkrtchyan, Tigran; Adeyemi, F.; Ashish, A.
For over a decade, dCache.org has delivered a robust software used at more than 80 Universities and research institutes around the world, allowing these sites to provide reliable storage services for the WLCG experiments as well as many other scientific communities. The flexible architecture of dCache allows running it in a wide variety of configurations and platforms - from a SoC based all-in-one Raspberry-Pi up to hundreds of nodes in a multipetabyte installation. Due to lack of managed storage at the time, dCache implemented data placement, replication and data integrity directly. Today, many alternatives are available: S3, GlusterFS, CEPH andmore » others. While such solutions position themselves as scalable storage systems, they cannot be used by many scientific communities out of the box. The absence of community-accepted authentication and authorization mechanisms, the use of product specific protocols and the lack of namespace are some of the reasons that prevent wide-scale adoption of these alternatives. Most of these limitations are already solved by dCache. By delegating low-level storage management functionality to the above-mentioned new systems and providing the missing layer through dCache, we provide a solution which combines the benefits of both worlds - industry standard storage building blocks with the access protocols and authentication required by scientific communities. In this paper, we focus on CEPH, a popular software for clustered storage that supports file, block and object interfaces. CEPH is often used in modern computing centers, for example as a backend to OpenStack services. We will show prototypes of dCache running with a CEPH backend and discuss the benefits and limitations of such an approach. As a result, we will also outline the roadmap for supporting ‘delegated storage’ within the dCache releases.« less
Testing Scientific Software: A Systematic Literature Review.
Kanewala, Upulee; Bieman, James M
2014-10-01
Scientific software plays an important role in critical decision making, for example making weather predictions based on climate models, and computation of evidence for research publications. Recently, scientists have had to retract publications due to errors caused by software faults. Systematic testing can identify such faults in code. This study aims to identify specific challenges, proposed solutions, and unsolved problems faced when testing scientific software. We conducted a systematic literature survey to identify and analyze relevant literature. We identified 62 studies that provided relevant information about testing scientific software. We found that challenges faced when testing scientific software fall into two main categories: (1) testing challenges that occur due to characteristics of scientific software such as oracle problems and (2) testing challenges that occur due to cultural differences between scientists and the software engineering community such as viewing the code and the model that it implements as inseparable entities. In addition, we identified methods to potentially overcome these challenges and their limitations. Finally we describe unsolved challenges and how software engineering researchers and practitioners can help to overcome them. Scientific software presents special challenges for testing. Specifically, cultural differences between scientist developers and software engineers, along with the characteristics of the scientific software make testing more difficult. Existing techniques such as code clone detection can help to improve the testing process. Software engineers should consider special challenges posed by scientific software such as oracle problems when developing testing techniques.
Developing science gateways for drug discovery in a grid environment.
Pérez-Sánchez, Horacio; Rezaei, Vahid; Mezhuyev, Vitaliy; Man, Duhu; Peña-García, Jorge; den-Haan, Helena; Gesing, Sandra
2016-01-01
Methods for in silico screening of large databases of molecules increasingly complement and replace experimental techniques to discover novel compounds to combat diseases. As these techniques become more complex and computationally costly we are faced with an increasing problem to provide the research community of life sciences with a convenient tool for high-throughput virtual screening on distributed computing resources. To this end, we recently integrated the biophysics-based drug-screening program FlexScreen into a service, applicable for large-scale parallel screening and reusable in the context of scientific workflows. Our implementation is based on Pipeline Pilot and Simple Object Access Protocol and provides an easy-to-use graphical user interface to construct complex workflows, which can be executed on distributed computing resources, thus accelerating the throughput by several orders of magnitude.
On Stable Marriages and Greedy Matchings
DOE Office of Scientific and Technical Information (OSTI.GOV)
Manne, Fredrik; Naim, Md; Lerring, Hakon
2016-12-11
Research on stable marriage problems has a long and mathematically rigorous history, while that of exploiting greedy matchings in combinatorial scientific computing is a younger and less developed research field. In this paper we consider the relationships between these two areas. In particular we show that several problems related to computing greedy matchings can be formulated as stable marriage problems and as a consequence several recently proposed algorithms for computing greedy matchings are in fact special cases of well known algorithms for the stable marriage problem. However, in terms of implementations and practical scalable solutions on modern hardware, the greedymore » matching community has made considerable progress. We show that due to the strong relationship between these two fields many of these results are also applicable for solving stable marriage problems.« less
Dynamic Extension of a Virtualized Cluster by using Cloud Resources
NASA Astrophysics Data System (ADS)
Oberst, Oliver; Hauth, Thomas; Kernert, David; Riedel, Stephan; Quast, Günter
2012-12-01
The specific requirements concerning the software environment within the HEP community constrain the choice of resource providers for the outsourcing of computing infrastructure. The use of virtualization in HPC clusters and in the context of cloud resources is therefore a subject of recent developments in scientific computing. The dynamic virtualization of worker nodes in common batch systems provided by ViBatch serves each user with a dynamically virtualized subset of worker nodes on a local cluster. Now it can be transparently extended by the use of common open source cloud interfaces like OpenNebula or Eucalyptus, launching a subset of the virtual worker nodes within the cloud. This paper demonstrates how a dynamically virtualized computing cluster is combined with cloud resources by attaching remotely started virtual worker nodes to the local batch system.
Development of a Web Based Simulating System for Earthquake Modeling on the Grid
NASA Astrophysics Data System (ADS)
Seber, D.; Youn, C.; Kaiser, T.
2007-12-01
Existing cyberinfrastructure-based information, data and computational networks now allow development of state- of-the-art, user-friendly simulation environments that democratize access to high-end computational environments and provide new research opportunities for many research and educational communities. Within the Geosciences cyberinfrastructure network, GEON, we have developed the SYNSEIS (SYNthetic SEISmogram) toolkit to enable efficient computations of 2D and 3D seismic waveforms for a variety of research purposes especially for helping to analyze the EarthScope's USArray seismic data in a speedy and efficient environment. The underlying simulation software in SYNSEIS is a finite difference code, E3D, developed by LLNL (S. Larsen). The code is embedded within the SYNSEIS portlet environment and it is used by our toolkit to simulate seismic waveforms of earthquakes at regional distances (<1000km). Architecturally, SYNSEIS uses both Web Service and Grid computing resources in a portal-based work environment and has a built in access mechanism to connect to national supercomputer centers as well as to a dedicated, small-scale compute cluster for its runs. Even though Grid computing is well-established in many computing communities, its use among domain scientists still is not trivial because of multiple levels of complexities encountered. We grid-enabled E3D using our own dialect XML inputs that include geological models that are accessible through standard Web services within the GEON network. The XML inputs for this application contain structural geometries, source parameters, seismic velocity, density, attenuation values, number of time steps to compute, and number of stations. By enabling a portal based access to a such computational environment coupled with its dynamic user interface we enable a large user community to take advantage of such high end calculations in their research and educational activities. Our system can be used to promote an efficient and effective modeling environment to help scientists as well as educators in their daily activities and speed up the scientific discovery process.
Aerothermodynamic environment for a Titan probe with deployable decelerator
NASA Technical Reports Server (NTRS)
Green, M. J.; Swenson, B. L.; Balakrishnan, A.
1985-01-01
It is pointed out that further exploration of Titan, Saturn's largest moon, is of current interest to the scientific community, particularly from the standpoint of the organic chemical evolution of its atmosphere. For a suitable study of this Saturnian satellite, a mission involving a Titan atmospheric entry probe is to be conducted. The probe is to employ a deployable decelerator with the aim to allow scientific measurements in the haze layer. The present investigation is concerned with an assessment of the aerothermodynamic environment for the considered probe during its hypervelocity, low-Reynolds-number entry. Attention is given to the employed computational method, the Titan probe configuration, the Titan probe trajectory, the viscous-layer regime of the aerothermodynamic environment, and the incipient merged-layer regime.
The European perspective for LSST
NASA Astrophysics Data System (ADS)
Gangler, Emmanuel
2017-06-01
LSST is a next generation telescope that will produce an unprecedented data flow. The project goal is to deliver data products such as images and catalogs thus enabling scientific analysis for a wide community of users. As a large scale survey, LSST data will be complementary with other facilities in a wide range of scientific domains, including data from ESA or ESO. European countries have invested in LSST since 2007, in the construction of the camera as well as in the computing effort. This latter will be instrumental in designing the next step: how to distribute LSST data to Europe. Astroinformatics challenges for LSST indeed includes not only the analysis of LSST big data, but also the practical efficiency of the data access.
Preparing Scientists to be Community Partners
NASA Astrophysics Data System (ADS)
Pandya, R. E.
2012-12-01
Many students, especially students from historically under-represented communities, leave science majors or avoid choosing them because scientific careers do not offer enough opportunity to contribute to their communities. Citizen science, or public participation in scientific research, may address these challenges. At its most collaborative, it means inviting communities to partner in every step of the scientific process from defining the research question to applying the results to community priorities. In addition to attracting and retaining students, this level of community engagement will help diversify science, ensure the use and usability of our science, help buttress public support of science, and encourage the application of scientific results to policy. It also offers opportunities to tackle scientific questions that can't be accomplished in other way and it is demonstrably effective at helping people learn scientific concepts and methods. In order to learn how to prepare scientists for this kind of intensive community collaboration, we examined several case studies, including a project on disease and public health in Africa and the professionally evaluated experience of two summer interns in Southern Louisiana. In these and other cases, we learned that scientific expertise in a discipline has to be accompanied by a reservoir of humility and respect for other ways of knowing, the ability to work collaboratively with a broad range of disciplines and people, patience and enough career stability to allow that patience, and a willingness to adapt research to a broader set of scientific and non-scientific priorities. To help students achieve this, we found that direct instruction in participatory methods, mentoring by community members and scientists with participatory experience, in-depth training on scientific ethics and communication, explicit articulation of the goal of working with communities, and ample opportunity for personal reflection were essential. There is much more to learn about preparing students for these collaborative approaches, and the principal goal of sharing these strategies is to spark a conversation about the ways we prepare scientists and the public to work together in an increasingly collaborative scientific enterprise.
Lowering the Barrier to Reproducible Research by Publishing Provenance from Common Analytical Tools
NASA Astrophysics Data System (ADS)
Jones, M. B.; Slaughter, P.; Walker, L.; Jones, C. S.; Missier, P.; Ludäscher, B.; Cao, Y.; McPhillips, T.; Schildhauer, M.
2015-12-01
Scientific provenance describes the authenticity, origin, and processing history of research products and promotes scientific transparency by detailing the steps in computational workflows that produce derived products. These products include papers, findings, input data, software products to perform computations, and derived data and visualizations. The geosciences community values this type of information, and, at least theoretically, strives to base conclusions on computationally replicable findings. In practice, capturing detailed provenance is laborious and thus has been a low priority; beyond a lab notebook describing methods and results, few researchers capture and preserve detailed records of scientific provenance. We have built tools for capturing and publishing provenance that integrate into analytical environments that are in widespread use by geoscientists (R and Matlab). These tools lower the barrier to provenance generation by automating capture of critical information as researchers prepare data for analysis, develop, test, and execute models, and create visualizations. The 'recordr' library in R and the `matlab-dataone` library in Matlab provide shared functions to capture provenance with minimal changes to normal working procedures. Researchers can capture both scripted and interactive sessions, tag and manage these executions as they iterate over analyses, and then prune and publish provenance metadata and derived products to the DataONE federation of archival repositories. Provenance traces conform to the ProvONE model extension of W3C PROV, enabling interoperability across tools and languages. The capture system supports fine-grained versioning of science products and provenance traces. By assigning global identifiers such as DOIs, reseachers can cite the computational processes used to reach findings. And, finally, DataONE has built a web portal to search, browse, and clearly display provenance relationships between input data, the software used to execute analyses and models, and derived data and products that arise from these computations. This provenance is vital to interpretation and understanding of science, and provides an audit trail that researchers can use to understand and replicate computational workflows in the geosciences.
High-throughput neuroimaging-genetics computational infrastructure
Dinov, Ivo D.; Petrosyan, Petros; Liu, Zhizhong; Eggert, Paul; Hobel, Sam; Vespa, Paul; Woo Moon, Seok; Van Horn, John D.; Franco, Joseph; Toga, Arthur W.
2014-01-01
Many contemporary neuroscientific investigations face significant challenges in terms of data management, computational processing, data mining, and results interpretation. These four pillars define the core infrastructure necessary to plan, organize, orchestrate, validate, and disseminate novel scientific methods, computational resources, and translational healthcare findings. Data management includes protocols for data acquisition, archival, query, transfer, retrieval, and aggregation. Computational processing involves the necessary software, hardware, and networking infrastructure required to handle large amounts of heterogeneous neuroimaging, genetics, clinical, and phenotypic data and meta-data. Data mining refers to the process of automatically extracting data features, characteristics and associations, which are not readily visible by human exploration of the raw dataset. Result interpretation includes scientific visualization, community validation of findings and reproducible findings. In this manuscript we describe the novel high-throughput neuroimaging-genetics computational infrastructure available at the Institute for Neuroimaging and Informatics (INI) and the Laboratory of Neuro Imaging (LONI) at University of Southern California (USC). INI and LONI include ultra-high-field and standard-field MRI brain scanners along with an imaging-genetics database for storing the complete provenance of the raw and derived data and meta-data. In addition, the institute provides a large number of software tools for image and shape analysis, mathematical modeling, genomic sequence processing, and scientific visualization. A unique feature of this architecture is the Pipeline environment, which integrates the data management, processing, transfer, and visualization. Through its client-server architecture, the Pipeline environment provides a graphical user interface for designing, executing, monitoring validating, and disseminating of complex protocols that utilize diverse suites of software tools and web-services. These pipeline workflows are represented as portable XML objects which transfer the execution instructions and user specifications from the client user machine to remote pipeline servers for distributed computing. Using Alzheimer's and Parkinson's data, we provide several examples of translational applications using this infrastructure1. PMID:24795619
PREFACE: New trends in Computer Simulations in Physics and not only in physics
NASA Astrophysics Data System (ADS)
Shchur, Lev N.; Krashakov, Serge A.
2016-02-01
In this volume we have collected papers based on the presentations given at the International Conference on Computer Simulations in Physics and beyond (CSP2015), held in Moscow, September 6-10, 2015. We hope that this volume will be helpful and scientifically interesting for readers. The Conference was organized for the first time with the common efforts of the Moscow Institute for Electronics and Mathematics (MIEM) of the National Research University Higher School of Economics, the Landau Institute for Theoretical Physics, and the Science Center in Chernogolovka. The name of the Conference emphasizes the multidisciplinary nature of computational physics. Its methods are applied to the broad range of current research in science and society. The choice of venue was motivated by the multidisciplinary character of the MIEM. It is a former independent university, which has recently become the part of the National Research University Higher School of Economics. The Conference Computer Simulations in Physics and beyond (CSP) is planned to be organized biannually. This year's Conference featured 99 presentations, including 21 plenary and invited talks ranging from the analysis of Irish myths with recent methods of statistical physics, to computing with novel quantum computers D-Wave and D-Wave2. This volume covers various areas of computational physics and emerging subjects within the computational physics community. Each section was preceded by invited talks presenting the latest algorithms and methods in computational physics, as well as new scientific results. Both parallel and poster sessions paid special attention to numerical methods, applications and results. For all the abstracts presented at the conference please follow the link http://csp2015.ac.ru/files/book5x.pdf
Lemont B. Kier: a bibliometric exploration of his scientific production and its use.
Restrepo, Guillermo; Llanos, Eugenio J; Silva, Adriana E
2013-12-01
We thought an appropriate way to celebrate the seminal contribution of Kier is to explore his influence on science, looking for the impact of his research through the citation of his scientific production. From a bibliometric approach the impact of Kier's work is addressed as an individual within a community. Reviewing data from his curriculum vitae, as well as from the ISI Web of Knowledge (ISI), his role within the scientific community is established and the way his scientific results circulate is studied. His curriculum vitae is explored emphasising the approaches he used in his research activities and the social ties with other actors of the community. The circulation of Kier's publications in the ISI is studied as a means for spreading and installing his discourse within the community. The citation patterns found not only show the usage of Kier's scientific results, but also open the possibility to identify some characteristics of this discursive community, such as a common vocabulary and common research goals. The results show an interdisciplinary research work that consolidates a scientific community on the topic of drug discovery.
NASA Astrophysics Data System (ADS)
Salvi, S.; Trasatti, E.; Rubbia, G.; Romaniello, V.; Spinetti, C.; Corradini, S.; Merucci, L.
2016-12-01
The EU's H2020 EVER-EST Project is dedicated to the realization of a Virtual Research Environment (VRE) for Earth Science researchers, during 2015-2018. EVER-EST implements state-of-the-art technologies in the area of Earth Science data catalogues, data access/processing and long-term data preservation together with models, techniques and tools for the computational methods, such as scientific workflows. The VRE is designed with the aim of providing the Earth Science user community with an innovative virtual environment to enhance their ability to interoperate and share knowledge and experience, exploiting also the Research Object concept. The GEO Geohazard Supersites is one of the four Research Communities chosen to validate the e-infrastructure. EVER-EST will help the exploitation of the full potential of the GEO Geohazard Supersite and Natural Laboratories (GSNL) initiative demonstrating the use case in the Permanent Supersites of Mt Etna, Campi Flegrei-Vesuvius, and Icelandic volcanoes. Besides providing tools for active volcanoes monitoring and studies, we intend to demonstrate how a more organized and collaborative research environment, such as a VRE, can improve the quality of the scientific research on the Geohazard Supersites, addressing at the same time the problem of the slow uptake of scientific research findings in Disaster Risk Management. Presently, the full exploitation of the in situ and satellite data made available for each Supersite is delayed by the difficult access (especially for researchers in developing countries) to intensive processing and modeling capabilities. EVER-EST is designed to provide these means and also a friendly virtual environment for the easy transfer of scientific knowledge as soon as it is acquired, promoting collaboration among researchers located in distant regions of the world. A further benefit will be to increase the societal impact of the scientific advancements obtained in the Supersites, allowing a more uniform interface towards the different user communities, who will use part of the services provided by EVER-EST during research result uptake. We show a few test cases of use of the Geohazard Supersite VRE at the actual state of development, and its future development.
History of the numerical aerodynamic simulation program
NASA Technical Reports Server (NTRS)
Peterson, Victor L.; Ballhaus, William F., Jr.
1987-01-01
The Numerical Aerodynamic Simulation (NAS) program has reached a milestone with the completion of the initial operating configuration of the NAS Processing System Network. This achievement is the first major milestone in the continuing effort to provide a state-of-the-art supercomputer facility for the national aerospace community and to serve as a pathfinder for the development and use of future supercomputer systems. The underlying factors that motivated the initiation of the program are first identified and then discussed. These include the emergence and evolution of computational aerodynamics as a powerful new capability in aerodynamics research and development, the computer power required for advances in the discipline, the complementary nature of computation and wind tunnel testing, and the need for the government to play a pathfinding role in the development and use of large-scale scientific computing systems. Finally, the history of the NAS program is traced from its inception in 1975 to the present time.
GPU-Accelerated Molecular Modeling Coming Of Age
Stone, John E.; Hardy, David J.; Ufimtsev, Ivan S.
2010-01-01
Graphics processing units (GPUs) have traditionally been used in molecular modeling solely for visualization of molecular structures and animation of trajectories resulting from molecular dynamics simulations. Modern GPUs have evolved into fully programmable, massively parallel co-processors that can now be exploited to accelerate many scientific computations, typically providing about one order of magnitude speedup over CPU code and in special cases providing speedups of two orders of magnitude. This paper surveys the development of molecular modeling algorithms that leverage GPU computing, the advances already made and remaining issues to be resolved, and the continuing evolution of GPU technology that promises to become even more useful to molecular modeling. Hardware acceleration with commodity GPUs is expected to benefit the overall computational biology community by bringing teraflops performance to desktop workstations and in some cases potentially changing what were formerly batch-mode computational jobs into interactive tasks. PMID:20675161
A comparative analysis of soft computing techniques for gene prediction.
Goel, Neelam; Singh, Shailendra; Aseri, Trilok Chand
2013-07-01
The rapid growth of genomic sequence data for both human and nonhuman species has made analyzing these sequences, especially predicting genes in them, very important and is currently the focus of many research efforts. Beside its scientific interest in the molecular biology and genomics community, gene prediction is of considerable importance in human health and medicine. A variety of gene prediction techniques have been developed for eukaryotes over the past few years. This article reviews and analyzes the application of certain soft computing techniques in gene prediction. First, the problem of gene prediction and its challenges are described. These are followed by different soft computing techniques along with their application to gene prediction. In addition, a comparative analysis of different soft computing techniques for gene prediction is given. Finally some limitations of the current research activities and future research directions are provided. Copyright © 2013 Elsevier Inc. All rights reserved.
A GPU accelerated and error-controlled solver for the unbounded Poisson equation in three dimensions
NASA Astrophysics Data System (ADS)
Exl, Lukas
2017-12-01
An efficient solver for the three dimensional free-space Poisson equation is presented. The underlying numerical method is based on finite Fourier series approximation. While the error of all involved approximations can be fully controlled, the overall computation error is driven by the convergence of the finite Fourier series of the density. For smooth and fast-decaying densities the proposed method will be spectrally accurate. The method scales with O(N log N) operations, where N is the total number of discretization points in the Cartesian grid. The majority of the computational costs come from fast Fourier transforms (FFT), which makes it ideal for GPU computation. Several numerical computations on CPU and GPU validate the method and show efficiency and convergence behavior. Tests are performed using the Vienna Scientific Cluster 3 (VSC3). A free MATLAB implementation for CPU and GPU is provided to the interested community.
GPU-accelerated molecular modeling coming of age.
Stone, John E; Hardy, David J; Ufimtsev, Ivan S; Schulten, Klaus
2010-09-01
Graphics processing units (GPUs) have traditionally been used in molecular modeling solely for visualization of molecular structures and animation of trajectories resulting from molecular dynamics simulations. Modern GPUs have evolved into fully programmable, massively parallel co-processors that can now be exploited to accelerate many scientific computations, typically providing about one order of magnitude speedup over CPU code and in special cases providing speedups of two orders of magnitude. This paper surveys the development of molecular modeling algorithms that leverage GPU computing, the advances already made and remaining issues to be resolved, and the continuing evolution of GPU technology that promises to become even more useful to molecular modeling. Hardware acceleration with commodity GPUs is expected to benefit the overall computational biology community by bringing teraflops performance to desktop workstations and in some cases potentially changing what were formerly batch-mode computational jobs into interactive tasks. (c) 2010 Elsevier Inc. All rights reserved.
The GÉANT network: addressing current and future needs of the HEP community
NASA Astrophysics Data System (ADS)
Capone, Vincenzo; Usman, Mian
2015-12-01
The GÉANT infrastructure is the backbone that serves the scientific communities in Europe for their data movement needs and their access to international research and education networks. Using the extensive fibre footprint and infrastructure in Europe the GÉANT network delivers a portfolio of services aimed to best fit the specific needs of the users, including Authentication and Authorization Infrastructure, end-to-end performance monitoring, advanced network services (dynamic circuits, L2-L3VPN, MD-VPN). This talk will outline the factors that help the GÉANT network to respond to the needs of the High Energy Physics community, both in Europe and worldwide. The Pan-European network provides the connectivity between 40 European national research and education networks. In addition, GÉANT also connects the European NRENs to the R&E networks in other world region and has reach to over 110 NREN worldwide, making GÉANT the best connected Research and Education network, with its multiple intercontinental links to different continents e.g. North and South America, Africa and Asia-Pacific. The High Energy Physics computational needs have always had (and will keep having) a leading role among the scientific user groups of the GÉANT network: the LHCONE overlay network has been built, in collaboration with the other big world REN, specifically to address the peculiar needs of the LHC data movement. Recently, as a result of a series of coordinated efforts, the LHCONE network has been expanded to the Asia-Pacific area, and is going to include some of the main regional R&E network in the area. The LHC community is not the only one that is actively using a distributed computing model (hence the need for a high-performance network); new communities are arising, as BELLE II. GÉANT is deeply involved also with the BELLE II Experiment, to provide full support to their distributed computing model, along with a perfSONAR-based network monitoring system. GÉANT has also coordinated the setup of the network infrastructure to perform the BELLE II Trans-Atlantic Data Challenge, and has been active on helping the BELLE II community to sort out their end-to-end performance issues. In this talk we will provide information about the current GÉANT network architecture and of the international connectivity, along with the upcoming upgrades and the planned and foreseeable improvements. We will also describe the implementation of the solutions provided to support the LHC and BELLE II experiments.
Final Report National Laboratory Professional Development Workshop for Underrepresented Participants
DOE Office of Scientific and Technical Information (OSTI.GOV)
Taylor, Valerie
The 2013 CMD-IT National Laboratories Professional Development Workshop for Underrepresented Participants (CMD-IT NLPDev 2013) was held at the Oak Ridge National Laboratory campus in Oak Ridge, TN. from June 13 - 14, 2013. Sponsored by the Department of Energy (DOE) Advanced Scientific Computing Research Program, the primary goal of these workshops is to provide information about career opportunities in computational science at the various national laboratories and to mentor the underrepresented participants through community building and expert presentations focused on career success. This second annual workshop offered sessions to facilitate career advancement and, in particular, the strategies and resources neededmore » to be successful at the national laboratories.« less
Visualization Techniques in Space and Atmospheric Sciences
NASA Technical Reports Server (NTRS)
Szuszczewicz, E. P. (Editor); Bredekamp, Joseph H. (Editor)
1995-01-01
Unprecedented volumes of data will be generated by research programs that investigate the Earth as a system and the origin of the universe, which will in turn require analysis and interpretation that will lead to meaningful scientific insight. Providing a widely distributed research community with the ability to access, manipulate, analyze, and visualize these complex, multidimensional data sets depends on a wide range of computer science and technology topics. Data storage and compression, data base management, computational methods and algorithms, artificial intelligence, telecommunications, and high-resolution display are just a few of the topics addressed. A unifying theme throughout the papers with regards to advanced data handling and visualization is the need for interactivity, speed, user-friendliness, and extensibility.
XVis: Visualization for the Extreme-Scale Scientific-Computation Ecosystem: Mid-year report FY17 Q2
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moreland, Kenneth D.; Pugmire, David; Rogers, David
The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. This project provides the necessary research and infrastructure for scientific discovery in this new computational ecosystem by addressingmore » four interlocking challenges: emerging processor technology, in situ integration, usability, and proxy analysis.« less
XVis: Visualization for the Extreme-Scale Scientific-Computation Ecosystem: Year-end report FY17.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moreland, Kenneth D.; Pugmire, David; Rogers, David
The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. This project provides the necessary research and infrastructure for scientific discovery in this new computational ecosystem by addressingmore » four interlocking challenges: emerging processor technology, in situ integration, usability, and proxy analysis.« less
XVis: Visualization for the Extreme-Scale Scientific-Computation Ecosystem. Mid-year report FY16 Q2
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moreland, Kenneth D.; Sewell, Christopher; Childs, Hank
The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. This project provides the necessary research and infrastructure for scientific discovery in this new computational ecosystem by addressingmore » four interlocking challenges: emerging processor technology, in situ integration, usability, and proxy analysis.« less
XVis: Visualization for the Extreme-Scale Scientific-Computation Ecosystem: Year-end report FY15 Q4.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moreland, Kenneth D.; Sewell, Christopher; Childs, Hank
The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. This project provides the necessary research and infrastructure for scientific discovery in this new computational ecosystem by addressingmore » four interlocking challenges: emerging processor technology, in situ integration, usability, and proxy analysis.« less
Position Paper - pFLogger: The Parallel Fortran Logging framework for HPC Applications
NASA Technical Reports Server (NTRS)
Clune, Thomas L.; Cruz, Carlos A.
2017-01-01
In the context of high performance computing (HPC), software investments in support of text-based diagnostics, which monitor a running application, are typically limited compared to those for other types of IO. Examples of such diagnostics include reiteration of configuration parameters, progress indicators, simple metrics (e.g., mass conservation, convergence of solvers, etc.), and timers. To some degree, this difference in priority is justifiable as other forms of output are the primary products of a scientific model and, due to their large data volume, much more likely to be a significant performance concern. In contrast, text-based diagnostic content is generally not shared beyond the individual or group running an application and is most often used to troubleshoot when something goes wrong. We suggest that a more systematic approach enabled by a logging facility (or logger) similar to those routinely used by many communities would provide significant value to complex scientific applications. In the context of high-performance computing, an appropriate logger would provide specialized support for distributed and shared-memory parallelism and have low performance overhead. In this paper, we present our prototype implementation of pFlogger a parallel Fortran-based logging framework, and assess its suitability for use in a complex scientific application.
POSITION PAPER - pFLogger: The Parallel Fortran Logging Framework for HPC Applications
NASA Technical Reports Server (NTRS)
Clune, Thomas L.; Cruz, Carlos A.
2017-01-01
In the context of high performance computing (HPC), software investments in support of text-based diagnostics, which monitor a running application, are typically limited compared to those for other types of IO. Examples of such diagnostics include reiteration of configuration parameters, progress indicators, simple metrics (e.g., mass conservation, convergence of solvers, etc.), and timers. To some degree, this difference in priority is justifiable as other forms of output are the primary products of a scientific model and, due to their large data volume, much more likely to be a significant performance concern. In contrast, text-based diagnostic content is generally not shared beyond the individual or group running an application and is most often used to troubleshoot when something goes wrong. We suggest that a more systematic approach enabled by a logging facility (or 'logger') similar to those routinely used by many communities would provide significant value to complex scientific applications. In the context of high-performance computing, an appropriate logger would provide specialized support for distributed and shared-memory parallelism and have low performance overhead. In this paper, we present our prototype implementation of pFlogger - a parallel Fortran-based logging framework, and assess its suitability for use in a complex scientific application.
Current trends for customized biomedical software tools.
Khan, Haseeb Ahmad
2017-01-01
In the past, biomedical scientists were solely dependent on expensive commercial software packages for various applications. However, the advent of user-friendly programming languages and open source platforms has revolutionized the development of simple and efficient customized software tools for solving specific biomedical problems. Many of these tools are designed and developed by biomedical scientists independently or with the support of computer experts and often made freely available for the benefit of scientific community. The current trends for customized biomedical software tools are highlighted in this short review.
Software Engineering Tools for Scientific Models
NASA Technical Reports Server (NTRS)
Abrams, Marc; Saboo, Pallabi; Sonsini, Mike
2013-01-01
Software tools were constructed to address issues the NASA Fortran development community faces, and they were tested on real models currently in use at NASA. These proof-of-concept tools address the High-End Computing Program and the Modeling, Analysis, and Prediction Program. Two examples are the NASA Goddard Earth Observing System Model, Version 5 (GEOS-5) atmospheric model in Cell Fortran on the Cell Broadband Engine, and the Goddard Institute for Space Studies (GISS) coupled atmosphere- ocean model called ModelE, written in fixed format Fortran.
ERIC Educational Resources Information Center
Kraemer Diaz, Anne E.; Spears Johnson, Chaya R.; Arcury, Thomas A.
2015-01-01
Scientific integrity is necessary for strong science; yet many variables can influence scientific integrity. In traditional research, some common threats are the pressure to publish, competition for funds, and career advancement. Community-based participatory research (CBPR) provides a different context for scientific integrity with additional and…
Data management and analysis for the Earth System Grid
NASA Astrophysics Data System (ADS)
Williams, D. N.; Ananthakrishnan, R.; Bernholdt, D. E.; Bharathi, S.; Brown, D.; Chen, M.; Chervenak, A. L.; Cinquini, L.; Drach, R.; Foster, I. T.; Fox, P.; Hankin, S.; Henson, V. E.; Jones, P.; Middleton, D. E.; Schwidder, J.; Schweitzer, R.; Schuler, R.; Shoshani, A.; Siebenlist, F.; Sim, A.; Strand, W. G.; Wilhelmi, N.; Su, M.
2008-07-01
The international climate community is expected to generate hundreds of petabytes of simulation data within the next five to seven years. This data must be accessed and analyzed by thousands of analysts worldwide in order to provide accurate and timely estimates of the likely impact of climate change on physical, biological, and human systems. Climate change is thus not only a scientific challenge of the first order but also a major technological challenge. In order to address this technological challenge, the Earth System Grid Center for Enabling Technologies (ESG-CET) has been established within the U.S. Department of Energy's Scientific Discovery through Advanced Computing (SciDAC)-2 program, with support from the offices of Advanced Scientific Computing Research and Biological and Environmental Research. ESG-CET's mission is to provide climate researchers worldwide with access to the data, information, models, analysis tools, and computational capabilities required to make sense of enormous climate simulation datasets. Its specific goals are to (1) make data more useful to climate researchers by developing Grid technology that enhances data usability; (2) meet specific distributed database, data access, and data movement needs of national and international climate projects; (3) provide a universal and secure web-based data access portal for broad multi-model data collections; and (4) provide a wide-range of Grid-enabled climate data analysis tools and diagnostic methods to international climate centers and U.S. government agencies. Building on the successes of the previous Earth System Grid (ESG) project, which has enabled thousands of researchers to access tens of terabytes of data from a small number of ESG sites, ESG-CET is working to integrate a far larger number of distributed data providers, high-bandwidth wide-area networks, and remote computers in a highly collaborative problem-solving environment.
Web-based analysis and publication of flow cytometry experiments.
Kotecha, Nikesh; Krutzik, Peter O; Irish, Jonathan M
2010-07-01
Cytobank is a Web-based application for storage, analysis, and sharing of flow cytometry experiments. Researchers use a Web browser to log in and use a wide range of tools developed for basic and advanced flow cytometry. In addition to providing access to standard cytometry tools from any computer, Cytobank creates a platform and community for developing new analysis and publication tools. Figure layouts created on Cytobank are designed to allow transparent access to the underlying experiment annotation and data processing steps. Since all flow cytometry files and analysis data are stored on a central server, experiments and figures can be viewed or edited by anyone with the proper permission, from any computer with Internet access. Once a primary researcher has performed the initial analysis of the data, collaborators can engage in experiment analysis and make their own figure layouts using the gated, compensated experiment files. Cytobank is available to the scientific community at http://www.cytobank.org. (c) 2010 by John Wiley & Sons, Inc.
Web-Based Analysis and Publication of Flow Cytometry Experiments
Kotecha, Nikesh; Krutzik, Peter O.; Irish, Jonathan M.
2014-01-01
Cytobank is a web-based application for storage, analysis, and sharing of flow cytometry experiments. Researchers use a web browser to log in and use a wide range of tools developed for basic and advanced flow cytometry. In addition to providing access to standard cytometry tools from any computer, Cytobank creates a platform and community for developing new analysis and publication tools. Figure layouts created on Cytobank are designed to allow transparent access to the underlying experiment annotation and data processing steps. Since all flow cytometry files and analysis data are stored on a central server, experiments and figures can be viewed or edited by anyone with the proper permissions from any computer with Internet access. Once a primary researcher has performed the initial analysis of the data, collaborators can engage in experiment analysis and make their own figure layouts using the gated, compensated experiment files. Cytobank is available to the scientific community at www.cytobank.org PMID:20578106
Dingreville, Rémi; Karnesky, Richard A.; Puel, Guillaume; ...
2015-11-16
With the increasing interplay between experimental and computational approaches at multiple length scales, new research directions are emerging in materials science and computational mechanics. Such cooperative interactions find many applications in the development, characterization and design of complex material systems. This manuscript provides a broad and comprehensive overview of recent trends in which predictive modeling capabilities are developed in conjunction with experiments and advanced characterization to gain a greater insight into structure–property relationships and study various physical phenomena and mechanisms. The focus of this review is on the intersections of multiscale materials experiments and modeling relevant to the materials mechanicsmore » community. After a general discussion on the perspective from various communities, the article focuses on the latest experimental and theoretical opportunities. Emphasis is given to the role of experiments in multiscale models, including insights into how computations can be used as discovery tools for materials engineering, rather than to “simply” support experimental work. This is illustrated by examples from several application areas on structural materials. In conclusion this manuscript ends with a discussion on some problems and open scientific questions that are being explored in order to advance this relatively new field of research.« less
Review of An Introduction to Parallel and Vector Scientific Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bailey, David H.; Lefton, Lew
2006-06-30
On one hand, the field of high-performance scientific computing is thriving beyond measure. Performance of leading-edge systems on scientific calculations, as measured say by the Top500 list, has increased by an astounding factor of 8000 during the 15-year period from 1993 to 2008, which is slightly faster even than Moore's Law. Even more importantly, remarkable advances in numerical algorithms, numerical libraries and parallel programming environments have led to improvements in the scope of what can be computed that are entirely on a par with the advances in computing hardware. And these successes have spread far beyond the confines of largemore » government-operated laboratories, many universities, modest-sized research institutes and private firms now operate clusters that differ only in scale from the behemoth systems at the large-scale facilities. In the wake of these recent successes, researchers from fields that heretofore have not been part of the scientific computing world have been drawn into the arena. For example, at the recent SC07 conference, the exhibit hall, which long has hosted displays from leading computer systems vendors and government laboratories, featured some 70 exhibitors who had not previously participated. In spite of all these exciting developments, and in spite of the clear need to present these concepts to a much broader technical audience, there is a perplexing dearth of training material and textbooks in the field, particularly at the introductory level. Only a handful of universities offer coursework in the specific area of highly parallel scientific computing, and instructors of such courses typically rely on custom-assembled material. For example, the present reviewer and Robert F. Lucas relied on materials assembled in a somewhat ad-hoc fashion from colleagues and personal resources when presenting a course on parallel scientific computing at the University of California, Berkeley, a few years ago. Thus it is indeed refreshing to see the publication of the book An Introduction to Parallel and Vector Scientic Computing, written by Ronald W. Shonkwiler and Lew Lefton, both of the Georgia Institute of Technology. They have taken the bull by the horns and produced a book that appears to be entirely satisfactory as an introductory textbook for use in such a course. It is also of interest to the much broader community of researchers who are already in the field, laboring day by day to improve the power and performance of their numerical simulations. The book is organized into 11 chapters, plus an appendix. The first three chapters describe the basics of system architecture including vector, parallel and distributed memory systems, the details of task dependence and synchronization, and the various programming models currently in use - threads, MPI and OpenMP. Chapters four through nine provide a competent introduction to floating-point arithmetic, numerical error and numerical linear algebra. Some of the topics presented include Gaussian elimination, LU decomposition, tridiagonal systems, Givens rotations, QR decompositions, Gauss-Seidel iterations and Householder transformations. Chapters 10 and 11 introduce Monte Carlo methods and schemes for discrete optimization such as genetic algorithms.« less
Water resources scientific information center
Cardin, C. William; Campbell, J.T.
1986-01-01
The Water Resources Scientific Information Center (WRSIC) acquires, abstracts and indexes the major water resources related literature of the world, and makes information available to the water resources community and the public. A component of the Water Resources Division of the US Geological Survey, the Center maintains a searchable computerized bibliographic data base, and publishers a monthly journal of abstracts. Through its services, the Center is able to provide reliable scientific and technical information about the most recent water resources developments, as well as long-term trends and changes. WRSIC was established in 1966 by the Secretary of the Interior to further the objectives of the Water Resources Research Act of 1964--legislation that encouraged research in water resources and the prevention of needless duplication of research efforts. It was determined the WRSIC should be the national center for information on water resources, covering research reports, scientific journals, and other water resources literature of the world. WRSIC would evaluate all water resources literature, catalog selected articles, and make the information available in publications or by computer access. In this way WRSIC would increase the availability and awareness of water related scientific and technical information. (Lantz-PTT)
Software Carpentry and the Hydrological Sciences
NASA Astrophysics Data System (ADS)
Ahmadia, A. J.; Kees, C. E.; Farthing, M. W.
2013-12-01
Scientists are spending an increasing amount of time building and using hydrology software. However, most scientists are never taught how to do this efficiently. As a result, many are unaware of tools and practices that would allow them to write more reliable and maintainable code with less effort. As hydrology models increase in capability and enter use by a growing number of scientists and their communities, it is important that the scientific software development practices scale up to meet the challenges posed by increasing software complexity, lengthening software lifecycles, a growing number of stakeholders and contributers, and a broadened developer base that extends from application domains to high performance computing centers. Many of these challenges in complexity, lifecycles, and developer base have been successfully met by the open source community, and there are many lessons to be learned from their experiences and practices. Additionally, there is much wisdom to be found in the results of research studies conducted on software engineering itself. Software Carpentry aims to bridge the gap between the current state of software development and these known best practices for scientific software development, with a focus on hands-on exercises and practical advice based on the following principles: 1. Write programs for people, not computers. 2. Automate repetitive tasks 3. Use the computer to record history 4. Make incremental changes 5. Use version control 6. Don't repeat yourself (or others) 7. Plan for mistakes 8. Optimize software only after it works 9. Document design and purpose, not mechanics 10. Collaborate We discuss how these best practices, arising from solid foundations in research and experience, have been shown to help improve scientist's productivity and the reliability of their software.
Enabling Diverse Software Stacks on Supercomputers using High Performance Virtual Clusters.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Younge, Andrew J.; Pedretti, Kevin; Grant, Ryan
While large-scale simulations have been the hallmark of the High Performance Computing (HPC) community for decades, Large Scale Data Analytics (LSDA) workloads are gaining attention within the scientific community not only as a processing component to large HPC simulations, but also as standalone scientific tools for knowledge discovery. With the path towards Exascale, new HPC runtime systems are also emerging in a way that differs from classical distributed com- puting models. However, system software for such capabilities on the latest extreme-scale DOE supercomputing needs to be enhanced to more appropriately support these types of emerging soft- ware ecosystems. In thismore » paper, we propose the use of Virtual Clusters on advanced supercomputing resources to enable systems to support not only HPC workloads, but also emerging big data stacks. Specifi- cally, we have deployed the KVM hypervisor within Cray's Compute Node Linux on a XC-series supercomputer testbed. We also use libvirt and QEMU to manage and provision VMs directly on compute nodes, leveraging Ethernet-over-Aries network emulation. To our knowledge, this is the first known use of KVM on a true MPP supercomputer. We investigate the overhead our solution using HPC benchmarks, both evaluating single-node performance as well as weak scaling of a 32-node virtual cluster. Overall, we find single node performance of our solution using KVM on a Cray is very efficient with near-native performance. However overhead increases by up to 20% as virtual cluster size increases, due to limitations of the Ethernet-over-Aries bridged network. Furthermore, we deploy Apache Spark with large data analysis workloads in a Virtual Cluster, ef- fectively demonstrating how diverse software ecosystems can be supported by High Performance Virtual Clusters.« less
The iPlant Collaborative: Cyberinfrastructure for Plant Biology.
Goff, Stephen A; Vaughn, Matthew; McKay, Sheldon; Lyons, Eric; Stapleton, Ann E; Gessler, Damian; Matasci, Naim; Wang, Liya; Hanlon, Matthew; Lenards, Andrew; Muir, Andy; Merchant, Nirav; Lowry, Sonya; Mock, Stephen; Helmke, Matthew; Kubach, Adam; Narro, Martha; Hopkins, Nicole; Micklos, David; Hilgert, Uwe; Gonzales, Michael; Jordan, Chris; Skidmore, Edwin; Dooley, Rion; Cazes, John; McLay, Robert; Lu, Zhenyuan; Pasternak, Shiran; Koesterke, Lars; Piel, William H; Grene, Ruth; Noutsos, Christos; Gendler, Karla; Feng, Xin; Tang, Chunlao; Lent, Monica; Kim, Seung-Jin; Kvilekval, Kristian; Manjunath, B S; Tannen, Val; Stamatakis, Alexandros; Sanderson, Michael; Welch, Stephen M; Cranston, Karen A; Soltis, Pamela; Soltis, Doug; O'Meara, Brian; Ane, Cecile; Brutnell, Tom; Kleibenstein, Daniel J; White, Jeffery W; Leebens-Mack, James; Donoghue, Michael J; Spalding, Edgar P; Vision, Todd J; Myers, Christopher R; Lowenthal, David; Enquist, Brian J; Boyle, Brad; Akoglu, Ali; Andrews, Greg; Ram, Sudha; Ware, Doreen; Stein, Lincoln; Stanzione, Dan
2011-01-01
The iPlant Collaborative (iPlant) is a United States National Science Foundation (NSF) funded project that aims to create an innovative, comprehensive, and foundational cyberinfrastructure in support of plant biology research (PSCIC, 2006). iPlant is developing cyberinfrastructure that uniquely enables scientists throughout the diverse fields that comprise plant biology to address Grand Challenges in new ways, to stimulate and facilitate cross-disciplinary research, to promote biology and computer science research interactions, and to train the next generation of scientists on the use of cyberinfrastructure in research and education. Meeting humanity's projected demands for agricultural and forest products and the expectation that natural ecosystems be managed sustainably will require synergies from the application of information technologies. The iPlant cyberinfrastructure design is based on an unprecedented period of research community input, and leverages developments in high-performance computing, data storage, and cyberinfrastructure for the physical sciences. iPlant is an open-source project with application programming interfaces that allow the community to extend the infrastructure to meet its needs. iPlant is sponsoring community-driven workshops addressing specific scientific questions via analysis tool integration and hypothesis testing. These workshops teach researchers how to add bioinformatics tools and/or datasets into the iPlant cyberinfrastructure enabling plant scientists to perform complex analyses on large datasets without the need to master the command-line or high-performance computational services.
The iPlant Collaborative: Cyberinfrastructure for Plant Biology
Goff, Stephen A.; Vaughn, Matthew; McKay, Sheldon; Lyons, Eric; Stapleton, Ann E.; Gessler, Damian; Matasci, Naim; Wang, Liya; Hanlon, Matthew; Lenards, Andrew; Muir, Andy; Merchant, Nirav; Lowry, Sonya; Mock, Stephen; Helmke, Matthew; Kubach, Adam; Narro, Martha; Hopkins, Nicole; Micklos, David; Hilgert, Uwe; Gonzales, Michael; Jordan, Chris; Skidmore, Edwin; Dooley, Rion; Cazes, John; McLay, Robert; Lu, Zhenyuan; Pasternak, Shiran; Koesterke, Lars; Piel, William H.; Grene, Ruth; Noutsos, Christos; Gendler, Karla; Feng, Xin; Tang, Chunlao; Lent, Monica; Kim, Seung-Jin; Kvilekval, Kristian; Manjunath, B. S.; Tannen, Val; Stamatakis, Alexandros; Sanderson, Michael; Welch, Stephen M.; Cranston, Karen A.; Soltis, Pamela; Soltis, Doug; O'Meara, Brian; Ane, Cecile; Brutnell, Tom; Kleibenstein, Daniel J.; White, Jeffery W.; Leebens-Mack, James; Donoghue, Michael J.; Spalding, Edgar P.; Vision, Todd J.; Myers, Christopher R.; Lowenthal, David; Enquist, Brian J.; Boyle, Brad; Akoglu, Ali; Andrews, Greg; Ram, Sudha; Ware, Doreen; Stein, Lincoln; Stanzione, Dan
2011-01-01
The iPlant Collaborative (iPlant) is a United States National Science Foundation (NSF) funded project that aims to create an innovative, comprehensive, and foundational cyberinfrastructure in support of plant biology research (PSCIC, 2006). iPlant is developing cyberinfrastructure that uniquely enables scientists throughout the diverse fields that comprise plant biology to address Grand Challenges in new ways, to stimulate and facilitate cross-disciplinary research, to promote biology and computer science research interactions, and to train the next generation of scientists on the use of cyberinfrastructure in research and education. Meeting humanity's projected demands for agricultural and forest products and the expectation that natural ecosystems be managed sustainably will require synergies from the application of information technologies. The iPlant cyberinfrastructure design is based on an unprecedented period of research community input, and leverages developments in high-performance computing, data storage, and cyberinfrastructure for the physical sciences. iPlant is an open-source project with application programming interfaces that allow the community to extend the infrastructure to meet its needs. iPlant is sponsoring community-driven workshops addressing specific scientific questions via analysis tool integration and hypothesis testing. These workshops teach researchers how to add bioinformatics tools and/or datasets into the iPlant cyberinfrastructure enabling plant scientists to perform complex analyses on large datasets without the need to master the command-line or high-performance computational services. PMID:22645531
Study of Scientific Production of Community Medicines' Department Indexed in ISI Citation Databases.
Khademloo, Mohammad; Khaseh, Ali Akbar; Siamian, Hasan; Aligolbandi, Kobra; Latifi, Mahsoomeh; Yaminfirooz, Mousa
2016-10-01
In the scientometric, the main criterion in determining the scientific position and ranking of the scientific centers, particularly the universities, is the rate of scientific production and innovation, and in all participations in the global scientific development. One of the subjects more involved in repeatedly dealt with science and technology and effective on the improvement of health is medical science fields. In this research using scientometric and citation analysis, we studied the rate of scientific productions in the field of community medicine, which is the numbers of articles published and indexed in ISI database from 2000 to 2010. This study is scientometric using the survey and analytical citation. The study samples included all of the articles in the ISI database from 2000 to 2010. For the data collection, the advance method of searching was used at the ISI database. The ISI analyses software and descriptive statistics were used for data analysis. Results showed that among the five top universities in producing documents, Tehran University of Medical Sciences with 88 (22.22%) documents are allocated to the first rank of scientific products. M. Askarian with 36 (90/9%) published documents; most of the scientific outputs in Community medicine, in the international arena is the most active author in this field. In collaboration with other writers, Iranian departments of Community Medicine with 27 published articles have the greatest participation with scholars of English authors. In the process of scientific outputs, the results showed that the scientific process was in its lowest in the years 2000 to 2004, and while the department of Community medicine in 2009 allocated most of the production process to itself. Iranian Journal of Public Health and Saudi Medical Journal each of them had 16 articles which had most participation rate in the publishing of community medicine's department. On the type of carrier, community medicine's department by presentation of 340(85.86%) articles had presented most of their scientific productions in the format of article, also in the field of community medicine outputs, article entitled: "Iron loading and erythrophagocytosis increase ferroportin 1 (FPN1) expression in J774 macrophages"(1) with 81 citations ranked first in cited articles. Subject areas of occupational health with 70 articles and subject areas of general medicine with 69 articles ranked the most active research areas in the Production of community medicine's department. the obtained data showed the much growth of scientific production. The Tehran University of medical Sciences ranked the first in publishing articles in community medicine's department and with most collaboration with community medicine department of England writers in this field and most writers will present their works in paper format.
Study of Scientific Production of Community Medicines’ Department Indexed in ISI Citation Databases
Khademloo, Mohammad; Khaseh, Ali Akbar; Siamian, Hasan; Aligolbandi, Kobra; Latifi, Mahsoomeh; Yaminfirooz, Mousa
2016-01-01
Background: In the scientometric, the main criterion in determining the scientific position and ranking of the scientific centers, particularly the universities, is the rate of scientific production and innovation, and in all participations in the global scientific development. One of the subjects more involved in repeatedly dealt with science and technology and effective on the improvement of health is medical science fields. In this research using scientometric and citation analysis, we studied the rate of scientific productions in the field of community medicine, which is the numbers of articles published and indexed in ISI database from 2000 to 2010. Methods: This study is scientometric using the survey and analytical citation. The study samples included all of the articles in the ISI database from 2000 to 2010. For the data collection, the advance method of searching was used at the ISI database. The ISI analyses software and descriptive statistics were used for data analysis. Results: Results showed that among the five top universities in producing documents, Tehran University of Medical Sciences with 88 (22.22%) documents are allocated to the first rank of scientific products. M. Askarian with 36 (90/9%) published documents; most of the scientific outputs in Community medicine, in the international arena is the most active author in this field. In collaboration with other writers, Iranian departments of Community Medicine with 27 published articles have the greatest participation with scholars of English authors. In the process of scientific outputs, the results showed that the scientific process was in its lowest in the years 2000 to 2004, and while the department of Community medicine in 2009 allocated most of the production process to itself. Iranian Journal of Public Health and Saudi Medical Journal each of them had 16 articles which had most participation rate in the publishing of community medicine’s department. On the type of carrier, community medicine’s department by presentation of 340(85.86%) articles had presented most of their scientific productions in the format of article, also in the field of community medicine outputs, article entitled: “Iron loading and erythrophagocytosis increase ferroportin 1 (FPN1) expression in J774 macrophages”(1) with 81 citations ranked first in cited articles. Subject areas of occupational health with 70 articles and subject areas of general medicine with 69 articles ranked the most active research areas in the Production of community medicine’s department. Conclusion: the obtained data showed the much growth of scientific production. The Tehran University of medical Sciences ranked the first in publishing articles in community medicine’s department and with most collaboration with community medicine department of England writers in this field and most writers will present their works in paper format. PMID:28077896
CryoSat Ice Processor: Known Processor Anomalies and Potential Future Product Evolutions
NASA Astrophysics Data System (ADS)
Mannan, R.; Webb, E.; Hall, A.; Bouffard, J.; Femenias, P.; Parrinello, T.; Bouffard, J.; Brockley, D.; Baker, S.; Scagliola, M.; Urien, S.
2016-08-01
Launched in 2010, CryoSat was designed to measure changes in polar sea ice thickness and ice sheet elevation. To reach this goal the CryoSat data products have to meet the highest performance standards and are subjected to a continual cycle of improvement achieved through upgrades to the Instrument Processing Facilities (IPFs). Following the switch to the Baseline-C Ice IPFs there are already planned evolutions for the next processing Baseline, based on recommendations from the Scientific Community, Expert Support Laboratory (ESL), Quality Control (QC) Centres and Validation campaigns. Some of the proposed evolutions, to be discussed with the scientific community, include the activation of freeboard computation in SARin mode, the potential operation of SARin mode over flat-to-slope transitory land ice areas, further tuning of the land ice retracker, the switch to NetCDF format and the resolution of anomalies arising in Baseline-C. This paper describes some of the anomalies known to affect Baseline-C in addition to potential evolutions that are planned and foreseen for Baseline-D.
War of Ontology Worlds: Mathematics, Computer Code, or Esperanto?
Rzhetsky, Andrey; Evans, James A.
2011-01-01
The use of structured knowledge representations—ontologies and terminologies—has become standard in biomedicine. Definitions of ontologies vary widely, as do the values and philosophies that underlie them. In seeking to make these views explicit, we conducted and summarized interviews with a dozen leading ontologists. Their views clustered into three broad perspectives that we summarize as mathematics, computer code, and Esperanto. Ontology as mathematics puts the ultimate premium on rigor and logic, symmetry and consistency of representation across scientific subfields, and the inclusion of only established, non-contradictory knowledge. Ontology as computer code focuses on utility and cultivates diversity, fitting ontologies to their purpose. Like computer languages C++, Prolog, and HTML, the code perspective holds that diverse applications warrant custom designed ontologies. Ontology as Esperanto focuses on facilitating cross-disciplinary communication, knowledge cross-referencing, and computation across datasets from diverse communities. We show how these views align with classical divides in science and suggest how a synthesis of their concerns could strengthen the next generation of biomedical ontologies. PMID:21980276
War of ontology worlds: mathematics, computer code, or Esperanto?
Rzhetsky, Andrey; Evans, James A
2011-09-01
The use of structured knowledge representations-ontologies and terminologies-has become standard in biomedicine. Definitions of ontologies vary widely, as do the values and philosophies that underlie them. In seeking to make these views explicit, we conducted and summarized interviews with a dozen leading ontologists. Their views clustered into three broad perspectives that we summarize as mathematics, computer code, and Esperanto. Ontology as mathematics puts the ultimate premium on rigor and logic, symmetry and consistency of representation across scientific subfields, and the inclusion of only established, non-contradictory knowledge. Ontology as computer code focuses on utility and cultivates diversity, fitting ontologies to their purpose. Like computer languages C++, Prolog, and HTML, the code perspective holds that diverse applications warrant custom designed ontologies. Ontology as Esperanto focuses on facilitating cross-disciplinary communication, knowledge cross-referencing, and computation across datasets from diverse communities. We show how these views align with classical divides in science and suggest how a synthesis of their concerns could strengthen the next generation of biomedical ontologies.
McKinney, Bill; Meyer, Peter A.; Crosas, Mercè; Sliz, Piotr
2016-01-01
Access to experimental X-ray diffraction image data is important for validation and reproduction of macromolecular models and indispensable for the development of structural biology processing methods. In response to the evolving needs of the structural biology community, we recently established a diffraction data publication system, the Structural Biology Data Grid (SBDG, data.sbgrid.org), to preserve primary experimental datasets supporting scientific publications. All datasets published through the SBDG are freely available to the research community under a public domain dedication license, with metadata compliant with the DataCite Schema (schema.datacite.org). A proof-of-concept study demonstrated community interest and utility. Publication of large datasets is a challenge shared by several fields, and the SBDG has begun collaborating with the Institute for Quantitative Social Science at Harvard University to extend the Dataverse (dataverse.org) open-source data repository system to structural biology datasets. Several extensions are necessary to support the size and metadata requirements for structural biology datasets. In this paper, we describe one such extension—functionality supporting preservation of filesystem structure within Dataverse—which is essential for both in-place computation and supporting non-http data transfers. PMID:27862010
Science and Technology Diplomacy with Cuba
NASA Astrophysics Data System (ADS)
Colon, Frances
President Obama's announcement of U. S. policy change toward Cuba and increased freedom of interaction with the Cuban people opens unprecedented and long-awaited opportunities for the scientific and engineering communities in the U. S. and in Cuba to establish and expand collaborative efforts that will greatly advance U.S. and Cuba science and technology agendas. New rules for export of donated-only items for scientific use will bring researchers closer to the level of their professional peers around the world. Increasing Cubans' access to information will result in greater interactions between scientific communities and enable the sharing of ideas and discoveries that can fuel entrepreneurship on the island. The scientific community has expressed an extraordinary level of interest in the wide range of scientific opportunities that the new policy presents, in collaborating with their Cuban counterparts, and in supporting the development of scientific capacity in Cuba. In response to numerous expressions of interest and inquiries from the scientific community, the Office of the Science and Technology Adviser to the Secretary of State (STAS) has engaged in public outreach to inform the U.S. science and technology community of the implications of the new policy for collaborative research, emerging scientific opportunities, and the standing limitations for engagement with the people of Cuba.
NASA Astrophysics Data System (ADS)
Ramamurthy, M.
2005-12-01
A revolution is underway in the role played by cyberinfrastructure and data services in the conduct of research and education. We live in an era of an unprecedented data volume from diverse sources, multidisciplinary analysis and synthesis, and active, learner-centered education emphasis. For example, modern remote-sensing systems like hyperspectral satellite instruments generate terabytes of data each day. Environmental problems such as global change and water cycle transcend disciplinary as well as geographic boundaries, and their solution requires integrated earth system science approaches. Contemporary education strategies recommend adopting an Earth system science approach for teaching the geosciences, employing new pedagogical techniques such as enquiry-based learning and hands-on activities. Needless to add, today's education and research enterprise depends heavily on robust, flexible and scalable cyberinfrastructure, especially on the ready availability of quality data and appropriate tools to manipulate and integrate those data. Fortuitously, rapid advances in computing and communication technologies have also revolutionized how data, tools and services are being incorporated into the teaching and scientific enterprise. The exponential growth in the use of the Internet in education and research, largely due to the advent of the World Wide Web, is by now well documented. On the other hand, how some of the other technological and community trends that have shaped the use of cyberinfrastructure, especially data services, is less well understood. For example, the computing industry is converging on an approach called Web services that enables a standard and yet revolutionary way of building applications and methods to connect and exchange information over the Web. This new approach, based on XML - a widely accepted format for exchanging data and corresponding semantics over the Internet - enables applications, computer systems, and information processes to work together in a fundamentally different way. Likewise, the advent of digital libraries, grid computing platforms, interoperable frameworks, standards and protocols, open-source software, and community atmospheric models have been important drivers in shaping the use of a new generation of end-to-end cyberinfrastructure for solving some of the most challenging scientific and educational problems. In this talk, I will present an overview of the scientific, technological, and educational drivers and discuss recent developments in cyberinfrastructure and Unidata's role and directions in providing robust, end-to-end data services for solving geoscientific problems and advancing student learning.
Computationally efficient statistical differential equation modeling using homogenization
Hooten, Mevin B.; Garlick, Martha J.; Powell, James A.
2013-01-01
Statistical models using partial differential equations (PDEs) to describe dynamically evolving natural systems are appearing in the scientific literature with some regularity in recent years. Often such studies seek to characterize the dynamics of temporal or spatio-temporal phenomena such as invasive species, consumer-resource interactions, community evolution, and resource selection. Specifically, in the spatial setting, data are often available at varying spatial and temporal scales. Additionally, the necessary numerical integration of a PDE may be computationally infeasible over the spatial support of interest. We present an approach to impose computationally advantageous changes of support in statistical implementations of PDE models and demonstrate its utility through simulation using a form of PDE known as “ecological diffusion.” We also apply a statistical ecological diffusion model to a data set involving the spread of mountain pine beetle (Dendroctonus ponderosae) in Idaho, USA.
Social behavioural epistemology and the scientific community.
Watve, Milind
2017-07-01
The progress of science is influenced substantially by social behaviour of and social interactions within the scientific community. Similar to innovations in primate groups, the social acceptance of an innovation depends not only upon the relevance of the innovation but also on the social dominance and connectedness of the innovator. There are a number of parallels between many well-known phenomena in behavioural evolution and various behavioural traits observed in the scientific community. It would be useful, therefore, to use principles of behavioural evolution as hypotheses to study the social behaviour of the scientific community. I argue in this paper that a systematic study of social behavioural epistemology is likely to boost the progress of science by addressing several prevalent biases and other problems in scientific communication and by facilitating appropriate acceptance/rejection of novel concepts.
NASA Astrophysics Data System (ADS)
Ramamurthy, M. K.
2006-05-01
We live in an era of an unprecedented data volumes, multidisciplinary analysis and synthesis, and active, learner-centered education emphasis. For instance, a new generation of satellite instruments is being designed for GOES-R and NPOESS programs to deliver terabytes of data each day. Similarly, high-resolution, coupled models run over a wide range of temporal scales are generating data at unprecedented rates. Complex environmental problems such as El Nino/Southern Oscillation, climate change, and water cycle transcend not only disciplinary but also geographic boundaries, with their impacts and implications touching every region and community of the world. The understanding and solution to these inherently global scientific and social problems requires integrated observations that cover all areas of the globe, international sharing and flow of data, and earth system science approaches. Contemporary education strategies recommend adopting an Earth system science approach for teaching the geosciences, employing new pedagogical techniques such as enquiry-based learning and hands-on activities. Needless to add, today's education and research enterprise depends heavily on easy to use, robust, flexible and scalable cyberinfrastructure, especially on the ready availability of quality data and appropriate tools to manipulate and integrate those data. Fortunately, rapid advances in computing, communication and information technologies have provided solutions that can are being applied to advance teaching, research, and service. The exponential growth in the use of the Internet in education and research, largely due to the advent of the World Wide Web, is well documented. On the other hand, how other technological and community trends have shaped the development and application of cyberinfrastructure, especially in the data services area, is less well understood. For example, the computing industry is converging on an approach called Web services that enables a standard and yet revolutionary way of building applications and methods to connect and exchange information over the Web. This new approach, based on XML - a widely accepted format for exchanging data and corresponding semantics over the Internet - enables applications, computer systems, and information processes to work together in fundamentally different ways. Likewise, the advent of digital libraries, grid computing platforms, interoperable frameworks, standards and protocols, open-source software, and community atmospheric models have been important drivers in shaping the use of a new generation of end-to-end cyberinfrastructure for solving some of the most challenging scientific and educational problems. In this talk, I will present an overview of the scientific, technological, and educational landscape, discuss recent developments in cyberinfrastructure, and Unidata's role in and vision for providing easy-to use, robust, end-to-end data services for solving geoscientific problems and advancing student learning.
NASA Astrophysics Data System (ADS)
Downs, R. R.; Lenhardt, W. C.; Robinson, E.
2014-12-01
Science software is integral to the scientific process and must be developed and managed in a sustainable manner to ensure future access to scientific data and related resources. Organizations that are part of the scientific enterprise, as well as members of the scientific community who work within these entities, can contribute to the sustainability of science software and to practices that improve scientific community capabilities for science software sustainability. As science becomes increasingly digital and therefore, dependent on software, improving community practices for sustainable science software will contribute to the sustainability of science. Members of the Earth science informatics community, including scientific data producers and distributers, end-user scientists, system and application developers, and data center managers, use science software regularly and face the challenges and the opportunities that science software presents for the sustainability of science. To gain insight on practices needed for the sustainability of science software from the science software experiences of the Earth science informatics community, an interdisciplinary group of 300 community members were asked to engage in simultaneous roundtable discussions and report on their answers to questions about the requirements for improving scientific software sustainability. This paper will present an analysis of the issues reported and the conclusions offered by the participants. These results provide perspectives for science software sustainability practices and have implications for actions that organizations and their leadership can initiate to improve the sustainability of science software.
Farrell, Nicholas R; Deacon, Brett J
2016-03-01
Although client preferences are an integral component of evidence-based practice in psychology (American Psychological Association, 2006), relatively little research has examined what potential mental health consumers value in the psychotherapy they may receive. The present study was conducted to examine community members' preferences for the scientific and relational aspects of psychotherapy for different types of presenting problems, and how accurately therapists perceive these preferences. Community members (n = 200) were surveyed about the importance of scientific (e.g., demonstrated efficacy in clinical trials) and relational (e.g., therapist empathy) characteristics of psychotherapy both for anxiety disorders (e.g., obsessive-compulsive disorder) and disorder-nonspecific issues (e.g., relationship difficulties). Therapists (n = 199) completed the same survey and responded how they expected the average mental health consumer would. Results showed that although community members valued relational characteristics significantly more than scientific characteristics, the gap between these two was large for disorder-nonspecific issues (d = 1.24) but small for anxiety disorders (d = .27). Community members rated scientific credibility as important across problem types. Therapists significantly underestimated the importance of scientific characteristics to community members, particularly in the treatment of disorder-nonspecific issues (d = .74). Therapists who valued research less in their own practice were more likely to underestimate the importance of scientific credibility to community members. The implications of the present findings for understanding the nature of client preferences in evidence-based psychological practice are discussed. Copyright © 2015 Elsevier Ltd. All rights reserved.
Testing Scientific Software: A Systematic Literature Review
Kanewala, Upulee; Bieman, James M.
2014-01-01
Context Scientific software plays an important role in critical decision making, for example making weather predictions based on climate models, and computation of evidence for research publications. Recently, scientists have had to retract publications due to errors caused by software faults. Systematic testing can identify such faults in code. Objective This study aims to identify specific challenges, proposed solutions, and unsolved problems faced when testing scientific software. Method We conducted a systematic literature survey to identify and analyze relevant literature. We identified 62 studies that provided relevant information about testing scientific software. Results We found that challenges faced when testing scientific software fall into two main categories: (1) testing challenges that occur due to characteristics of scientific software such as oracle problems and (2) testing challenges that occur due to cultural differences between scientists and the software engineering community such as viewing the code and the model that it implements as inseparable entities. In addition, we identified methods to potentially overcome these challenges and their limitations. Finally we describe unsolved challenges and how software engineering researchers and practitioners can help to overcome them. Conclusions Scientific software presents special challenges for testing. Specifically, cultural differences between scientist developers and software engineers, along with the characteristics of the scientific software make testing more difficult. Existing techniques such as code clone detection can help to improve the testing process. Software engineers should consider special challenges posed by scientific software such as oracle problems when developing testing techniques. PMID:25125798
Ports Primer: 8.1 Using Scientific Data and Research
Communities can demonstrate environmental concerns by providing scientific evidence of environmental impact. Communities may be able to access existing local data and conduct their own analyses or communities may turn to existing studies.
NASA Astrophysics Data System (ADS)
Myre, Joseph M.
Heterogeneous computing systems have recently come to the forefront of the High-Performance Computing (HPC) community's interest. HPC computer systems that incorporate special purpose accelerators, such as Graphics Processing Units (GPUs), are said to be heterogeneous. Large scale heterogeneous computing systems have consistently ranked highly on the Top500 list since the beginning of the heterogeneous computing trend. By using heterogeneous computing systems that consist of both general purpose processors and special- purpose accelerators, the speed and problem size of many simulations could be dramatically increased. Ultimately this results in enhanced simulation capabilities that allows, in some cases for the first time, the execution of parameter space and uncertainty analyses, model optimizations, and other inverse modeling techniques that are critical for scientific discovery and engineering analysis. However, simplifying the usage and optimization of codes for heterogeneous computing systems remains a challenge. This is particularly true for scientists and engineers for whom understanding HPC architectures and undertaking performance analysis may not be primary research objectives. To enable scientists and engineers to remain focused on their primary research objectives, a modular environment for geophysical inversion and run-time autotuning on heterogeneous computing systems is presented. This environment is composed of three major components: 1) CUSH---a framework for reducing the complexity of programming heterogeneous computer systems, 2) geophysical inversion routines which can be used to characterize physical systems, and 3) run-time autotuning routines designed to determine configurations of heterogeneous computing systems in an attempt to maximize the performance of scientific and engineering codes. Using three case studies, a lattice-Boltzmann method, a non-negative least squares inversion, and a finite-difference fluid flow method, it is shown that this environment provides scientists and engineers with means to reduce the programmatic complexity of their applications, to perform geophysical inversions for characterizing physical systems, and to determine high-performing run-time configurations of heterogeneous computing systems using a run-time autotuner.
Computational Science: A Research Methodology for the 21st Century
NASA Astrophysics Data System (ADS)
Orbach, Raymond L.
2004-03-01
Computational simulation - a means of scientific discovery that employs computer systems to simulate a physical system according to laws derived from theory and experiment - has attained peer status with theory and experiment. Important advances in basic science are accomplished by a new "sociology" for ultrascale scientific computing capability (USSCC), a fusion of sustained advances in scientific models, mathematical algorithms, computer architecture, and scientific software engineering. Expansion of current capabilities by factors of 100 - 1000 open up new vistas for scientific discovery: long term climatic variability and change, macroscopic material design from correlated behavior at the nanoscale, design and optimization of magnetic confinement fusion reactors, strong interactions on a computational lattice through quantum chromodynamics, and stellar explosions and element production. The "virtual prototype," made possible by this expansion, can markedly reduce time-to-market for industrial applications such as jet engines and safer, more fuel efficient cleaner cars. In order to develop USSCC, the National Energy Research Scientific Computing Center (NERSC) announced the competition "Innovative and Novel Computational Impact on Theory and Experiment" (INCITE), with no requirement for current DOE sponsorship. Fifty nine proposals for grand challenge scientific problems were submitted for a small number of awards. The successful grants, and their preliminary progress, will be described.
NASA Astrophysics Data System (ADS)
Haslinger, Florian; Dupont, Aurelien; Michelini, Alberto; Rietbrock, Andreas; Sleeman, Reinoud; Wiemer, Stefan; Basili, Roberto; Bossu, Rémy; Cakti, Eser; Cotton, Fabrice; Crawford, Wayne; Diaz, Jordi; Garth, Tom; Locati, Mario; Luzi, Lucia; Pinho, Rui; Pitilakis, Kyriazis; Strollo, Angelo
2016-04-01
Easy, efficient and comprehensive access to data, data products, scientific services and scientific software is a key ingredient in enabling research at the frontiers of science. Organizing this access across the European Research Infrastructures in the field of seismology, so that it best serves user needs, takes advantage of state-of-the-art ICT solutions, provides cross-domain interoperability, and is organizationally and financially sustainable in the long term, is the core challenge of the implementation phase of the Thematic Core Service (TCS) Seismology within the EPOS-IP project. Building upon the existing European-level infrastructures ORFEUS for seismological waveforms, EMSC for seismological products, and EFEHR for seismological hazard and risk information, and implementing a pilot Computational Earth Science service starting from the results of the VERCE project, the work within the EPOS-IP project focuses on improving and extending the existing services, aligning them with global developments, to at the end produce a well coordinated framework that is technically, organizationally, and financially integrated with the EPOS architecture. This framework needs to respect the roles and responsibilities of the underlying national research infrastructures that are the data owners and main providers of data and products, and allow for active input and feedback from the (scientific) user community. At the same time, it needs to remain flexible enough to cope with unavoidable challenges in the availability of resources and dynamics of contributors. The technical work during the next years is organized in four areas: - constructing the next generation software architecture for the European Integrated (waveform) Data Archive EIDA, developing advanced metadata and station information services, fully integrate strong motion waveforms and derived parametric engineering-domain data, and advancing the integration of mobile (temporary) networks and OBS deployments in EIDA; - further development and expansion of services to access seismological products of scientific interest as provided by the community by implementing a common collection and development (IT) platform, improvements in the earthquake information services e.g. by introducing more robust quality indicators and diversifying collection and dissemination mechanisms, as well as improving historical earthquake data services; - development of a comprehensive suite of earthquake hazard products, tools, and services harmonized on the European level and available through a common access platform, encompassing information on seismic sources, seismogenic faults, ground-motion prediction equations, geotechnical information, and strong-motion recordings in buildings, together with an interface to earthquake risk; - a portal implementation of computational seismology tools and services, specifically for seismic waveform propagation in complex 3D media following the results of the VERCE project, and initiating the inclusion of further suitable codes on that portal in discussion with the community, forming the basis of EPOS computational earth science infrastructure. This will be accompanied by development and implementation of integrated and interoperable metadata structures, adequate and referencable persistent identifiers, and appropriate user access and authorization mechanisms. Here we present further detail on the work plan with the attempt to foster interaction with the target user community on the spectrum of services as well as on feedback mechanisms and governance.
InSAR Scientific Computing Environment
NASA Astrophysics Data System (ADS)
Gurrola, E. M.; Rosen, P. A.; Sacco, G.; Zebker, H. A.; Simons, M.; Sandwell, D. T.
2010-12-01
The InSAR Scientific Computing Environment (ISCE) is a software development effort in its second year within the NASA Advanced Information Systems and Technology program. The ISCE will provide a new computing environment for geodetic image processing for InSAR sensors that will enable scientists to reduce measurements directly from radar satellites and aircraft to new geophysical products without first requiring them to develop detailed expertise in radar processing methods. The environment can serve as the core of a centralized processing center to bring Level-0 raw radar data up to Level-3 data products, but is adaptable to alternative processing approaches for science users interested in new and different ways to exploit mission data. The NRC Decadal Survey-recommended DESDynI mission will deliver data of unprecedented quantity and quality, making possible global-scale studies in climate research, natural hazards, and Earth's ecosystem. The InSAR Scientific Computing Environment is planned to become a key element in processing DESDynI data into higher level data products and it is expected to enable a new class of analyses that take greater advantage of the long time and large spatial scales of these new data, than current approaches. At the core of ISCE is both legacy processing software from the JPL/Caltech ROI_PAC repeat-pass interferometry package as well as a new InSAR processing package containing more efficient and more accurate processing algorithms being developed at Stanford for this project that is based on experience gained in developing processors for missions such as SRTM and UAVSAR. Around the core InSAR processing programs we are building object-oriented wrappers to enable their incorporation into a more modern, flexible, extensible software package that is informed by modern programming methods, including rigorous componentization of processing codes, abstraction and generalization of data models, and a robust, intuitive user interface with graduated exposure to the levels of sophistication, allowing novices to apply it readily for common tasks and experienced users to mine data with great facility and flexibility. The environment is designed to easily allow user contributions, enabling an open source community to extend the framework into the indefinite future. In this paper we briefly describe both the legacy and the new core processing algorithms and their integration into the new computing environment. We describe the ISCE component and application architecture and the features that permit the desired flexibility, extensibility and ease-of-use. We summarize the state of progress of the environment and the plans for completion of the environment and for its future introduction into the radar processing community.
Virtual Geophysics Laboratory: Exploiting the Cloud and Empowering Geophysicsts
NASA Astrophysics Data System (ADS)
Fraser, Ryan; Vote, Josh; Goh, Richard; Cox, Simon
2013-04-01
Over the last five decades geoscientists from Australian state and federal agencies have collected and assembled around 3 Petabytes of geoscience data sets under public funding. As a consequence of technological progress, data is now being acquired at exponential rates and in higher resolution than ever before. Effective use of these big data sets challenges the storage and computational infrastructure of most organizations. The Virtual Geophysics Laboratory (VGL) is a scientific workflow portal addresses some of the resulting issues by providing Australian geophysicists with access to a Web 2.0 or Rich Internet Application (RIA) based integrated environment that exploits eResearch tools and Cloud computing technology, and promotes collaboration between the user community. VGL simplifies and automates large portions of what were previously manually intensive scientific workflow processes, allowing scientists to focus on the natural science problems, rather than computer science and IT. A number of geophysical processing codes are incorporated to support multiple workflows. For example a gravity inversion can be performed by combining the Escript/Finley codes (from the University of Queensland) with the gravity data registered in VGL. Likewise, tectonic processes can also be modeled by combining the Underworld code (from Monash University) with one of the various 3D models available to VGL. Cloud services provide scalable and cost effective compute resources. VGL is built on top of mature standards-compliant information services, many deployed using the Spatial Information Services Stack (SISS), which provides direct access to geophysical data. A large number of data sets from Geoscience Australia assist users in data discovery. GeoNetwork provides a metadata catalog to store workflow results for future use, discovery and provenance tracking. VGL has been developed in collaboration with the research community using incremental software development practices and open source tools. While developed to provide the geophysics research community with a sustainable platform and scalable infrastructure; VGL has also developed a number of concepts, patterns and generic components of which have been reused for cases beyond geophysics, including natural hazards, satellite processing and other areas requiring spatial data discovery and processing. Future plans for VGL include a number of improvements in both functional and non-functional areas in response to its user community needs and advancement in information technologies. In particular, research is underway in the following areas (a) distributed and parallel workflow processing in the cloud, (b) seamless integration with various cloud providers, and (c) integration with virtual laboratories representing other science domains. Acknowledgements: VGL was developed by CSIRO in collaboration with Geoscience Australia, National Computational Infrastructure, Australia National University, Monash University and University of Queensland, and has been supported by the Australian Government's Education Investment Funds through NeCTAR.
Genome annotation in a community college cell biology lab.
Beagley, C Timothy
2013-01-01
The Biology Department at Salt Lake Community College has used the IMG-ACT toolbox to introduce a genome mapping and annotation exercise into the laboratory portion of its Cell Biology course. This project provides students with an authentic inquiry-based learning experience while introducing them to computational biology and contemporary learning skills. Additionally, the project strengthens student understanding of the scientific method and contributes to student learning gains in curricular objectives centered around basic molecular biology, specifically, the Central Dogma. Importantly, inclusion of this project in the laboratory course provides students with a positive learning environment and allows for the use of cooperative learning strategies to increase overall student success. Copyright © 2012 International Union of Biochemistry and Molecular Biology, Inc.
CernVM WebAPI - Controlling Virtual Machines from the Web
NASA Astrophysics Data System (ADS)
Charalampidis, I.; Berzano, D.; Blomer, J.; Buncic, P.; Ganis, G.; Meusel, R.; Segal, B.
2015-12-01
Lately, there is a trend in scientific projects to look for computing resources in the volunteering community. In addition, to reduce the development effort required to port the scientific software stack to all the known platforms, the use of Virtual Machines (VMs)u is becoming increasingly popular. Unfortunately their use further complicates the software installation and operation, restricting the volunteer audience to sufficiently expert people. CernVM WebAPI is a software solution addressing this specific case in a way that opens wide new application opportunities. It offers a very simple API for setting-up, controlling and interfacing with a VM instance in the users computer, while in the same time offloading the user from all the burden of downloading, installing and configuring the hypervisor. WebAPI comes with a lightweight javascript library that guides the user through the application installation process. Malicious usage is prohibited by offering a per-domain PKI validation mechanism. In this contribution we will overview this new technology, discuss its security features and examine some test cases where it is already in use.
IEEE International Symposium on Biomedical Imaging.
2017-01-01
The IEEE International Symposium on Biomedical Imaging (ISBI) is a scientific conference dedicated to mathematical, algorithmic, and computational aspects of biological and biomedical imaging, across all scales of observation. It fosters knowledge transfer among different imaging communities and contributes to an integrative approach to biomedical imaging. ISBI is a joint initiative from the IEEE Signal Processing Society (SPS) and the IEEE Engineering in Medicine and Biology Society (EMBS). The 2018 meeting will include tutorials, and a scientific program composed of plenary talks, invited special sessions, challenges, as well as oral and poster presentations of peer-reviewed papers. High-quality papers are requested containing original contributions to the topics of interest including image formation and reconstruction, computational and statistical image processing and analysis, dynamic imaging, visualization, image quality assessment, and physical, biological, and statistical modeling. Accepted 4-page regular papers will be published in the symposium proceedings published by IEEE and included in IEEE Xplore. To encourage attendance by a broader audience of imaging scientists and offer additional presentation opportunities, ISBI 2018 will continue to have a second track featuring posters selected from 1-page abstract submissions without subsequent archival publication.
Mission critical cloud computing in a week
NASA Astrophysics Data System (ADS)
George, B.; Shams, K.; Knight, D.; Kinney, J.
NASA's vision is to “ reach for new heights and reveal the unknown so that what we do and learn will benefit all humankind.” While our missions provide large volumes of unique and invaluable data to the scientific community, they also serve to inspire and educate the next generation of engineers and scientists. One critical aspect of “ benefiting all humankind” is to make our missions as visible and accessible as possible to facilitate the transfer of scientific knowledge to the public. The recent successful landing of the Curiosity rover on Mars exemplified this vision: we shared the landing event via live video streaming and web experiences with millions of people around the world. The video stream on Curiosity's website was delivered by a highly scalable stack of computing resources in the cloud to cache and distribute the video stream to our viewers. While this work was done in the context of public outreach, it has extensive implications for the development of mission critical, highly available, and elastic applications in the cloud for a diverse set of use cases across NASA.
Alzforum and SWAN: the present and future of scientific web communities.
Clark, Tim; Kinoshita, June
2007-05-01
Scientists drove the early development of the World Wide Web, primarily as a means for rapid communication, document sharing and data access. They have been far slower to adopt the web as a medium for building research communities. Yet, web-based communities hold great potential for accelerating the pace of scientific research. In this article, we will describe the 10-year experience of the Alzheimer Research Forum ('Alzforum'), a unique example of a thriving scientific web community, and explain the features that contributed to its success. We will then outline the SWAN (Semantic Web Applications in Neuromedicine) project, in which Alzforum curators are collaborating with informatics researchers to develop novel approaches that will enable communities to share richly contextualized information about scientific data, claims and hypotheses.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Geveci, Berk; Maynard, Robert
The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. The XVis project brought together collaborators from predominant DOE projects for visualization on accelerators and combining their respectivemore » features into a new visualization toolkit called VTK-m.« less
MSL: Facilitating automatic and physical analysis of published scientific literature in PDF format.
Ahmed, Zeeshan; Dandekar, Thomas
2015-01-01
Published scientific literature contains millions of figures, including information about the results obtained from different scientific experiments e.g. PCR-ELISA data, microarray analysis, gel electrophoresis, mass spectrometry data, DNA/RNA sequencing, diagnostic imaging (CT/MRI and ultrasound scans), and medicinal imaging like electroencephalography (EEG), magnetoencephalography (MEG), echocardiography (ECG), positron-emission tomography (PET) images. The importance of biomedical figures has been widely recognized in scientific and medicine communities, as they play a vital role in providing major original data, experimental and computational results in concise form. One major challenge for implementing a system for scientific literature analysis is extracting and analyzing text and figures from published PDF files by physical and logical document analysis. Here we present a product line architecture based bioinformatics tool 'Mining Scientific Literature (MSL)', which supports the extraction of text and images by interpreting all kinds of published PDF files using advanced data mining and image processing techniques. It provides modules for the marginalization of extracted text based on different coordinates and keywords, visualization of extracted figures and extraction of embedded text from all kinds of biological and biomedical figures using applied Optimal Character Recognition (OCR). Moreover, for further analysis and usage, it generates the system's output in different formats including text, PDF, XML and images files. Hence, MSL is an easy to install and use analysis tool to interpret published scientific literature in PDF format.
Bringing Legacy Visualization Software to Modern Computing Devices via Application Streaming
NASA Astrophysics Data System (ADS)
Fisher, Ward
2014-05-01
Planning software compatibility across forthcoming generations of computing platforms is a problem commonly encountered in software engineering and development. While this problem can affect any class of software, data analysis and visualization programs are particularly vulnerable. This is due in part to their inherent dependency on specialized hardware and computing environments. A number of strategies and tools have been designed to aid software engineers with this task. While generally embraced by developers at 'traditional' software companies, these methodologies are often dismissed by the scientific software community as unwieldy, inefficient and unnecessary. As a result, many important and storied scientific software packages can struggle to adapt to a new computing environment; for example, one in which much work is carried out on sub-laptop devices (such as tablets and smartphones). Rewriting these packages for a new platform often requires significant investment in terms of development time and developer expertise. In many cases, porting older software to modern devices is neither practical nor possible. As a result, replacement software must be developed from scratch, wasting resources better spent on other projects. Enabled largely by the rapid rise and adoption of cloud computing platforms, 'Application Streaming' technologies allow legacy visualization and analysis software to be operated wholly from a client device (be it laptop, tablet or smartphone) while retaining full functionality and interactivity. It mitigates much of the developer effort required by other more traditional methods while simultaneously reducing the time it takes to bring the software to a new platform. This work will provide an overview of Application Streaming and how it compares against other technologies which allow scientific visualization software to be executed from a remote computer. We will discuss the functionality and limitations of existing application streaming frameworks and how a developer might prepare their software for application streaming. We will also examine the secondary benefits realized by moving legacy software to the cloud. Finally, we will examine the process by which a legacy Java application, the Integrated Data Viewer (IDV), is to be adapted for tablet computing via Application Streaming.
Development of hybrid 3-D hydrological modeling for the NCAR Community Earth System Model (CESM)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zeng, Xubin; Troch, Peter; Pelletier, Jon
2015-11-15
This is the Final Report of our four-year (3-year plus one-year no cost extension) collaborative project between the University of Arizona (UA) and the National Center for Atmospheric Research (NCAR). The overall objective of our project is to develop and evaluate the first hybrid 3-D hydrological model with a horizontal grid spacing of 1 km for the NCAR Community Earth System Model (CESM). We have made substantial progress in model development and evaluation, computational efficiencies and software engineering, and data development and evaluation, as discussed in Sections 2-4. Section 5 presents our success in data dissemination, while Section 6 discussesmore » the scientific impacts of our work. Section 7 discusses education and mentoring success of our project, while Section 8 lists our relevant DOE services. All peer-reviewed papers that acknowledged this project are listed in Section 9. Highlights of our achievements include: • We have finished 20 papers (most published already) on model development and evaluation, computational efficiencies and software engineering, and data development and evaluation • The global datasets developed under this project have been permanently archived and publicly available • Some of our research results have already been implemented in WRF and CLM • Patrick Broxton and Michael Brunke have received their Ph.D. • PI Zeng has served on DOE proposal review panels and DOE lab scientific focus area (SFA) review panels« less
An ontology for factors affecting tuberculosis treatment adherence behavior in sub-Saharan Africa.
Ogundele, Olukunle Ayodeji; Moodley, Deshendran; Pillay, Anban W; Seebregts, Christopher J
2016-01-01
Adherence behavior is a complex phenomenon influenced by diverse personal, cultural, and socioeconomic factors that may vary between communities in different regions. Understanding the factors that influence adherence behavior is essential in predicting which individuals and communities are at risk of nonadherence. This is necessary for supporting resource allocation and intervention planning in disease control programs. Currently, there is no known concrete and unambiguous computational representation of factors that influence tuberculosis (TB) treatment adherence behavior that is useful for prediction. This study developed a computer-based conceptual model for capturing and structuring knowledge about the factors that influence TB treatment adherence behavior in sub-Saharan Africa (SSA). An extensive review of existing categorization systems in the literature was used to develop a conceptual model that captured scientific knowledge about TB adherence behavior in SSA. The model was formalized as an ontology using the web ontology language. The ontology was then evaluated for its comprehensiveness and applicability in building predictive models. The outcome of the study is a novel ontology-based approach for curating and structuring scientific knowledge of adherence behavior in patients with TB in SSA. The ontology takes an evidence-based approach by explicitly linking factors to published clinical studies. Factors are structured around five dimensions: factor type, type of effect, regional variation, cross-dependencies between factors, and treatment phase. The ontology is flexible and extendable and provides new insights into the nature of and interrelationship between factors that influence TB adherence.
An ontology for factors affecting tuberculosis treatment adherence behavior in sub-Saharan Africa
Ogundele, Olukunle Ayodeji; Moodley, Deshendran; Pillay, Anban W; Seebregts, Christopher J
2016-01-01
Purpose Adherence behavior is a complex phenomenon influenced by diverse personal, cultural, and socioeconomic factors that may vary between communities in different regions. Understanding the factors that influence adherence behavior is essential in predicting which individuals and communities are at risk of nonadherence. This is necessary for supporting resource allocation and intervention planning in disease control programs. Currently, there is no known concrete and unambiguous computational representation of factors that influence tuberculosis (TB) treatment adherence behavior that is useful for prediction. This study developed a computer-based conceptual model for capturing and structuring knowledge about the factors that influence TB treatment adherence behavior in sub-Saharan Africa (SSA). Methods An extensive review of existing categorization systems in the literature was used to develop a conceptual model that captured scientific knowledge about TB adherence behavior in SSA. The model was formalized as an ontology using the web ontology language. The ontology was then evaluated for its comprehensiveness and applicability in building predictive models. Conclusion The outcome of the study is a novel ontology-based approach for curating and structuring scientific knowledge of adherence behavior in patients with TB in SSA. The ontology takes an evidence-based approach by explicitly linking factors to published clinical studies. Factors are structured around five dimensions: factor type, type of effect, regional variation, cross-dependencies between factors, and treatment phase. The ontology is flexible and extendable and provides new insights into the nature of and interrelationship between factors that influence TB adherence. PMID:27175067
NASA Astrophysics Data System (ADS)
Miller, Stephen D.; Herwig, Kenneth W.; Ren, Shelly; Vazhkudai, Sudharshan S.; Jemian, Pete R.; Luitz, Steffen; Salnikov, Andrei A.; Gaponenko, Igor; Proffen, Thomas; Lewis, Paul; Green, Mark L.
2009-07-01
The primary mission of user facilities operated by Basic Energy Sciences under the Department of Energy is to produce data for users in support of open science and basic research [1]. We trace back almost 30 years of history across selected user facilities illustrating the evolution of facility data management practices and how these practices have related to performing scientific research. The facilities cover multiple techniques such as X-ray and neutron scattering, imaging and tomography sciences. Over time, detector and data acquisition technologies have dramatically increased the ability to produce prolific volumes of data challenging the traditional paradigm of users taking data home upon completion of their experiments to process and publish their results. During this time, computing capacity has also increased dramatically, though the size of the data has grown significantly faster than the capacity of one's laptop to manage and process this new facility produced data. Trends indicate that this will continue to be the case for yet some time. Thus users face a quandary for how to manage today's data complexity and size as these may exceed the computing resources users have available to themselves. This same quandary can also stifle collaboration and sharing. Realizing this, some facilities are already providing web portal access to data and computing thereby providing users access to resources they need [2]. Portal based computing is now driving researchers to think about how to use the data collected at multiple facilities in an integrated way to perform their research, and also how to collaborate and share data. In the future, inter-facility data management systems will enable next tier cross-instrument-cross facility scientific research fuelled by smart applications residing upon user computer resources. We can learn from the medical imaging community that has been working since the early 1990's to integrate data from across multiple modalities to achieve better diagnoses [3] - similarly, data fusion across BES facilities will lead to new scientific discoveries.
Integrated Exoplanet Modeling with the GSFC Exoplanet Modeling & Analysis Center (EMAC)
NASA Astrophysics Data System (ADS)
Mandell, Avi M.; Hostetter, Carl; Pulkkinen, Antti; Domagal-Goldman, Shawn David
2018-01-01
Our ability to characterize the atmospheres of extrasolar planets will be revolutionized by JWST, WFIRST and future ground- and space-based telescopes. In preparation, the exoplanet community must develop an integrated suite of tools with which we can comprehensively predict and analyze observations of exoplanets, in order to characterize the planetary environments and ultimately search them for signs of habitability and life.The GSFC Exoplanet Modeling and Analysis Center (EMAC) will be a web-accessible high-performance computing platform with science support for modelers and software developers to host and integrate their scientific software tools, with the goal of leveraging the scientific contributions from the entire exoplanet community to improve our interpretations of future exoplanet discoveries. Our suite of models will include stellar models, models for star-planet interactions, atmospheric models, planet system science models, telescope models, instrument models, and finally models for retrieving signals from observational data. By integrating this suite of models, the community will be able to self-consistently calculate the emergent spectra from the planet whether from emission, scattering, or in transmission, and use these simulations to model the performance of current and new telescopes and their instrumentation.The EMAC infrastructure will not only provide a repository for planetary and exoplanetary community models, modeling tools and intermodal comparisons, but it will include a "run-on-demand" portal with each software tool hosted on a separate virtual machine. The EMAC system will eventually include a means of running or “checking in” new model simulations that are in accordance with the community-derived standards. Additionally, the results of intermodal comparisons will be used to produce open source publications that quantify the model comparisons and provide an overview of community consensus on model uncertainties on the climates of various planetary targets.
NASA Astrophysics Data System (ADS)
Bailo, Daniele; Scardaci, Diego; Spinuso, Alessandro; Sterzel, Mariusz; Schwichtenberg, Horst; Gemuend, Andre
2016-04-01
The mission of EGI-Engage project [1] is to accelerate the implementation of the Open Science Commons vision, where researchers from all disciplines have easy and open access to the innovative digital services, data, knowledge and expertise they need for collaborative and excellent research. The Open Science Commons is grounded on three pillars: the e-Infrastructure Commons, an ecosystem of services that constitute the foundation layer of distributed infrastructures; the Open Data Commons, where observations, results and applications are increasingly available for scientific research and for anyone to use and reuse; and the Knowledge Commons, in which communities have shared ownership of knowledge, participate in the co-development of software and are technically supported to exploit state-of-the-art digital services. To develop the Knowledge Commons, EGI-Engage is supporting the work of a set of community-specific Competence Centres, with participants from user communities (scientific institutes), National Grid Initiatives (NGIs), technology and service providers. Competence Centres collect and analyse requirements, integrate community-specific applications into state-of-the-art services, foster interoperability across e-Infrastructures, and evolve services through a user-centric development model. One of these Competence Centres is focussed on the European Plate Observing System (EPOS) [2] as representative of the solid earth science communities. EPOS is a pan-European long-term plan to integrate data, software and services from the distributed (and already existing) Research Infrastructures all over Europe, in the domain of the solid earth science. EPOS will enable innovative multidisciplinary research for a better understanding of the Earth's physical and chemical processes that control earthquakes, volcanic eruptions, ground instability and tsunami as well as the processes driving tectonics and Earth's surface dynamics. EPOS will improve our ability to better manage the use of the subsurface of the Earth. EPOS started its Implementation Phase in October 2015 and is now actively working in order to integrate multidisciplinary data into a single e-infrastructure. Multidisciplinary data are organized and governed by the Thematic Core Services (TCS) - European wide organizations and e-Infrastructure providing community specific data and data products - and are driven by various scientific communities encompassing a wide spectrum of Earth science disciplines. TCS data, data products and services will be integrated into the Integrated Core Services (ICS) system, that will ensure their interoperability and access to these services by the scientific community as well as other users within the society. The EPOS competence center (EPOS CC) goal is to tackle two of the main challenges that the ICS are going to face in the near future, by taking advantage of the technical solutions provided by EGI. In order to do this, we will present the two pilot use cases the EGI-EPOS CC is developing: 1) The AAI pilot, dealing with the provision of transparent and homogeneous access to the ICS infrastructure to users owning different kind of credentials (e.g. eduGain, OpenID Connect, X509 certificates etc.). Here the focus is on the mechanisms which allow the credential delegation. 2) The computational pilot, Improve the back-end services of an existing application in the field of Computational Seismology, developed in the context of the EC funded project VERCE. The application allows the processing and the comparison of data resulting from the simulation of seismic wave propagation following a real earthquake and real measurements recorded by seismographs. While the simulation data is produced directly by the users and stored in a Data Management System, the observations need to be pre-staged from institutional data-services, which are maintained by the community itself. This use case aims at exploiting the EGI FedCloud e-infrastructure for Data Intensive analysis and also explores possible interaction with other Common Data Infrastructure initiatives as EUDAT. In the presentation, the state of the art of the two use cases, together with the open challenges and the future application will be discussed. Also, possible integration of EGI solutions with EPOS and other e-infrastructure providers will be considered. [1] EGI-ENGAGE https://www.egi.eu/about/egi-engage/ [2] EPOS http://www.epos-eu.org/
Advanced computations in plasma physics
NASA Astrophysics Data System (ADS)
Tang, W. M.
2002-05-01
Scientific simulation in tandem with theory and experiment is an essential tool for understanding complex plasma behavior. In this paper we review recent progress and future directions for advanced simulations in magnetically confined plasmas with illustrative examples chosen from magnetic confinement research areas such as microturbulence, magnetohydrodynamics, magnetic reconnection, and others. Significant recent progress has been made in both particle and fluid simulations of fine-scale turbulence and large-scale dynamics, giving increasingly good agreement between experimental observations and computational modeling. This was made possible by innovative advances in analytic and computational methods for developing reduced descriptions of physics phenomena spanning widely disparate temporal and spatial scales together with access to powerful new computational resources. In particular, the fusion energy science community has made excellent progress in developing advanced codes for which computer run-time and problem size scale well with the number of processors on massively parallel machines (MPP's). A good example is the effective usage of the full power of multi-teraflop (multi-trillion floating point computations per second) MPP's to produce three-dimensional, general geometry, nonlinear particle simulations which have accelerated progress in understanding the nature of turbulence self-regulation by zonal flows. It should be emphasized that these calculations, which typically utilized billions of particles for thousands of time-steps, would not have been possible without access to powerful present generation MPP computers and the associated diagnostic and visualization capabilities. In general, results from advanced simulations provide great encouragement for being able to include increasingly realistic dynamics to enable deeper physics insights into plasmas in both natural and laboratory environments. The associated scientific excitement should serve to stimulate improved cross-cutting collaborations with other fields and also to help attract bright young talent to plasma science.
NASA Astrophysics Data System (ADS)
Walton, A. L.
2015-12-01
In 2016, the National Science Foundation (NSF) will support a portfolio of activities and investments focused upon challenges in data access, interoperability, and sustainability. These topics are fundamental to science questions of increasing complexity that require multidisciplinary approaches and expertise. Progress has become tractable because of (and sometimes complicated by) unprecedented growth in data (both simulations and observations) and rapid advances in technology (such as instrumentation in all aspects of the discovery process, together with ubiquitous cyberinfrastructure to connect, compute, visualize, store, and discover). The goal is an evolution of capabilities for the research community based on these investments, scientific priorities, technology advances, and policies. Examples from multiple NSF directorates, including investments by the Advanced Cyberinfrastructure Division, are aimed at these challenges and can provide the geosciences research community with models and opportunities for participation. Implications for the future are highlighted, along with the importance of continued community engagement on key issues.
Grid Computing for Earth Science
NASA Astrophysics Data System (ADS)
Renard, Philippe; Badoux, Vincent; Petitdidier, Monique; Cossu, Roberto
2009-04-01
The fundamental challenges facing humankind at the beginning of the 21st century require an effective response to the massive changes that are putting increasing pressure on the environment and society. The worldwide Earth science community, with its mosaic of disciplines and players (academia, industry, national surveys, international organizations, and so forth), provides a scientific basis for addressing issues such as the development of new energy resources; a secure water supply; safe storage of nuclear waste; the analysis, modeling, and mitigation of climate changes; and the assessment of natural and industrial risks. In addition, the Earth science community provides short- and medium-term prediction of weather and natural hazards in real time, and model simulations of a host of phenomena relating to the Earth and its space environment. These capabilities require that the Earth science community utilize, both in real and remote time, massive amounts of data, which are usually distributed among many different organizations and data centers.
Bridging the Gap between Scientific Data Producers and Consumers: A Provenance Approach
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stephan, Eric G.; Pinheiro da Silva, Paulo; Kleese van Dam, Kerstin
2013-06-03
Despite the methodical and painstaking efforts made by scientists to record their scientific findings and protocols, a knowledge gap problem continues to persist today between producers of scientific results and consumers because technology is performing the exchange of data as opposed to scientists making direct contact. Provenance is a means to formalize how this knowledge is transferred. However, for it to be meaningful to scientists, the provenance research community needs continued contributions from the scientific community to extend and leverage provenance-based vocabularies and technology from the provenance community. Going forward the provenance community must also be vigilant to meet scalabilitymore » needs of data intensive science« less
Scientific Networks on Data Landscapes: Question Difficulty, Epistemic Success, and Convergence
Grim, Patrick; Singer, Daniel J.; Fisher, Steven; Bramson, Aaron; Berger, William J.; Reade, Christopher; Flocken, Carissa; Sales, Adam
2014-01-01
A scientific community can be modeled as a collection of epistemic agents attempting to answer questions, in part by communicating about their hypotheses and results. We can treat the pathways of scientific communication as a network. When we do, it becomes clear that the interaction between the structure of the network and the nature of the question under investigation affects epistemic desiderata, including accuracy and speed to community consensus. Here we build on previous work, both our own and others’, in order to get a firmer grasp on precisely which features of scientific communities interact with which features of scientific questions in order to influence epistemic outcomes. Here we introduce a measure on the landscape meant to capture some aspects of the difficulty of answering an empirical question. We then investigate both how different communication networks affect whether the community finds the best answer and the time it takes for the community to reach consensus on an answer. We measure these two epistemic desiderata on a continuum of networks sampled from the Watts-Strogatz spectrum. It turns out that finding the best answer and reaching consensus exhibit radically different patterns. The time it takes for a community to reach a consensus in these models roughly tracks mean path length in the network. Whether a scientific community finds the best answer, on the other hand, tracks neither mean path length nor clustering coefficient. PMID:24683416
Scientific Networks on Data Landscapes: Question Difficulty, Epistemic Success, and Convergence.
Grim, Patrick; Singer, Daniel J; Fisher, Steven; Bramson, Aaron; Berger, William J; Reade, Christopher; Flocken, Carissa; Sales, Adam
2013-12-01
A scientific community can be modeled as a collection of epistemic agents attempting to answer questions, in part by communicating about their hypotheses and results. We can treat the pathways of scientific communication as a network. When we do, it becomes clear that the interaction between the structure of the network and the nature of the question under investigation affects epistemic desiderata, including accuracy and speed to community consensus. Here we build on previous work, both our own and others', in order to get a firmer grasp on precisely which features of scientific communities interact with which features of scientific questions in order to influence epistemic outcomes. Here we introduce a measure on the landscape meant to capture some aspects of the difficulty of answering an empirical question. We then investigate both how different communication networks affect whether the community finds the best answer and the time it takes for the community to reach consensus on an answer. We measure these two epistemic desiderata on a continuum of networks sampled from the Watts-Strogatz spectrum. It turns out that finding the best answer and reaching consensus exhibit radically different patterns. The time it takes for a community to reach a consensus in these models roughly tracks mean path length in the network. Whether a scientific community finds the best answer, on the other hand, tracks neither mean path length nor clustering coefficient.
Information dynamics algorithm for detecting communities in networks
NASA Astrophysics Data System (ADS)
Massaro, Emanuele; Bagnoli, Franco; Guazzini, Andrea; Lió, Pietro
2012-11-01
The problem of community detection is relevant in many scientific disciplines, from social science to statistical physics. Given the impact of community detection in many areas, such as psychology and social sciences, we have addressed the issue of modifying existing well performing algorithms by incorporating elements of the domain application fields, i.e. domain-inspired. We have focused on a psychology and social network-inspired approach which may be useful for further strengthening the link between social network studies and mathematics of community detection. Here we introduce a community-detection algorithm derived from the van Dongen's Markov Cluster algorithm (MCL) method [4] by considering networks' nodes as agents capable to take decisions. In this framework we have introduced a memory factor to mimic a typical human behavior such as the oblivion effect. The method is based on information diffusion and it includes a non-linear processing phase. We test our method on two classical community benchmark and on computer generated networks with known community structure. Our approach has three important features: the capacity of detecting overlapping communities, the capability of identifying communities from an individual point of view and the fine tuning the community detectability with respect to prior knowledge of the data. Finally we discuss how to use a Shannon entropy measure for parameter estimation in complex networks.
NASA Astrophysics Data System (ADS)
Baztan, J.; Vanderlinden, J. P.; Cordier, M.; Da Cunha, C.; Gaye, N.; Huctin, J. M.; Kane, A.; Quensiere, J.; Remvikos, Y.; Seck, A.
2016-12-01
The cultural dimensions of climate change impacts and adaptation have been increasingly examined in recent years through various disciplinary lenses, exposing a clear need for mainstream natural sciences to address the question of how to incorporate the values of communities facing global changes into their work. With this in mind, the work presented here addresses three main questions: (i) Do community members consider available scientific data and findings credible? Answering this question provides insight into whether available scientific knowledge expresses causal links that are mobilized by affected communities. (ii) Do community members consider available scientific data and findings salient? Answering this question provides insight into whether available scientific knowledge focuses on phenomena that those in affected communities think should receive attention. (iii) Do community members consider available scientific data and findings legitimate? Answering this question provides insight into whether available scientific knowledge expresses what is good, tolerable, and/or acceptable for affected communities. These three questions delve into the ways in which adaptation requires affected individuals and communities to adopt attitudes by integrating/woven from potentially conflicting evidence, relevance, and/or normative claims. These questions also shed light on the links between mainstream sciences and studied affected communities. The research presented here focuses on 2 communities: (i) Uummannaq, an island of 12km2 in a fjord, located along the middle of Greenland's west coast and (ii) Joal-Fadiouth & M'bour area in the wester African's coast, few Km south of Dakar, Senegal. This communication shares the results from field work experiences from ARTiticc's interdisciplinary approach to identifying the needs, values, and representations of the world of the communities, and how to fit these elements into mainstream sciences in order to bridge gaps between communities and research efforts and by doing so, determine the optimal adaptation strategies through which to engage global changes.
78 FR 41046 - Advanced Scientific Computing Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2013-07-09
... Services Administration, notice is hereby given that the Advanced Scientific Computing Advisory Committee will be renewed for a two-year period beginning on July 1, 2013. The Committee will provide advice to the Director, Office of Science (DOE), on the Advanced Scientific Computing Research Program managed...
Eppig, Janan T; Smith, Cynthia L; Blake, Judith A; Ringwald, Martin; Kadin, James A; Richardson, Joel E; Bult, Carol J
2017-01-01
The Mouse Genome Informatics (MGI), resource ( www.informatics.jax.org ) has existed for over 25 years, and over this time its data content, informatics infrastructure, and user interfaces and tools have undergone dramatic changes (Eppig et al., Mamm Genome 26:272-284, 2015). Change has been driven by scientific methodological advances, rapid improvements in computational software, growth in computer hardware capacity, and the ongoing collaborative nature of the mouse genomics community in building resources and sharing data. Here we present an overview of the current data content of MGI, describe its general organization, and provide examples using simple and complex searches, and tools for mining and retrieving sets of data.
Measuring the Interdisciplinary Impact of Using Geospatial Data with Remote Sensing Data
NASA Astrophysics Data System (ADS)
Downs, R. R.; Chen, R. S.; Schumacher, J.
2017-12-01
Various disciplines offer benefits to society by contributing to the scientific progress that informs the knowledge and decisions that improve the lives, safety, and conditions of people around the globe. In addition to disciplines within the natural sciences, other disciplines, including those in the social, health, and computer sciences, provide benefits to society by collecting, preparing, and analyzing data in the process of conducting research. Preparing geospatial environmental and socioeconomic data together with remote sensing data from satellite-based instruments for wider use by heterogeneous communities of users increases the potential impact of these data by enabling their use in different application areas and sectors of society. Furthermore, enabling wider use of scientific data can bring to bear resources and expertise that will improve reproducibility, quality, methodological transparency, interoperability, and improved understanding by diverse communities of users. In line with its commitment to open data, the NASA Socioeconomic Data and Applications Center (SEDAC), which focuses on human interactions in the environment, curates and disseminates freely and publicly available geospatial data for use across many disciplines and societal benefit areas. We describe efforts to broaden the use of SEDAC data and to publicly document their impact, assess the interdisciplinary impact of the use of SEDAC data with remote sensing data, and characterize these impacts in terms of their influence across disciplines by analyzing citations of geospatial data with remote sensing data within scientific journals.
Succurro, Antonella; Moejes, Fiona Wanjiku; Ebenhöh, Oliver
2017-08-01
The last few years have seen the advancement of high-throughput experimental techniques that have produced an extraordinary amount of data. Bioinformatics and statistical analyses have become instrumental to interpreting the information coming from, e.g., sequencing data and often motivate further targeted experiments. The broad discipline of "computational biology" extends far beyond the well-established field of bioinformatics, but it is our impression that more theoretical methods such as the use of mathematical models are not yet as well integrated into the research studying microbial interactions. The empirical complexity of microbial communities presents challenges that are difficult to address with in vivo / in vitro approaches alone, and with microbiology developing from a qualitative to a quantitative science, we see stronger opportunities arising for interdisciplinary projects integrating theoretical approaches with experiments. Indeed, the addition of in silico experiments, i.e., computational simulations, has a discovery potential that is, unfortunately, still largely underutilized and unrecognized by the scientific community. This minireview provides an overview of mathematical models of natural ecosystems and emphasizes that one critical point in the development of a theoretical description of a microbial community is the choice of problem scale. Since this choice is mostly dictated by the biological question to be addressed, in order to employ theoretical models fully and successfully it is vital to implement an interdisciplinary view at the conceptual stages of the experimental design. Copyright © 2017 Succurro et al.
ENES the European Network for Earth System modelling and its infrastructure projects IS-ENES
NASA Astrophysics Data System (ADS)
Guglielmo, Francesca; Joussaume, Sylvie; Parinet, Marie
2016-04-01
The scientific community working on climate modelling is organized within the European Network for Earth System modelling (ENES). In the past decade, several European university departments, research centres, meteorological services, computer centres, and industrial partners engaged in the creation of ENES with the purpose of working together and cooperating towards the further development of the network, by signing a Memorandum of Understanding. As of 2015, the consortium counts 47 partners. The climate modelling community, and thus ENES, faces challenges which are both science-driven, i.e. analysing of the full complexity of the Earth System to improve our understanding and prediction of climate changes, and have multi-faceted societal implications, as a better representation of climate change on regional scales leads to improved understanding and prediction of impacts and to the development and provision of climate services. ENES, promoting and endorsing projects and initiatives, helps in developing and evaluating of state-of-the-art climate and Earth system models, facilitates model inter-comparison studies, encourages exchanges of software and model results, and fosters the use of high performance computing facilities dedicated to high-resolution multi-model experiments. ENES brings together public and private partners, integrates countries underrepresented in climate modelling studies, and reaches out to different user communities, thus enhancing European expertise and competitiveness. In this need of sophisticated models, world-class, high-performance computers, and state-of-the-art software solutions to make efficient use of models, data and hardware, a key role is played by the constitution and maintenance of a solid infrastructure, developing and providing services to the different user communities. ENES has investigated the infrastructural needs and has received funding from the EU FP7 program for the IS-ENES (InfraStructure for ENES) phase I and II projects. We present here the case study of an existing network of institutions brought together toward common goals by a non-binding agreement, ENES, and of its two IS-ENES projects. These latter will be discussed in their double role as a means to provide and/or maintain the actual infrastructure (hardware, software, skilled human resources, services) to achieve ENES scientific goals -fulfilling the aims set in a strategy document-, but also to inform and provide to the network a structured way of working and of interacting with the extended community. The genesis and evolution of the network and the interaction network/projects will also be analysed in terms of long-term sustainability.
The SCEC/UseIT Intern Program: Creating Open-Source Visualization Software Using Diverse Resources
NASA Astrophysics Data System (ADS)
Francoeur, H.; Callaghan, S.; Perry, S.; Jordan, T.
2004-12-01
The Southern California Earthquake Center undergraduate IT intern program (SCEC UseIT) conducts IT research to benefit collaborative earth science research. Through this program, interns have developed real-time, interactive, 3D visualization software using open-source tools. Dubbed LA3D, a distribution of this software is now in use by the seismic community. LA3D enables the user to interactively view Southern California datasets and models of importance to earthquake scientists, such as faults, earthquakes, fault blocks, digital elevation models, and seismic hazard maps. LA3D is now being extended to support visualizations anywhere on the planet. The new software, called SCEC-VIDEO (Virtual Interactive Display of Earth Objects), makes use of a modular, plugin-based software architecture which supports easy development and integration of new data sets. Currently SCEC-VIDEO is in beta testing, with a full open-source release slated for the future. Both LA3D and SCEC-VIDEO were developed using a wide variety of software technologies. These, which included relational databases, web services, software management technologies, and 3-D graphics in Java, were necessary to integrate the heterogeneous array of data sources which comprise our software. Currently the interns are working to integrate new technologies and larger data sets to increase software functionality and value. In addition, both LA3D and SCEC-VIDEO allow the user to script and create movies. Thus program interns with computer science backgrounds have been writing software while interns with other interests, such as cinema, geology, and education, have been making movies that have proved of great use in scientific talks, media interviews, and education. Thus, SCEC UseIT incorporates a wide variety of scientific and human resources to create products of value to the scientific and outreach communities. The program plans to continue with its interdisciplinary approach, increasing the relevance of the software and expanding its use in the scientific community.
The ImageJ ecosystem: an open platform for biomedical image analysis
Schindelin, Johannes; Rueden, Curtis T.; Hiner, Mark C.; Eliceiri, Kevin W.
2015-01-01
Technology in microscopy advances rapidly, enabling increasingly affordable, faster, and more precise quantitative biomedical imaging, which necessitates correspondingly more-advanced image processing and analysis techniques. A wide range of software is available – from commercial to academic, special-purpose to Swiss army knife, small to large–but a key characteristic of software that is suitable for scientific inquiry is its accessibility. Open-source software is ideal for scientific endeavors because it can be freely inspected, modified, and redistributed; in particular, the open-software platform ImageJ has had a huge impact on life sciences, and continues to do so. From its inception, ImageJ has grown significantly due largely to being freely available and its vibrant and helpful user community. Scientists as diverse as interested hobbyists, technical assistants, students, scientific staff, and advanced biology researchers use ImageJ on a daily basis, and exchange knowledge via its dedicated mailing list. Uses of ImageJ range from data visualization and teaching to advanced image processing and statistical analysis. The software's extensibility continues to attract biologists at all career stages as well as computer scientists who wish to effectively implement specific image-processing algorithms. In this review, we use the ImageJ project as a case study of how open-source software fosters its suites of software tools, making multitudes of image-analysis technology easily accessible to the scientific community. We specifically explore what makes ImageJ so popular, how it impacts life science, how it inspires other projects, and how it is self-influenced by coevolving projects within the ImageJ ecosystem. PMID:26153368
The ImageJ ecosystem: An open platform for biomedical image analysis.
Schindelin, Johannes; Rueden, Curtis T; Hiner, Mark C; Eliceiri, Kevin W
2015-01-01
Technology in microscopy advances rapidly, enabling increasingly affordable, faster, and more precise quantitative biomedical imaging, which necessitates correspondingly more-advanced image processing and analysis techniques. A wide range of software is available-from commercial to academic, special-purpose to Swiss army knife, small to large-but a key characteristic of software that is suitable for scientific inquiry is its accessibility. Open-source software is ideal for scientific endeavors because it can be freely inspected, modified, and redistributed; in particular, the open-software platform ImageJ has had a huge impact on the life sciences, and continues to do so. From its inception, ImageJ has grown significantly due largely to being freely available and its vibrant and helpful user community. Scientists as diverse as interested hobbyists, technical assistants, students, scientific staff, and advanced biology researchers use ImageJ on a daily basis, and exchange knowledge via its dedicated mailing list. Uses of ImageJ range from data visualization and teaching to advanced image processing and statistical analysis. The software's extensibility continues to attract biologists at all career stages as well as computer scientists who wish to effectively implement specific image-processing algorithms. In this review, we use the ImageJ project as a case study of how open-source software fosters its suites of software tools, making multitudes of image-analysis technology easily accessible to the scientific community. We specifically explore what makes ImageJ so popular, how it impacts the life sciences, how it inspires other projects, and how it is self-influenced by coevolving projects within the ImageJ ecosystem. © 2015 Wiley Periodicals, Inc.
The Roles of Science in Local Resilience Policy Development: A Case Study of Three U.S. Cities
NASA Astrophysics Data System (ADS)
Clavin, C.; Gupta, N.
2015-12-01
The development and deployment of resilience policies within communities in the United States often respond to the place-based, hazard-specific nature of disasters. Prior to the onset of a disaster, municipal and regional decision makers establish long-term development policies, such as land use planning, infrastructure investment, and economic development policies. Despite the importance of incorporating disaster risk within community decision making, resilience and disaster risk are only one consideration community decision makers weigh when choosing how and whether to establish resilience policy. Using a case study approach, we examine the governance, organizational, management, and policy making processes and the involvement of scientific advice in designing and implementing resilience policy in three U.S. communities: Los Angeles, CA; Norfolk, VA; and Flagstaff, AZ. Disaster mitigation or resilience initiatives were developed and deployed in each community with differing levels and types of scientific engagement. Engagement spanned from providing technical support with traditional risk assessment to direct engagement with community decision makers and design of community resilience outreach. Best practices observed include embedding trusted, independent scientific advisors with strong community credibility within local government agencies, use of interdisciplinary and interdepartmental expert teams with management and technical skillsets, and establishing scientifically-informed disaster and hazard scenarios to enable community outreach. Case study evidence suggest science communication and engagement within and across municipal government agencies and scientifically-informed direct engagement with community stakeholders are effective approaches and roles that disaster risk scientists can fill to support resilience policy development.
ObsPy: Establishing and maintaining an open-source community package
NASA Astrophysics Data System (ADS)
Krischer, L.; Megies, T.; Barsch, R.
2017-12-01
Python's ecosystem evolved into one of the most powerful and productive research environment across disciplines. ObsPy (https://obspy.org) is a fully community driven, open-source project dedicated to provide a bridge for seismology into that ecosystem. It does so by offering Read and write support for essentially every commonly used data format in seismology, Integrated access to the largest data centers, web services, and real-time data streams, A powerful signal processing toolbox tuned to the specific needs of seismologists, and Utility functionality like travel time calculations, geodetic functions, and data visualizations. ObsPy has been in constant unfunded development for more than eight years and is developed and used by scientists around the world with successful applications in all branches of seismology. By now around 70 people directly contributed code to ObsPy and we aim to make it a self-sustaining community project.This contributions focusses on several meta aspects of open-source software in science, in particular how we experienced them. During the panel we would like to discuss obvious questions like long-term sustainability with very limited to no funding, insufficient computer science training in many sciences, and gaining hard scientific credits for software development, but also the following questions: How to best deal with the fact that a lot of scientific software is very specialized thus usually solves a complex problem but at the same time can only ever reach a limited pool of developers and users by virtue of it being so specialized? Therefore the "many eyes on the code" approach to develop and improve open-source software only applies in a limited fashion. An initial publication for a significant new scientific software package is fairly straightforward. How to on-board and motivate potential new contributors when they can no longer be lured by a potential co-authorship? When is spending significant time and effort on reusable scientific open-source development a reasonable choice for young researchers? The effort to go from purpose tailored code for a single application resulting in a scientific publication is significantly less compared to generalising and engineering it well enough so it can be used by others.
Detecting and analyzing research communities in longitudinal scientific networks.
Leone Sciabolazza, Valerio; Vacca, Raffaele; Kennelly Okraku, Therese; McCarty, Christopher
2017-01-01
A growing body of evidence shows that collaborative teams and communities tend to produce the highest-impact scientific work. This paper proposes a new method to (1) Identify collaborative communities in longitudinal scientific networks, and (2) Evaluate the impact of specific research institutes, services or policies on the interdisciplinary collaboration between these communities. First, we apply community-detection algorithms to cross-sectional scientific collaboration networks and analyze different types of co-membership in the resulting subgroups over time. This analysis summarizes large amounts of longitudinal network data to extract sets of research communities whose members have consistently collaborated or shared collaborators over time. Second, we construct networks of cross-community interactions and estimate Exponential Random Graph Models to predict the formation of interdisciplinary collaborations between different communities. The method is applied to longitudinal data on publication and grant collaborations at the University of Florida. Results show that similar institutional affiliation, spatial proximity, transitivity effects, and use of the same research services predict higher degree of interdisciplinary collaboration between research communities. Our application also illustrates how the identification of research communities in longitudinal data and the analysis of cross-community network formation can be used to measure the growth of interdisciplinary team science at a research university, and to evaluate its association with research policies, services or institutes.
Detecting and analyzing research communities in longitudinal scientific networks
Vacca, Raffaele; Kennelly Okraku, Therese; McCarty, Christopher
2017-01-01
A growing body of evidence shows that collaborative teams and communities tend to produce the highest-impact scientific work. This paper proposes a new method to (1) Identify collaborative communities in longitudinal scientific networks, and (2) Evaluate the impact of specific research institutes, services or policies on the interdisciplinary collaboration between these communities. First, we apply community-detection algorithms to cross-sectional scientific collaboration networks and analyze different types of co-membership in the resulting subgroups over time. This analysis summarizes large amounts of longitudinal network data to extract sets of research communities whose members have consistently collaborated or shared collaborators over time. Second, we construct networks of cross-community interactions and estimate Exponential Random Graph Models to predict the formation of interdisciplinary collaborations between different communities. The method is applied to longitudinal data on publication and grant collaborations at the University of Florida. Results show that similar institutional affiliation, spatial proximity, transitivity effects, and use of the same research services predict higher degree of interdisciplinary collaboration between research communities. Our application also illustrates how the identification of research communities in longitudinal data and the analysis of cross-community network formation can be used to measure the growth of interdisciplinary team science at a research university, and to evaluate its association with research policies, services or institutes. PMID:28797047
Recontextualization of Science from Lab to School: Implications for Science Literacy
ERIC Educational Resources Information Center
Sharma, Ajay; Anderson, Charles W.
2009-01-01
Scientists' science differs remarkably from school science. In order to be taught to students, science is recontextualized from scientific research communities to science classrooms. This paper examines scientific discourse in scientific research communities, and discusses its transformation from an internally-persuasive and authoritative…
Blueprint for a microwave trapped ion quantum computer.
Lekitsch, Bjoern; Weidt, Sebastian; Fowler, Austin G; Mølmer, Klaus; Devitt, Simon J; Wunderlich, Christof; Hensinger, Winfried K
2017-02-01
The availability of a universal quantum computer may have a fundamental impact on a vast number of research fields and on society as a whole. An increasingly large scientific and industrial community is working toward the realization of such a device. An arbitrarily large quantum computer may best be constructed using a modular approach. We present a blueprint for a trapped ion-based scalable quantum computer module, making it possible to create a scalable quantum computer architecture based on long-wavelength radiation quantum gates. The modules control all operations as stand-alone units, are constructed using silicon microfabrication techniques, and are within reach of current technology. To perform the required quantum computations, the modules make use of long-wavelength radiation-based quantum gate technology. To scale this microwave quantum computer architecture to a large size, we present a fully scalable design that makes use of ion transport between different modules, thereby allowing arbitrarily many modules to be connected to construct a large-scale device. A high error-threshold surface error correction code can be implemented in the proposed architecture to execute fault-tolerant operations. With appropriate adjustments, the proposed modules are also suitable for alternative trapped ion quantum computer architectures, such as schemes using photonic interconnects.
ISMB 2016 offers outstanding science, networking, and celebration
Fogg, Christiana
2016-01-01
The annual international conference on Intelligent Systems for Molecular Biology (ISMB) is the major meeting of the International Society for Computational Biology (ISCB). Over the past 23 years the ISMB conference has grown to become the world's largest bioinformatics/computational biology conference. ISMB 2016 will be the year's most important computational biology event globally. The conferences provide a multidisciplinary forum for disseminating the latest developments in bioinformatics/computational biology. ISMB brings together scientists from computer science, molecular biology, mathematics, statistics and related fields. Its principal focus is on the development and application of advanced computational methods for biological problems. ISMB 2016 offers the strongest scientific program and the broadest scope of any international bioinformatics/computational biology conference. Building on past successes, the conference is designed to cater to variety of disciplines within the bioinformatics/computational biology community. ISMB 2016 takes place July 8 - 12 at the Swan and Dolphin Hotel in Orlando, Florida, United States. For two days preceding the conference, additional opportunities including Satellite Meetings, Student Council Symposium, and a selection of Special Interest Group Meetings and Applied Knowledge Exchange Sessions (AKES) are all offered to enable registered participants to learn more on the latest methods and tools within specialty research areas. PMID:27347392
ISMB 2016 offers outstanding science, networking, and celebration.
Fogg, Christiana
2016-01-01
The annual international conference on Intelligent Systems for Molecular Biology (ISMB) is the major meeting of the International Society for Computational Biology (ISCB). Over the past 23 years the ISMB conference has grown to become the world's largest bioinformatics/computational biology conference. ISMB 2016 will be the year's most important computational biology event globally. The conferences provide a multidisciplinary forum for disseminating the latest developments in bioinformatics/computational biology. ISMB brings together scientists from computer science, molecular biology, mathematics, statistics and related fields. Its principal focus is on the development and application of advanced computational methods for biological problems. ISMB 2016 offers the strongest scientific program and the broadest scope of any international bioinformatics/computational biology conference. Building on past successes, the conference is designed to cater to variety of disciplines within the bioinformatics/computational biology community. ISMB 2016 takes place July 8 - 12 at the Swan and Dolphin Hotel in Orlando, Florida, United States. For two days preceding the conference, additional opportunities including Satellite Meetings, Student Council Symposium, and a selection of Special Interest Group Meetings and Applied Knowledge Exchange Sessions (AKES) are all offered to enable registered participants to learn more on the latest methods and tools within specialty research areas.
Need for evaluative methodologies in land use, regional resource and waste management planning
DOE Office of Scientific and Technical Information (OSTI.GOV)
Croke, E. J.
The transfer of planning methodology from the research community to the practitioner very frequently takes the form of analytical and evaluative techniques and procedures. In the end, these become operational in the form of data acquisition, management and display systems, computational schemes that are codified in the form of manuals and handbooks, and computer simulation models. The complexity of the socioeconomic and physical processes that govern environmental resource and waste management have reinforced the need for computer assisted, scientifically sophisticated planning models that are fully operational, dependent on an attainable data base and accessible in terms of the resources normallymore » available to practitioners of regional resource management, waste management, and land use planning. A variety of models and procedures that attempt to meet one or more of the needs of these practitioners are discussed.« less
Computing Properties of Hadrons, Nuclei and Nuclear Matter from Quantum Chromodynamics (LQCD)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Negele, John W.
Building on the success of two preceding generations of Scientific Discovery through Advanced Computing (SciDAC) projects, this grant supported the MIT component (P.I. John Negele) of a multi-institutional SciDAC-3 project that also included Brookhaven National Laboratory, the lead laboratory with P. I. Frithjof Karsch serving as Project Director, Thomas Jefferson National Accelerator Facility with P. I. David Richards serving as Co-director, University of Washington with P. I. Martin Savage, University of North Carolina with P. I. Rob Fowler, and College of William and Mary with P. I. Andreas Stathopoulos. Nationally, this multi-institutional project coordinated the software development effort that themore » nuclear physics lattice QCD community needs to ensure that lattice calculations can make optimal use of forthcoming leadership-class and dedicated hardware, including that at the national laboratories, and to exploit future computational resources in the Exascale era.« less
Publication Bias in Methodological Computational Research.
Boulesteix, Anne-Laure; Stierle, Veronika; Hapfelmeier, Alexander
2015-01-01
The problem of publication bias has long been discussed in research fields such as medicine. There is a consensus that publication bias is a reality and that solutions should be found to reduce it. In methodological computational research, including cancer informatics, publication bias may also be at work. The publication of negative research findings is certainly also a relevant issue, but has attracted very little attention to date. The present paper aims at providing a new formal framework to describe the notion of publication bias in the context of methodological computational research, facilitate and stimulate discussions on this topic, and increase awareness in the scientific community. We report an exemplary pilot study that aims at gaining experiences with the collection and analysis of information on unpublished research efforts with respect to publication bias, and we outline the encountered problems. Based on these experiences, we try to formalize the notion of publication bias.
NASA Astrophysics Data System (ADS)
Lyons, Renee
Educational programs created to provide opportunities for all, in reality often reflect social inequalities. Such is the case for Public Participation in Scientific Research (PPSR) Projects. PPSR projects have been proposed as an effective way to engage more diverse audiences in science, yet the demographics of PPSR participants do not correspond with the demographic makeup of the United States. The field of PPSR as a whole has struggled to recruit low SES and underrepresented populations to participate in project research efforts. This research study explores factors, which may be affecting an underrepresented community's willingness to engage in scientific research and provides advice from PPSR project leaders in the field, who have been able to engage underrepresented communities in scientific research, on how to overcome these barriers. Finally the study investigates the theoretical construct of a Third Space within a PPSR project. The research-based recommendations for PPSR projects desiring to initiate and sustain research partnerships with underrepresented communities well align with the theoretical construct of a Third Space. This study examines a specific scientific research partnership between an underrepresented community and scientific researchers to examine if and to what extent a Third Space was created. Using qualitative methods to understand interactions and processes involved in initiating and sustaining a scientific research partnership, this study provides advice on how PPSR research partnerships can engage underrepresented communities in scientific research. Study results show inequality and mistrust of powerful institutions stood as participation barriers for underrepresented community members. Despite these barriers PPSR project leaders recommend barriers can be confronted by open dialogue with communities about the abuse and alienation they have faced, by signaling respect for the community, and by entering the community through someone the community already trusts. Finally although many of the principles of a Third Space well align with the larger level of activity, which existed in the PPSR project examined in this study, study findings challenge others to critically examine assumptions behind the idea of a Third Space in PPSR and urge other PPSR project leaders towards a transformed view of science.
Georges Lemaître: The Priest Who Invented the Big Bang
NASA Astrophysics Data System (ADS)
Lambert, Dominique
This contribution gives a concise survey of Georges Lemaître works and life, shedding some light on less-known aspects. Lemaître is a Belgian catholic priest who gave for the first time in 1927 the explanation of the Hubble law and who proposed in 1931 the "Primeval Atom Hypothesis", considered as the first step towards the Big Bang cosmology. But the scientific work of Lemaître goes far beyond Physical Cosmology. Indeed, he contributed also to the theory of Cosmis Rays, to the Spinor theory, to Analytical mechanics (regularization of 3- Bodies problem), to Numerical Analysis (Fast Fourier Transform), to Computer Science (he introduced and programmed the first computer of Louvain),… Lemaître took part to the "Science and Faith" debate. He defended a position that has some analogy with the NOMA principle, making a sharp distinction between what he called the "two paths to Truth" (a scientific one and a theological one). In particular, he never made a confusion between the theological concept of "creation" and the scientific notion of "natural beginning" (initial singularity). Lemaître was deeply rooted in his faith and sacerdotal vocation. Remaining a secular priest, he belonged to a community of priests called "The Friends of Jesus", characterized by a deep spirituality and special vows (for example the vow of poverty). He had also an apostolic activity amongst Chinese students.
Multi-level meta-workflows: new concept for regularly occurring tasks in quantum chemistry.
Arshad, Junaid; Hoffmann, Alexander; Gesing, Sandra; Grunzke, Richard; Krüger, Jens; Kiss, Tamas; Herres-Pawlis, Sonja; Terstyanszky, Gabor
2016-01-01
In Quantum Chemistry, many tasks are reoccurring frequently, e.g. geometry optimizations, benchmarking series etc. Here, workflows can help to reduce the time of manual job definition and output extraction. These workflows are executed on computing infrastructures and may require large computing and data resources. Scientific workflows hide these infrastructures and the resources needed to run them. It requires significant efforts and specific expertise to design, implement and test these workflows. Many of these workflows are complex and monolithic entities that can be used for particular scientific experiments. Hence, their modification is not straightforward and it makes almost impossible to share them. To address these issues we propose developing atomic workflows and embedding them in meta-workflows. Atomic workflows deliver a well-defined research domain specific function. Publishing workflows in repositories enables workflow sharing inside and/or among scientific communities. We formally specify atomic and meta-workflows in order to define data structures to be used in repositories for uploading and sharing them. Additionally, we present a formal description focused at orchestration of atomic workflows into meta-workflows. We investigated the operations that represent basic functionalities in Quantum Chemistry, developed the relevant atomic workflows and combined them into meta-workflows. Having these workflows we defined the structure of the Quantum Chemistry workflow library and uploaded these workflows in the SHIWA Workflow Repository.Graphical AbstractMeta-workflows and embedded workflows in the template representation.
GillesPy: A Python Package for Stochastic Model Building and Simulation.
Abel, John H; Drawert, Brian; Hellander, Andreas; Petzold, Linda R
2016-09-01
GillesPy is an open-source Python package for model construction and simulation of stochastic biochemical systems. GillesPy consists of a Python framework for model building and an interface to the StochKit2 suite of efficient simulation algorithms based on the Gillespie stochastic simulation algorithms (SSA). To enable intuitive model construction and seamless integration into the scientific Python stack, we present an easy to understand, action-oriented programming interface. Here, we describe the components of this package and provide a detailed example relevant to the computational biology community.
GillesPy: A Python Package for Stochastic Model Building and Simulation
Abel, John H.; Drawert, Brian; Hellander, Andreas; Petzold, Linda R.
2017-01-01
GillesPy is an open-source Python package for model construction and simulation of stochastic biochemical systems. GillesPy consists of a Python framework for model building and an interface to the StochKit2 suite of efficient simulation algorithms based on the Gillespie stochastic simulation algorithms (SSA). To enable intuitive model construction and seamless integration into the scientific Python stack, we present an easy to understand, action-oriented programming interface. Here, we describe the components of this package and provide a detailed example relevant to the computational biology community. PMID:28630888
Roskams, Jane; Popović, Zoran
2016-11-02
Global neuroscience projects are producing big data at an unprecedented rate that informatic and artificial intelligence (AI) analytics simply cannot handle. Online games, like Foldit, Eterna, and Eyewire-and now a new neuroscience game, Mozak-are fueling a people-powered research science (PPRS) revolution, creating a global community of "new experts" that over time synergize with computational efforts to accelerate scientific progress, empowering us to use our collective cerebral talents to drive our understanding of our brain. Copyright © 2016 Elsevier Inc. All rights reserved.
Integrating Data Base into the Elementary School Science Program.
ERIC Educational Resources Information Center
Schlenker, Richard M.
This document describes seven science activities that combine scientific principles and computers. The objectives for the activities are to show students how the computer can be used as a tool to store and arrange scientific data, provide students with experience using the computer as a tool to manage scientific data, and provide students with…
A high performance scientific cloud computing environment for materials simulations
NASA Astrophysics Data System (ADS)
Jorissen, K.; Vila, F. D.; Rehr, J. J.
2012-09-01
We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including tools for execution and monitoring performance, as well as efficient I/O utilities that enable seamless connections to and from the cloud. Our SCC platform is optimized for the Amazon Elastic Compute Cloud (EC2). We present benchmarks for prototypical scientific applications and demonstrate performance comparable to local compute clusters. To facilitate code execution and provide user-friendly access, we have also integrated cloud computing capability in a JAVA-based GUI. Our SCC platform may be an alternative to traditional HPC resources for materials science or quantum chemistry applications.
Scout: high-performance heterogeneous computing made simple
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jablin, James; Mc Cormick, Patrick; Herlihy, Maurice
2011-01-26
Researchers must often write their own simulation and analysis software. During this process they simultaneously confront both computational and scientific problems. Current strategies for aiding the generation of performance-oriented programs do not abstract the software development from the science. Furthermore, the problem is becoming increasingly complex and pressing with the continued development of many-core and heterogeneous (CPU-GPU) architectures. To acbieve high performance, scientists must expertly navigate both software and hardware. Co-design between computer scientists and research scientists can alleviate but not solve this problem. The science community requires better tools for developing, optimizing, and future-proofing codes, allowing scientists to focusmore » on their research while still achieving high computational performance. Scout is a parallel programming language and extensible compiler framework targeting heterogeneous architectures. It provides the abstraction required to buffer scientists from the constantly-shifting details of hardware while still realizing higb-performance by encapsulating software and hardware optimization within a compiler framework.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perahia, Dvora; Grest, Gary S.
Neutron experiments coupled with computational components have resulted in unprecedented understanding of the factors that impact the behavior of ionic structured polymers. Additionally, new computational tools to study macromolecules, were developed. In parallel, this DOE funding have enabled the education of the next generation of material researchers who are able to take the advantage neutron tools offer to the understanding and design of advanced materials. Our research has provided unprecedented insight into one of the major factors that limits the use of ionizable polymers, combining the macroscopic view obtained from the experimental techniques with molecular insight extracted from computational studiesmore » leading to transformative knowledge that will impact the design of nano-structured, materials. With the focus on model systems, of broad interest to the scientific community and to industry, the research addressed challenges that cut across a large number of polymers, independent of the specific chemical structure or the transported species.« less
High performance hybrid functional Petri net simulations of biological pathway models on CUDA.
Chalkidis, Georgios; Nagasaki, Masao; Miyano, Satoru
2011-01-01
Hybrid functional Petri nets are a wide-spread tool for representing and simulating biological models. Due to their potential of providing virtual drug testing environments, biological simulations have a growing impact on pharmaceutical research. Continuous research advancements in biology and medicine lead to exponentially increasing simulation times, thus raising the demand for performance accelerations by efficient and inexpensive parallel computation solutions. Recent developments in the field of general-purpose computation on graphics processing units (GPGPU) enabled the scientific community to port a variety of compute intensive algorithms onto the graphics processing unit (GPU). This work presents the first scheme for mapping biological hybrid functional Petri net models, which can handle both discrete and continuous entities, onto compute unified device architecture (CUDA) enabled GPUs. GPU accelerated simulations are observed to run up to 18 times faster than sequential implementations. Simulating the cell boundary formation by Delta-Notch signaling on a CUDA enabled GPU results in a speedup of approximately 7x for a model containing 1,600 cells.
Models and Simulations as a Service: Exploring the Use of Galaxy for Delivering Computational Models
Walker, Mark A.; Madduri, Ravi; Rodriguez, Alex; Greenstein, Joseph L.; Winslow, Raimond L.
2016-01-01
We describe the ways in which Galaxy, a web-based reproducible research platform, can be used for web-based sharing of complex computational models. Galaxy allows users to seamlessly customize and run simulations on cloud computing resources, a concept we refer to as Models and Simulations as a Service (MaSS). To illustrate this application of Galaxy, we have developed a tool suite for simulating a high spatial-resolution model of the cardiac Ca2+ spark that requires supercomputing resources for execution. We also present tools for simulating models encoded in the SBML and CellML model description languages, thus demonstrating how Galaxy’s reproducible research features can be leveraged by existing technologies. Finally, we demonstrate how the Galaxy workflow editor can be used to compose integrative models from constituent submodules. This work represents an important novel approach, to our knowledge, to making computational simulations more accessible to the broader scientific community. PMID:26958881
Edwards, Roger L; Edwards, Sandra L; Bryner, James; Cunningham, Kelly; Rogers, Amy; Slattery, Martha L
2008-04-01
We describe a computer-assisted data collection system developed for a multicenter cohort study of American Indian and Alaska Native people. The study computer-assisted participant evaluation system or SCAPES is built around a central database server that controls a small private network with touch screen workstations. SCAPES encompasses the self-administered questionnaires, the keyboard-based stations for interviewer-administered questionnaires, a system for inputting medical measurements, and administrative tasks such as data exporting, backup and management. Elements of SCAPES hardware/network design, data storage, programming language, software choices, questionnaire programming including the programming of questionnaires administered using audio computer-assisted self-interviewing (ACASI), and participant identification/data security system are presented. Unique features of SCAPES are that data are promptly made available to participants in the form of health feedback; data can be quickly summarized for tribes for health monitoring and planning at the community level; and data are available to study investigators for analyses and scientific evaluation.
NASA Astrophysics Data System (ADS)
Reading, A. M.; Morse, P. E.; Staal, T.
2017-12-01
Geoscientific inversion outputs, such as seismic tomography contour images, are finding increasing use amongst scientific user communities that have limited knowledge of the impact of output parameter uncertainty on subsequent interpretations made from such images. We make use of a newly written computer application which enables seismic tomography images to be displayed in a performant 3D graphics environment. This facilitates the mapping of colour scales to the human visual sensorium for the interactive interpretation of contoured inversion results incorporating parameter uncertainty. Two case examples of seismic tomography inversions or contoured compilations are compared from the southern hemisphere continents of Australia and Antarctica. The Australian example is based on the AuSREM contoured seismic wavespeed model while the Antarctic example is a valuable but less well constrained result. Through adjusting the multiple colour gradients, layer separations, opacity, illumination, shadowing and background effects, we can optimise the insights obtained from the 3D structure in the inversion compilation or result. Importantly, we can also limit the display to show information in a way that is mapped to the uncertainty in the 3D result. Through this practical application, we demonstrate that the uncertainty in the result can be handled through a well-posed mapping of the parameter values to displayed colours in the knowledge of what is perceived visually by a typical human. We found that this approach maximises the chance of a useful tectonic interpretation by a diverse scientific user community. In general, we develop the idea that quantified inversion uncertainty can be used to tailor the way that the output is presented to the analyst for scientific interpretation.
The HEPCloud Facility: elastic computing for High Energy Physics - The NOvA Use Case
NASA Astrophysics Data System (ADS)
Fuess, S.; Garzoglio, G.; Holzman, B.; Kennedy, R.; Norman, A.; Timm, S.; Tiradani, A.
2017-10-01
The need for computing in the HEP community follows cycles of peaks and valleys mainly driven by conference dates, accelerator shutdown, holiday schedules, and other factors. Because of this, the classical method of provisioning these resources at providing facilities has drawbacks such as potential overprovisioning. As the appetite for computing increases, however, so does the need to maximize cost efficiency by developing a model for dynamically provisioning resources only when needed. To address this issue, the HEPCloud project was launched by the Fermilab Scientific Computing Division in June 2015. Its goal is to develop a facility that provides a common interface to a variety of resources, including local clusters, grids, high performance computers, and community and commercial Clouds. Initially targeted experiments include CMS and NOvA, as well as other Fermilab stakeholders. In its first phase, the project has demonstrated the use of the “elastic” provisioning model offered by commercial clouds, such as Amazon Web Services. In this model, resources are rented and provisioned automatically over the Internet upon request. In January 2016, the project demonstrated the ability to increase the total amount of global CMS resources by 58,000 cores from 150,000 cores - a 38 percent increase - in preparation for the Recontres de Moriond. In March 2016, the NOvA experiment has also demonstrated resource burst capabilities with an additional 7,300 cores, achieving a scale almost four times as large as the local allocated resources and utilizing the local AWS s3 storage to optimize data handling operations and costs. NOvA was using the same familiar services used for local computations, such as data handling and job submission, in preparation for the Neutrino 2016 conference. In both cases, the cost was contained by the use of the Amazon Spot Instance Market and the Decision Engine, a HEPCloud component that aims at minimizing cost and job interruption. This paper describes the Fermilab HEPCloud Facility and the challenges overcome for the CMS and NOvA communities.
iSPHERE - A New Approach to Collaborative Research and Cloud Computing
NASA Astrophysics Data System (ADS)
Al-Ubaidi, T.; Khodachenko, M. L.; Kallio, E. J.; Harry, A.; Alexeev, I. I.; Vázquez-Poletti, J. L.; Enke, H.; Magin, T.; Mair, M.; Scherf, M.; Poedts, S.; De Causmaecker, P.; Heynderickx, D.; Congedo, P.; Manolescu, I.; Esser, B.; Webb, S.; Ruja, C.
2015-10-01
The project iSPHERE (integrated Scientific Platform for HEterogeneous Research and Engineering) that has been proposed for Horizon 2020 (EINFRA-9- 2015, [1]) aims at creating a next generation Virtual Research Environment (VRE) that embraces existing and emerging technologies and standards in order to provide a versatile platform for scientific investigations and collaboration. The presentation will introduce the large project consortium, provide a comprehensive overview of iSPHERE's basic concepts and approaches and outline general user requirements that the VRE will strive to satisfy. An overview of the envisioned architecture will be given, focusing on the adapted Service Bus concept, i.e. the "Scientific Service Bus" as it is called in iSPHERE. The bus will act as a central hub for all communication and user access, and will be implemented in the course of the project. The agile approach [2] that has been chosen for detailed elaboration and documentation of user requirements, as well as for the actual implementation of the system, will be outlined and its motivation and basic structure will be discussed. The presentation will show which user communities will benefit and which concrete problems, scientific investigations are facing today, will be tackled by the system. Another focus of the presentation is iSPHERE's seamless integration of cloud computing resources and how these will benefit scientific modeling teams by providing a reliable and web based environment for cloud based model execution, storage of results, and comparison with measurements, including fully web based tools for data mining, analysis and visualization. Also the envisioned creation of a dedicated data model for experimental plasma physics will be discussed. It will be shown why the Scientific Service Bus provides an ideal basis to integrate a number of data models and communication protocols and to provide mechanisms for data exchange across multiple and even multidisciplinary platforms.
NASA Astrophysics Data System (ADS)
Benveniste, J.; Regner, P.; Desnos, Y. L.
2015-12-01
The Scientific Exploitation of Operational Mission (SEOM) programme element (http://seom.esa.int/) is part of the ESA's Fourth Earth Observation Envelope Programme (2013-2017). The prime objective is to federate, support and expand the international research community that the ERS, ENVISAT and the Envelope programmes have built up over the last 25 years. It aims to further strengthen the leadership of the European Earth Observation research community by enabling them to extensively exploit future European operational EO missions. SEOM is enabling the science community to address new scientific research that are opened by free and open access to data from operational EO missions. The Programme is based on community-wide recommendations for actions on key research issues, gathered through a series of international thematic workshops and scientific user consultation meetings such as the Sentinel-3 for Science Workshop held last June in Venice, Italy (see http://seom.esa.int/S3forScience2015). The 2015 SEOM work plan includes the launch of new R&D studies for scientific exploitation of the Sentinels, the development of open-source multi-mission scientific toolboxes, the organization of advanced international training courses, summer schools and educational materials, as well as activities for promoting the scientific use of EO data, also via the organization of Workshops. This paper will report the recommendations from the International Scientific Community concerning the Sentinel-3 Scientific Exploitation, as expressed in Venice, keeping in mind that Sentinel-3 is an operational mission to provide operational services (see http://www.copernicus.eu).
Advanced Computation in Plasma Physics
NASA Astrophysics Data System (ADS)
Tang, William
2001-10-01
Scientific simulation in tandem with theory and experiment is an essential tool for understanding complex plasma behavior. This talk will review recent progress and future directions for advanced simulations in magnetically-confined plasmas with illustrative examples chosen from areas such as microturbulence, magnetohydrodynamics, magnetic reconnection, and others. Significant recent progress has been made in both particle and fluid simulations of fine-scale turbulence and large-scale dynamics, giving increasingly good agreement between experimental observations and computational modeling. This was made possible by innovative advances in analytic and computational methods for developing reduced descriptions of physics phenomena spanning widely disparate temporal and spatial scales together with access to powerful new computational resources. In particular, the fusion energy science community has made excellent progress in developing advanced codes for which computer run-time and problem size scale well with the number of processors on massively parallel machines (MPP's). A good example is the effective usage of the full power of multi-teraflop MPP's to produce 3-dimensional, general geometry, nonlinear particle simulations which have accelerated progress in understanding the nature of turbulence self-regulation by zonal flows. It should be emphasized that these calculations, which typically utilized billions of particles for tens of thousands time-steps, would not have been possible without access to powerful present generation MPP computers and the associated diagnostic and visualization capabilities. In general, results from advanced simulations provide great encouragement for being able to include increasingly realistic dynamics to enable deeper physics insights into plasmas in both natural and laboratory environments. The associated scientific excitement should serve to stimulate improved cross-cutting collaborations with other fields and also to help attract bright young talent to plasma science.
NASA Astrophysics Data System (ADS)
Khalili, N.; Valliappan, S.; Li, Q.; Russell, A.
2010-07-01
The use for mathematical models of natural phenomena has underpinned science and engineering for centuries, but until the advent of modern computers and computational methods, the full utility of most of these models remained outside the reach of the engineering communities. Since World War II, advances in computational methods have transformed the way engineering and science is undertaken throughout the world. Today, theories of mechanics of solids and fluids, electromagnetism, heat transfer, plasma physics, and other scientific disciplines are implemented through computational methods in engineering analysis, design, manufacturing, and in studying broad classes of physical phenomena. The discipline concerned with the application of computational methods is now a key area of research, education, and application throughout the world. In the early 1980's, the International Association for Computational Mechanics (IACM) was founded to promote activities related to computational mechanics and has made impressive progress. The most important scientific event of IACM is the World Congress on Computational Mechanics. The first was held in Austin (USA) in 1986 and then in Stuttgart (Germany) in 1990, Chiba (Japan) in 1994, Buenos Aires (Argentina) in 1998, Vienna (Austria) in 2002, Beijing (China) in 2004, Los Angeles (USA) in 2006 and Venice, Italy; in 2008. The 9th World Congress on Computational Mechanics is held in conjunction with the 4th Asian Pacific Congress on Computational Mechanics under the auspices of Australian Association for Computational Mechanics (AACM), Asian Pacific Association for Computational Mechanics (APACM) and International Association for Computational Mechanics (IACM). The 1st Asian Pacific Congress was in Sydney (Australia) in 2001, then in Beijing (China) in 2004 and Kyoto (Japan) in 2007. The WCCM/APCOM 2010 publications consist of a printed book of abstracts given to delegates, along with 247 full length peer reviewed papers published with free access online in IOP Conference Series: Materials Science and Engineering. The editors acknowledge the help of the paper reviewers in maintaining a high standard of assessment and the co-operation of the authors in complying with the requirements of the editors and the reviewers. We also would like to take this opportunity to thank the members of the Local Organising Committee and the International Scientific Committee for helping make WCCM/APCOM 2010 a successful event. We also thank The University of New South Wales, The University of Newcastle, the Centre for Infrastructure Engineering and Safety (CIES), IACM, APCAM, AACM for their financial support, along with the United States Association for Computational Mechanics for the Travel Awards made available. N. Khalili S. Valliappan Q. Li A. Russell 19 July 2010 Sydney, Australia
Constructing Scientific Arguments Using Evidence from Dynamic Computational Climate Models
ERIC Educational Resources Information Center
Pallant, Amy; Lee, Hee-Sun
2015-01-01
Modeling and argumentation are two important scientific practices students need to develop throughout school years. In this paper, we investigated how middle and high school students (N = 512) construct a scientific argument based on evidence from computational models with which they simulated climate change. We designed scientific argumentation…
PyMT: A Python package for model-coupling in the Earth sciences
NASA Astrophysics Data System (ADS)
Hutton, E.
2016-12-01
The current landscape of Earth-system models is not only broad in scientific scope, but also broad in type. On the one hand, the large variety of models is exciting, as it provides fertile ground for extending or linking models together in novel ways to answer new scientific questions. However, the heterogeneity in model type acts to inhibit model coupling, model development, or even model use. Existing models are written in a variety of programming languages, operate on different grids, use their own file formats (both for input and output), have different user interfaces, have their own time steps, etc. Each of these factors become obstructions to scientists wanting to couple, extend - or simply run - existing models. For scientists whose main focus may not be computer science these barriers become even larger and become significant logistical hurdles. And this is all before the scientific difficulties of coupling or running models are addressed. The CSDMS Python Modeling Toolkit (PyMT) was developed to help non-computer scientists deal with these sorts of modeling logistics. PyMT is the fundamental package the Community Surface Dynamics Modeling System uses for the coupling of models that expose the Basic Modeling Interface (BMI). It contains: Tools necessary for coupling models of disparate time and space scales (including grid mappers) Time-steppers that coordinate the sequencing of coupled models Exchange of data between BMI-enabled models Wrappers that automatically load BMI-enabled models into the PyMT framework Utilities that support open-source interfaces (UGRID, SGRID,CSDMS Standard Names, etc.) A collection of community-submitted models, written in a variety of programminglanguages, from a variety of process domains - but all usable from within the Python programming language A plug-in framework for adding additional BMI-enabled models to the framework In this presentation we intoduce the basics of the PyMT as well as provide an example of coupling models of different domains and grid types.
NASA Technical Reports Server (NTRS)
Druyan, Leonard M.
2012-01-01
Climate models is a very broad topic, so a single volume can only offer a small sampling of relevant research activities. This volume of 14 chapters includes descriptions of a variety of modeling studies for a variety of geographic regions by an international roster of authors. The climate research community generally uses the rubric climate models to refer to organized sets of computer instructions that produce simulations of climate evolution. The code is based on physical relationships that describe the shared variability of meteorological parameters such as temperature, humidity, precipitation rate, circulation, radiation fluxes, etc. Three-dimensional climate models are integrated over time in order to compute the temporal and spatial variations of these parameters. Model domains can be global or regional and the horizontal and vertical resolutions of the computational grid vary from model to model. Considering the entire climate system requires accounting for interactions between solar insolation, atmospheric, oceanic and continental processes, the latter including land hydrology and vegetation. Model simulations may concentrate on one or more of these components, but the most sophisticated models will estimate the mutual interactions of all of these environments. Advances in computer technology have prompted investments in more complex model configurations that consider more phenomena interactions than were possible with yesterday s computers. However, not every attempt to add to the computational layers is rewarded by better model performance. Extensive research is required to test and document any advantages gained by greater sophistication in model formulation. One purpose for publishing climate model research results is to present purported advances for evaluation by the scientific community.
Parallel processing for scientific computations
NASA Technical Reports Server (NTRS)
Alkhatib, Hasan S.
1995-01-01
The scope of this project dealt with the investigation of the requirements to support distributed computing of scientific computations over a cluster of cooperative workstations. Various experiments on computations for the solution of simultaneous linear equations were performed in the early phase of the project to gain experience in the general nature and requirements of scientific applications. A specification of a distributed integrated computing environment, DICE, based on a distributed shared memory communication paradigm has been developed and evaluated. The distributed shared memory model facilitates porting existing parallel algorithms that have been designed for shared memory multiprocessor systems to the new environment. The potential of this new environment is to provide supercomputing capability through the utilization of the aggregate power of workstations cooperating in a cluster interconnected via a local area network. Workstations, generally, do not have the computing power to tackle complex scientific applications, making them primarily useful for visualization, data reduction, and filtering as far as complex scientific applications are concerned. There is a tremendous amount of computing power that is left unused in a network of workstations. Very often a workstation is simply sitting idle on a desk. A set of tools can be developed to take advantage of this potential computing power to create a platform suitable for large scientific computations. The integration of several workstations into a logical cluster of distributed, cooperative, computing stations presents an alternative to shared memory multiprocessor systems. In this project we designed and evaluated such a system.
Finding Our Way through Phenotypes
Deans, Andrew R.; Lewis, Suzanna E.; Huala, Eva; Anzaldo, Salvatore S.; Ashburner, Michael; Balhoff, James P.; Blackburn, David C.; Blake, Judith A.; Burleigh, J. Gordon; Chanet, Bruno; Cooper, Laurel D.; Courtot, Mélanie; Csösz, Sándor; Cui, Hong; Dahdul, Wasila; Das, Sandip; Dececchi, T. Alexander; Dettai, Agnes; Diogo, Rui; Druzinsky, Robert E.; Dumontier, Michel; Franz, Nico M.; Friedrich, Frank; Gkoutos, George V.; Haendel, Melissa; Harmon, Luke J.; Hayamizu, Terry F.; He, Yongqun; Hines, Heather M.; Ibrahim, Nizar; Jackson, Laura M.; Jaiswal, Pankaj; James-Zorn, Christina; Köhler, Sebastian; Lecointre, Guillaume; Lapp, Hilmar; Lawrence, Carolyn J.; Le Novère, Nicolas; Lundberg, John G.; Macklin, James; Mast, Austin R.; Midford, Peter E.; Mikó, István; Mungall, Christopher J.; Oellrich, Anika; Osumi-Sutherland, David; Parkinson, Helen; Ramírez, Martín J.; Richter, Stefan; Robinson, Peter N.; Ruttenberg, Alan; Schulz, Katja S.; Segerdell, Erik; Seltmann, Katja C.; Sharkey, Michael J.; Smith, Aaron D.; Smith, Barry; Specht, Chelsea D.; Squires, R. Burke; Thacker, Robert W.; Thessen, Anne; Fernandez-Triana, Jose; Vihinen, Mauno; Vize, Peter D.; Vogt, Lars; Wall, Christine E.; Walls, Ramona L.; Westerfeld, Monte; Wharton, Robert A.; Wirkner, Christian S.; Woolley, James B.; Yoder, Matthew J.; Zorn, Aaron M.; Mabee, Paula
2015-01-01
Despite a large and multifaceted effort to understand the vast landscape of phenotypic data, their current form inhibits productive data analysis. The lack of a community-wide, consensus-based, human- and machine-interpretable language for describing phenotypes and their genomic and environmental contexts is perhaps the most pressing scientific bottleneck to integration across many key fields in biology, including genomics, systems biology, development, medicine, evolution, ecology, and systematics. Here we survey the current phenomics landscape, including data resources and handling, and the progress that has been made to accurately capture relevant data descriptions for phenotypes. We present an example of the kind of integration across domains that computable phenotypes would enable, and we call upon the broader biology community, publishers, and relevant funding agencies to support efforts to surmount today's data barriers and facilitate analytical reproducibility. PMID:25562316
Finding our way through phenotypes.
Deans, Andrew R; Lewis, Suzanna E; Huala, Eva; Anzaldo, Salvatore S; Ashburner, Michael; Balhoff, James P; Blackburn, David C; Blake, Judith A; Burleigh, J Gordon; Chanet, Bruno; Cooper, Laurel D; Courtot, Mélanie; Csösz, Sándor; Cui, Hong; Dahdul, Wasila; Das, Sandip; Dececchi, T Alexander; Dettai, Agnes; Diogo, Rui; Druzinsky, Robert E; Dumontier, Michel; Franz, Nico M; Friedrich, Frank; Gkoutos, George V; Haendel, Melissa; Harmon, Luke J; Hayamizu, Terry F; He, Yongqun; Hines, Heather M; Ibrahim, Nizar; Jackson, Laura M; Jaiswal, Pankaj; James-Zorn, Christina; Köhler, Sebastian; Lecointre, Guillaume; Lapp, Hilmar; Lawrence, Carolyn J; Le Novère, Nicolas; Lundberg, John G; Macklin, James; Mast, Austin R; Midford, Peter E; Mikó, István; Mungall, Christopher J; Oellrich, Anika; Osumi-Sutherland, David; Parkinson, Helen; Ramírez, Martín J; Richter, Stefan; Robinson, Peter N; Ruttenberg, Alan; Schulz, Katja S; Segerdell, Erik; Seltmann, Katja C; Sharkey, Michael J; Smith, Aaron D; Smith, Barry; Specht, Chelsea D; Squires, R Burke; Thacker, Robert W; Thessen, Anne; Fernandez-Triana, Jose; Vihinen, Mauno; Vize, Peter D; Vogt, Lars; Wall, Christine E; Walls, Ramona L; Westerfeld, Monte; Wharton, Robert A; Wirkner, Christian S; Woolley, James B; Yoder, Matthew J; Zorn, Aaron M; Mabee, Paula
2015-01-01
Despite a large and multifaceted effort to understand the vast landscape of phenotypic data, their current form inhibits productive data analysis. The lack of a community-wide, consensus-based, human- and machine-interpretable language for describing phenotypes and their genomic and environmental contexts is perhaps the most pressing scientific bottleneck to integration across many key fields in biology, including genomics, systems biology, development, medicine, evolution, ecology, and systematics. Here we survey the current phenomics landscape, including data resources and handling, and the progress that has been made to accurately capture relevant data descriptions for phenotypes. We present an example of the kind of integration across domains that computable phenotypes would enable, and we call upon the broader biology community, publishers, and relevant funding agencies to support efforts to surmount today's data barriers and facilitate analytical reproducibility.
NASA Astrophysics Data System (ADS)
Cox, S. J.; Wyborn, L. A.; Fraser, R.; Rankine, T.; Woodcock, R.; Vote, J.; Evans, B.
2012-12-01
The Virtual Geophysics Laboratory (VGL) is web portal that provides geoscientists with an integrated online environment that: seamlessly accesses geophysical and geoscience data services from the AuScope national geoscience information infrastructure; loosely couples these data to a variety of gesocience software tools; and provides large scale processing facilities via cloud computing. VGL is a collaboration between CSIRO, Geoscience Australia, National Computational Infrastructure, Monash University, Australian National University and the University of Queensland. The VGL provides a distributed system whereby a user can enter an online virtual laboratory to seamlessly connect to OGC web services for geoscience data. The data is supplied in open standards formats using international standards like GeoSciML. A VGL user uses a web mapping interface to discover and filter the data sources using spatial and attribute filters to define a subset. Once the data is selected the user is not required to download the data. VGL collates the service query information for later in the processing workflow where it will be staged directly to the computing facilities. The combination of deferring data download and access to Cloud computing enables VGL users to access their data at higher resolutions and to undertake larger scale inversions, more complex models and simulations than their own local computing facilities might allow. Inside the Virtual Geophysics Laboratory, the user has access to a library of existing models, complete with exemplar workflows for specific scientific problems based on those models. For example, the user can load a geological model published by Geoscience Australia, apply a basic deformation workflow provided by a CSIRO scientist, and have it run in a scientific code from Monash. Finally the user can publish these results to share with a colleague or cite in a paper. This opens new opportunities for access and collaboration as all the resources (models, code, data, processing) are shared in the one virtual laboratory. VGL provides end users with access to an intuitive, user-centered interface that leverages cloud storage and cloud and cluster processing from both the research communities and commercial suppliers (e.g. Amazon). As the underlying data and information services are agnostic of the scientific domain, they can support many other data types. This fundamental characteristic results in a highly reusable virtual laboratory infrastructure that could also be used for example natural hazards, satellite processing, soil geochemistry, climate modeling, agriculture crop modeling.
McKinney, Bill; Meyer, Peter A; Crosas, Mercè; Sliz, Piotr
2017-01-01
Access to experimental X-ray diffraction image data is important for validation and reproduction of macromolecular models and indispensable for the development of structural biology processing methods. In response to the evolving needs of the structural biology community, we recently established a diffraction data publication system, the Structural Biology Data Grid (SBDG, data.sbgrid.org), to preserve primary experimental datasets supporting scientific publications. All datasets published through the SBDG are freely available to the research community under a public domain dedication license, with metadata compliant with the DataCite Schema (schema.datacite.org). A proof-of-concept study demonstrated community interest and utility. Publication of large datasets is a challenge shared by several fields, and the SBDG has begun collaborating with the Institute for Quantitative Social Science at Harvard University to extend the Dataverse (dataverse.org) open-source data repository system to structural biology datasets. Several extensions are necessary to support the size and metadata requirements for structural biology datasets. In this paper, we describe one such extension-functionality supporting preservation of file system structure within Dataverse-which is essential for both in-place computation and supporting non-HTTP data transfers. © 2016 New York Academy of Sciences.
Automating CapCom: Pragmatic Operations and Technology Research for Human Exploration of Mars
NASA Technical Reports Server (NTRS)
Clancey, William J.
2003-01-01
During the Apollo program, NASA and the scientific community used terrestrial analog sites for understanding planetary features and for training astronauts to be scientists. More recently, computer scientists and human factors specialists have followed geologists and biologists into the field, learning how science is actually done on expeditions in extreme environments. Research stations have been constructed by the Mars Society in the Arctic and American southwest, providing facilities for hundreds of researchers to investigate how small crews might live and work on Mars. Combining these interests-science, operations, and technology-in Mars analog field expeditions provides tremendous synergy and authenticity to speculations about Mars missions. By relating historical analyses of Apollo and field science, engineers are creating experimental prototypes that provide significant new capabilities, such as a computer system that automates some of the functions of Apollo s CapCom. Thus, analog studies have created a community of practice-a new collaboration between scientists and engineers-so that technology begins with real human needs and works incrementally towards the challenges of the human exploration of Mars.
Maier-Hein, Lena; Mersmann, Sven; Kondermann, Daniel; Bodenstedt, Sebastian; Sanchez, Alexandro; Stock, Christian; Kenngott, Hannes Gotz; Eisenmann, Mathias; Speidel, Stefanie
2014-01-01
Machine learning algorithms are gaining increasing interest in the context of computer-assisted interventions. One of the bottlenecks so far, however, has been the availability of training data, typically generated by medical experts with very limited resources. Crowdsourcing is a new trend that is based on outsourcing cognitive tasks to many anonymous untrained individuals from an online community. In this work, we investigate the potential of crowdsourcing for segmenting medical instruments in endoscopic image data. Our study suggests that (1) segmentations computed from annotations of multiple anonymous non-experts are comparable to those made by medical experts and (2) training data generated by the crowd is of the same quality as that annotated by medical experts. Given the speed of annotation, scalability and low costs, this implies that the scientific community might no longer need to rely on experts to generate reference or training data for certain applications. To trigger further research in endoscopic image processing, the data used in this study will be made publicly available.
Computer Simulation for Emergency Incident Management
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brown, D L
2004-12-03
This report describes the findings and recommendations resulting from the Department of Homeland Security (DHS) Incident Management Simulation Workshop held by the DHS Advanced Scientific Computing Program in May 2004. This workshop brought senior representatives of the emergency response and incident-management communities together with modeling and simulation technologists from Department of Energy laboratories. The workshop provided an opportunity for incident responders to describe the nature and substance of the primary personnel roles in an incident response, to identify current and anticipated roles of modeling and simulation in support of incident response, and to begin a dialog between the incident responsemore » and simulation technology communities that will guide and inform planned modeling and simulation development for incident response. This report provides a summary of the discussions at the workshop as well as a summary of simulation capabilities that are relevant to incident-management training, and recommendations for the use of simulation in both incident management and in incident management training, based on the discussions at the workshop. In addition, the report discusses areas where further research and development will be required to support future needs in this area.« less
Novel Scientific Visualization Interfaces for Interactive Information Visualization and Sharing
NASA Astrophysics Data System (ADS)
Demir, I.; Krajewski, W. F.
2012-12-01
As geoscientists are confronted with increasingly massive datasets from environmental observations to simulations, one of the biggest challenges is having the right tools to gain scientific insight from the data and communicate the understanding to stakeholders. Recent developments in web technologies make it easy to manage, visualize and share large data sets with general public. Novel visualization techniques and dynamic user interfaces allow users to interact with data, and modify the parameters to create custom views of the data to gain insight from simulations and environmental observations. This requires developing new data models and intelligent knowledge discovery techniques to explore and extract information from complex computational simulations or large data repositories. Scientific visualization will be an increasingly important component to build comprehensive environmental information platforms. This presentation provides an overview of the trends and challenges in the field of scientific visualization, and demonstrates information visualization and communication tools in the Iowa Flood Information System (IFIS), developed within the light of these challenges. The IFIS is a web-based platform developed by the Iowa Flood Center (IFC) to provide access to and visualization of flood inundation maps, real-time flood conditions, flood forecasts both short-term and seasonal, and other flood-related data for communities in Iowa. The key element of the system's architecture is the notion of community. Locations of the communities, those near streams and rivers, define basin boundaries. The IFIS provides community-centric watershed and river characteristics, weather (rainfall) conditions, and streamflow data and visualization tools. Interactive interfaces allow access to inundation maps for different stage and return period values, and flooding scenarios with contributions from multiple rivers. Real-time and historical data of water levels, gauge heights, and rainfall conditions are available in the IFIS. 2D and 3D interactive visualizations in the IFIS make the data more understandable to general public. Users are able to filter data sources for their communities and selected rivers. The data and information on IFIS is also accessible through web services and mobile applications. The IFIS is optimized for various browsers and screen sizes to provide access through multiple platforms including tablets and mobile devices. Multiple view modes in the IFIS accommodate different user types from general public to researchers and decision makers by providing different level of tools and details. River view mode allows users to visualize data from multiple IFC bridge sensors and USGS stream gauges to follow flooding condition along a river. The IFIS will help communities make better-informed decisions on the occurrence of floods, and will alert communities in advance to help minimize damage of floods.
The configuration of the Brazilian scientific field.
Barata, Rita B; Aragão, Erika; de Sousa, Luis E P Fernandes; Santana, Taris M; Barreto, Mauricio L
2014-03-01
This article describes the configuration of the scientific field in Brazil, characterizing the scientific communities in every major area of knowledge in terms of installed capacity, ability to train new researchers, and capacity for academic production. Empirical data from several sources of information are used to characterize the different communities. Articulating the theoretical contributions of Pierre Bourdieu, Ludwik Fleck, and Thomas Kuhn, the following types of capital are analyzed for each community: social capital (scientific prestige), symbolic capital (dominant paradigm), political capital (leadership in S & T policy), and economic capital (resources). Scientific prestige is analyzed by taking into account the volume of production, activity index, citations, and other indicators. To characterize symbolic capital, the dominant paradigms that distinguish the natural sciences, the humanities, applied sciences, and technology development are analyzed theoretically. Political capital is measured by presidency in one of the main agencies in the S & T national system, and research resources and fellowships define the economic capital. The article discusses the composition of these different types of capital and their correspondence to structural capacities in various communities with the aim of describing the configuration of the Brazilian scientific field.
MSL: Facilitating automatic and physical analysis of published scientific literature in PDF format
Ahmed, Zeeshan; Dandekar, Thomas
2018-01-01
Published scientific literature contains millions of figures, including information about the results obtained from different scientific experiments e.g. PCR-ELISA data, microarray analysis, gel electrophoresis, mass spectrometry data, DNA/RNA sequencing, diagnostic imaging (CT/MRI and ultrasound scans), and medicinal imaging like electroencephalography (EEG), magnetoencephalography (MEG), echocardiography (ECG), positron-emission tomography (PET) images. The importance of biomedical figures has been widely recognized in scientific and medicine communities, as they play a vital role in providing major original data, experimental and computational results in concise form. One major challenge for implementing a system for scientific literature analysis is extracting and analyzing text and figures from published PDF files by physical and logical document analysis. Here we present a product line architecture based bioinformatics tool ‘Mining Scientific Literature (MSL)’, which supports the extraction of text and images by interpreting all kinds of published PDF files using advanced data mining and image processing techniques. It provides modules for the marginalization of extracted text based on different coordinates and keywords, visualization of extracted figures and extraction of embedded text from all kinds of biological and biomedical figures using applied Optimal Character Recognition (OCR). Moreover, for further analysis and usage, it generates the system’s output in different formats including text, PDF, XML and images files. Hence, MSL is an easy to install and use analysis tool to interpret published scientific literature in PDF format. PMID:29721305
CFD Vision 2030 Study: A Path to Revolutionary Computational Aerosciences
NASA Technical Reports Server (NTRS)
Slotnick, Jeffrey; Khodadoust, Abdollah; Alonso, Juan; Darmofal, David; Gropp, William; Lurie, Elizabeth; Mavriplis, Dimitri
2014-01-01
This report documents the results of a study to address the long range, strategic planning required by NASA's Revolutionary Computational Aerosciences (RCA) program in the area of computational fluid dynamics (CFD), including future software and hardware requirements for High Performance Computing (HPC). Specifically, the "Vision 2030" CFD study is to provide a knowledge-based forecast of the future computational capabilities required for turbulent, transitional, and reacting flow simulations across a broad Mach number regime, and to lay the foundation for the development of a future framework and/or environment where physics-based, accurate predictions of complex turbulent flows, including flow separation, can be accomplished routinely and efficiently in cooperation with other physics-based simulations to enable multi-physics analysis and design. Specific technical requirements from the aerospace industrial and scientific communities were obtained to determine critical capability gaps, anticipated technical challenges, and impediments to achieving the target CFD capability in 2030. A preliminary development plan and roadmap were created to help focus investments in technology development to help achieve the CFD vision in 2030.
Timpka, T
2001-08-01
In an analysis departing from the global health situation, the foundation for a change of paradigm in health informatics based on socially embedded information infrastructures and technologies is identified and discussed. It is shown how an increasing computing and data transmitting capacity can be employed for proactive health computing. As a foundation for ubiquitous health promotion and prevention of disease and injury, proactive health systems use data from multiple sources to supply individuals and communities evidence-based information on means to improve their state of health and avoid health risks. The systems are characterised by: (1) being profusely connected to the world around them, using perceptual interfaces, sensors and actuators; (2) responding to external stimuli at faster than human speeds; (3) networked feedback loops; and (4) humans remaining in control, while being left outside the primary computing loop. The extended scientific mission of this new partnership between computer science, electrical engineering and social medicine is suggested to be the investigation of how the dissemination of information and communication technology on democratic grounds can be made even more important for global health than sanitation and urban planning became a century ago.
Graphics processing units in bioinformatics, computational biology and systems biology.
Nobile, Marco S; Cazzaniga, Paolo; Tangherloni, Andrea; Besozzi, Daniela
2017-09-01
Several studies in Bioinformatics, Computational Biology and Systems Biology rely on the definition of physico-chemical or mathematical models of biological systems at different scales and levels of complexity, ranging from the interaction of atoms in single molecules up to genome-wide interaction networks. Traditional computational methods and software tools developed in these research fields share a common trait: they can be computationally demanding on Central Processing Units (CPUs), therefore limiting their applicability in many circumstances. To overcome this issue, general-purpose Graphics Processing Units (GPUs) are gaining an increasing attention by the scientific community, as they can considerably reduce the running time required by standard CPU-based software, and allow more intensive investigations of biological systems. In this review, we present a collection of GPU tools recently developed to perform computational analyses in life science disciplines, emphasizing the advantages and the drawbacks in the use of these parallel architectures. The complete list of GPU-powered tools here reviewed is available at http://bit.ly/gputools. © The Author 2016. Published by Oxford University Press.
NASA Astrophysics Data System (ADS)
Casajus, A.; Ciba, K.; Fernandez, V.; Graciani, R.; Hamar, V.; Mendez, V.; Poss, S.; Sapunov, M.; Stagni, F.; Tsaregorodtsev, A.; Ubeda, M.
2012-12-01
The DIRAC Project was initiated to provide a data processing system for the LHCb Experiment at CERN. It provides all the necessary functionality and performance to satisfy the current and projected future requirements of the LHCb Computing Model. A considerable restructuring of the DIRAC software was undertaken in order to turn it into a general purpose framework for building distributed computing systems that can be used by various user communities in High Energy Physics and other scientific application domains. The CLIC and ILC-SID detector projects started to use DIRAC for their data production system. The Belle Collaboration at KEK, Japan, has adopted the Computing Model based on the DIRAC system for its second phase starting in 2015. The CTA Collaboration uses DIRAC for the data analysis tasks. A large number of other experiments are starting to use DIRAC or are evaluating this solution for their data processing tasks. DIRAC services are included as part of the production infrastructure of the GISELA Latin America grid. Similar services are provided for the users of the France-Grilles and IBERGrid National Grid Initiatives in France and Spain respectively. The new communities using DIRAC started to provide important contributions to its functionality. Among recent additions can be mentioned the support of the Amazon EC2 computing resources as well as other Cloud management systems; a versatile File Replica Catalog with File Metadata capabilities; support for running MPI jobs in the pilot based Workload Management System. Integration with existing application Web Portals, like WS-PGRADE, is demonstrated. In this paper we will describe the current status of the DIRAC Project, recent developments of its framework and functionality as well as the status of the rapidly evolving community of the DIRAC users.
Toward mapping the biology of the genome.
Chanock, Stephen
2012-09-01
This issue of Genome Research presents new results, methods, and tools from The ENCODE Project (ENCyclopedia of DNA Elements), which collectively represents an important step in moving beyond a parts list of the genome and promises to shape the future of genomic research. This collection sheds light on basic biological questions and frames the current debate over the optimization of tools and methodological challenges necessary to compare and interpret large complex data sets focused on how the genome is organized and regulated. In a number of instances, the authors have highlighted the strengths and limitations of current computational and technical approaches, providing the community with useful standards, which should stimulate development of new tools. In many ways, these papers will ripple through the scientific community, as those in pursuit of understanding the "regulatory genome" will heavily traverse the maps and tools. Similarly, the work should have a substantive impact on how genetic variation contributes to specific diseases and traits by providing a compendium of functional elements for follow-up study. The success of these papers should not only be measured by the scope of the scientific insights and tools but also by their ability to attract new talent to mine existing and future data.
Methodology capture: discriminating between the "best" and the rest of community practice
Eales, James M; Pinney, John W; Stevens, Robert D; Robertson, David L
2008-01-01
Background The methodologies we use both enable and help define our research. However, as experimental complexity has increased the choice of appropriate methodologies has become an increasingly difficult task. This makes it difficult to keep track of available bioinformatics software, let alone the most suitable protocols in a specific research area. To remedy this we present an approach for capturing methodology from literature in order to identify and, thus, define best practice within a field. Results Our approach is to implement data extraction techniques on the full-text of scientific articles to obtain the set of experimental protocols used by an entire scientific discipline, molecular phylogenetics. Our methodology for identifying methodologies could in principle be applied to any scientific discipline, whether or not computer-based. We find a number of issues related to the nature of best practice, as opposed to community practice. We find that there is much heterogeneity in the use of molecular phylogenetic methods and software, some of which is related to poor specification of protocols. We also find that phylogenetic practice exhibits field-specific tendencies that have increased through time, despite the generic nature of the available software. We used the practice of highly published and widely collaborative researchers ("expert" researchers) to analyse the influence of authority on community practice. We find expert authors exhibit patterns of practice common to their field and therefore act as useful field-specific practice indicators. Conclusion We have identified a structured community of phylogenetic researchers performing analyses that are customary in their own local community and significantly different from those in other areas. Best practice information can help to bridge such subtle differences by increasing communication of protocols to a wider audience. We propose that the practice of expert authors from the field of evolutionary biology is the closest to contemporary best practice in phylogenetic experimental design. Capturing best practice is, however, a complex task and should also acknowledge the differences between fields such as the specific context of the analysis. PMID:18761740
DOE Office of Scientific and Technical Information (OSTI.GOV)
Eric A. Wernert; William R. Sherman; Patrick O'Leary
Immersive visualization makes use of the medium of virtual reality (VR) - it is a subset of virtual reality focused on the application of VR technologies to scientific and information visualization. As the name implies, there is a particular focus on the physically immersive aspect of VR that more fully engages the perceptual and kinesthetic capabilities of the scientist with the goal of producing greater insight. The immersive visualization community is uniquely positioned to address the analysis needs of the wide spectrum of domain scientists who are becoming increasingly overwhelmed by data. The outputs of computational science simulations and high-resolutionmore » sensors are creating a data deluge. Data is coming in faster than it can be analyzed, and there are countless opportunities for discovery that are missed as the data speeds by. By more fully utilizing the scientists visual and other sensory systems, and by offering a more natural user interface with which to interact with computer-generated representations, immersive visualization offers great promise in taming this data torrent. However, increasing the adoption of immersive visualization in scientific research communities can only happen by simultaneously lowering the engagement threshold while raising the measurable benefits of adoption. Scientists time spent immersed with their data will thus be rewarded with higher productivity, deeper insight, and improved creativity. Immersive visualization ties together technologies and methodologies from a variety of related but frequently disjoint areas, including hardware, software and human-computer interaction (HCI) disciplines. In many ways, hardware is a solved problem. There are well established technologies including large walk-in systems such as the CAVE{trademark} and head-based systems such as the Wide-5{trademark}. The advent of new consumer-level technologies now enable an entirely new generation of immersive displays, with smaller footprints and costs, widening the potential consumer base. While one would be hard-pressed to call software a solved problem, we now understand considerably more about best practices for designing and developing sustainable, scalable software systems, and we have useful software examples that illuminate the way to even better implementations. As with any research endeavour, HCI will always be exploring new topics in interface design, but we now have a sizable knowledge base of the strengths and weaknesses of the human perceptual systems and we know how to design effective interfaces for immersive systems. So, in a research landscape with a clear need for better visualization and analysis tools, a methodology in immersive visualization that has been shown to effectively address some of those needs, and vastly improved supporting technologies and knowledge of hardware, software, and HCI, why hasn't immersive visualization 'caught on' more with scientists? What can we do as a community of immersive visualization researchers and practitioners to facilitate greater adoption by scientific communities so as to make the transition from 'the promise of virtual reality' to 'the reality of virtual reality'.« less
Introduction to the LaRC central scientific computing complex
NASA Technical Reports Server (NTRS)
Shoosmith, John N.
1993-01-01
The computers and associated equipment that make up the Central Scientific Computing Complex of the Langley Research Center are briefly described. The electronic networks that provide access to the various components of the complex and a number of areas that can be used by Langley and contractors staff for special applications (scientific visualization, image processing, software engineering, and grid generation) are also described. Flight simulation facilities that use the central computers are described. Management of the complex, procedures for its use, and available services and resources are discussed. This document is intended for new users of the complex, for current users who wish to keep appraised of changes, and for visitors who need to understand the role of central scientific computers at Langley.
[The Durkheim Test. Remarks on Susan Leigh Star's Boundary Objects].
Gießmann, Sebastian
2015-09-01
The article reconstructs Susan Leigh Star's conceptual work on the notion of 'boundary objects'. It traces the emergence of the concept, beginning with her PhD thesis and its publication as Regions of the Mind in 1989. 'Boundary objects' attempt to represent the distributed, multifold nature of scientific work and its mediations between different 'social worlds'. Being addressed to several 'communities of practice', the term responded to questions from Distributed Artificial Intelligence in Computer Science, Workplace Studies and Computer Supported Cooperative Work (CSCW), and microhistorical approaches inside the growing Science and Technology Studies. Yet the interdisciplinary character and interpretive flexibility of Star’s invention has rarely been noticed as a conceptual tool for media theory. I therefore propose to reconsider Star's 'Durkheim test' for sociotechnical media practices.
NASA Technical Reports Server (NTRS)
VanZandt, John
1994-01-01
The usage model of supercomputers for scientific applications, such as computational fluid dynamics (CFD), has changed over the years. Scientific visualization has moved scientists away from looking at numbers to looking at three-dimensional images, which capture the meaning of the data. This change has impacted the system models for computing. This report details the model which is used by scientists at NASA's research centers.
ERIC Educational Resources Information Center
Adams, Stephen T.
2004-01-01
Although one role of computers in science education is to help students learn specific science concepts, computers are especially intriguing as a vehicle for fostering the development of epistemological knowledge about the nature of scientific knowledge--what it means to "know" in a scientific sense (diSessa, 1985). In this vein, the…
Drinking policies and exercise-associated hyponatraemia: is anyone still promoting overdrinking?
Beltrami, F G; Hew-Butler, T; Noakes, T D
2008-10-01
The purpose of this review is to describe the evolution of hydration research and advice on drinking during exercise from published scientific papers, books and non-scientific material (advertisements and magazine contents) and detail how erroneous advice is likely propagated throughout the global sports medicine community. Hydration advice from sports-linked entities, the scientific community, exercise physiology textbooks and non-scientific sources was analysed historically and compared with the most recent scientific evidence. Drinking policies during exercise have changed substantially throughout history. Since the mid-1990s, however, there has been an increase in the promotion of overdrinking by athletes. While the scientific community is slowly moving away from "blanket" hydration advice in which one form of advice fits all and towards more modest, individualised, hydration guidelines in which thirst is recognised as the best physiological indicator of each subject's fluid needs during exercise, marketing departments of the global sports drink industry continue to promote overdrinking.
NASA Astrophysics Data System (ADS)
Christensen, C.; Summa, B.; Scorzelli, G.; Lee, J. W.; Venkat, A.; Bremer, P. T.; Pascucci, V.
2017-12-01
Massive datasets are becoming more common due to increasingly detailed simulations and higher resolution acquisition devices. Yet accessing and processing these huge data collections for scientific analysis is still a significant challenge. Solutions that rely on extensive data transfers are increasingly untenable and often impossible due to lack of sufficient storage at the client side as well as insufficient bandwidth to conduct such large transfers, that in some cases could entail petabytes of data. Large-scale remote computing resources can be useful, but utilizing such systems typically entails some form of offline batch processing with long delays, data replications, and substantial cost for any mistakes. Both types of workflows can severely limit the flexible exploration and rapid evaluation of new hypotheses that are crucial to the scientific process and thereby impede scientific discovery. In order to facilitate interactivity in both analysis and visualization of these massive data ensembles, we introduce a dynamic runtime system suitable for progressive computation and interactive visualization of arbitrarily large, disparately located spatiotemporal datasets. Our system includes an embedded domain-specific language (EDSL) that allows users to express a wide range of data analysis operations in a simple and abstract manner. The underlying runtime system transparently resolves issues such as remote data access and resampling while at the same time maintaining interactivity through progressive and interruptible processing. Computations involving large amounts of data can be performed remotely in an incremental fashion that dramatically reduces data movement, while the client receives updates progressively thereby remaining robust to fluctuating network latency or limited bandwidth. This system facilitates interactive, incremental analysis and visualization of massive remote datasets up to petabytes in size. Our system is now available for general use in the community through both docker and anaconda.
EPA uses high-end scientific computing, geospatial services and remote sensing/imagery analysis to support EPA's mission. The Center for Environmental Computing (CEC) assists the Agency's program offices and regions to meet staff needs in these areas.
NASA Astrophysics Data System (ADS)
Ortega-Rodríguez, M.; Solís-Sánchez, H.; Boza-Oviedo, E.; Chaves-Cruz, K.; Guevara-Bertsch, M.; Quirós-Rojas, M.; Vargas-Hernández, S.; Venegas-Li, A.
2017-04-01
We assess the scientific value of Oppenheimer's research on black holes in order to explain its neglect by the scientific community, and even by Oppenheimer himself. Looking closely at the scientific culture and conceptual belief system of the 1930s, the present article seeks to supplement the existing literature by enriching the explanations and complicating the guiding questions. We suggest a rereading of Oppenheimer as a figure both more intriguing for the history of astrophysics and further ahead of his time than is commonly supposed.
ERIC Educational Resources Information Center
Ledley, Tamara Shapiro; Taber, Michael R.; Lynds, Susan; Domenico, Ben; Dahlman, LuAnn
2012-01-01
Traditionally, there has been a large gap between the scientific and educational communities in terms of communication, which hinders the transfer of new scientific knowledge to teachers and students and the understanding of each other's needs and capabilities. In this paper, we describe a workshop model we have developed to facilitate…
Exploring "The World around Us" in a Community of Scientific Enquiry
ERIC Educational Resources Information Center
Dunlop, Lynda; Compton, Kirsty; Clarke, Linda; McKelvey-Martin, Valerie
2013-01-01
The primary Communities of Scientific Enquiry project is one element of the outreach work in Science in Society in Biomedical Sciences in partnership with the School of Education at the University of Ulster. The project aims to develop scientific understanding and skills at key stage 2 and is a response to several contemporary issues in primary…
ERIC Educational Resources Information Center
Estrada, Mica; Woodcock, Anna; Hernandez, Paul R.; Schultz, P. Wesley
2011-01-01
Students from several ethnic minority groups are underrepresented in the sciences, indicating that minority students more frequently drop out of the scientific career path than nonminority students. Viewed from a perspective of social influence, this pattern suggests that minority students do not integrate into the scientific community at the same…
Computer networking at SLR stations
NASA Technical Reports Server (NTRS)
Novotny, Antonin
1993-01-01
There are several existing communication methods to deliver data from the satellite laser ranging (SLR) station to the SLR data center and back: telephonmodem, telex, and computer networks. The SLR scientific community has been exploiting mainly INTERNET, BITNET/EARN, and SPAN. The total of 56 countries are connected to INTERNET and the number of nodes is exponentially growing. The computer networks mentioned above and others are connected through E-mail protocol. The scientific progress of SLR requires the increase of communication speed and the amount of the transmitted data. The TOPEX/POSEIDON test campaign required to deliver Quick Look data (1.7 kB/pass) from a SLR site to SLR data center within 8 hours and full rate data (up to 500 kB/pass) within 24 hours. We developed networking for the remote SLR station in Helwan, Egypt. The reliable scheme for data delivery consists of: compression of MERIT2 format (up to 89 percent), encoding to ASCII Me (files); and e-mail sending from SLR station--e-mail receiving, decoding, and decompression at the center. We do propose to use the ZIP method for compression/decompression and the UUCODE method for ASCII encoding/decoding. This method will be useful for stations connected via telephonemodems or commercial networks. The electronics delivery could solve the problem of the too late receiving of the FR data by SLR data center.
Computer networking at SLR stations
NASA Astrophysics Data System (ADS)
Novotny, Antonin
1993-06-01
There are several existing communication methods to deliver data from the satellite laser ranging (SLR) station to the SLR data center and back: telephonmodem, telex, and computer networks. The SLR scientific community has been exploiting mainly INTERNET, BITNET/EARN, and SPAN. The total of 56 countries are connected to INTERNET and the number of nodes is exponentially growing. The computer networks mentioned above and others are connected through E-mail protocol. The scientific progress of SLR requires the increase of communication speed and the amount of the transmitted data. The TOPEX/POSEIDON test campaign required to deliver Quick Look data (1.7 kB/pass) from a SLR site to SLR data center within 8 hours and full rate data (up to 500 kB/pass) within 24 hours. We developed networking for the remote SLR station in Helwan, Egypt. The reliable scheme for data delivery consists of: compression of MERIT2 format (up to 89 percent), encoding to ASCII Me (files); and e-mail sending from SLR station--e-mail receiving, decoding, and decompression at the center. We do propose to use the ZIP method for compression/decompression and the UUCODE method for ASCII encoding/decoding. This method will be useful for stations connected via telephonemodems or commercial networks. The electronics delivery could solve the problem of the too late receiving of the FR data by SLR data center.
Developing a Science Commons for Geosciences
NASA Astrophysics Data System (ADS)
Lenhardt, W. C.; Lander, H.
2016-12-01
Many scientific communities, recognizing the research possibilities inherent in data sets, have created domain specific archives such as the Incorporated Research Institutions for Seismology (iris.edu) and ClinicalTrials.gov. Though this is an important step forward, most scientists, including geoscientists, also use a variety of software tools and at least some amount of computation to conduct their research. While the archives make it simpler for scientists to locate the required data, provisioning disk space, compute resources, and network bandwidth can still require significant efforts. This challenge exists despite the wealth of resources available to researchers, namely lab IT resources, institutional IT resources, national compute resources (XSEDE, OSG), private clouds, public clouds, and the development of cyberinfrastructure technologies meant to facilitate use of those resources. Further tasks include obtaining and installing required tools for analysis and visualization. If the research effort is a collaboration or involves certain types of data, then the partners may well have additional non-scientific tasks such as securing the data and developing secure sharing methods for the data. These requirements motivate our investigations into the "Science Commons". This paper will present a working definition of a science commons, compare and contrast examples of existing science commons, and describe a project based at RENCI to implement a science commons for risk analytics. We will then explore what a similar tool might look like for the geosciences.
Using Java for distributed computing in the Gaia satellite data processing
NASA Astrophysics Data System (ADS)
O'Mullane, William; Luri, Xavier; Parsons, Paul; Lammers, Uwe; Hoar, John; Hernandez, Jose
2011-10-01
In recent years Java has matured to a stable easy-to-use language with the flexibility of an interpreter (for reflection etc.) but the performance and type checking of a compiled language. When we started using Java for astronomical applications around 1999 they were the first of their kind in astronomy. Now a great deal of astronomy software is written in Java as are many business applications. We discuss the current environment and trends concerning the language and present an actual example of scientific use of Java for high-performance distributed computing: ESA's mission Gaia. The Gaia scanning satellite will perform a galactic census of about 1,000 million objects in our galaxy. The Gaia community has chosen to write its processing software in Java. We explore the manifold reasons for choosing Java for this large science collaboration. Gaia processing is numerically complex but highly distributable, some parts being embarrassingly parallel. We describe the Gaia processing architecture and its realisation in Java. We delve into the astrometric solution which is the most advanced and most complex part of the processing. The Gaia simulator is also written in Java and is the most mature code in the system. This has been successfully running since about 2005 on the supercomputer "Marenostrum" in Barcelona. We relate experiences of using Java on a large shared machine. Finally we discuss Java, including some of its problems, for scientific computing.
Kretser, Alison; Murphy, Delia; Dwyer, Johanna
2017-01-01
ABSTRACT Scientific integrity is at the forefront of the scientific research enterprise. This paper provides an overview of key existing efforts on scientific integrity by federal agencies, foundations, nonprofit organizations, professional societies, and academia from 1989 to April 2016. It serves as a resource for the scientific community on scientific integrity work and helps to identify areas in which more action is needed. Overall, there is tremendous activity in this area and there are clear linkages among the efforts of the five sectors. All the same, scientific integrity needs to remain visible in the scientific community and evolve along with new research paradigms. High priority in instilling these values falls upon all stakeholders. PMID:27748637
Nektar++: An open-source spectral/ hp element framework
NASA Astrophysics Data System (ADS)
Cantwell, C. D.; Moxey, D.; Comerford, A.; Bolis, A.; Rocco, G.; Mengaldo, G.; De Grazia, D.; Yakovlev, S.; Lombard, J.-E.; Ekelschot, D.; Jordi, B.; Xu, H.; Mohamied, Y.; Eskilsson, C.; Nelson, B.; Vos, P.; Biotto, C.; Kirby, R. M.; Sherwin, S. J.
2015-07-01
Nektar++ is an open-source software framework designed to support the development of high-performance scalable solvers for partial differential equations using the spectral/ hp element method. High-order methods are gaining prominence in several engineering and biomedical applications due to their improved accuracy over low-order techniques at reduced computational cost for a given number of degrees of freedom. However, their proliferation is often limited by their complexity, which makes these methods challenging to implement and use. Nektar++ is an initiative to overcome this limitation by encapsulating the mathematical complexities of the underlying method within an efficient C++ framework, making the techniques more accessible to the broader scientific and industrial communities. The software supports a variety of discretisation techniques and implementation strategies, supporting methods research as well as application-focused computation, and the multi-layered structure of the framework allows the user to embrace as much or as little of the complexity as they need. The libraries capture the mathematical constructs of spectral/ hp element methods, while the associated collection of pre-written PDE solvers provides out-of-the-box application-level functionality and a template for users who wish to develop solutions for addressing questions in their own scientific domains.
NASA Technical Reports Server (NTRS)
Talbot, Bryan; Zhou, Shu-Jia; Higgins, Glenn; Zukor, Dorothy (Technical Monitor)
2002-01-01
One of the most significant challenges in large-scale climate modeling, as well as in high-performance computing in other scientific fields, is that of effectively integrating many software models from multiple contributors. A software framework facilitates the integration task, both in the development and runtime stages of the simulation. Effective software frameworks reduce the programming burden for the investigators, freeing them to focus more on the science and less on the parallel communication implementation. while maintaining high performance across numerous supercomputer and workstation architectures. This document surveys numerous software frameworks for potential use in Earth science modeling. Several frameworks are evaluated in depth, including Parallel Object-Oriented Methods and Applications (POOMA), Cactus (from (he relativistic physics community), Overture, Goddard Earth Modeling System (GEMS), the National Center for Atmospheric Research Flux Coupler, and UCLA/UCB Distributed Data Broker (DDB). Frameworks evaluated in less detail include ROOT, Parallel Application Workspace (PAWS), and Advanced Large-Scale Integrated Computational Environment (ALICE). A host of other frameworks and related tools are referenced in this context. The frameworks are evaluated individually and also compared with each other.
Using the High-Level Based Program Interface to Facilitate the Large Scale Scientific Computing
Shang, Yizi; Shang, Ling; Gao, Chuanchang; Lu, Guiming; Ye, Yuntao; Jia, Dongdong
2014-01-01
This paper is to make further research on facilitating the large-scale scientific computing on the grid and the desktop grid platform. The related issues include the programming method, the overhead of the high-level program interface based middleware, and the data anticipate migration. The block based Gauss Jordan algorithm as a real example of large-scale scientific computing is used to evaluate those issues presented above. The results show that the high-level based program interface makes the complex scientific applications on large-scale scientific platform easier, though a little overhead is unavoidable. Also, the data anticipation migration mechanism can improve the efficiency of the platform which needs to process big data based scientific applications. PMID:24574931
NASA Astrophysics Data System (ADS)
Hwang, L.; Kellogg, L. H.
2017-12-01
Curation of software promotes discoverability and accessibility and works hand in hand with scholarly citation to ascribe value to, and provide recognition for software development. To meet this challenge, the Computational Infrastructure for Geodynamics (CIG) maintains a community repository built on custom and open tools to promote discovery, access, identification, credit, and provenance of research software for the geodynamics community. CIG (geodynamics.org) originated from recognition of the tremendous effort required to develop sound software and the need to reduce duplication of effort and to sustain community codes. CIG curates software across 6 domains and has developed and follows software best practices that include establishing test cases, documentation, and a citable publication for each software package. CIG software landing web pages provide access to current and past releases; many are also accessible through the CIG community repository on github. CIG has now developed abc - attribution builder for citation to enable software users to give credit to software developers. abc uses zenodo as an archive and as the mechanism to obtain a unique identifier (DOI) for scientific software. To assemble the metadata, we searched the software's documentation and research publications and then requested the primary developers to verify. In this process, we have learned that each development community approaches software attribution differently. The metadata gathered is based on guidelines established by groups such as FORCE11 and OntoSoft. The rollout of abc is gradual as developers are forward-looking, rarely willing to go back and archive prior releases in zenodo. Going forward all actively developed packages will utilize the zenodo and github integration to automate the archival process when a new release is issued. How to handle legacy software, multi-authored libraries, and assigning roles to software remain open issues.
NASA Astrophysics Data System (ADS)
Stevens, Rick
2008-07-01
The fourth annual Scientific Discovery through Advanced Computing (SciDAC) Conference was held June 13-18, 2008, in Seattle, Washington. The SciDAC conference series is the premier communitywide venue for presentation of results from the DOE Office of Science's interdisciplinary computational science program. Started in 2001 and renewed in 2006, the DOE SciDAC program is the country's - and arguably the world's - most significant interdisciplinary research program supporting the development of advanced scientific computing methods and their application to fundamental and applied areas of science. SciDAC supports computational science across many disciplines, including astrophysics, biology, chemistry, fusion sciences, and nuclear physics. Moreover, the program actively encourages the creation of long-term partnerships among scientists focused on challenging problems and computer scientists and applied mathematicians developing the technology and tools needed to address those problems. The SciDAC program has played an increasingly important role in scientific research by allowing scientists to create more accurate models of complex processes, simulate problems once thought to be impossible, and analyze the growing amount of data generated by experiments. To help further the research community's ability to tap into the capabilities of current and future supercomputers, Under Secretary for Science, Raymond Orbach, launched the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program in 2003. The INCITE program was conceived specifically to seek out computationally intensive, large-scale research projects with the potential to significantly advance key areas in science and engineering. The program encourages proposals from universities, other research institutions, and industry. During the first two years of the INCITE program, 10 percent of the resources at NERSC were allocated to INCITE awardees. However, demand for supercomputing resources far exceeded available systems; and in 2003, the Office of Science identified increasing computing capability by a factor of 100 as the second priority on its Facilities of the Future list. The goal was to establish leadership-class computing resources to support open science. As a result of a peer reviewed competition, the first leadership computing facility was established at Oak Ridge National Laboratory in 2004. A second leadership computing facility was established at Argonne National Laboratory in 2006. This expansion of computational resources led to a corresponding expansion of the INCITE program. In 2008, Argonne, Lawrence Berkeley, Oak Ridge, and Pacific Northwest national laboratories all provided resources for INCITE. By awarding large blocks of computer time on the DOE leadership computing facilities, the INCITE program enables the largest-scale computations to be pursued. In 2009, INCITE will award over half a billion node-hours of time. The SciDAC conference celebrates progress in advancing science through large-scale modeling and simulation. Over 350 participants attended this year's talks, poster sessions, and tutorials, spanning the disciplines supported by DOE. While the principal focus was on SciDAC accomplishments, this year's conference also included invited presentations and posters from DOE INCITE awardees. Another new feature in the SciDAC conference series was an electronic theater and video poster session, which provided an opportunity for the community to see over 50 scientific visualizations in a venue equipped with many high-resolution large-format displays. To highlight the growing international interest in petascale computing, this year's SciDAC conference included a keynote presentation by Herman Lederer from the Max Planck Institut, one of the leaders of DEISA (Distributed European Infrastructure for Supercomputing Applications) project and a member of the PRACE consortium, Europe's main petascale project. We also heard excellent talks from several European groups, including Laurent Gicquel of CERFACS, who spoke on `Large-Eddy Simulations of Turbulent Reacting Flows of Real Burners: Status and Challenges', and Jean-Francois Hamelin from EDF, who presented a talk on `Getting Ready for Petaflop Capacities and Beyond: A Utility Perspective'. Two other compelling addresses gave attendees a glimpse into the future. Tomas Diaz de la Rubia of Lawrence Livermore National Laboratory spoke on a vision for a fusion/fission hybrid reactor known as the `LIFE Engine' and discussed some of the materials and modeling challenges that need to be overcome to realize the vision for a 1000-year greenhouse-gas-free power source. Dan Reed from Microsoft gave a capstone talk on the convergence of technology, architecture, and infrastructure for cloud computing, data-intensive computing, and exascale computing (1018 flops/sec). High-performance computing is making rapid strides. The SciDAC community's computational resources are expanding dramatically. In the summer of 2008 the first general purpose petascale system (IBM Cell-based RoadRunner at Los Alamos National Laboratory) was recognized in the top 500 list of fastest machines heralding in the dawning of the petascale era. The DOE's leadership computing facility at Argonne reached number three on the Top 500 and is at the moment the most capable open science machine based on an IBM BG/P system with a peak performance of over 550 teraflops/sec. Later this year Oak Ridge is expected to deploy a 1 petaflops/sec Cray XT system. And even before the scientific community has had an opportunity to make significant use of petascale systems, the computer science research community is forging ahead with ideas and strategies for development of systems that may by the end of the next decade sustain exascale performance. Several talks addressed barriers to, and strategies for, achieving exascale capabilities. The last day of the conference was devoted to tutorials hosted by Microsoft Research at a new conference facility in Redmond, Washington. Over 90 people attended the tutorials, which covered topics ranging from an introduction to BG/P programming to advanced numerical libraries. The SciDAC and INCITE programs and the DOE Office of Advanced Scientific Computing Research core program investments in applied mathematics, computer science, and computational and networking facilities provide a nearly optimum framework for advancing computational science for DOE's Office of Science. At a broader level this framework also is benefiting the entire American scientific enterprise. As we look forward, it is clear that computational approaches will play an increasingly significant role in addressing challenging problems in basic science, energy, and environmental research. It takes many people to organize and support the SciDAC conference, and I would like to thank as many of them as possible. The backbone of the conference is the technical program; and the task of selecting, vetting, and recruiting speakers is the job of the organizing committee. I thank the members of this committee for all the hard work and the many tens of conference calls that enabled a wonderful program to be assembled. This year the following people served on the organizing committee: Jim Ahrens, LANL; David Bader, LLNL; Bryan Barnett, Microsoft; Peter Beckman, ANL; Vincent Chan, GA; Jackie Chen, SNL; Lori Diachin, LLNL; Dan Fay, Microsoft; Ian Foster, ANL; Mark Gordon, Ames; Mohammad Khaleel, PNNL; David Keyes, Columbia University; Bob Lucas, University of Southern California; Tony Mezzacappa, ORNL; Jeff Nichols, ORNL; David Nowak, ANL; Michael Papka, ANL; Thomas Schultess, ORNL; Horst Simon, LBNL; David Skinner, LBNL; Panagiotis Spentzouris, Fermilab; Bob Sugar, UCSB; and Kathy Yelick, LBNL. I owe a special thanks to Mike Papka and Jim Ahrens for handling the electronic theater. I also thank all those who submitted videos. It was a highly successful experiment. Behind the scenes an enormous amount of work is required to make a large conference go smoothly. First I thank Cheryl Zidel for her tireless efforts as organizing committee liaison and posters chair and, in general, handling all of my end of the program and keeping me calm. I also thank Gail Pieper for her work in editing the proceedings, Beth Cerny Patino for her work on the Organizing Committee website and electronic theater, and Ken Raffenetti for his work in keeping that website working. Jon Bashor and John Hules did an excellent job in handling conference communications. I thank Caitlin Youngquist for the striking graphic design; Dan Fay for tutorials arrangements; and Lynn Dory, Suzanne Stevenson, Sarah Pebelske and Sarah Zidel for on-site registration and conference support. We all owe Yeen Mankin an extra-special thanks for choosing the hotel, handling contracts, arranging menus, securing venues, and reassuring the chair that everything was under control. We are pleased to have obtained corporate sponsorship from Cray, IBM, Intel, HP, and SiCortex. I thank all the speakers and panel presenters. I also thank the former conference chairs Tony Metzzacappa, Bill Tang, and David Keyes, who were never far away for advice and encouragement. Finally, I offer my thanks to Michael Strayer, without whose leadership, vision, and persistence the SciDAC program would not have come into being and flourished. I am honored to be part of his program and his friend. Rick Stevens Seattle, Washington July 18, 2008
Defining Computational Thinking for Mathematics and Science Classrooms
ERIC Educational Resources Information Center
Weintrop, David; Beheshti, Elham; Horn, Michael; Orton, Kai; Jona, Kemi; Trouille, Laura; Wilensky, Uri
2016-01-01
Science and mathematics are becoming computational endeavors. This fact is reflected in the recently released Next Generation Science Standards and the decision to include "computational thinking" as a core scientific practice. With this addition, and the increased presence of computation in mathematics and scientific contexts, a new…
Semantic Web Compatible Names and Descriptions for Organisms
NASA Astrophysics Data System (ADS)
Wang, H.; Wilson, N.; McGuinness, D. L.
2012-12-01
Modern scientific names are critical for understanding the biological literature and provide a valuable way to understand evolutionary relationships. To validly publish a name, a description is required to separate the described group of organisms from those described by other names at the same level of the taxonomic hierarchy. The frequent revision of descriptions due to new evolutionary evidence has lead to situations where a single given scientific name may over time have multiple descriptions associated with it and a given published description may apply to multiple scientific names. Because of these many-to-many relationships between scientific names and descriptions, the usage of scientific names as a proxy for descriptions is inevitably ambiguous. Another issue lies in the fact that the precise application of scientific names often requires careful microscopic work, or increasingly, genetic sequencing, as scientific names are focused on the evolutionary relatedness between and within named groups such as species, genera, families, etc. This is problematic to many audiences, especially field biologists, who often do not have access to the instruments and tools required to make identifications on a microscopic or genetic basis. To better connect scientific names to descriptions and find a more convenient way to support computer assisted identification, we proposed the Semantic Vernacular System, a novel naming system that creates named, machine-interpretable descriptions for groups of organisms, and is compatible with the Semantic Web. Unlike the evolutionary relationship based scientific naming system, it emphasizes the observable features of organisms. By independently naming the descriptions composed of sets of observational features, as well as maintaining connections to scientific names, it preserves the observational data used to identify organisms. The system is designed to support a peer-review mechanism for creating new names, and uses a controlled vocabulary encoded in the Web Ontology Language to represent the observational features. A prototype of the system is currently under development in collaboration with the Mushroom Observer website. It allows users to propose new names and descriptions for fungi, provide feedback on those proposals, and ultimately have them formally approved. It relies on SPARQL queries and semantic reasoning for data management. This effort will offer the mycology community a knowledge base of fungal observational features and a tool for identifying fungal observations. It will also serve as an operational specification of how the Semantic Vernacular System can be used in practice in one scientific community (in this case mycology).
FDA's Activities Supporting Regulatory Application of "Next Gen" Sequencing Technologies.
Wilson, Carolyn A; Simonyan, Vahan
2014-01-01
Applications of next-generation sequencing (NGS) technologies require availability and access to an information technology (IT) infrastructure and bioinformatics tools for large amounts of data storage and analyses. The U.S. Food and Drug Administration (FDA) anticipates that the use of NGS data to support regulatory submissions will continue to increase as the scientific and clinical communities become more familiar with the technologies and identify more ways to apply these advanced methods to support development and evaluation of new biomedical products. FDA laboratories are conducting research on different NGS platforms and developing the IT infrastructure and bioinformatics tools needed to enable regulatory evaluation of the technologies and the data sponsors will submit. A High-performance Integrated Virtual Environment, or HIVE, has been launched, and development and refinement continues as a collaborative effort between the FDA and George Washington University to provide the tools to support these needs. The use of a highly parallelized environment facilitated by use of distributed cloud storage and computation has resulted in a platform that is both rapid and responsive to changing scientific needs. The FDA plans to further develop in-house capacity in this area, while also supporting engagement by the external community, by sponsoring an open, public workshop to discuss NGS technologies and data formats standardization, and to promote the adoption of interoperability protocols in September 2014. Next-generation sequencing (NGS) technologies are enabling breakthroughs in how the biomedical community is developing and evaluating medical products. One example is the potential application of this method to the detection and identification of microbial contaminants in biologic products. In order for the U.S. Food and Drug Administration (FDA) to be able to evaluate the utility of this technology, we need to have the information technology infrastructure and bioinformatics tools to be able to store and analyze large amounts of data. To address this need, we have developed the High-performance Integrated Virtual Environment, or HIVE. HIVE uses a combination of distributed cloud storage and distributed cloud computations to provide a platform that is both rapid and responsive to support the growing and increasingly diverse scientific and regulatory needs of FDA scientists in their evaluation of NGS in research and ultimately for evaluation of NGS data in regulatory submissions. © PDA, Inc. 2014.
Challenges in integrating multidisciplinary data into a single e-infrastructure
NASA Astrophysics Data System (ADS)
Atakan, Kuvvet; Jeffery, Keith G.; Bailo, Daniele; Harrison, Matthew
2015-04-01
The European Plate Observing System (EPOS) aims to create a pan-European infrastructure for solid Earth science to support a safe and sustainable society. The mission of EPOS is to monitor and understand the dynamic and complex Earth system by relying on new e-science opportunities and integrating diverse and advanced Research Infrastructures in Europe for solid Earth Science. EPOS will enable innovative multidisciplinary research for a better understanding of the Earth's physical and chemical processes that control earthquakes, volcanic eruptions, ground instability and tsunami as well as the processes driving tectonics and Earth's surface dynamics. EPOS will improve our ability to better manage the use of the subsurface of the Earth. Through integration of data, models and facilities EPOS will allow the Earth Science community to make a step change in developing new concepts and tools for key answers to scientific and socio-economic questions concerning geo-hazards and geo-resources as well as Earth sciences applications to the environment and to human welfare. EPOS is now getting into its Implementation Phase (EPOS-IP). One of the main challenges during the implementation phase is the integration of multidisciplinary data into a single e-infrastructure. Multidisciplinary data are organized and governed by the Thematic Core Services (TCS) and are driven by various scientific communities encompassing a wide spectrum of Earth science disciplines. TCS data, data products and services will be integrated into a platform "the ICS system" that will ensure their interoperability and access to these services by the scientific community as well as other users within the society. This requires dedicated tasks for interactions with the various TCS-WPs, as well as the various distributed ICS (ICS-Ds), such as High Performance Computing (HPC) facilities, large scale data storage facilities, complex processing and visualization tools etc. Computational Earth Science (CES) services are identified as a transversal activity and as such need to be harmonized and provided within the ICS. In order to develop a metadata catalogue and the ICS system, the content from the entire spectrum of services included in TCS, ICS-Ds as well as CES activities, need to be organized in a systematic manner taking into account global and European IT-standards, while complying with the user needs and data provider requirements.
ERIC Educational Resources Information Center
Halbauer, Siegfried
1976-01-01
It was considered that students of intensive scientific Russian courses could learn vocabulary more efficiently if they were taught word stems and how to combine them with prefixes and suffixes to form scientific words. The computer programs developed to identify the most important stems is discussed. (Text is in German.) (FB)
Bethel, E. Wes [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division and Scientific Visualization Group
2018-05-07
Summer Lecture Series 2008: Scientific visualization transforms abstract data into readily comprehensible images, provide a vehicle for "seeing the unseeable," and play a central role in both experimental and computational sciences. Wes Bethel, who heads the Scientific Visualization Group in the Computational Research Division, presents an overview of visualization and computer graphics, current research challenges, and future directions for the field.
Partners in Science: A Suggested Framework for Inclusive Research
NASA Astrophysics Data System (ADS)
Pandya, R. E.
2012-12-01
Public participation in scientific research, also known as citizen science, is effective on many levels: it produces sound, publishable science and data, helps participants gain scientific knowledge and learn about the methods and practices of modern science, and can help communities advance their own priorities. Unfortunately, the demographics of citizen science programs do not reflect the demographics of the US; in general people of color and less affluent members of society are under-represented. To understand the reasons for this disparity, it is useful to look to the broader research about participation in science in a variety of informal and formal settings. From this research, the causes for unequal participation in science can be grouped into three broad categories: accessibility challenges, cultural differences, and a gap between scientific goals and community priorities. Many of these challenges are addressed in working with communities to develop an integrated program of scientific research, education, and community action that addresses community priorities and invites community participation at every stage of the process from defining the question to applying the results. In the spectrum of ways to engage the public in scientific research, this approach of "co-creation" is the most intensive. This talk will explore several examples of co-creation of science, including collaborations with tribal communities around climate change adaptation, work in the Louisiana Delta concerning land loss, and the link between weather and disease in Africa. We will articulate some of the challenges of working this intensively with communities, and suggest a general framework for guiding this kind of work with communities. This model of intensive collaboration at every stage is a promising one for adding to the diversity of citizen science efforts. It also provides a powerful strategy for science more generally, and may help us diversify our field, ensure the use and usability of our science, and help strengthen public support for and acceptance of scientific results.
Scientific Visualization, Seeing the Unseeable
LBNL
2017-12-09
June 24, 2008 Berkeley Lab lecture: Scientific visualization transforms abstract data into readily comprehensible images, provide a vehicle for "seeing the unseeable," and play a central role in bo... June 24, 2008 Berkeley Lab lecture: Scientific visualization transforms abstract data into readily comprehensible images, provide a vehicle for "seeing the unseeable," and play a central role in both experimental and computational sciences. Wes Bethel, who heads the Scientific Visualization Group in the Computational Research Division, presents an overview of visualization and computer graphics, current research challenges, and future directions for the field.
NCI's Transdisciplinary High Performance Scientific Data Platform
NASA Astrophysics Data System (ADS)
Evans, Ben; Antony, Joseph; Bastrakova, Irina; Car, Nicholas; Cox, Simon; Druken, Kelsey; Evans, Bradley; Fraser, Ryan; Ip, Alex; Kemp, Carina; King, Edward; Minchin, Stuart; Larraondo, Pablo; Pugh, Tim; Richards, Clare; Santana, Fabiana; Smillie, Jon; Trenham, Claire; Wang, Jingbo; Wyborn, Lesley
2016-04-01
The Australian National Computational Infrastructure (NCI) manages Earth Systems data collections sourced from several domains and organisations onto a single High Performance Data (HPD) Node to further Australia's national priority research and innovation agenda. The NCI HPD Node has rapidly established its value, currently managing over 10 PBytes of datasets from collections that span a wide range of disciplines including climate, weather, environment, geoscience, geophysics, water resources and social sciences. Importantly, in order to facilitate broad user uptake, maximise reuse and enable transdisciplinary access through software and standardised interfaces, the datasets, associated information systems and processes have been incorporated into the design and operation of a unified platform that NCI has called, the National Environmental Research Data Interoperability Platform (NERDIP). The key goal of the NERDIP is to regularise data access so that it is easily discoverable, interoperable for different domains and enabled for high performance methods. It adopts and implements international standards and data conventions, and promotes scientific integrity within a high performance computing and data analysis environment. NCI has established a rich and flexible computing environment to access to this data, through the NCI supercomputer; a private cloud that supports both domain focused virtual laboratories and in-common interactive analysis interfaces; as well as remotely through scalable data services. Data collections of this importance must be managed with careful consideration of both their current use and the needs of the end-communities, as well as its future potential use, such as transitioning to more advanced software and improved methods. It is therefore critical that the data platform is both well-managed and trusted for stable production use (including transparency and reproducibility), agile enough to incorporate new technological advances and additional communities practices, and a foundation for new exploratory developments. To that end, NCI is already participating in numerous current and emerging collaborations internationally including the Earth System Grid Federation (ESGF); Climate and Weather Data from International agencies such as NASA, NOAA, and UK Met Office; Remotely Sensed Satellite Earth Imaging through collaborations through GEOS and CEOS; EU-led Ocean Data Interoperability Platform (ODIP) and Horizon2020 Earth Server2 project; as well as broader data infrastructure community activities such as Research Data Alliance (RDA). Each research community is heavily engaged in international standards such as ISO, OGC and W3C, adopting community-led conventions for data, supporting improved data organisation such as controlled vocabularies, and creating workflows that use mature APIs and data services. NCI is engaging with these communities on NERDIP to ensure that such standards are applied uniformly and tested in practice by working with the variety of data and technologies. This includes benchmarking exemplar cases from individual communities, documenting their use of standards, and evaluating their practical use of the different technologies. Such a process fully establishes the functionality and performance, and is required to safely transition when improvements or rationalisation is required. Work is now underway to extend the NERDIP platform for better utilisation in the subsurface geophysical community, including maximizing national uptake, as well as better integration with international science platforms.
plasmaFoam: An OpenFOAM framework for computational plasma physics and chemistry
NASA Astrophysics Data System (ADS)
Venkattraman, Ayyaswamy; Verma, Abhishek Kumar
2016-09-01
As emphasized in the 2012 Roadmap for low temperature plasmas (LTP), scientific computing has emerged as an essential tool for the investigation and prediction of the fundamental physical and chemical processes associated with these systems. While several in-house and commercial codes exist, with each having its own advantages and disadvantages, a common framework that can be developed by researchers from all over the world will likely accelerate the impact of computational studies on advances in low-temperature plasma physics and chemistry. In this regard, we present a finite volume computational toolbox to perform high-fidelity simulations of LTP systems. This framework, primarily based on the OpenFOAM solver suite, allows us to enhance our understanding of multiscale plasma phenomenon by performing massively parallel, three-dimensional simulations on unstructured meshes using well-established high performance computing tools that are widely used in the computational fluid dynamics community. In this talk, we will present preliminary results obtained using the OpenFOAM-based solver suite with benchmark three-dimensional simulations of microplasma devices including both dielectric and plasma regions. We will also discuss the future outlook for the solver suite.
NASA Astrophysics Data System (ADS)
Evans, B. J. K.; Pugh, T.; Wyborn, L. A.; Porter, D.; Allen, C.; Smillie, J.; Antony, J.; Trenham, C.; Evans, B. J.; Beckett, D.; Erwin, T.; King, E.; Hodge, J.; Woodcock, R.; Fraser, R.; Lescinsky, D. T.
2014-12-01
The National Computational Infrastructure (NCI) has co-located a priority set of national data assets within a HPC research platform. This powerful in-situ computational platform has been created to help serve and analyse the massive amounts of data across the spectrum of environmental collections - in particular the climate, observational data and geoscientific domains. This paper examines the infrastructure, innovation and opportunity for this significant research platform. NCI currently manages nationally significant data collections (10+ PB) categorised as 1) earth system sciences, climate and weather model data assets and products, 2) earth and marine observations and products, 3) geosciences, 4) terrestrial ecosystem, 5) water management and hydrology, and 6) astronomy, social science and biosciences. The data is largely sourced from the NCI partners (who include the custodians of many of the national scientific records), major research communities, and collaborating overseas organisations. By co-locating these large valuable data assets, new opportunities have arisen by harmonising the data collections, making a powerful transdisciplinary research platformThe data is accessible within an integrated HPC-HPD environment - a 1.2 PFlop supercomputer (Raijin), a HPC class 3000 core OpenStack cloud system and several highly connected large scale and high-bandwidth Lustre filesystems. New scientific software, cloud-scale techniques, server-side visualisation and data services have been harnessed and integrated into the platform, so that analysis is performed seamlessly across the traditional boundaries of the underlying data domains. Characterisation of the techniques along with performance profiling ensures scalability of each software component, all of which can either be enhanced or replaced through future improvements. A Development-to-Operations (DevOps) framework has also been implemented to manage the scale of the software complexity alone. This ensures that software is both upgradable and maintainable, and can be readily reused with complexly integrated systems and become part of the growing global trusted community tools for cross-disciplinary research.
STREAM2016: Streaming Requirements, Experience, Applications and Middleware Workshop
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fox, Geoffrey; Jha, Shantenu; Ramakrishnan, Lavanya
The Department of Energy (DOE) Office of Science (SC) facilities including accelerators, light sources and neutron sources and sensors that study, the environment, and the atmosphere, are producing streaming data that needs to be analyzed for next-generation scientific discoveries. There has been an explosion of new research and technologies for stream analytics arising from the academic and private sectors. However, there has been no corresponding effort in either documenting the critical research opportunities or building a community that can create and foster productive collaborations. The two-part workshop series, STREAM: Streaming Requirements, Experience, Applications and Middleware Workshop (STREAM2015 and STREAM2016), weremore » conducted to bring the community together and identify gaps and future efforts needed by both NSF and DOE. This report describes the discussions, outcomes and conclusions from STREAM2016: Streaming Requirements, Experience, Applications and Middleware Workshop, the second of these workshops held on March 22-23, 2016 in Tysons, VA. STREAM2016 focused on the Department of Energy (DOE) applications, computational and experimental facilities, as well software systems. Thus, the role of “streaming and steering” as a critical mode of connecting the experimental and computing facilities was pervasive through the workshop. Given the overlap in interests and challenges with industry, the workshop had significant presence from several innovative companies and major contributors. The requirements that drive the proposed research directions, identified in this report, show an important opportunity for building competitive research and development program around streaming data. These findings and recommendations are consistent with vision outlined in NRC Frontiers of Data and National Strategic Computing Initiative (NCSI) [1, 2]. The discussions from the workshop are captured as topic areas covered in this report's sections. The report discusses four research directions driven by current and future application requirements reflecting the areas identified as important by STREAM2016. These include (i) Algorithms, (ii) Programming Models, Languages and Runtime Systems (iii) Human-in-the-loop and Steering in Scientific Workflow and (iv) Facilities.« less
NASA Astrophysics Data System (ADS)
DeLong, S.; Troch, P. A.; Barron-Gafford, G. A.; Huxman, T. E.; Pelletier, J. D.; Dontsova, K.; Niu, G.; Chorover, J.; Zeng, X.
2012-12-01
To meet the challenge of predicting landscape-scale changes in Earth system behavior, the University of Arizona has designed and constructed a new large-scale and community-oriented scientific facility - the Landscape Evolution Observatory (LEO). The primary scientific objectives are to quantify interactions among hydrologic partitioning, geochemical weathering, ecology, microbiology, atmospheric processes, and geomorphic change associated with incipient hillslope development. LEO consists of three identical, sloping, 333 m2 convergent landscapes inside a 5,000 m2 environmentally controlled facility. These engineered landscapes contain 1 meter of basaltic tephra ground to homogenous loamy sand and contains a spatially dense sensor and sampler network capable of resolving meter-scale lateral heterogeneity and sub-meter scale vertical heterogeneity in moisture, energy and carbon states and fluxes. Each ~1000 metric ton landscape has load cells embedded into the structure to measure changes in total system mass with 0.05% full-scale repeatability (equivalent to less than 1 cm of precipitation), to facilitate better quantification of evapotraspiration. Each landscape has an engineered rain system that allows application of precipitation at rates between3 and 45 mm/hr. These landscapes are being studied in replicate as "bare soil" for an initial period of several years. After this initial phase, heat- and drought-tolerant vascular plant communities will be introduced. Introduction of vascular plants is expected to change how water, carbon, and energy cycle through the landscapes, with potentially dramatic effects on co-evolution of the physical and biological systems. LEO also provides a physical comparison to computer models that are designed to predict interactions among hydrological, geochemical, atmospheric, ecological and geomorphic processes in changing climates. These computer models will be improved by comparing their predictions to physical measurements made in LEO. The main focus of our iterative modeling and measurement discovery cycle is to use rapid data assimilation to facilitate validation of newly coupled open-source Earth systems models. LEO will be a community resource for Earth system science research, education, and outreach. The LEO project operational philosophy includes 1) open and real-time availability of sensor network data, 2) a framework for community collaboration and facility access that includes integration of new or comparative measurement capabilities into existing facility cyberinfrastructure, 3) community-guided science planning and 4) development of novel education and outreach programs.Artistic rendering of the University of Arizona Landscape Evolution Observatory
How Does the Scientific Community Contribute to Gene Ontology?
Lovering, Ruth C
2017-01-01
Collaborations between the scientific community and members of the Gene Ontology (GO) Consortium have led to an increase in the number and specificity of GO terms, as well as increasing the number of GO annotations. A variety of approaches have been taken to encourage research scientists to contribute to the GO, but the success of these approaches has been variable. This chapter reviews both the successes and failures of engaging the scientific community in GO development and annotation, as well as, providing motivation and advice to encourage individual researchers to contribute to GO.
Steponas Kolupaila's contribution to hydrological science development
NASA Astrophysics Data System (ADS)
Valiuškevičius, Gintaras
2017-08-01
Steponas Kolupaila (1892-1964) was an important figure in 20th century hydrology and one of the pioneers of scientific water gauging in Europe. His research on the reliability of hydrological data and measurement methods was particularly important and contributed to the development of empirical hydrological calculation methods. Kolupaila was one of the first who standardised water-gauging methods internationally. He created several original hydrological and hydraulic calculation methods (his discharge assessment method for winter period was particularly significant). His innate abilities and frequent travel made Kolupaila a universal specialist in various fields and an active public figure. He revealed his multilayered scientific and cultural experiences in his most famous book, Bibliography of Hydrometry. This book introduced the unique European hydrological-measurement and computation methods to the community of world hydrologists at that time and allowed the development and adaptation of these methods across the world.
NASA Astrophysics Data System (ADS)
Wilson, B. D.; McGibbney, L. J.; Mattmann, C. A.; Ramirez, P.; Joyce, M.; Whitehall, K. D.
2015-12-01
Quantifying scientific relevancy is of increasing importance to NASA and the research community. Scientific relevancy may be defined by mapping the impacts of a particular NASA mission, instrument, and/or retrieved variables to disciplines such as climate predictions, natural hazards detection and mitigation processes, education, and scientific discoveries. Related to relevancy, is the ability to expose data with similar attributes. This in turn depends upon the ability for us to extract latent, implicit document features from scientific data and resources and make them explicit, accessible and useable for search activities amongst others. This paper presents MemexGATE; a server side application, command line interface and computing environment for running large scale metadata extraction, general architecture text engineering, document classification and indexing tasks over document resources such as social media streams, scientific literature archives, legal documentation, etc. This work builds on existing experiences using MemexGATE (funded, developed and validated through the DARPA Memex Progrjam PI Mattmann) for extracting and leveraging latent content features from document resources within the Materials Research domain. We extend the software functionality capability to the domain of scientific literature with emphasis on the expansion of gazetteer lists, named entity rules, natural language construct labeling (e.g. synonym, antonym, hyponym, etc.) efforts to enable extraction of latent content features from data hosted by wide variety of scientific literature vendors (AGU Meeting Abstract Database, Springer, Wiley Online, Elsevier, etc.) hosting earth science literature. Such literature makes both implicit and explicit references to NASA datasets and relationships between such concepts stored across EOSDIS DAAC's hence we envisage that a significant part of this effort will also include development and understanding of relevancy signals which can ultimately be utilized for improved search and relevancy ranking across scientific literature.
Near Real-time Scientific Data Analysis and Visualization with the ArcGIS Platform
NASA Astrophysics Data System (ADS)
Shrestha, S. R.; Viswambharan, V.; Doshi, A.
2017-12-01
Scientific multidimensional data are generated from a variety of sources and platforms. These datasets are mostly produced by earth observation and/or modeling systems. Agencies like NASA, NOAA, USGS, and ESA produce large volumes of near real-time observation, forecast, and historical data that drives fundamental research and its applications in larger aspects of humanity from basic decision making to disaster response. A common big data challenge for organizations working with multidimensional scientific data and imagery collections is the time and resources required to manage and process such large volumes and varieties of data. The challenge of adopting data driven real-time visualization and analysis, as well as the need to share these large datasets, workflows, and information products to wider and more diverse communities, brings an opportunity to use the ArcGIS platform to handle such demand. In recent years, a significant effort has put in expanding the capabilities of ArcGIS to support multidimensional scientific data across the platform. New capabilities in ArcGIS to support scientific data management, processing, and analysis as well as creating information products from large volumes of data using the image server technology are becoming widely used in earth science and across other domains. We will discuss and share the challenges associated with big data by the geospatial science community and how we have addressed these challenges in the ArcGIS platform. We will share few use cases, such as NOAA High Resolution Refresh Radar (HRRR) data, that demonstrate how we access large collections of near real-time data (that are stored on-premise or on the cloud), disseminate them dynamically, process and analyze them on-the-fly, and serve them to a variety of geospatial applications. We will also share how on-the-fly processing using raster functions capabilities, can be extended to create persisted data and information products using raster analytics capabilities that exploit distributed computing in an enterprise environment.
NASA Astrophysics Data System (ADS)
Foglini, F.
2016-12-01
The EVER-EST project aims to develop a generic Virtual Research Environment (VRE) tailored to the needs and validated by the Earth Science domain. To achieve this the EVER-EST VRE provides earth scientists with the means to seamlessly manage both the data involved in their computationally intensive disciplines and the scientific methods applied in their observations and modellings, which lead to the specific results that need to be attributable, validated and shared within the community e.g. in the form of scholarly communications. Central to this approach is the concept of Research Objects (ROs) as semantically rich aggregations of resources that bring together data, methods and people in scientific investigations. ROs enable the creation of digital artifacts that can encapsulate scientific knowledge and provide a mechanism for sharing and discovering assets of reusable research and scientific assets as first-class citizens. The EVER-EST VRE is the first RO-centric native infrastructure leveraging the notion of ROs and their application in observational rather than experimental disciplines and particularly in Earth Science. The Institute of MARine Science (ISMAR-CNR) is a scientific partner of the EVER-EST project providing useful and applicable contributions to the identification and definition of variables indicated by the European Commission in the Marine Strategy Framework Directive (MSFD) to achieve the Good Environment Status (GES). The VRC is willing to deliver practical methods, procedures and protocols to support coherent and widely accepted interpretation of the MSFD. The use case deal with 1. the Posidonia meadows along the Apulian coast, 2. the deep-sea corals along the Apulian continenatal slope and 3. the jellyfish abundance in the Italian water. The SeaMonitoring VRC created specific RO for asesing deep sea corals suitabilty, Posidonia meadows occurrences and for detecting jelly fish density aloing the italian coast. The VRC developed specific RO for bathymetric data implementing a data preservation plan and a specific vocabulary for metadata.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of manymore » computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.« less
Technical support for Life Sciences communities on a production grid infrastructure.
Michel, Franck; Montagnat, Johan; Glatard, Tristan
2012-01-01
Production operation of large distributed computing infrastructures (DCI) still requires a lot of human intervention to reach acceptable quality of service. This may be achievable for scientific communities with solid IT support, but it remains a show-stopper for others. Some application execution environments are used to hide runtime technical issues from end users. But they mostly aim at fault-tolerance rather than incident resolution, and their operation still requires substantial manpower. A longer-term support activity is thus needed to ensure sustained quality of service for Virtual Organisations (VO). This paper describes how the biomed VO has addressed this challenge by setting up a technical support team. Its organisation, tooling, daily tasks, and procedures are described. Results are shown in terms of resource usage by end users, amount of reported incidents, and developed software tools. Based on our experience, we suggest ways to measure the impact of the technical support, perspectives to decrease its human cost and make it more community-specific.
agriGO v2.0: a GO analysis toolkit for the agricultural community, 2017 update
Tian, Tian; Liu, Yue; Yan, Hengyu; You, Qi; Yi, Xin; Du, Zhou
2017-01-01
Abstract The agriGO platform, which has been serving the scientific community for >10 years, specifically focuses on gene ontology (GO) enrichment analyses of plant and agricultural species. We continuously maintain and update the databases and accommodate the various requests of our global users. Here, we present our updated agriGO that has a largely expanded number of supporting species (394) and datatypes (865). In addition, a larger number of species have been classified into groups covering crops, vegetables, fish, birds and insects closely related to the agricultural community. We further improved the computational efficiency, including the batch analysis and P-value distribution (PVD), and the user-friendliness of the web pages. More visualization features were added to the platform, including SEACOMPARE (cross comparison of singular enrichment analysis), direct acyclic graph (DAG) and Scatter Plots, which can be merged by choosing any significant GO term. The updated platform agriGO v2.0 is now publicly accessible at http://systemsbiology.cau.edu.cn/agriGOv2/. PMID:28472432
Blueprint for a microwave trapped ion quantum computer
Lekitsch, Bjoern; Weidt, Sebastian; Fowler, Austin G.; Mølmer, Klaus; Devitt, Simon J.; Wunderlich, Christof; Hensinger, Winfried K.
2017-01-01
The availability of a universal quantum computer may have a fundamental impact on a vast number of research fields and on society as a whole. An increasingly large scientific and industrial community is working toward the realization of such a device. An arbitrarily large quantum computer may best be constructed using a modular approach. We present a blueprint for a trapped ion–based scalable quantum computer module, making it possible to create a scalable quantum computer architecture based on long-wavelength radiation quantum gates. The modules control all operations as stand-alone units, are constructed using silicon microfabrication techniques, and are within reach of current technology. To perform the required quantum computations, the modules make use of long-wavelength radiation–based quantum gate technology. To scale this microwave quantum computer architecture to a large size, we present a fully scalable design that makes use of ion transport between different modules, thereby allowing arbitrarily many modules to be connected to construct a large-scale device. A high error–threshold surface error correction code can be implemented in the proposed architecture to execute fault-tolerant operations. With appropriate adjustments, the proposed modules are also suitable for alternative trapped ion quantum computer architectures, such as schemes using photonic interconnects. PMID:28164154
Exploring Cloud Computing for Large-scale Scientific Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Guang; Han, Binh; Yin, Jian
This paper explores cloud computing for large-scale data-intensive scientific applications. Cloud computing is attractive because it provides hardware and software resources on-demand, which relieves the burden of acquiring and maintaining a huge amount of resources that may be used only once by a scientific application. However, unlike typical commercial applications that often just requires a moderate amount of ordinary resources, large-scale scientific applications often need to process enormous amount of data in the terabyte or even petabyte range and require special high performance hardware with low latency connections to complete computation in a reasonable amount of time. To address thesemore » challenges, we build an infrastructure that can dynamically select high performance computing hardware across institutions and dynamically adapt the computation to the selected resources to achieve high performance. We have also demonstrated the effectiveness of our infrastructure by building a system biology application and an uncertainty quantification application for carbon sequestration, which can efficiently utilize data and computation resources across several institutions.« less
Scientific American Inventions From Outer Space: Everyday Uses For NASA Technology
NASA Technical Reports Server (NTRS)
Baker, David
2000-01-01
The purpose of this book is to present some of the inventions highlighted in the yearly publication of the National Aeronautics and Space Administration (NASA) Spinoff. These inventions cover a wide range, some of which include improvements in health, medicine, public safety, energy, environment, resource management, computer technology, automation, construction, transportation, and manufacturing technology. NASA technology has brought forth thousands of commercial products which include athletic shoes, portable x-ray machines, and scratch-resistant sunglasses, guidance systems, lasers, solar power, robotics and prosthetic devices. These products are examples of NASA research innovations which have positively impacted the community.
NASA Astrophysics Data System (ADS)
Siontorou, Christina G.
2012-12-01
Herbal products have gained increasing popularity in the last decades, and are now broadly used to treat illness and improve health. Notwithstanding the public opinion, both, safety and efficacy, are major sources of dispute among the scientific community, mainly due to lack of (or scarcity or scattered) conclusive data linking a herbal constituent to pharmacological action in vivo, in a way that benefit overrides risk. This paper presents a methodological framework for addressing natural medicine in a systematic and holistic way with a view to providing medicinal products based on interactive chemical/herbal ingredients.
Deserts in the Deluge: TerraPopulus and Big Human-Environment Data.
Manson, S M; Kugler, T A; Haynes, D
2016-01-01
Terra Populus, or TerraPop, is a cyberinfrastructure project that integrates, preserves, and disseminates massive data collections describing characteristics of the human population and environment over the last six decades. TerraPop has made a number of GIScience advances in the handling of big spatial data to make information interoperable between formats and across scientific communities. In this paper, we describe challenges of these data, or 'deserts in the deluge' of data, that are common to spatial big data more broadly, and explore computational solutions specific to microdata, raster, and vector data models.
Graded Alternating-Time Temporal Logic
NASA Astrophysics Data System (ADS)
Faella, Marco; Napoli, Margherita; Parente, Mimmo
Graded modalities enrich the universal and existential quantifiers with the capability to express the concept of at least k or all but k, for a non-negative integer k. Recently, temporal logics such as μ-calculus and Computational Tree Logic, Ctl, augmented with graded modalities have received attention from the scientific community, both from a theoretical side and from an applicative perspective. Both μ-calculus and Ctl naturally apply as specification languages for closed systems: in this paper, we add graded modalities to the Alternating-time Temporal Logic (Atl) introduced by Alur et al., to study how these modalities may affect specification languages for open systems.
NASA Astrophysics Data System (ADS)
Hock, Emily; Sharp, Zoe
2016-03-01
Aspiring teachers and current teachers can gain insight about the scientific community through hands-on experience. As America's standards for elementary school and middle school become more advanced, future and current teachers must gain hands-on experience in the scientific community. For a teacher to be fully capable of teaching all subjects, they must be comfortable in the content areas, equipped to answer questions, and able to pass on their knowledge. Hands-on research experiences, like the Summer Astronomy Research Experience at California Polytechnic University, pair liberal studies students with a cooperative group of science students and instructors with the goal of doing research that benefits the scientific community and deepens the team members' perception of the scientific community. Teachers are then able to apply the basic research process in their classrooms, inspire students to do real life science, and understand the processes scientists' undergo in their workplace.
Discovering Network Structure Beyond Communities
NASA Astrophysics Data System (ADS)
Nishikawa, Takashi; Motter, Adilson E.
2011-11-01
To understand the formation, evolution, and function of complex systems, it is crucial to understand the internal organization of their interaction networks. Partly due to the impossibility of visualizing large complex networks, resolving network structure remains a challenging problem. Here we overcome this difficulty by combining the visual pattern recognition ability of humans with the high processing speed of computers to develop an exploratory method for discovering groups of nodes characterized by common network properties, including but not limited to communities of densely connected nodes. Without any prior information about the nature of the groups, the method simultaneously identifies the number of groups, the group assignment, and the properties that define these groups. The results of applying our method to real networks suggest the possibility that most group structures lurk undiscovered in the fast-growing inventory of social, biological, and technological networks of scientific interest.
An Overview of the Computational Physics and Methods Group at Los Alamos National Laboratory
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, Randal Scott
CCS Division was formed to strengthen the visibility and impact of computer science and computational physics research on strategic directions for the Laboratory. Both computer science and computational science are now central to scientific discovery and innovation. They have become indispensable tools for all other scientific missions at the Laboratory. CCS Division forms a bridge between external partners and Laboratory programs, bringing new ideas and technologies to bear on today’s important problems and attracting high-quality technical staff members to the Laboratory. The Computational Physics and Methods Group CCS-2 conducts methods research and develops scientific software aimed at the latest andmore » emerging HPC systems.« less
Grid today, clouds on the horizon
NASA Astrophysics Data System (ADS)
Shiers, Jamie
2009-04-01
By the time of CCP 2008, the largest scientific machine in the world - the Large Hadron Collider - had been cooled down as scheduled to its operational temperature of below 2 degrees Kelvin and injection tests were starting. Collisions of proton beams at 5+5 TeV were expected within one to two months of the initial tests, with data taking at design energy ( 7+7 TeV) foreseen for 2009. In order to process the data from this world machine, we have put our "Higgs in one basket" - that of Grid computing [The Worldwide LHC Computing Grid (WLCG), in: Proceedings of the Conference on Computational Physics 2006 (CCP 2006), vol. 177, 2007, pp. 219-223]. After many years of preparation, 2008 saw a final "Common Computing Readiness Challenge" (CCRC'08) - aimed at demonstrating full readiness for 2008 data taking, processing and analysis. By definition, this relied on a world-wide production Grid infrastructure. But change - as always - is on the horizon. The current funding model for Grids - which in Europe has been through 3 generations of EGEE projects, together with related projects in other parts of the world, including South America - is evolving towards a long-term, sustainable e-infrastructure, like the European Grid Initiative (EGI) [The European Grid Initiative Design Study, website at http://web.eu-egi.eu/]. At the same time, potentially new paradigms, such as that of "Cloud Computing" are emerging. This paper summarizes the results of CCRC'08 and discusses the potential impact of future Grid funding on both regional and international application communities. It contrasts Grid and Cloud computing models from both technical and sociological points of view. Finally, it discusses the requirements from production application communities, in terms of stability and continuity in the medium to long term.
Computations and interpretations: The growth of quantum chemistry, 1927-1967
NASA Astrophysics Data System (ADS)
Park, Buhm Soon
1999-10-01
This dissertation is a contribution to the historical study of scientific disciplines in the twentieth century. It seeks to examine the development of quantum chemistry during the four decades after its inception in 1927. This development was manifest in theories, tools, scientists, and institutions, all of which constituted the disciplinary identity of quantum chemistry. To characterize its identity, I deal with the origins of key ideas and concepts; the change of computational tools from desk calculators to digital computers; the formation of a network among research groups and individuals; and the institutionalization of annual meetings. The dissertation's thesis is three-fold. First, in the pre- World War II years, there were individual contributions to the development of theories in quantum chemistry, but the founding fathers worked in their disciplinary contexts of physics or chemistry with little interest in building a quantum chemistry community. Second, the introduction of electronic digital computers in the postwar years affected the resurgence of the ab initio approach-the attempt to solve the Schrödinger equation without recourse to empirical data-and also the emergence of a community of quantum chemists. But the use of computers did not give rise to a consensus over the aims, methods, or content of the discipline. Third, quantum chemistry exerted a significant influence upon the transformation of chemical education and research in general, thanks to ``chemical translators,'' who sought to explain the gist of quantum chemistry in a language that chemists could understand. In sum, quantum chemistry has been a discipline characterized by diverse traditions, and the whole of chemistry has been under the influence of computations and interpretations made by quantum chemists.
On the design of computer-based models for integrated environmental science.
McIntosh, Brian S; Jeffrey, Paul; Lemon, Mark; Winder, Nick
2005-06-01
The current research agenda in environmental science is dominated by calls to integrate science and policy to better understand and manage links between social (human) and natural (nonhuman) processes. Freshwater resource management is one area where such calls can be heard. Designing computer-based models for integrated environmental science poses special challenges to the research community. At present it is not clear whether such tools, or their outputs, receive much practical policy or planning application. It is argued that this is a result of (1) a lack of appreciation within the research modeling community of the characteristics of different decision-making processes including policy, planning, and (2) participation, (3) a lack of appreciation of the characteristics of different decision-making contexts, (4) the technical difficulties in implementing the necessary support tool functionality, and (5) the socio-technical demands of designing tools to be of practical use. This article presents a critical synthesis of ideas from each of these areas and interprets them in terms of design requirements for computer-based models being developed to provide scientific information support for policy and planning. Illustrative examples are given from the field of freshwater resources management. Although computer-based diagramming and modeling tools can facilitate processes of dialogue, they lack adequate simulation capabilities. Component-based models and modeling frameworks provide such functionality and may be suited to supporting problematic or messy decision contexts. However, significant technical (implementation) and socio-technical (use) challenges need to be addressed before such ambition can be realized.
Computers and Computation. Readings from Scientific American.
ERIC Educational Resources Information Center
Fenichel, Robert R.; Weizenbaum, Joseph
A collection of articles from "Scientific American" magazine has been put together at this time because the current period in computer science is one of consolidation rather than innovation. A few years ago, computer science was moving so swiftly that even the professional journals were more archival than informative; but today it is…
NASA Astrophysics Data System (ADS)
Anantharaj, Valentine; Norman, Matthew; Evans, Katherine; Taylor, Mark; Worley, Patrick; Hack, James; Mayer, Benjamin
2014-05-01
During 2013, high-resolution climate model simulations accounted for over 100 million "core hours" using Titan at the Oak Ridge Leadership Computing Facility (OLCF). The suite of climate modeling experiments, primarily using the Community Earth System Model (CESM) at nearly 0.25 degree horizontal resolution, generated over a petabyte of data and nearly 100,000 files, ranging in sizes from 20 MB to over 100 GB. Effective utilization of leadership class resources requires careful planning and preparation. The application software, such as CESM, need to be ported, optimized and benchmarked for the target platform in order to meet the computational readiness requirements. The model configuration needs to be "tuned and balanced" for the experiments. This can be a complicated and resource intensive process, especially for high-resolution configurations using complex physics. The volume of I/O also increases with resolution; and new strategies may be required to manage I/O especially for large checkpoint and restart files that may require more frequent output for resiliency. It is also essential to monitor the application performance during the course of the simulation exercises. Finally, the large volume of data needs to be analyzed to derive the scientific results; and appropriate data and information delivered to the stakeholders. Titan is currently the largest supercomputer available for open science. The computational resources, in terms of "titan core hours" are allocated primarily via the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) and ASCR Leadership Computing Challenge (ALCC) programs, both sponsored by the U.S. Department of Energy (DOE) Office of Science. Titan is a Cray XK7 system, capable of a theoretical peak performance of over 27 PFlop/s, consists of 18,688 compute nodes, with a NVIDIA Kepler K20 GPU and a 16-core AMD Opteron CPU in every node, for a total of 299,008 Opteron cores and 18,688 GPUs offering a cumulative 560,640 equivalent cores. Scientific applications, such as CESM, are also required to demonstrate a "computational readiness capability" to efficiently scale across and utilize 20% of the entire system. The 0,25 deg configuration of the spectral element dynamical core of the Community Atmosphere Model (CAM-SE), the atmospheric component of CESM, has been demonstrated to scale efficiently across more than 5,000 nodes (80,000 CPU cores) on Titan. The tracer transport routines of CAM-SE have also been ported to take advantage of the hybrid many-core architecture of Titan using GPUs [see EGU2014-4233], yielding over 2X speedup when transporting over 100 tracers. The high throughput I/O in CESM, based on the Parallel IO Library (PIO), is being further augmented to support even higher resolutions and enhance resiliency. The application performance of the individual runs are archived in a database and routinely analyzed to identify and rectify performance degradation during the course of the experiments. The various resources available at the OLCF now support a scientific workflow to facilitate high-resolution climate modelling. A high-speed center-wide parallel file system, called ATLAS, capable of 1 TB/s, is available on Titan as well as on the clusters used for analysis (Rhea) and visualization (Lens/EVEREST). Long-term archive is facilitated by the HPSS storage system. The Earth System Grid (ESG), featuring search & discovery, is also used to deliver data. The end-to-end workflow allows OLCF users to efficiently share data and publish results in a timely manner.
The contamination of scientific literature: looking for an antidote
NASA Astrophysics Data System (ADS)
Liotta, Marcello
2017-04-01
Science may have very strong implications for society. The knowledge of the processes occurring around the society represents a good opportunity to take responsible decisions. This is particularly true in the field of geosciences. Earthquakes, volcanic eruptions, landslides, climate changes and many other natural phenomena still need to be further investigated. The role of the scientific community is to increase the knowledge. Each member can share his own ideas and data thus allowing the entire scientific community to receive a precious contribution. The latter one often derives from research activities, which are expensive in terms of consumed time and resources. Nowadays the sharing of scientific results occurs through the publication on scientific journals. The reading of available scientific literature thus represents a unique opportunity to define the state of the art on a specific topic and to address research activities towards something new. When published results are obtained through a rigorous scientific process, they constitute a solid background where each member can add his ideas and evidences. Differently, published results may be affected by scientific misconduct; they constitute a labyrinth where the scientists lose their time in the attempt of truly understanding the natural processes. The normal scientific dialectic should unmask such results, thus avoiding literature contamination and making the scientific framework more stimulating. The scientific community should look for the best practice to reduce the risk of literature contamination.
What are Scientific Leaders? The Introduction of a Normalized Impact Factor
NASA Astrophysics Data System (ADS)
Matsas, George E. A.
2012-12-01
We define a normalized impact factor (NIF) suitable for assessing in a simple way both the strength of scientific communities and the research influence of individuals. We define those with NIF ≥ 1 as scientific leaders because they influence their peers at least as much as they are influenced by them. The NIF has two outstanding characteristics: (a) it has a clear and universal meaning and (b) it is robust against self-citation misuse. We show how a single lognormal function obtained from a simplified version of the NIF leads to a clear "radiography" of the corresponding scientific community. An illustrative application analyzes a community derived from the list of outstanding referees recognized by the American Physical Society in 2008.
Virtual Exploitation Environment Demonstration for Atmospheric Missions
NASA Astrophysics Data System (ADS)
Natali, Stefano; Mantovani, Simone; Hirtl, Marcus; Santillan, Daniel; Triebnig, Gerhard; Fehr, Thorsten; Lopes, Cristiano
2017-04-01
The scientific and industrial communities are being confronted with a strong increase of Earth Observation (EO) satellite missions and related data. This is in particular the case for the Atmospheric Sciences communities, with the upcoming Copernicus Sentinel-5 Precursor, Sentinel-4, -5 and -3, and ESA's Earth Explorers scientific satellites ADM-Aeolus and EarthCARE. The challenge is not only to manage the large volume of data generated by each mission / sensor, but to process and analyze the data streams. Creating synergies among the different datasets will be key to exploit the full potential of the available information. As a preparation activity supporting scientific data exploitation for Earth Explorer and Sentinel atmospheric missions, ESA funded the "Technology and Atmospheric Mission Platform" (TAMP) [1] [2] project; a scientific and technological forum (STF) has been set-up involving relevant European entities from different scientific and operational fields to define the platforḿs requirements. Data access, visualization, processing and download services have been developed to satisfy useŕs needs; use cases defined with the STF, such as study of the SO2 emissions for the Holuhraun eruption (2014) by means of two numerical models, two satellite platforms and ground measurements, global Aerosol analyses from long time series of satellite data, and local Aerosol analysis using satellite and LIDAR, have been implemented to ensure acceptance of TAMP by the atmospheric sciences community. The platform pursues the "virtual workspace" concept: all resources (data, processing, visualization, collaboration tools) are provided as "remote services", accessible through a standard web browser, to avoid the download of big data volumes and for allowing utilization of provided infrastructure for computation, analysis and sharing of results. Data access and processing are achieved through standardized protocols (WCS, WPS). As evolution toward a pre-operational environment, the "Virtual Exploitation Environment Demonstration for Atmospheric Missions" (VEEDAM) aims at maintaining, running and evolving the platform, demonstrating e.g. the possibility to perform massive processing over heterogeneous data sources. This work presents the VEEDAM concepts, provides pre-operational examples, stressing on the interoperability achievable exposing standardized data access and processing services (e.g. making accessible data and processing resources from different VREs). [1] TAMP platform landing page http://vtpip.zamg.ac.at/ [2] TAMP introductory video https://www.youtube.com/watch?v=xWiy8h1oXQY
OPENING REMARKS: SciDAC: Scientific Discovery through Advanced Computing
NASA Astrophysics Data System (ADS)
Strayer, Michael
2005-01-01
Good morning. Welcome to SciDAC 2005 and San Francisco. SciDAC is all about computational science and scientific discovery. In a large sense, computational science characterizes SciDAC and its intent is change. It transforms both our approach and our understanding of science. It opens new doors and crosses traditional boundaries while seeking discovery. In terms of twentieth century methodologies, computational science may be said to be transformational. There are a number of examples to this point. First are the sciences that encompass climate modeling. The application of computational science has in essence created the field of climate modeling. This community is now international in scope and has provided precision results that are challenging our understanding of our environment. A second example is that of lattice quantum chromodynamics. Lattice QCD, while adding precision and insight to our fundamental understanding of strong interaction dynamics, has transformed our approach to particle and nuclear science. The individual investigator approach has evolved to teams of scientists from different disciplines working side-by-side towards a common goal. SciDAC is also undergoing a transformation. This meeting is a prime example. Last year it was a small programmatic meeting tracking progress in SciDAC. This year, we have a major computational science meeting with a variety of disciplines and enabling technologies represented. SciDAC 2005 should position itself as a new corner stone for Computational Science and its impact on science. As we look to the immediate future, FY2006 will bring a new cycle to SciDAC. Most of the program elements of SciDAC will be re-competed in FY2006. The re-competition will involve new instruments for computational science, new approaches for collaboration, as well as new disciplines. There will be new opportunities for virtual experiments in carbon sequestration, fusion, and nuclear power and nuclear waste, as well as collaborations with industry and virtual prototyping. New instruments of collaboration will include institutes and centers while summer schools, workshops and outreach will invite new talent and expertise. Computational science adds new dimensions to science and its practice. Disciplines of fusion, accelerator science, and combustion are poised to blur the boundaries between pure and applied science. As we open the door into FY2006 we shall see a landscape of new scientific challenges: in biology, chemistry, materials, and astrophysics to name a few. The enabling technologies of SciDAC have been transformational as drivers of change. Planning for major new software systems assumes a base line employing Common Component Architectures and this has become a household word for new software projects. While grid algorithms and mesh refinement software have transformed applications software, data management and visualization have transformed our understanding of science from data. The Gordon Bell prize now seems to be dominated by computational science and solvers developed by TOPS ISIC. The priorities of the Office of Science in the Department of Energy are clear. The 20 year facilities plan is driven by new science. High performance computing is placed amongst the two highest priorities. Moore's law says that by the end of the next cycle of SciDAC we shall have peta-flop computers. The challenges of petascale computing are enormous. These and the associated computational science are the highest priorities for computing within the Office of Science. Our effort in Leadership Class computing is just a first step towards this goal. Clearly, computational science at this scale will face enormous challenges and possibilities. Performance evaluation and prediction will be critical to unraveling the needed software technologies. We must not lose sight of our overarching goal—that of scientific discovery. Science does not stand still and the landscape of science discovery and computing holds immense promise. In this environment, I believe it is necessary to institute a system of science based performance metrics to help quantify our progress towards science goals and scientific computing. As a final comment I would like to reaffirm that the shifting landscapes of science will force changes to our computational sciences, and leave you with the quote from Richard Hamming, 'The purpose of computing is insight, not numbers'.
Ramalho, Teodorico C.; DeCastro, Alexandre A.; Silva, Daniela R.; ...
2015-08-26
The re-emergence of chemical weapons as a global threat in hands of terrorist groups, together with an increasing number of pesticides intoxications and environmental contaminations worldwide, has called the attention of the scientific community for the need of improvement in the technologies for detoxification of organophosphorus (OP) compounds. A compelling strategy is the use of bioremediation by enzymes that are able to hydrolyze these molecules to harmless chemical species. Several enzymes have been studied and engineered for this purpose. However, their mechanisms of action are not well understood. Theoretical investigations may help elucidate important aspects of these mechanisms and helpmore » in the development of more efficient bio-remediators. In this review, we point out the major contributions of computational methodologies applied to enzyme based detoxification of OPs. Furthermore, we highlight the use of PTE, PON, DFP, and BuChE as enzymes used in OP detoxification process and how computational tools such as molecular docking, molecular dynamics simulations and combined quantum mechanical/molecular mechanics have and will continue to contribute to this very important area of research.The re-emergence of chemical weapons as a global threat in hands of terrorist groups, together with an increasing number of pesticides intoxications and environmental contaminations worldwide, has called the attention of the scientific community for the need of improvement in the technologies for detoxification of organophosphorus (OP) compounds. A compelling strategy is the use of bioremediation by enzymes that are able to hydrolyze these molecules to harmless chemical species. Several enzymes have been studied and engineered for this purpose. However, their mechanisms of action are not well understood. Theoretical investigations may help elucidate important aspects of these mechanisms and help in the development of more efficient bio-remediators. In this review, we point out the major contributions of computational methodologies applied to enzyme based detoxification of OPs. Furthermore, we highlight the use of PTE, PON, DFP, and BuChE as enzymes used in OP detoxification process and how computational tools such as molecular docking, molecular dynamics simulations and combined quantum mechanical/molecular mechanics have and will continue to contribute to this very important area of research.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ramalho, Teodorico C.; DeCastro, Alexandre A.; Silva, Daniela R.
The re-emergence of chemical weapons as a global threat in hands of terrorist groups, together with an increasing number of pesticides intoxications and environmental contaminations worldwide, has called the attention of the scientific community for the need of improvement in the technologies for detoxification of organophosphorus (OP) compounds. A compelling strategy is the use of bioremediation by enzymes that are able to hydrolyze these molecules to harmless chemical species. Several enzymes have been studied and engineered for this purpose. However, their mechanisms of action are not well understood. Theoretical investigations may help elucidate important aspects of these mechanisms and helpmore » in the development of more efficient bio-remediators. In this review, we point out the major contributions of computational methodologies applied to enzyme based detoxification of OPs. Furthermore, we highlight the use of PTE, PON, DFP, and BuChE as enzymes used in OP detoxification process and how computational tools such as molecular docking, molecular dynamics simulations and combined quantum mechanical/molecular mechanics have and will continue to contribute to this very important area of research.The re-emergence of chemical weapons as a global threat in hands of terrorist groups, together with an increasing number of pesticides intoxications and environmental contaminations worldwide, has called the attention of the scientific community for the need of improvement in the technologies for detoxification of organophosphorus (OP) compounds. A compelling strategy is the use of bioremediation by enzymes that are able to hydrolyze these molecules to harmless chemical species. Several enzymes have been studied and engineered for this purpose. However, their mechanisms of action are not well understood. Theoretical investigations may help elucidate important aspects of these mechanisms and help in the development of more efficient bio-remediators. In this review, we point out the major contributions of computational methodologies applied to enzyme based detoxification of OPs. Furthermore, we highlight the use of PTE, PON, DFP, and BuChE as enzymes used in OP detoxification process and how computational tools such as molecular docking, molecular dynamics simulations and combined quantum mechanical/molecular mechanics have and will continue to contribute to this very important area of research.« less
Mangado, Nerea; Piella, Gemma; Noailly, Jérôme; Pons-Prats, Jordi; Ballester, Miguel Ángel González
2016-01-01
Computational modeling has become a powerful tool in biomedical engineering thanks to its potential to simulate coupled systems. However, real parameters are usually not accurately known, and variability is inherent in living organisms. To cope with this, probabilistic tools, statistical analysis and stochastic approaches have been used. This article aims to review the analysis of uncertainty and variability in the context of finite element modeling in biomedical engineering. Characterization techniques and propagation methods are presented, as well as examples of their applications in biomedical finite element simulations. Uncertainty propagation methods, both non-intrusive and intrusive, are described. Finally, pros and cons of the different approaches and their use in the scientific community are presented. This leads us to identify future directions for research and methodological development of uncertainty modeling in biomedical engineering. PMID:27872840
Mangado, Nerea; Piella, Gemma; Noailly, Jérôme; Pons-Prats, Jordi; Ballester, Miguel Ángel González
2016-01-01
Computational modeling has become a powerful tool in biomedical engineering thanks to its potential to simulate coupled systems. However, real parameters are usually not accurately known, and variability is inherent in living organisms. To cope with this, probabilistic tools, statistical analysis and stochastic approaches have been used. This article aims to review the analysis of uncertainty and variability in the context of finite element modeling in biomedical engineering. Characterization techniques and propagation methods are presented, as well as examples of their applications in biomedical finite element simulations. Uncertainty propagation methods, both non-intrusive and intrusive, are described. Finally, pros and cons of the different approaches and their use in the scientific community are presented. This leads us to identify future directions for research and methodological development of uncertainty modeling in biomedical engineering.
NASA Astrophysics Data System (ADS)
Pordes, Ruth; OSG Consortium; Petravick, Don; Kramer, Bill; Olson, Doug; Livny, Miron; Roy, Alain; Avery, Paul; Blackburn, Kent; Wenaus, Torre; Würthwein, Frank; Foster, Ian; Gardner, Rob; Wilde, Mike; Blatecky, Alan; McGee, John; Quick, Rob
2007-07-01
The Open Science Grid (OSG) provides a distributed facility where the Consortium members provide guaranteed and opportunistic access to shared computing and storage resources. OSG provides support for and evolution of the infrastructure through activities that cover operations, security, software, troubleshooting, addition of new capabilities, and support for existing and engagement with new communities. The OSG SciDAC-2 project provides specific activities to manage and evolve the distributed infrastructure and support it's use. The innovative aspects of the project are the maintenance and performance of a collaborative (shared & common) petascale national facility over tens of autonomous computing sites, for many hundreds of users, transferring terabytes of data a day, executing tens of thousands of jobs a day, and providing robust and usable resources for scientific groups of all types and sizes. More information can be found at the OSG web site: www.opensciencegrid.org.
Practical Use of Computationally Frugal Model Analysis Methods
Hill, Mary C.; Kavetski, Dmitri; Clark, Martyn; ...
2015-03-21
Computationally frugal methods of model analysis can provide substantial benefits when developing models of groundwater and other environmental systems. Model analysis includes ways to evaluate model adequacy and to perform sensitivity and uncertainty analysis. Frugal methods typically require 10s of parallelizable model runs; their convenience allows for other uses of the computational effort. We suggest that model analysis be posed as a set of questions used to organize methods that range from frugal to expensive (requiring 10,000 model runs or more). This encourages focus on method utility, even when methods have starkly different theoretical backgrounds. We note that many frugalmore » methods are more useful when unrealistic process-model nonlinearities are reduced. Inexpensive diagnostics are identified for determining when frugal methods are advantageous. Examples from the literature are used to demonstrate local methods and the diagnostics. We suggest that the greater use of computationally frugal model analysis methods would allow questions such as those posed in this work to be addressed more routinely, allowing the environmental sciences community to obtain greater scientific insight from the many ongoing and future modeling efforts« less
Computational biology and bioinformatics in Nigeria.
Fatumo, Segun A; Adoga, Moses P; Ojo, Opeolu O; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi
2014-04-01
Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.
Computational Biology and Bioinformatics in Nigeria
Fatumo, Segun A.; Adoga, Moses P.; Ojo, Opeolu O.; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi
2014-01-01
Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries. PMID:24763310
A computational model of cerebral cortex folding.
Nie, Jingxin; Guo, Lei; Li, Gang; Faraco, Carlos; Stephen Miller, L; Liu, Tianming
2010-05-21
The geometric complexity and variability of the human cerebral cortex have long intrigued the scientific community. As a result, quantitative description of cortical folding patterns and the understanding of underlying folding mechanisms have emerged as important research goals. This paper presents a computational 3D geometric model of cerebral cortex folding initialized by MRI data of a human fetal brain and deformed under the governance of a partial differential equation modeling cortical growth. By applying different simulation parameters, our model is able to generate folding convolutions and shape dynamics of the cerebral cortex. The simulations of this 3D geometric model provide computational experimental support to the following hypotheses: (1) Mechanical constraints of the skull regulate the cortical folding process. (2) The cortical folding pattern is dependent on the global cell growth rate of the whole cortex. (3) The cortical folding pattern is dependent on relative rates of cell growth in different cortical areas. (4) The cortical folding pattern is dependent on the initial geometry of the cortex. Copyright (c) 2010 Elsevier Ltd. All rights reserved.
Investigating a holobiont: Microbiota perturbations and transkingdom networks.
Greer, Renee; Dong, Xiaoxi; Morgun, Andrey; Shulzhenko, Natalia
2016-01-01
The scientific community has recently come to appreciate that, rather than existing as independent organisms, multicellular hosts and their microbiota comprise a complex evolving superorganism or metaorganism, termed a holobiont. This point of view leads to a re-evaluation of our understanding of different physiological processes and diseases. In this paper we focus on experimental and computational approaches which, when combined in one study, allowed us to dissect mechanisms (traditionally named host-microbiota interactions) regulating holobiont physiology. Specifically, we discuss several approaches for microbiota perturbation, such as use of antibiotics and germ-free animals, including advantages and potential caveats of their usage. We briefly review computational approaches to characterize the microbiota and, more importantly, methods to infer specific components of microbiota (such as microbes or their genes) affecting host functions. One such approach called transkingdom network analysis has been recently developed and applied in our study. (1) Finally, we also discuss common methods used to validate the computational predictions of host-microbiota interactions using in vitro and in vivo experimental systems.
Quantum game theory and open access publishing
NASA Astrophysics Data System (ADS)
Hanauske, Matthias; Bernius, Steffen; Dugall, Berndt
2007-08-01
The digital revolution of the information age and in particular the sweeping changes of scientific communication brought about by computing and novel communication technology, potentiate global, high grade scientific information for free. The arXiv, for example, is the leading scientific communication platform, mainly for mathematics and physics, where everyone in the world has free access on. While in some scientific disciplines the open access way is successfully realized, other disciplines (e.g. humanities and social sciences) dwell on the traditional path, even though many scientists belonging to these communities approve the open access principle. In this paper we try to explain these different publication patterns by using a game theoretical approach. Based on the assumption, that the main goal of scientists is the maximization of their reputation, we model different possible game settings, namely a zero sum game, the prisoners’ dilemma case and a version of the stag hunt game, that show the dilemma of scientists belonging to “non-open access communities”. From an individual perspective, they have no incentive to deviate from the Nash equilibrium of traditional publishing. By extending the model using the quantum game theory approach it can be shown, that if the strength of entanglement exceeds a certain value, the scientists will overcome the dilemma and terminate to publish only traditionally in all three settings.
NASA Astrophysics Data System (ADS)
Bergey, Bradley W.; Ketelhut, Diane Jass; Liang, Senfeng; Natarajan, Uma; Karakus, Melissa
2015-10-01
The primary aim of the study was to examine whether performance on a science assessment in an immersive virtual environment was associated with changes in scientific inquiry self-efficacy. A secondary aim of the study was to examine whether performance on the science assessment was equitable for students with different levels of computer game self-efficacy, including whether gender differences were observed. We examined 407 middle school students' scientific inquiry self-efficacy and computer game self-efficacy before and after completing a computer game-like assessment about a science mystery. Results from path analyses indicated that prior scientific inquiry self-efficacy predicted achievement on end-of-module questions, which in turn predicted change in scientific inquiry self-efficacy. By contrast, computer game self-efficacy was neither predictive of nor predicted by performance on the science assessment. While boys had higher computer game self-efficacy compared to girls, multi-group analyses suggested only minor gender differences in how efficacy beliefs related to performance. Implications for assessments with virtual environments and future design and research are discussed.
Memon, Aamir Raoof
2016-12-01
ResearchGate has been regarded as one of the most attractive academic social networking site for scientific community. It has been trying to improve user-centered interfaces to gain more attractiveness to scientists around the world. Display of journal related scietometric measures (such as impact factor, 5-year impact, cited half-life, eigenfactor) is an important feature in ResearchGate. Open access publishing has added more to increased visibility of research work and easy access to information related to research. Moreover, scientific community has been much interested in promoting their work and exhibiting its impact to others through reliable scientometric measures. However, with the growing market of publications and improvements in the field of research, this community has been victimized by the cybercrime in the form of ghost journals, fake publishers and magical impact measures. Particularly, ResearchGate more recently, has been lenient in its policies against this dark side of academic writing. Therefore, this communication aims to discuss concerns associated with leniency in ResearchGate policies and its impact of scientific community.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hules, John
This 1998 annual report from the National Scientific Energy Research Computing Center (NERSC) presents the year in review of the following categories: Computational Science; Computer Science and Applied Mathematics; and Systems and Services. Also presented are science highlights in the following categories: Basic Energy Sciences; Biological and Environmental Research; Fusion Energy Sciences; High Energy and Nuclear Physics; and Advanced Scientific Computing Research and Other Projects.
Hypergraph-Based Combinatorial Optimization of Matrix-Vector Multiplication
ERIC Educational Resources Information Center
Wolf, Michael Maclean
2009-01-01
Combinatorial scientific computing plays an important enabling role in computational science, particularly in high performance scientific computing. In this thesis, we will describe our work on optimizing matrix-vector multiplication using combinatorial techniques. Our research has focused on two different problems in combinatorial scientific…
ERIC Educational Resources Information Center
Evans, C. D.
This paper describes the experiences of the industrial research laboratory of Kodak Ltd. in finding and providing a computer terminal most suited to its very varied requirements. These requirements include bibliographic and scientific data searching and access to a number of worldwide computing services for scientific computing work. The provision…
Hybrid 2-D and 3-D Immersive and Interactive User Interface for Scientific Data Visualization
2017-08-01
visualization, 3-D interactive visualization, scientific visualization, virtual reality, real -time ray tracing 16. SECURITY CLASSIFICATION OF: 17...scientists to employ in the real world. Other than user-friendly software and hardware setup, scientists also need to be able to perform their usual...and scientific visualization communities mostly have different research priorities. For the VR community, the ability to support real -time user