exploring scientific datasets: Topics by Science.gov

Sample records for exploring scientific datasets

The Planetary Science Archive (PSA): Exploration and discovery of scientific datasets from ESA's planetary missions

NASA Astrophysics Data System (ADS)

Vallat, C.; Besse, S.; Barbarisi, I.; Arviset, C.; De Marchi, G.; Barthelemy, M.; Coia, D.; Costa, M.; Docasal, R.; Fraga, D.; Heather, D. J.; Lim, T.; Macfarlane, A.; Martinez, S.; Rios, C.; Vallejo, F.; Said, J.

2017-09-01

The Planetary Science Archive (PSA) is the European Space Agency's (ESA) repository of science data from all planetary science and exploration missions. The PSA provides access to scientific datasets through various interfaces at http://psa.esa.int. All datasets are scientifically peer-reviewed by independent scientists, and are compliant with the Planetary Data System (PDS) standards. The PSA has started to implement a number of significant improvements, mostly driven by the evolution of the PDS standards, and the growing need for better interfaces and advanced applications to support science exploitation.
The New Planetary Science Archive (PSA): Exploration and Discovery of Scientific Datasets from ESA's Planetary Missions

NASA Astrophysics Data System (ADS)

Heather, David; Besse, Sebastien; Vallat, Claire; Barbarisi, Isa; Arviset, Christophe; De Marchi, Guido; Barthelemy, Maud; Coia, Daniela; Costa, Marc; Docasal, Ruben; Fraga, Diego; Grotheer, Emmanuel; Lim, Tanya; MacFarlane, Alan; Martinez, Santa; Rios, Carlos; Vallejo, Fran; Saiz, Jaime

2017-04-01

The Planetary Science Archive (PSA) is the European Space Agency's (ESA) repository of science data from all planetary science and exploration missions. The PSA provides access to scientific datasets through various interfaces at http://psa.esa.int. All datasets are scientifically peer-reviewed by independent scientists, and are compliant with the Planetary Data System (PDS) standards. The PSA is currently implementing a number of significant improvements, mostly driven by the evolution of the PDS standard, and the growing need for better interfaces and advanced applications to support science exploitation. As of the end of 2016, the PSA is hosting data from all of ESA's planetary missions. This includes ESA's first planetary mission Giotto that encountered comet 1P/Halley in 1986 with a flyby at 800km. Science data from Venus Express, Mars Express, Huygens and the SMART-1 mission are also all available at the PSA. The PSA also contains all science data from Rosetta, which explored comet 67P/Churyumov-Gerasimenko and asteroids Steins and Lutetia. The year 2016 has seen the arrival of the ExoMars 2016 data in the archive. In the upcoming years, at least three new projects are foreseen to be fully archived at the PSA. The BepiColombo mission is scheduled for launch in 2018. Following that, the ExoMars Rover Surface Platform (RSP) in 2020, and then the JUpiter ICy moon Explorer (JUICE). All of these will archive their data in the PSA. In addition, a few ground-based support programmes are also available, especially for the Venus Express and Rosetta missions. The newly designed PSA will enhance the user experience and will significantly reduce the complexity for users to find their data promoting one-click access to the scientific datasets with more customized views when needed. This includes a better integration with Planetary GIS analysis tools and Planetary interoperability services (search and retrieve data, supporting e.g. PDAP, EPN-TAP). It will also be up
The new Planetary Science Archive (PSA): Exploration and discovery of scientific datasets from ESA's planetary missions

NASA Astrophysics Data System (ADS)

Martinez, Santa; Besse, Sebastien; Heather, Dave; Barbarisi, Isa; Arviset, Christophe; De Marchi, Guido; Barthelemy, Maud; Docasal, Ruben; Fraga, Diego; Grotheer, Emmanuel; Lim, Tanya; Macfarlane, Alan; Rios, Carlos; Vallejo, Fran; Saiz, Jaime; ESDC (European Space Data Centre) Team

2016-10-01

The Planetary Science Archive (PSA) is the European Space Agency's (ESA) repository of science data from all planetary science and exploration missions. The PSA provides access to scientific datasets through various interfaces at http://archives.esac.esa.int/psa. All datasets are scientifically peer-reviewed by independent scientists, and are compliant with the Planetary Data System (PDS) standards. The PSA is currently implementing a number of significant improvements, mostly driven by the evolution of the PDS standard, and the growing need for better interfaces and advanced applications to support science exploitation. The newly designed PSA will enhance the user experience and will significantly reduce the complexity for users to find their data promoting one-click access to the scientific datasets with more specialised views when needed. This includes a better integration with Planetary GIS analysis tools and Planetary interoperability services (search and retrieve data, supporting e.g. PDAP, EPN-TAP). It will be also up-to-date with versions 3 and 4 of the PDS standards, as PDS4 will be used for ESA's ExoMars and upcoming BepiColombo missions. Users will have direct access to documentation, information and tools that are relevant to the scientific use of the dataset, including ancillary datasets, Software Interface Specification (SIS) documents, and any tools/help that the PSA team can provide. A login mechanism will provide additional functionalities to the users to aid / ease their searches (e.g. saving queries, managing default views). This contribution will introduce the new PSA, its key features and access interfaces.
The new Planetary Science Archive: A tool for exploration and discovery of scientific datasets from ESA's planetary missions

NASA Astrophysics Data System (ADS)

Heather, David

2016-07-01

Introduction: The Planetary Science Archive (PSA) is the European Space Agency's (ESA) repository of science data from all planetary science and exploration missions. The PSA provides access to scientific datasets through various interfaces (e.g. FTP browser, Map based, Advanced search, and Machine interface): http://archives.esac.esa.int/psa All datasets are scientifically peer-reviewed by independent scientists, and are compliant with the Planetary Data System (PDS) standards. Updating the PSA: The PSA is currently implementing a number of significant changes, both to its web-based interface to the scientific community, and to its database structure. The new PSA will be up-to-date with versions 3 and 4 of the PDS standards, as PDS4 will be used for ESA's upcoming ExoMars and BepiColombo missions. The newly designed PSA homepage will provide direct access to scientific datasets via a text search for targets or missions. This will significantly reduce the complexity for users to find their data and will promote one-click access to the datasets. Additionally, the homepage will provide direct access to advanced views and searches of the datasets. Users will have direct access to documentation, information and tools that are relevant to the scientific use of the dataset, including ancillary datasets, Software Interface Specification (SIS) documents, and any tools/help that the PSA team can provide. A login mechanism will provide additional functionalities to the users to aid / ease their searches (e.g. saving queries, managing default views). Queries to the PSA database will be possible either via the homepage (for simple searches of missions or targets), or through a filter menu for more tailored queries. The filter menu will offer multiple options to search for a particular dataset or product, and will manage queries for both in-situ and remote sensing instruments. Parameters such as start-time, phase angle, and heliocentric distance will be emphasized. A further
The new Planetary Science Archive: A tool for exploration and discovery of scientific datasets from ESA's planetary missions.

NASA Astrophysics Data System (ADS)

Heather, David; Besse, Sebastien; Barbarisi, Isa; Arviset, Christophe; de Marchi, Guido; Barthelemy, Maud; Docasal, Ruben; Fraga, Diego; Grotheer, Emmanuel; Lim, Tanya; Macfarlane, Alan; Martinez, Santa; Rios, Carlos

2016-04-01

Introduction: The Planetary Science Archive (PSA) is the European Space Agency's (ESA) repository of science data from all planetary science and exploration missions. The PSA provides access to scientific datasets through various interfaces (e.g. FTP browser, Map based, Advanced search, and Machine interface): http://archives.esac.esa.int/psa All datasets are scientifically peer-reviewed by independent scientists, and are compliant with the Planetary Data System (PDS) standards. Updating the PSA: The PSA is currently implementing a number of significant changes, both to its web-based interface to the scientific community, and to its database structure. The new PSA will be up-to-date with versions 3 and 4 of the PDS standards, as PDS4 will be used for ESA's upcoming ExoMars and BepiColombo missions. The newly designed PSA homepage will provide direct access to scientific datasets via a text search for targets or missions. This will significantly reduce the complexity for users to find their data and will promote one-click access to the datasets. Additionally, the homepage will provide direct access to advanced views and searches of the datasets. Users will have direct access to documentation, information and tools that are relevant to the scientific use of the dataset, including ancillary datasets, Software Interface Specification (SIS) documents, and any tools/help that the PSA team can provide. A login mechanism will provide additional functionalities to the users to aid / ease their searches (e.g. saving queries, managing default views). Queries to the PSA database will be possible either via the homepage (for simple searches of missions or targets), or through a filter menu for more tailored queries. The filter menu will offer multiple options to search for a particular dataset or product, and will manage queries for both in-situ and remote sensing instruments. Parameters such as start-time, phase angle, and heliocentric distance will be emphasized. A further
Dataset of Scientific Inquiry Learning Environment

ERIC Educational Resources Information Center

Ting, Choo-Yee; Ho, Chiung Ching

2015-01-01

This paper presents the dataset collected from student interactions with INQPRO, a computer-based scientific inquiry learning environment. The dataset contains records of 100 students and is divided into two portions. The first portion comprises (1) "raw log data", capturing the student's name, interfaces visited, the interface…
Unsupervised learning on scientific ocean drilling datasets from the South China Sea

NASA Astrophysics Data System (ADS)

Tse, Kevin C.; Chiu, Hon-Chim; Tsang, Man-Yin; Li, Yiliang; Lam, Edmund Y.

2018-06-01

Unsupervised learning methods were applied to explore data patterns in multivariate geophysical datasets collected from ocean floor sediment core samples coming from scientific ocean drilling in the South China Sea. Compared to studies on similar datasets, but using supervised learning methods which are designed to make predictions based on sample training data, unsupervised learning methods require no a priori information and focus only on the input data. In this study, popular unsupervised learning methods including K-means, self-organizing maps, hierarchical clustering and random forest were coupled with different distance metrics to form exploratory data clusters. The resulting data clusters were externally validated with lithologic units and geologic time scales assigned to the datasets by conventional methods. Compact and connected data clusters displayed varying degrees of correspondence with existing classification by lithologic units and geologic time scales. K-means and self-organizing maps were observed to perform better with lithologic units while random forest corresponded best with geologic time scales. This study sets a pioneering example of how unsupervised machine learning methods can be used as an automatic processing tool for the increasingly high volume of scientific ocean drilling data.
Determining similarity of scientific entities in annotation datasets

PubMed Central

Palma, Guillermo; Vidal, Maria-Esther; Haag, Eric; Raschid, Louiqa; Thor, Andreas

2015-01-01

Linked Open Data initiatives have made available a diversity of scientific collections where scientists have annotated entities in the datasets with controlled vocabulary terms from ontologies. Annotations encode scientific knowledge, which is captured in annotation datasets. Determining relatedness between annotated entities becomes a building block for pattern mining, e.g. identifying drug–drug relationships may depend on the similarity of the targets that interact with each drug. A diversity of similarity measures has been proposed in the literature to compute relatedness between a pair of entities. Each measure exploits some knowledge including the name, function, relationships with other entities, taxonomic neighborhood and semantic knowledge. We propose a novel general-purpose annotation similarity measure called ‘AnnSim’ that measures the relatedness between two entities based on the similarity of their annotations. We model AnnSim as a 1–1 maximum weight bipartite match and exploit properties of existing solvers to provide an efficient solution. We empirically study the performance of AnnSim on real-world datasets of drugs and disease associations from clinical trials and relationships between drugs and (genomic) targets. Using baselines that include a variety of measures, we identify where AnnSim can provide a deeper understanding of the semantics underlying the relatedness of a pair of entities or where it could lead to predicting new links or identifying potential novel patterns. Although AnnSim does not exploit knowledge or properties of a particular domain, its performance compares well with a variety of state-of-the-art domain-specific measures. Database URL: http://www.yeastgenome.org/ PMID:25725057
Determining similarity of scientific entities in annotation datasets.

PubMed

Palma, Guillermo; Vidal, Maria-Esther; Haag, Eric; Raschid, Louiqa; Thor, Andreas

2015-01-01

Linked Open Data initiatives have made available a diversity of scientific collections where scientists have annotated entities in the datasets with controlled vocabulary terms from ontologies. Annotations encode scientific knowledge, which is captured in annotation datasets. Determining relatedness between annotated entities becomes a building block for pattern mining, e.g. identifying drug-drug relationships may depend on the similarity of the targets that interact with each drug. A diversity of similarity measures has been proposed in the literature to compute relatedness between a pair of entities. Each measure exploits some knowledge including the name, function, relationships with other entities, taxonomic neighborhood and semantic knowledge. We propose a novel general-purpose annotation similarity measure called 'AnnSim' that measures the relatedness between two entities based on the similarity of their annotations. We model AnnSim as a 1-1 maximum weight bipartite match and exploit properties of existing solvers to provide an efficient solution. We empirically study the performance of AnnSim on real-world datasets of drugs and disease associations from clinical trials and relationships between drugs and (genomic) targets. Using baselines that include a variety of measures, we identify where AnnSim can provide a deeper understanding of the semantics underlying the relatedness of a pair of entities or where it could lead to predicting new links or identifying potential novel patterns. Although AnnSim does not exploit knowledge or properties of a particular domain, its performance compares well with a variety of state-of-the-art domain-specific measures. Database URL: http://www.yeastgenome.org/ © The Author(s) 2015. Published by Oxford University Press.
Exploring patterns enriched in a dataset with contrastive principal component analysis.

PubMed

Abid, Abubakar; Zhang, Martin J; Bagaria, Vivek K; Zou, James

2018-05-30

Visualization and exploration of high-dimensional data is a ubiquitous challenge across disciplines. Widely used techniques such as principal component analysis (PCA) aim to identify dominant trends in one dataset. However, in many settings we have datasets collected under different conditions, e.g., a treatment and a control experiment, and we are interested in visualizing and exploring patterns that are specific to one dataset. This paper proposes a method, contrastive principal component analysis (cPCA), which identifies low-dimensional structures that are enriched in a dataset relative to comparison data. In a wide variety of experiments, we demonstrate that cPCA with a background dataset enables us to visualize dataset-specific patterns missed by PCA and other standard methods. We further provide a geometric interpretation of cPCA and strong mathematical guarantees. An implementation of cPCA is publicly available, and can be used for exploratory data analysis in many applications where PCA is currently used.
DNAism: exploring genomic datasets on the web with Horizon Charts.

PubMed

Rio Deiros, David; Gibbs, Richard A; Rogers, Jeffrey

2016-01-27

Computational biologists daily face the need to explore massive amounts of genomic data. New visualization techniques can help researchers navigate and understand these big data. Horizon Charts are a relatively new visualization method that, under the right circumstances, maximizes data density without losing graphical perception. Horizon Charts have been successfully applied to understand multi-metric time series data. We have adapted an existing JavaScript library (Cubism) that implements Horizon Charts for the time series domain so that it works effectively with genomic datasets. We call this new library DNAism. Horizon Charts can be an effective visual tool to explore complex and large genomic datasets. Researchers can use our library to leverage these techniques to extract additional insights from their own datasets.
Scientific Resource EXplorer

NASA Astrophysics Data System (ADS)

Xing, Z.; Wormuth, A.; Smith, A.; Arca, J.; Lu, Y.; Sayfi, E.

2014-12-01

Inquisitive minds in our society are never satisfied with curatedimages released by a typical public affairs office. They always want tolook deeper and play directly on original data. However, most scientificdata products are notoriously hard to use. They are immensely large,highly distributed and diverse in format. In this presentation,we will demonstrate Resource EXplorer (REX), a novel webtop applicationthat allows anyone to conveniently explore and visualize rich scientificdata repositories, using only a standard web browser. This tool leverageson the power of Webification Science (w10n-sci), a powerful enabling technologythat simplifies the use of scientific data on the web platform.W10n-sci is now being deployed at an increasing number of NASA data centers,some of which are the largest digital treasure troves in our nation.With REX, these wonderful scientific resources are open for teachers andstudents to learn and play.
Tree-based approach for exploring marine spatial patterns with raster datasets.

PubMed

Liao, Xiaohan; Xue, Cunjin; Su, Fenzhen

2017-01-01

From multiple raster datasets to spatial association patterns, the data-mining technique is divided into three subtasks, i.e., raster dataset pretreatment, mining algorithm design, and spatial pattern exploration from the mining results. Comparison with the former two subtasks reveals that the latter remains unresolved. Confronted with the interrelated marine environmental parameters, we propose a Tree-based Approach for eXploring Marine Spatial Patterns with multiple raster datasets called TAXMarSP, which includes two models. One is the Tree-based Cascading Organization Model (TCOM), and the other is the Spatial Neighborhood-based CAlculation Model (SNCAM). TCOM designs the "Spatial node→Pattern node" from top to bottom layers to store the table-formatted frequent patterns. Together with TCOM, SNCAM considers the spatial neighborhood contributions to calculate the pattern-matching degree between the specified marine parameters and the table-formatted frequent patterns and then explores the marine spatial patterns. Using the prevalent quantification Apriori algorithm and a real remote sensing dataset from January 1998 to December 2014, a successful application of TAXMarSP to marine spatial patterns in the Pacific Ocean is described, and the obtained marine spatial patterns present not only the well-known but also new patterns to Earth scientists.
Omicseq: a web-based search engine for exploring omics datasets

PubMed Central

Sun, Xiaobo; Pittard, William S.; Xu, Tianlei; Chen, Li; Zwick, Michael E.; Jiang, Xiaoqian; Wang, Fusheng

2017-01-01

Abstract The development and application of high-throughput genomics technologies has resulted in massive quantities of diverse omics data that continue to accumulate rapidly. These rich datasets offer unprecedented and exciting opportunities to address long standing questions in biomedical research. However, our ability to explore and query the content of diverse omics data is very limited. Existing dataset search tools rely almost exclusively on the metadata. A text-based query for gene name(s) does not work well on datasets wherein the vast majority of their content is numeric. To overcome this barrier, we have developed Omicseq, a novel web-based platform that facilitates the easy interrogation of omics datasets holistically to improve ‘findability’ of relevant data. The core component of Omicseq is trackRank, a novel algorithm for ranking omics datasets that fully uses the numerical content of the dataset to determine relevance to the query entity. The Omicseq system is supported by a scalable and elastic, NoSQL database that hosts a large collection of processed omics datasets. In the front end, a simple, web-based interface allows users to enter queries and instantly receive search results as a list of ranked datasets deemed to be the most relevant. Omicseq is freely available at http://www.omicseq.org. PMID:28402462
Advanced Aerobots for Scientific Exploration

NASA Technical Reports Server (NTRS)

Behar, Alberto; Raymond, Carol A.; Matthews, Janet B.; Nicaise, Fabien; Jones, Jack A.

2010-01-01

The Picosat and Uninhabited Aerial Vehicle Systems Engineering (PAUSE) project is developing balloon-borne instrumentation systems as aerobots for scientific exploration of remote planets and for diverse terrestrial purposes that can include scientific exploration, mapping, and military surveillance. The underlying concept of balloon-borne gondolas housing outer-space-qualified scientific instruments and associated data-processing and radio-communication equipment is not new. Instead, the novelty lies in numerous design details that, taken together, make a PAUSE aerobot smaller, less expensive, and less massive, relative to prior aerobots developed for similar purposes: Whereas the gondola (including the instrumentation system housed in it) of a typical prior aerobot has a mass of hundreds of kilograms, the mass of the gondola (with instrumentation system) of a PAUSE aerobot is a few kilograms.
Omicseq: a web-based search engine for exploring omics datasets.

PubMed

Sun, Xiaobo; Pittard, William S; Xu, Tianlei; Chen, Li; Zwick, Michael E; Jiang, Xiaoqian; Wang, Fusheng; Qin, Zhaohui S

2017-07-03

The development and application of high-throughput genomics technologies has resulted in massive quantities of diverse omics data that continue to accumulate rapidly. These rich datasets offer unprecedented and exciting opportunities to address long standing questions in biomedical research. However, our ability to explore and query the content of diverse omics data is very limited. Existing dataset search tools rely almost exclusively on the metadata. A text-based query for gene name(s) does not work well on datasets wherein the vast majority of their content is numeric. To overcome this barrier, we have developed Omicseq, a novel web-based platform that facilitates the easy interrogation of omics datasets holistically to improve 'findability' of relevant data. The core component of Omicseq is trackRank, a novel algorithm for ranking omics datasets that fully uses the numerical content of the dataset to determine relevance to the query entity. The Omicseq system is supported by a scalable and elastic, NoSQL database that hosts a large collection of processed omics datasets. In the front end, a simple, web-based interface allows users to enter queries and instantly receive search results as a list of ranked datasets deemed to be the most relevant. Omicseq is freely available at http://www.omicseq.org. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Scientific Datasets: Discovery and Aggregation for Semantic Interpretation.

NASA Astrophysics Data System (ADS)

Lopez, L. A.; Scott, S.; Khalsa, S. J. S.; Duerr, R.

2015-12-01

One of the biggest challenges that interdisciplinary researchers face is finding suitable datasets in order to advance their science; this problem remains consistent across multiple disciplines. A surprising number of scientists, when asked what tool they use for data discovery, reply "Google", which is an acceptable solution in some cases but not even Google can find -or cares to compile- all the data that's relevant for science and particularly geo sciences. If a dataset is not discoverable through a well known search provider it will remain dark data to the scientific world.For the past year, BCube, an EarthCube Building Block project, has been developing, testing and deploying a technology stack capable of data discovery at web-scale using the ultimate dataset: The Internet. This stack has 2 principal components, a web-scale crawling infrastructure and a semantic aggregator. The web-crawler is a modified version of Apache Nutch (the originator of Hadoop and other big data technologies) that has been improved and tailored for data and data service discovery. The second component is semantic aggregation, carried out by a python-based workflow that extracts valuable metadata and stores it in the form of triples through the use semantic technologies.While implementing the BCube stack we have run into several challenges such as a) scaling the project to cover big portions of the Internet at a reasonable cost, b) making sense of very diverse and non-homogeneous data, and lastly, c) extracting facts about these datasets using semantic technologies in order to make them usable for the geosciences community. Despite all these challenges we have proven that we can discover and characterize data that otherwise would have remained in the dark corners of the Internet. Having all this data indexed and 'triplelized' will enable scientists to access a trove of information relevant to their work in a more natural way. An important characteristic of the BCube stack is that all
SciSpark's SRDD : A Scientific Resilient Distributed Dataset for Multidimensional Data

NASA Astrophysics Data System (ADS)

Palamuttam, R. S.; Wilson, B. D.; Mogrovejo, R. M.; Whitehall, K. D.; Mattmann, C. A.; McGibbney, L. J.; Ramirez, P.

2015-12-01

Remote sensing data and climate model output are multi-dimensional arrays of massive sizes locked away in heterogeneous file formats (HDF5/4, NetCDF 3/4) and metadata models (HDF-EOS, CF) making it difficult to perform multi-stage, iterative science processing since each stage requires writing and reading data to and from disk. We have developed SciSpark, a robust Big Data framework, that extends ApacheTM Spark for scaling scientific computations. Apache Spark improves the map-reduce implementation in ApacheTM Hadoop for parallel computing on a cluster, by emphasizing in-memory computation, "spilling" to disk only as needed, and relying on lazy evaluation. Central to Spark is the Resilient Distributed Dataset (RDD), an in-memory distributed data structure that extends the functional paradigm provided by the Scala programming language. However, RDDs are ideal for tabular or unstructured data, and not for highly dimensional data. The SciSpark project introduces the Scientific Resilient Distributed Dataset (sRDD), a distributed-computing array structure which supports iterative scientific algorithms for multidimensional data. SciSpark processes data stored in NetCDF and HDF files by partitioning them across time or space and distributing the partitions among a cluster of compute nodes. We show usability and extensibility of SciSpark by implementing distributed algorithms for geospatial operations on large collections of multi-dimensional grids. In particular we address the problem of scaling an automated method for finding Mesoscale Convective Complexes. SciSpark provides a tensor interface to support the pluggability of different matrix libraries. We evaluate performance of the various matrix libraries in distributed pipelines, such as Nd4jTM and BreezeTM. We detail the architecture and design of SciSpark, our efforts to integrate climate science algorithms, parallel ingest and partitioning (sharding) of A-Train satellite observations from model grids. These
Exploring the influence of social activity on scientific career

NASA Astrophysics Data System (ADS)

Xie, Zonglin; Xie, Zheng; Li, Jianping; Yang, Qian

2018-06-01

For researchers, does activity in academic society influence their careers? In scientometrics, the activity can be expressed through the number of collaborators and scientific careers through the number of publications and citations of authors. We provide empirical evidence from four datasets of representative journals and explore the correlations between each two of the three indices. By using a hypothetical extraction method, we divide authors into patterns which can reflect the different extent of preference for social activity, according to their contributions to the correlation between the number of collaborators and that of papers. Furthermore, we choose two of the patterns as a sociable one and an unsociable one and then compare both of the expected value and the distribution of publications and citations for authors between sociable pattern and unsociable pattern. Finally, we draw a conclusion that social activity could be favorable for authors to promote academic outcomes and obtain recognition.
Principles of Dataset Versioning: Exploring the Recreation/Storage Tradeoff

PubMed Central

Bhattacherjee, Souvik; Chavan, Amit; Huang, Silu; Deshpande, Amol; Parameswaran, Aditya

2015-01-01

The relative ease of collaborative data science and analysis has led to a proliferation of many thousands or millions of versions of the same datasets in many scientific and commercial domains, acquired or constructed at various stages of data analysis across many users, and often over long periods of time. Managing, storing, and recreating these dataset versions is a non-trivial task. The fundamental challenge here is the storage-recreation trade-off: the more storage we use, the faster it is to recreate or retrieve versions, while the less storage we use, the slower it is to recreate or retrieve versions. Despite the fundamental nature of this problem, there has been a surprisingly little amount of work on it. In this paper, we study this trade-off in a principled manner: we formulate six problems under various settings, trading off these quantities in various ways, demonstrate that most of the problems are intractable, and propose a suite of inexpensive heuristics drawing from techniques in delay-constrained scheduling, and spanning tree literature, to solve these problems. We have built a prototype version management system, that aims to serve as a foundation to our DataHub system for facilitating collaborative data science. We demonstrate, via extensive experiments, that our proposed heuristics provide efficient solutions in practical dataset versioning scenarios. PMID:28752014

The Visual Geophysical Exploration Environment: A Multi-dimensional Scientific Visualization

NASA Astrophysics Data System (ADS)

Pandya, R. E.; Domenico, B.; Murray, D.; Marlino, M. R.

2003-12-01

The Visual Geophysical Exploration Environment (VGEE) is an online learning environment designed to help undergraduate students understand fundamental Earth system science concepts. The guiding principle of the VGEE is the importance of hands-on interaction with scientific visualization and data. The VGEE consists of four elements: 1) an online, inquiry-based curriculum for guiding student exploration; 2) a suite of El Nino-related data sets adapted for student use; 3) a learner-centered interface to a scientific visualization tool; and 4) a set of concept models (interactive tools that help students understand fundamental scientific concepts). There are two key innovations featured in this interactive poster session. One is the integration of concept models and the visualization tool. Concept models are simple, interactive, Java-based illustrations of fundamental physical principles. We developed eight concept models and integrated them into the visualization tool to enable students to probe data. The ability to probe data using a concept model addresses the common problem of transfer: the difficulty students have in applying theoretical knowledge to everyday phenomenon. The other innovation is a visualization environment and data that are discoverable in digital libraries, and installed, configured, and used for investigations over the web. By collaborating with the Integrated Data Viewer developers, we were able to embed a web-launchable visualization tool and access to distributed data sets into the online curricula. The Thematic Real-time Environmental Data Distributed Services (THREDDS) project is working to provide catalogs of datasets that can be used in new VGEE curricula under development. By cataloging this curricula in the Digital Library for Earth System Education (DLESE), learners and educators can discover the data and visualization tool within a framework that guides their use.
The Role of Datasets on Scientific Influence within Conflict Research.

PubMed

Van Holt, Tracy; Johnson, Jeffery C; Moates, Shiloh; Carley, Kathleen M

2016-01-01

We inductively tested if a coherent field of inquiry in human conflict research emerged in an analysis of published research involving "conflict" in the Web of Science (WoS) over a 66-year period (1945-2011). We created a citation network that linked the 62,504 WoS records and their cited literature. We performed a critical path analysis (CPA), a specialized social network analysis on this citation network (~1.5 million works), to highlight the main contributions in conflict research and to test if research on conflict has in fact evolved to represent a coherent field of inquiry. Out of this vast dataset, 49 academic works were highlighted by the CPA suggesting a coherent field of inquiry; which means that researchers in the field acknowledge seminal contributions and share a common knowledge base. Other conflict concepts that were also analyzed-such as interpersonal conflict or conflict among pharmaceuticals, for example, did not form their own CP. A single path formed, meaning that there was a cohesive set of ideas that built upon previous research. This is in contrast to a main path analysis of conflict from 1957-1971 where ideas didn't persist in that multiple paths existed and died or emerged reflecting lack of scientific coherence (Carley, Hummon, and Harty, 1993). The critical path consisted of a number of key features: 1) Concepts that built throughout include the notion that resource availability drives conflict, which emerged in the 1960s-1990s and continued on until 2011. More recent intrastate studies that focused on inequalities emerged from interstate studies on the democracy of peace earlier on the path. 2) Recent research on the path focused on forecasting conflict, which depends on well-developed metrics and theories to model. 3) We used keyword analysis to independently show how the CP was topically linked (i.e., through democracy, modeling, resources, and geography). Publically available conflict datasets developed early on helped shape the
Nanocubes for real-time exploration of spatiotemporal datasets.

PubMed

Lins, Lauro; Klosowski, James T; Scheidegger, Carlos

2013-12-01

Consider real-time exploration of large multidimensional spatiotemporal datasets with billions of entries, each defined by a location, a time, and other attributes. Are certain attributes correlated spatially or temporally? Are there trends or outliers in the data? Answering these questions requires aggregation over arbitrary regions of the domain and attributes of the data. Many relational databases implement the well-known data cube aggregation operation, which in a sense precomputes every possible aggregate query over the database. Data cubes are sometimes assumed to take a prohibitively large amount of space, and to consequently require disk storage. In contrast, we show how to construct a data cube that fits in a modern laptop's main memory, even for billions of entries; we call this data structure a nanocube. We present algorithms to compute and query a nanocube, and show how it can be used to generate well-known visual encodings such as heatmaps, histograms, and parallel coordinate plots. When compared to exact visualizations created by scanning an entire dataset, nanocube plots have bounded screen error across a variety of scales, thanks to a hierarchical structure in space and time. We demonstrate the effectiveness of our technique on a variety of real-world datasets, and present memory, timing, and network bandwidth measurements. We find that the timings for the queries in our examples are dominated by network and user-interaction latencies.
ConTour: Data-Driven Exploration of Multi-Relational Datasets for Drug Discovery.

PubMed

Partl, Christian; Lex, Alexander; Streit, Marc; Strobelt, Hendrik; Wassermann, Anne-Mai; Pfister, Hanspeter; Schmalstieg, Dieter

2014-12-01

Large scale data analysis is nowadays a crucial part of drug discovery. Biologists and chemists need to quickly explore and evaluate potentially effective yet safe compounds based on many datasets that are in relationship with each other. However, there is a lack of tools that support them in these processes. To remedy this, we developed ConTour, an interactive visual analytics technique that enables the exploration of these complex, multi-relational datasets. At its core ConTour lists all items of each dataset in a column. Relationships between the columns are revealed through interaction: selecting one or multiple items in one column highlights and re-sorts the items in other columns. Filters based on relationships enable drilling down into the large data space. To identify interesting items in the first place, ConTour employs advanced sorting strategies, including strategies based on connectivity strength and uniqueness, as well as sorting based on item attributes. ConTour also introduces interactive nesting of columns, a powerful method to show the related items of a child column for each item in the parent column. Within the columns, ConTour shows rich attribute data about the items as well as information about the connection strengths to other datasets. Finally, ConTour provides a number of detail views, which can show items from multiple datasets and their associated data at the same time. We demonstrate the utility of our system in case studies conducted with a team of chemical biologists, who investigate the effects of chemical compounds on cells and need to understand the underlying mechanisms.
Handwritten mathematical symbols dataset.

PubMed

Chajri, Yassine; Bouikhalene, Belaid

2016-06-01

Due to the technological advances in recent years, paper scientific documents are used less and less. Thus, the trend in the scientific community to use digital documents has increased considerably. Among these documents, there are scientific documents and more specifically mathematics documents. In this context, we present our own dataset of handwritten mathematical symbols composed of 10,379 images. This dataset gathers Arabic characters, Latin characters, Arabic numerals, Latin numerals, arithmetic operators, set-symbols, comparison symbols, delimiters, etc.
The Role of Datasets on Scientific Influence within Conflict Research

PubMed Central

Van Holt, Tracy; Johnson, Jeffery C.; Moates, Shiloh; Carley, Kathleen M.

2016-01-01

We inductively tested if a coherent field of inquiry in human conflict research emerged in an analysis of published research involving “conflict” in the Web of Science (WoS) over a 66-year period (1945–2011). We created a citation network that linked the 62,504 WoS records and their cited literature. We performed a critical path analysis (CPA), a specialized social network analysis on this citation network (~1.5 million works), to highlight the main contributions in conflict research and to test if research on conflict has in fact evolved to represent a coherent field of inquiry. Out of this vast dataset, 49 academic works were highlighted by the CPA suggesting a coherent field of inquiry; which means that researchers in the field acknowledge seminal contributions and share a common knowledge base. Other conflict concepts that were also analyzed—such as interpersonal conflict or conflict among pharmaceuticals, for example, did not form their own CP. A single path formed, meaning that there was a cohesive set of ideas that built upon previous research. This is in contrast to a main path analysis of conflict from 1957–1971 where ideas didn’t persist in that multiple paths existed and died or emerged reflecting lack of scientific coherence (Carley, Hummon, and Harty, 1993). The critical path consisted of a number of key features: 1) Concepts that built throughout include the notion that resource availability drives conflict, which emerged in the 1960s-1990s and continued on until 2011. More recent intrastate studies that focused on inequalities emerged from interstate studies on the democracy of peace earlier on the path. 2) Recent research on the path focused on forecasting conflict, which depends on well-developed metrics and theories to model. 3) We used keyword analysis to independently show how the CP was topically linked (i.e., through democracy, modeling, resources, and geography). Publically available conflict datasets developed early on helped
Gathering and Exploring Scientific Knowledge in Pharmacovigilance

PubMed Central

Lopes, Pedro; Nunes, Tiago; Campos, David; Furlong, Laura Ines; Bauer-Mehren, Anna; Sanz, Ferran; Carrascosa, Maria Carmen; Mestres, Jordi; Kors, Jan; Singh, Bharat; van Mulligen, Erik; Van der Lei, Johan; Diallo, Gayo; Avillach, Paul; Ahlberg, Ernst; Boyer, Scott; Diaz, Carlos; Oliveira, José Luís

2013-01-01

Pharmacovigilance plays a key role in the healthcare domain through the assessment, monitoring and discovery of interactions amongst drugs and their effects in the human organism. However, technological advances in this field have been slowing down over the last decade due to miscellaneous legal, ethical and methodological constraints. Pharmaceutical companies started to realize that collaborative and integrative approaches boost current drug research and development processes. Hence, new strategies are required to connect researchers, datasets, biomedical knowledge and analysis algorithms, allowing them to fully exploit the true value behind state-of-the-art pharmacovigilance efforts. This manuscript introduces a new platform directed towards pharmacovigilance knowledge providers. This system, based on a service-oriented architecture, adopts a plugin-based approach to solve fundamental pharmacovigilance software challenges. With the wealth of collected clinical and pharmaceutical data, it is now possible to connect knowledge providers’ analysis and exploration algorithms with real data. As a result, new strategies allow a faster identification of high-risk interactions between marketed drugs and adverse events, and enable the automated uncovering of scientific evidence behind them. With this architecture, the pharmacovigilance field has a new platform to coordinate large-scale drug evaluation efforts in a unique ecosystem, publicly available at http://bioinformatics.ua.pt/euadr/. PMID:24349421
Handwritten mathematical symbols dataset

PubMed Central

Chajri, Yassine; Bouikhalene, Belaid

2016-01-01

Due to the technological advances in recent years, paper scientific documents are used less and less. Thus, the trend in the scientific community to use digital documents has increased considerably. Among these documents, there are scientific documents and more specifically mathematics documents. In this context, we present our own dataset of handwritten mathematical symbols composed of 10,379 images. This dataset gathers Arabic characters, Latin characters, Arabic numerals, Latin numerals, arithmetic operators, set-symbols, comparison symbols, delimiters, etc. PMID:27006975
SPICE: exploration and analysis of post-cytometric complex multivariate datasets.

PubMed

Roederer, Mario; Nozzi, Joshua L; Nason, Martha C

2011-02-01

Polychromatic flow cytometry results in complex, multivariate datasets. To date, tools for the aggregate analysis of these datasets across multiple specimens grouped by different categorical variables, such as demographic information, have not been optimized. Often, the exploration of such datasets is accomplished by visualization of patterns with pie charts or bar charts, without easy access to statistical comparisons of measurements that comprise multiple components. Here we report on algorithms and a graphical interface we developed for these purposes. In particular, we discuss thresholding necessary for accurate representation of data in pie charts, the implications for display and comparison of normalized versus unnormalized data, and the effects of averaging when samples with significant background noise are present. Finally, we define a statistic for the nonparametric comparison of complex distributions to test for difference between groups of samples based on multi-component measurements. While originally developed to support the analysis of T cell functional profiles, these techniques are amenable to a broad range of datatypes. Published 2011 Wiley-Liss, Inc.
Scientific exploration of the moon

NASA Technical Reports Server (NTRS)

El-Baz, F.

1979-01-01

The paper reviews efforts undertaken to explore the moon and the results obtained, noting that such efforts have involved a successful interdisciplinary approach to solving a number of scientific problems. Attention is given to the interactions of astronomers, cartographers, geologists, geochemists, geophysicists, physicists, mathematicians and engineers. Earth based remote sensing and unmanned spacecraft such as the Ranger and Surveyor programs are discussed. Emphasis is given to the manned Apollo missions and the results obtained. Finally, the information gathered by these missions is reviewed with regards to how it has increased understanding of the moon, and future exploration is considered.
Earth Exploration Toolbook Workshops: Helping Teachers and Students Analyze Web-based Scientific Data

NASA Astrophysics Data System (ADS)

McAuliffe, C.; Ledley, T.; Dahlman, L.; Haddad, N.

2007-12-01

One of the challenges faced by Earth science teachers, particularly in K-12 settings, is that of connecting scientific research to classroom experiences. Helping teachers and students analyze Web-based scientific data is one way to bring scientific research to the classroom. The Earth Exploration Toolbook (EET) was developed as an online resource to accomplish precisely that. The EET consists of chapters containing step-by-step instructions for accessing Web-based scientific data and for using a software analysis tool to explore issues or concepts in science, technology, and mathematics. For example, in one EET chapter, users download Earthquake data from the USGS and bring it into a geographic information system (GIS), analyzing factors affecting the distribution of earthquakes. The goal of the EET Workshops project is to provide professional development that enables teachers to incorporate Web-based scientific data and analysis tools in ways that meet their curricular needs. In the EET Workshops project, Earth science teachers participate in a pair of workshops that are conducted in a combined teleconference and Web-conference format. In the first workshop, the EET Data Analysis Workshop, participants are introduced to the National Science Digital Library (NSDL) and the Digital Library for Earth System Education (DLESE). They also walk through an Earth Exploration Toolbook (EET) chapter and discuss ways to use Earth science datasets and tools with their students. In a follow-up second workshop, the EET Implementation Workshop, teachers share how they used these materials in the classroom by describing the projects and activities that they carried out with students. The EET Workshops project offers unique and effective professional development. Participants work at their own Internet-connected computers, and dial into a toll-free group teleconference for step-by-step facilitation and interaction. They also receive support via Elluminate, a Web
Redesigning the DOE Data Explorer to embed dataset relationships at the point of search and to reflect landing page organization

DOE PAGES

Studwell, Sara; Robinson, Carly; Elliott, Jannean

2017-04-04

Scientific research is producing ever-increasing amounts of data. Organizing and reflecting relationships across data collections, datasets, publications, and other research objects are essential functionalities of the modern science environment, yet challenging to implement. Landing pages are often used for providing ‘big picture’ contextual frameworks for datasets and data collections, and many large-volume data holders are utilizing them in thoughtful, creative ways. The benefits of their organizational efforts, however, are not realized unless the user eventually sees the landing page at the end point of their search. What if that organization and ‘big picture’ context could benefit the user at themore » beginning of the search? That is a challenging approach, but The Department of Energy’s (DOE) Office of Scientific and Technical Information (OSTI) is redesigning the database functionality of the DOE Data Explorer (DDE) with that goal in mind. Phase I is focused on redesigning the DDE database to leverage relationships between two existing distinct populations in DDE, data Projects and individual Datasets, and then adding a third intermediate population, data Collections. Mapped, structured linkages, designed to show user relationships, will allow users to make informed search choices. These linkages will be sustainable and scalable, created automatically with the use of new metadata fields and existing authorities. Phase II will study selected DOE Data ID Service clients, analyzing how their landing pages are organized, and how that organization might be used to improve DDE search capabilities. At the heart of both phases is the realization that adding more metadata information for cross-referencing may require additional effort for data scientists. Finally, OSTI’s approach seeks to leverage existing metadata and landing page intelligence without imposing an additional burden on the data creators.« less
Redesigning the DOE Data Explorer to embed dataset relationships at the point of search and to reflect landing page organization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Studwell, Sara; Robinson, Carly; Elliott, Jannean

Scientific research is producing ever-increasing amounts of data. Organizing and reflecting relationships across data collections, datasets, publications, and other research objects are essential functionalities of the modern science environment, yet challenging to implement. Landing pages are often used for providing ‘big picture’ contextual frameworks for datasets and data collections, and many large-volume data holders are utilizing them in thoughtful, creative ways. The benefits of their organizational efforts, however, are not realized unless the user eventually sees the landing page at the end point of their search. What if that organization and ‘big picture’ context could benefit the user at themore » beginning of the search? That is a challenging approach, but The Department of Energy’s (DOE) Office of Scientific and Technical Information (OSTI) is redesigning the database functionality of the DOE Data Explorer (DDE) with that goal in mind. Phase I is focused on redesigning the DDE database to leverage relationships between two existing distinct populations in DDE, data Projects and individual Datasets, and then adding a third intermediate population, data Collections. Mapped, structured linkages, designed to show user relationships, will allow users to make informed search choices. These linkages will be sustainable and scalable, created automatically with the use of new metadata fields and existing authorities. Phase II will study selected DOE Data ID Service clients, analyzing how their landing pages are organized, and how that organization might be used to improve DDE search capabilities. At the heart of both phases is the realization that adding more metadata information for cross-referencing may require additional effort for data scientists. Finally, OSTI’s approach seeks to leverage existing metadata and landing page intelligence without imposing an additional burden on the data creators.« less
VisIVO: A Library and Integrated Tools for Large Astrophysical Dataset Exploration

NASA Astrophysics Data System (ADS)

Becciani, U.; Costa, A.; Ersotelos, N.; Krokos, M.; Massimino, P.; Petta, C.; Vitello, F.

2012-09-01

VisIVO provides an integrated suite of tools and services that can be used in many scientific fields. VisIVO development starts in the Virtual Observatory framework. VisIVO allows users to visualize meaningfully highly-complex, large-scale datasets and create movies of these visualizations based on distributed infrastructures. VisIVO supports high-performance, multi-dimensional visualization of large-scale astrophysical datasets. Users can rapidly obtain meaningful visualizations while preserving full and intuitive control of the relevant parameters. VisIVO consists of VisIVO Desktop - a stand-alone application for interactive visualization on standard PCs, VisIVO Server - a platform for high performance visualization, VisIVO Web - a custom designed web portal, VisIVOSmartphone - an application to exploit the VisIVO Server functionality and the latest VisIVO features: VisIVO Library allows a job running on a computational system (grid, HPC, etc.) to produce movies directly with the code internal data arrays without the need to produce intermediate files. This is particularly important when running on large computational facilities, where the user wants to have a look at the results during the data production phase. For example, in grid computing facilities, images can be produced directly in the grid catalogue while the user code is running in a system that cannot be directly accessed by the user (a worker node). The deployment of VisIVO on the DG and gLite is carried out with the support of EDGI and EGI-Inspire projects. Depending on the structure and size of datasets under consideration, the data exploration process could take several hours of CPU for creating customized views and the production of movies could potentially last several days. For this reason an MPI parallel version of VisIVO could play a fundamental role in increasing performance, e.g. it could be automatically deployed on nodes that are MPI aware. A central concept in our development is thus to
Benchmark Dataset for Whole Genome Sequence Compression.

PubMed

C L, Biji; S Nair, Achuthsankar

2017-01-01

The research in DNA data compression lacks a standard dataset to test out compression tools specific to DNA. This paper argues that the current state of achievement in DNA compression is unable to be benchmarked in the absence of such scientifically compiled whole genome sequence dataset and proposes a benchmark dataset using multistage sampling procedure. Considering the genome sequence of organisms available in the National Centre for Biotechnology and Information (NCBI) as the universe, the proposed dataset selects 1,105 prokaryotes, 200 plasmids, 164 viruses, and 65 eukaryotes. This paper reports the results of using three established tools on the newly compiled dataset and show that their strength and weakness are evident only with a comparison based on the scientifically compiled benchmark dataset. The sample dataset and the respective links are available @ https://sourceforge.net/projects/benchmarkdnacompressiondataset/.
Drilling informatics: data-driven challenges of scientific drilling

NASA Astrophysics Data System (ADS)

Yamada, Yasuhiro; Kyaw, Moe; Saito, Sanny

2017-04-01

The primary aim of scientific drilling is to precisely understand the dynamic nature of the Earth. This is the reason why we investigate the subsurface materials (rock and fluid including microbial community) existing under particular environmental conditions. This requires sample collection and analytical data production from the samples, and in-situ data measurement at boreholes. Current available data comes from cores, cuttings, mud logging, geophysical logging, and exploration geophysics, but these datasets are difficult to be integrated because of their different kinds and scales. Now we are producing more useful datasets to fill the gap between the exiting data and extracting more information from such datasets and finally integrating the information. In particular, drilling parameters are very useful datasets as geomechanical properties. We believe such approach, 'drilling informatics', would be the most appropriate to obtain the comprehensive and dynamic picture of our scientific target, such as the seismogenic fault zone and the Moho discontinuity surface. This presentation introduces our initiative and current achievements of drilling informatics.
Adventures in supercomputing: Scientific exploration in an era of change

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gentry, E.; Helland, B.; Summers, B.

1997-11-01

Students deserve the opportunity to explore the world of science surrounding them. Therefore it is important that scientific exploration and investigation be a part of each student`s educational career. The Department of Energy`s Adventures in Superconducting (AiS) takes students beyond mere scientific literacy to a rich embodiment of scientific exploration. AiS provides today`s science and math students with a greater opportunity to investigate science problems, propose solutions, explore different methods of solving the problem, organize their work into a technical paper, and present their results. Students learn at different rates in different ways. Science classes with students having varying learningmore » styles and levels of achievement have always been a challenge for teachers. The AiS {open_quotes}hands-on, minds-on{close_quotes} project-based method of teaching science meets the challenge of this diversity heads on! AiS uses the development of student chosen projects as the means of achieving a lifelong enthusiasm for scientific proficiency. One goal of AiS is to emulate the research that takes place in the everyday environment of scientists. Students work in teams and often collaborate with students nationwide. With the help of mentors from the academic and scientific community, students pose a problem in science, investigate possible solutions, design a mathematical and computational model for the problem, exercise the model to achieve results, and evaluate the implications of the results. The students then have the opportunity to present the project to their peers, teachers, and scientists. Using this inquiry-based technique, students learn more than science skills, they learn to reason and think -- going well beyond the National Science Education Standard. The teacher becomes a resource person actively working together with the students in their quest for scientific knowledge.« less
Interactive Exploration on Large Genomic Datasets.

PubMed

Tu, Eric

2016-01-01

The prevalence of large genomics datasets has made the the need to explore this data more important. Large sequencing projects like the 1000 Genomes Project [1], which reconstructed the genomes of 2,504 individuals sampled from 26 populations, have produced over 200TB of publically available data. Meanwhile, existing genomic visualization tools have been unable to scale with the growing amount of larger, more complex data. This difficulty is acute when viewing large regions (over 1 megabase, or 1,000,000 bases of DNA), or when concurrently viewing multiple samples of data. While genomic processing pipelines have shifted towards using distributed computing techniques, such as with ADAM [4], genomic visualization tools have not. In this work we present Mango, a scalable genome browser built on top of ADAM that can run both locally and on a cluster. Mango presents a combination of different optimizations that can be combined in a single application to drive novel genomic visualization techniques over terabytes of genomic data. By building visualization on top of a distributed processing pipeline, we can perform visualization queries over large regions that are not possible with current tools, and decrease the time for viewing large data sets. Mango is part of the Big Data Genomics project at University of California-Berkeley [25] and is published under the Apache 2 license. Mango is available at https://github.com/bigdatagenomics/mango.
The 3D widgets for exploratory scientific visualization

NASA Technical Reports Server (NTRS)

Herndon, Kenneth P.; Meyer, Tom

1995-01-01

Computational fluid dynamics (CFD) techniques are used to simulate flows of fluids like air or water around such objects as airplanes and automobiles. These techniques usually generate very large amounts of numerical data which are difficult to understand without using graphical scientific visualization techniques. There are a number of commercial scientific visualization applications available today which allow scientists to control visualization tools via textual and/or 2D user interfaces. However, these user interfaces are often difficult to use. We believe that 3D direct-manipulation techniques for interactively controlling visualization tools will provide opportunities for powerful and useful interfaces with which scientists can more effectively explore their datasets. A few systems have been developed which use these techniques. In this paper, we will present a variety of 3D interaction techniques for manipulating parameters of visualization tools used to explore CFD datasets, and discuss in detail various techniques for positioning tools in a 3D scene.
Scientific Opportunities with ispace, a Lunar Exploration Company

NASA Astrophysics Data System (ADS)

Acierno, K. T.

2016-11-01

This presentation introduces ispace, a Tokyo-based lunar exploration company. Technology applied to the Team Hakuto Google Lunar XPRIZE mission will be described. Finally, it will discuss how developing low cost and mass efficient rovers can support scientific opportunities.

Proceedings: Fourth Workshop on Mining Scientific Datasets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kamath, C

Commercial applications of data mining in areas such as e-commerce, market-basket analysis, text-mining, and web-mining have taken on a central focus in the JCDD community. However, there is a significant amount of innovative data mining work taking place in the context of scientific and engineering applications that is not well represented in the mainstream KDD conferences. For example, scientific data mining techniques are being developed and applied to diverse fields such as remote sensing, physics, chemistry, biology, astronomy, structural mechanics, computational fluid dynamics etc. In these areas, data mining frequently complements and enhances existing analysis methods based on statistics, exploratorymore » data analysis, and domain-specific approaches. On the surface, it may appear that data from one scientific field, say genomics, is very different from another field, such as physics. However, despite their diversity, there is much that is common across the mining of scientific and engineering data. For example, techniques used to identify objects in images are very similar, regardless of whether the images came from a remote sensing application, a physics experiment, an astronomy observation, or a medical study. Further, with data mining being applied to new types of data, such as mesh data from scientific simulations, there is the opportunity to apply and extend data mining to new scientific domains. This one-day workshop brings together data miners analyzing science data and scientists from diverse fields to share their experiences, learn how techniques developed in one field can be applied in another, and better understand some of the newer techniques being developed in the KDD community. This is the fourth workshop on the topic of Mining Scientific Data sets; for information on earlier workshops, see http://www.ahpcrc.org/conferences/. This workshop continues the tradition of addressing challenging problems in a field where the diversity of
Towards AN Integrated Scientific and Social Case for Human Space Exploration

NASA Astrophysics Data System (ADS)

Crawford, I. A.

2004-06-01

I will argue that an ambitious programme of human space exploration, involving a return to the Moon, and eventually human missions to Mars, will add greatly to human knowledge. Gathering such knowledge is the primary aim of science, but science’s compartmentalisation into isolated academic disciplines tends to obscure the overall strength of the scientific case. Any consideration of the scientific arguments for human space exploration must therefore take a holistic view, and integrate the potential benefits over the entire spectrum of human knowledge. Moreover, science is only one thread in a much larger overall case for human space exploration. Other threads include economic, industrial, educational, geopolitical and cultural benefits. Any responsibly formulated public space policy must weigh all of these factors before deciding whether or not an investment in human space activities is scientifically and socially desirable.
The Role of the Spacecraft Operator in Scientific Exploration

NASA Astrophysics Data System (ADS)

Love, S. G.

2011-03-01

Pilot and flight engineer crew members can improve scientific exploration missions and effectively support field work that they may not understand by contributing leadership, teamwork, communication, and operational thinking skills.
Space Exploration as a Human Enterprise: The Scientific Interest

ERIC Educational Resources Information Center

Sagan, Carl

1973-01-01

Presents examples which illustrate the importance of space exploration in diverse aspects of scientific knowledge. Indicates that human beings are today not wise enough to anticipate the practical benefits of planetary studies. (CC)
Scientific Objectives of China-Russia Joint Mars Exploration Program YH-1

NASA Astrophysics Data System (ADS)

Wu, Ji; Zhu, Guang-Wu; Zhao, Hua; Wang, Chi; Li, Lei; Sun, Yue-Qiang; Guo, Wei; Huang, Cheng-Li

2010-04-01

Compared with other planets, Mars is a planet most similar with the earth and most possible to find the extraterrestrial life on it, and therefore especially concerned about by human beings. In recent years, some countries have launched Mars probes and announced their manned Mars exploration programs. China has become the fifth country in the world to launch independently artificial satellites, and the third country able to carry out an independent manned space program. However, China is just at the beginning of deep space explorations. In 2007, China and Russia signed an agreement on a joint Mars exploration program by sending a Chinese micro-satellite Yinghuo-1 (YH-1) to the Mars orbit. Once YH-1 enters its orbit, it will carry out its own exploration, as well as the joint exploration with the Russian Phobos-Grunt probe. This paper summarizes the scientific background and objectives of YH-1 and describes briefly its payloads for realizing these scientific objectives. In addition, the main exploration tasks of YH-1 and a preliminary prospect on its exploration results are also given.
Data Recommender: An Alternative Way to Discover Open Scientific Datasets

NASA Astrophysics Data System (ADS)

Klump, J. F.; Devaraju, A.; Williams, G.; Hogan, D.; Davy, R.; Page, J.; Singh, D.; Peterson, N.

2017-12-01

Over the past few years, institutions and government agencies have adopted policies to openly release their data, which has resulted in huge amounts of open data becoming available on the web. When trying to discover the data, users face two challenges: an overload of choice and the limitations of the existing data search tools. On the one hand, there are too many datasets to choose from, and therefore, users need to spend considerable effort to find the datasets most relevant to their research. On the other hand, data portals commonly offer keyword and faceted search, which depend fully on the user queries to search and rank relevant datasets. Consequently, keyword and faceted search may return loosely related or irrelevant results, although the results may contain the same query. They may also return highly specific results that depend more on how well metadata was authored. They do not account well for variance in metadata due to variance in author styles and preferences. The top-ranked results may also come from the same data collection, and users are unlikely to discover new and interesting datasets. These search modes mainly suits users who can express their information needs in terms of the structure and terminology of the data portals, but may pose a challenge otherwise. The above challenges reflect that we need a solution that delivers the most relevant (i.e., similar and serendipitous) datasets to users, beyond the existing search functionalities on the portals. A recommender system is an information filtering system that presents users with relevant and interesting contents based on users' context and preferences. Delivering data recommendations to users can make data discovery easier, and as a result may enhance user engagement with the portal. We developed a hybrid data recommendation approach for the CSIRO Data Access Portal. The approach leverages existing recommendation techniques (e.g., content-based filtering and item co-occurrence) to produce
Design of FastQuery: How to Generalize Indexing and Querying System for Scientific Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, Jerry; Wu, Kesheng

2011-04-18

Modern scientific datasets present numerous data management and analysis challenges. State-of-the-art index and query technologies such as FastBit are critical for facilitating interactive exploration of large datasets. These technologies rely on adding auxiliary information to existing datasets to accelerate query processing. To use these indices, we need to match the relational data model used by the indexing systems with the array data model used by most scientific data, and to provide an efficient input and output layer for reading and writing the indices. In this work, we present a flexible design that can be easily applied to most scientific datamore » formats. We demonstrate this flexibility by applying it to two of the most commonly used scientific data formats, HDF5 and NetCDF. We present two case studies using simulation data from the particle accelerator and climate simulation communities. To demonstrate the effectiveness of the new design, we also present a detailed performance study using both synthetic and real scientific workloads.« less
Technologies Enabling Scientific Exploration of Asteroids and Moons

NASA Astrophysics Data System (ADS)

Shaw, A.; Fulford, P.; Chappell, L.

2016-12-01

Scientific exploration of moons and asteroids is enabled by several key technologies that yield topographic information, allow excavation of subsurface materials, and allow delivery of higher-mass scientific payloads to moons and asteroids. These key technologies include lidar systems, robotics, and solar-electric propulsion spacecraft buses. Many of these technologies have applications for a variety of planetary targets. Lidar systems yield high-resolution shape models of asteroids and moons. These shape models can then be combined with radio science information to yield insight into density and internal structure. Further, lidar systems allow investigation of topographic surface features, large and small, which yields information on regolith properties. Robotic arms can be used for a variety of purposes, especially to support excavation, revealing subsurface material and acquiring material from depth for either in situ analysis or sample return. Robotic arms with built-in force sensors can also be used to gauge the strength of materials as a function of depth, yielding insight into regolith physical properties. Mobility systems allow scientific exploration of multiple sites, and also yield insight into regolith physical properties due to the interaction of wheels with regolith. High-power solar electric propulsion (SEP) spacecraft bus systems allow more science instruments to be included on missions given their ability to support greater payload mass. In addition, leveraging a cost-effective commercially-built SEP spacecraft bus can significantly reduce mission cost.
Segmentation of Unstructured Datasets

NASA Technical Reports Server (NTRS)

Bhat, Smitha

1996-01-01

Datasets generated by computer simulations and experiments in Computational Fluid Dynamics tend to be extremely large and complex. It is difficult to visualize these datasets using standard techniques like Volume Rendering and Ray Casting. Object Segmentation provides a technique to extract and quantify regions of interest within these massive datasets. This thesis explores basic algorithms to extract coherent amorphous regions from two-dimensional and three-dimensional scalar unstructured grids. The techniques are applied to datasets from Computational Fluid Dynamics and from Finite Element Analysis.
PEGASO . Polar Explorer for Geomagnetic And other Scientific Observation

NASA Astrophysics Data System (ADS)

Romeo, G.; Di Stefano, G.; Di Felice, F.; Caprara, F.; Iarocci, A.; Peterzen, S.; Masi, S.; Spoto, D.; Ibba, R.; Musso, I.; Dragoy, P.

PEGASO (Polar Explorer for Geomagnetic And other Scientific Observation) program has been created to conduct small experiments in as many disciplines on-board of small stratospheric balloons. PEGASO uses the very low expensive pathfinder balloons. Stratospheric pathfinders are small balloons commonly used to explore the atmospheric circumpolar upper winds and to predict the trajectory for big LDBs (Long Duration Balloons). Installing scientific instruments on pathfinder and using solar energy to power supply the system, we have the opportunity to explorer the Polar Regions, during the polar summer, following circular trajectory. These stratospheric small payload have flown for 14 up to 40 days, measuring the magnetic field of polar region, by means of 3-axis-fluxgate magnetometer. PEGASO payload uses IRIDIUM satellite telemetry (TM). A ground station communicates with one or more payloads to download scientific and house-keeping data and to send commands for ballast releasing, for system resetting and for operating on the separator system at the flight end. The PEGASO missions have been performed from the Svalbard islands with the logistic collaboration of the Andoya Rocket Range and from the Antarctic Italian base. Continuous trajectory predictions, elaborated by Institute of Information Science and Technology (ISTI-CNR), were necessary for the flight safety requirements in the north hemisphere. This light payloads (<10 Kg) are realized by the cooperation between the INGV and the Physics department "La Sapienza" University and it has operated five times in polar areas with the sponsorship of Italian Antarctic Program (PNRA), Italian Space Agency (ASI). This paper summarizes important results about stratospheric missions.
Scientific field training for human planetary exploration

NASA Astrophysics Data System (ADS)

Lim, D. S. S.; Warman, G. L.; Gernhardt, M. L.; McKay, C. P.; Fong, T.; Marinova, M. M.; Davila, A. F.; Andersen, D.; Brady, A. L.; Cardman, Z.; Cowie, B.; Delaney, M. D.; Fairén, A. G.; Forrest, A. L.; Heaton, J.; Laval, B. E.; Arnold, R.; Nuytten, P.; Osinski, G.; Reay, M.; Reid, D.; Schulze-Makuch, D.; Shepard, R.; Slater, G. F.; Williams, D.

2010-05-01

Forthcoming human planetary exploration will require increased scientific return (both in real time and post-mission), longer surface stays, greater geographical coverage, longer and more frequent EVAs, and more operational complexities than during the Apollo missions. As such, there is a need to shift the nature of astronauts' scientific capabilities to something akin to an experienced terrestrial field scientist. To achieve this aim, the authors present a case that astronaut training should include an Apollo-style curriculum based on traditional field school experiences, as well as full immersion in field science programs. Herein we propose four Learning Design Principles (LDPs) focused on optimizing astronaut learning in field science settings. The LDPs are as follows: LDP#1: Provide multiple experiences: varied field science activities will hone astronauts' abilities to adapt to novel scientific opportunities LDP#2: Focus on the learner: fostering intrinsic motivation will orient astronauts towards continuous informal learning and a quest for mastery LDP#3: Provide a relevant experience - the field site: field sites that share features with future planetary missions will increase the likelihood that astronauts will successfully transfer learning LDP#4: Provide a social learning experience - the field science team and their activities: ensuring the field team includes members of varying levels of experience engaged in opportunities for discourse and joint problem solving will facilitate astronauts' abilities to think and perform like a field scientist. The proposed training program focuses on the intellectual and technical aspects of field science, as well as the cognitive manner in which field scientists experience, observe and synthesize their environment. The goal of the latter is to help astronauts develop the thought patterns and mechanics of an effective field scientist, thereby providing a broader base of experience and expertise than could be achieved
The Need for Analogue Missions in Scientific Human and Robotic Planetary Exploration

NASA Technical Reports Server (NTRS)

Snook, K. J.; Mendell, W. W.

2004-01-01

With the increasing challenges of planetary missions, and especially with the prospect of human exploration of the moon and Mars, the need for earth-based mission simulations has never been greater. The current focus on science as a major driver for planetary exploration introduces new constraints in mission design, planning, operations, and technology development. Analogue missions can be designed to address critical new integration issues arising from the new science-driven exploration paradigm. This next step builds on existing field studies and technology development at analogue sites, providing engineering, programmatic, and scientific lessons-learned in relatively low-cost and low-risk environments. One of the most important outstanding questions in planetary exploration is how to optimize the human and robotic interaction to achieve maximum science return with minimum cost and risk. To answer this question, researchers are faced with the task of defining scientific return and devising ways of measuring the benefit of scientific planetary exploration to humanity. Earth-based and spacebased analogue missions are uniquely suited to answer this question. Moreover, they represent the only means for integrating science operations, mission operations, crew training, technology development, psychology and human factors, and all other mission elements prior to final mission design and launch. Eventually, success in future planetary exploration will depend on our ability to prepare adequately for missions, requiring improved quality and quantity of analogue activities. This effort demands more than simply developing new technologies needed for future missions and increasing our scientific understanding of our destinations. It requires a systematic approach to the identification and evaluation of the categories of analogue activities. This paper presents one possible approach to the classification and design of analogue missions based on their degree of fidelity in ten
Exploring Antarctic Land Surface Temperature Extremes Using Condensed Anomaly Databases

NASA Astrophysics Data System (ADS)

Grant, Glenn Edwin

Satellite observations have revolutionized the Earth Sciences and climate studies. However, data and imagery continue to accumulate at an accelerating rate, and efficient tools for data discovery, analysis, and quality checking lag behind. In particular, studies of long-term, continental-scale processes at high spatiotemporal resolutions are especially problematic. The traditional technique of downloading an entire dataset and using customized analysis code is often impractical or consumes too many resources. The Condensate Database Project was envisioned as an alternative method for data exploration and quality checking. The project's premise was that much of the data in any satellite dataset is unneeded and can be eliminated, compacting massive datasets into more manageable sizes. Dataset sizes are further reduced by retaining only anomalous data of high interest. Hosting the resulting "condensed" datasets in high-speed databases enables immediate availability for queries and exploration. Proof of the project's success relied on demonstrating that the anomaly database methods can enhance and accelerate scientific investigations. The hypothesis of this dissertation is that the condensed datasets are effective tools for exploring many scientific questions, spurring further investigations and revealing important information that might otherwise remain undetected. This dissertation uses condensed databases containing 17 years of Antarctic land surface temperature anomalies as its primary data. The study demonstrates the utility of the condensate database methods by discovering new information. In particular, the process revealed critical quality problems in the source satellite data. The results are used as the starting point for four case studies, investigating Antarctic temperature extremes, cloud detection errors, and the teleconnections between Antarctic temperature anomalies and climate indices. The results confirm the hypothesis that the condensate databases
Network effects on scientific collaborations.

PubMed

Uddin, Shahadat; Hossain, Liaquat; Rasmussen, Kim

2013-01-01

The analysis of co-authorship network aims at exploring the impact of network structure on the outcome of scientific collaborations and research publications. However, little is known about what network properties are associated with authors who have increased number of joint publications and are being cited highly. Measures of social network analysis, for example network centrality and tie strength, have been utilized extensively in current co-authorship literature to explore different behavioural patterns of co-authorship networks. Using three SNA measures (i.e., degree centrality, closeness centrality and betweenness centrality), we explore scientific collaboration networks to understand factors influencing performance (i.e., citation count) and formation (tie strength between authors) of such networks. A citation count is the number of times an article is cited by other articles. We use co-authorship dataset of the research field of 'steel structure' for the year 2005 to 2009. To measure the strength of scientific collaboration between two authors, we consider the number of articles co-authored by them. In this study, we examine how citation count of a scientific publication is influenced by different centrality measures of its co-author(s) in a co-authorship network. We further analyze the impact of the network positions of authors on the strength of their scientific collaborations. We use both correlation and regression methods for data analysis leading to statistical validation. We identify that citation count of a research article is positively correlated with the degree centrality and betweenness centrality values of its co-author(s). Also, we reveal that degree centrality and betweenness centrality values of authors in a co-authorship network are positively correlated with the strength of their scientific collaborations. Authors' network positions in co-authorship networks influence the performance (i.e., citation count) and formation (i.e., tie strength
Network Effects on Scientific Collaborations

PubMed Central

Uddin, Shahadat; Hossain, Liaquat; Rasmussen, Kim

2013-01-01

Background The analysis of co-authorship network aims at exploring the impact of network structure on the outcome of scientific collaborations and research publications. However, little is known about what network properties are associated with authors who have increased number of joint publications and are being cited highly. Methodology/Principal Findings Measures of social network analysis, for example network centrality and tie strength, have been utilized extensively in current co-authorship literature to explore different behavioural patterns of co-authorship networks. Using three SNA measures (i.e., degree centrality, closeness centrality and betweenness centrality), we explore scientific collaboration networks to understand factors influencing performance (i.e., citation count) and formation (tie strength between authors) of such networks. A citation count is the number of times an article is cited by other articles. We use co-authorship dataset of the research field of ‘steel structure’ for the year 2005 to 2009. To measure the strength of scientific collaboration between two authors, we consider the number of articles co-authored by them. In this study, we examine how citation count of a scientific publication is influenced by different centrality measures of its co-author(s) in a co-authorship network. We further analyze the impact of the network positions of authors on the strength of their scientific collaborations. We use both correlation and regression methods for data analysis leading to statistical validation. We identify that citation count of a research article is positively correlated with the degree centrality and betweenness centrality values of its co-author(s). Also, we reveal that degree centrality and betweenness centrality values of authors in a co-authorship network are positively correlated with the strength of their scientific collaborations. Conclusions/Significance Authors’ network positions in co-authorship networks influence
I Wonder…Scientific Exploration and Experimentation as a Practice of Christian Faith

ERIC Educational Resources Information Center

Shaver, Ruth E.

2016-01-01

"I Wonder...Gaining Wisdom and Growing Faith Through Scientific Exploration" is an intergenerational science curriculum designed to be used in congregations. The goal of this curriculum and the theoretical work underpinning it is to counter the perception that people of faith cannot also be people who possess a scientific understanding…
Exploring Cloud Computing for Large-scale Scientific Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, Guang; Han, Binh; Yin, Jian

This paper explores cloud computing for large-scale data-intensive scientific applications. Cloud computing is attractive because it provides hardware and software resources on-demand, which relieves the burden of acquiring and maintaining a huge amount of resources that may be used only once by a scientific application. However, unlike typical commercial applications that often just requires a moderate amount of ordinary resources, large-scale scientific applications often need to process enormous amount of data in the terabyte or even petabyte range and require special high performance hardware with low latency connections to complete computation in a reasonable amount of time. To address thesemore » challenges, we build an infrastructure that can dynamically select high performance computing hardware across institutions and dynamically adapt the computation to the selected resources to achieve high performance. We have also demonstrated the effectiveness of our infrastructure by building a system biology application and an uncertainty quantification application for carbon sequestration, which can efficiently utilize data and computation resources across several institutions.« less
Internet Activities Using Scientific Data. A Self-Guided Exploration.

ERIC Educational Resources Information Center

Froseth, Stan; Poppe, Barbara

This guide is intended for the secondary school teacher (especially math or science) or the student who wants to access and learn about scientific data on the Internet. It is organized as a self-guided exploration. Nine exercises enable the user to access and analyze on-line information from the National Oceanic and Atmospheric Administration…
Lemont B. Kier: a bibliometric exploration of his scientific production and its use.

PubMed

Restrepo, Guillermo; Llanos, Eugenio J; Silva, Adriana E

2013-12-01

We thought an appropriate way to celebrate the seminal contribution of Kier is to explore his influence on science, looking for the impact of his research through the citation of his scientific production. From a bibliometric approach the impact of Kier's work is addressed as an individual within a community. Reviewing data from his curriculum vitae, as well as from the ISI Web of Knowledge (ISI), his role within the scientific community is established and the way his scientific results circulate is studied. His curriculum vitae is explored emphasising the approaches he used in his research activities and the social ties with other actors of the community. The circulation of Kier's publications in the ISI is studied as a means for spreading and installing his discourse within the community. The citation patterns found not only show the usage of Kier's scientific results, but also open the possibility to identify some characteristics of this discursive community, such as a common vocabulary and common research goals. The results show an interdisciplinary research work that consolidates a scientific community on the topic of drug discovery.
Explore Earth Science Datasets for STEM with the NASA GES DISC Online Visualization and Analysis Tool, GIOVANNI

NASA Astrophysics Data System (ADS)

Liu, Z.; Acker, J. G.; Kempler, S. J.

2016-12-01

The NASA Goddard Earth Sciences (GES) Data and Information Services Center (DISC) is one of twelve NASA Science Mission Directorate (SMD) Data Centers that provide Earth science data, information, and services to research scientists, applications scientists, applications users, and students around the world. The GES DISC is the home (archive) of NASA Precipitation and Hydrology, as well as Atmospheric Composition and Dynamics remote sensing data and information. To facilitate Earth science data access, the GES DISC has been developing user-friendly data services for users at different levels. Among them, the Geospatial Interactive Online Visualization ANd aNalysis Infrastructure (GIOVANNI, http://giovanni.gsfc.nasa.gov/) allows users to explore satellite-based data using sophisticated analyses and visualizations without downloading data and software, which is particularly suitable for novices to use NASA datasets in STEM activities. In this presentation, we will briefly introduce GIOVANNI and recommend datasets for STEM. Examples of using these datasets in STEM activities will be presented as well.

Explore Earth Science Datasets for STEM with the NASA GES DISC Online Visualization and Analysis Tool, Giovanni

NASA Technical Reports Server (NTRS)

Liu, Z.; Acker, J.; Kempler, S.

2016-01-01

The NASA Goddard Earth Sciences (GES) Data and Information Services Center(DISC) is one of twelve NASA Science Mission Directorate (SMD) Data Centers that provide Earth science data, information, and services to users around the world including research and application scientists, students, citizen scientists, etc. The GESDISC is the home (archive) of remote sensing datasets for NASA Precipitation and Hydrology, Atmospheric Composition and Dynamics, etc. To facilitate Earth science data access, the GES DISC has been developing user-friendly data services for users at different levels in different countries. Among them, the Geospatial Interactive Online Visualization ANd aNalysis Infrastructure (Giovanni, http:giovanni.gsfc.nasa.gov) allows users to explore satellite-based datasets using sophisticated analyses and visualization without downloading data and software, which is particularly suitable for novices (such as students) to use NASA datasets in STEM (science, technology, engineering and mathematics) activities. In this presentation, we will briefly introduce Giovanni along with examples for STEM activities.
Lost in space: design of experiments and scientific exploration in a Hogarth Universe.

PubMed

Lendrem, Dennis W; Lendrem, B Clare; Woods, David; Rowland-Jones, Ruth; Burke, Matthew; Chatfield, Marion; Isaacs, John D; Owen, Martin R

2015-11-01

A Hogarth, or 'wicked', universe is an irregular environment generating data to support erroneous beliefs. Here, we argue that development scientists often work in such a universe. We demonstrate that exploring these multidimensional spaces using small experiments guided by scientific intuition alone, gives rise to an illusion of validity and a misplaced confidence in that scientific intuition. By contrast, design of experiments (DOE) permits the efficient mapping of such complex, multidimensional spaces. We describe simulation tools that enable research scientists to explore these spaces in relative safety. Copyright © 2015 Elsevier Ltd. All rights reserved.
Exploring multiliteracies, student voice, and scientific practices in two elementary classrooms

NASA Astrophysics Data System (ADS)

Allison, Elizabeth Rowland

This study explored the voices of children in a changing world with evolving needs and new opportunities. The workplaces of rapidly moving capitalist societies value creativity, collaboration, and critical thinking skills which are of growing importance and manifesting themselves in modern K-12 science classroom cultures (Gee, 2000; New London Group, 2000). This study explored issues of multiliteracies and student voice set within the context of teaching and learning in 4th and 5th grade science classrooms. The purpose of the study was to ascertain what and how multiliteracies and scientific practices (NGSS Lead States, 2013c) are implemented, explore how multiliteracies influence students' voices, and investigate teacher and student perceptions of multiliteracies, student voice, and scientific practices. Grounded in a constructivist framework, a multiple case study was employed in two elementary classrooms. Through observations, student focus groups and interviews, and teacher interviews, a detailed narrative was created to describe a range of multiliteracies, student voice, and scientific practices that occurred with the science classroom context. Using grounded theory analysis, data were coded and analyzed to reveal emergent themes. Data analysis revealed that these two classrooms were enriched with multiliteracies that serve metaphorically as breeding grounds for student voice. In the modern classroom, defined as a space where information is instantly accessible through the Internet, multiliteracies can be developed through inquiry-based, collaborative, and technology-rich experiences. Scientific literacy, cultivated through student communication and collaboration, is arguably a multiliteracy that has not been considered in the literature, and should be, as an integral component of overall individual literacy in the 21st century. Findings revealed four themes. Three themes suggest that teachers address several modes of multiliteracies in science, but identify
A new dataset validation system for the Planetary Science Archive

NASA Astrophysics Data System (ADS)

Manaud, N.; Zender, J.; Heather, D.; Martinez, S.

2007-08-01

The Planetary Science Archive is the official archive for the Mars Express mission. It has received its first data by the end of 2004. These data are delivered by the PI teams to the PSA team as datasets, which are formatted conform to the Planetary Data System (PDS). The PI teams are responsible for analyzing and calibrating the instrument data as well as the production of reduced and calibrated data. They are also responsible of the scientific validation of these data. ESA is responsible of the long-term data archiving and distribution to the scientific community and must ensure, in this regard, that all archived products meet quality. To do so, an archive peer-review is used to control the quality of the Mars Express science data archiving process. However a full validation of its content is missing. An independent review board recently recommended that the completeness of the archive as well as the consistency of the delivered data should be validated following well-defined procedures. A new validation software tool is being developed to complete the overall data quality control system functionality. This new tool aims to improve the quality of data and services provided to the scientific community through the PSA, and shall allow to track anomalies in and to control the completeness of datasets. It shall ensure that the PSA end-users: (1) can rely on the result of their queries, (2) will get data products that are suitable for scientific analysis, (3) can find all science data acquired during a mission. We defined dataset validation as the verification and assessment process to check the dataset content against pre-defined top-level criteria, which represent the general characteristics of good quality datasets. The dataset content that is checked includes the data and all types of information that are essential in the process of deriving scientific results and those interfacing with the PSA database. The validation software tool is a multi-mission tool that
Decibel: The Relational Dataset Branching System

PubMed Central

Maddox, Michael; Goehring, David; Elmore, Aaron J.; Madden, Samuel; Parameswaran, Aditya; Deshpande, Amol

2017-01-01

As scientific endeavors and data analysis become increasingly collaborative, there is a need for data management systems that natively support the versioning or branching of datasets to enable concurrent analysis, cleaning, integration, manipulation, or curation of data across teams of individuals. Common practice for sharing and collaborating on datasets involves creating or storing multiple copies of the dataset, one for each stage of analysis, with no provenance information tracking the relationships between these datasets. This results not only in wasted storage, but also makes it challenging to track and integrate modifications made by different users to the same dataset. In this paper, we introduce the Relational Dataset Branching System, Decibel, a new relational storage system with built-in version control designed to address these shortcomings. We present our initial design for Decibel and provide a thorough evaluation of three versioned storage engine designs that focus on efficient query processing with minimal storage overhead. We also develop an exhaustive benchmark to enable the rigorous testing of these and future versioned storage engine designs. PMID:28149668
What Scientific Objectives Have Been Defined by the French Scientific Community for Mars Exploration?

NASA Astrophysics Data System (ADS)

Sotin, Christophe

2000-07-01

Every four or five years, the French scientific community is invited by the French space agency (CNES) to define the scientific priorities of the forthcoming years. The last workshop took place in March 98 in Arcachon, France. During this three-day workshop, it was clear that the study of Mars was very attractive for everyone because it is a planet very close to the Earth and its study should allow us to better understand the chemical and physical processes which drive the evolution of a planet by comparing the evolution of the two planets. For example, the study of Mars should help to understand the relationship between mantle convection and plate tectonics, the way magnetic dynamo works, and which conditions allowed life to emerge and evolve on Earth. The Southern Hemisphere of planet Mars is very old and it should have recorded some clues on the planetary evolution during the first billion years, a period for which very little is known for the Earth because both plate tectonics and weathering have erased the geological record. The international scientific community defined the architecture of Mars exploration program more than ten years ago. After the scientific discoveries made (and to come) with orbiters and landers, it appeared obvious that the next steps to be prepared are the delivery of networks on the surface and the study of samples returned from Mars. Scientific objectives related to network science include the determination of the different shells which compose the planet, the search for water in the subsurface, the record of atmospheric parameters both in time and space. Those related to the study of samples include the understanding of the differentiation of the planet and the fate of volatiles (including H2O) thanks to very accurate isotopic measurements which can be performed in laboratories, the search for minerals which can prove that life once existed on Mars, the search for present life on Mars (bacteria). Viking landers successfully landed on
Language, Space, Time: Anthropological Tools and Scientific Exploration on Mars

NASA Technical Reports Server (NTRS)

Wales, Roxana

2005-01-01

This viewgraph presentation reviews the importance of social science disciplines in the scientific exploration of Mars. The importance of language, workspace, and time differences are reviewed. It would appear that the social scientist perspective in developing a completely new workspace, keeping track of new vocabulary and the different time zones (i.e., terrestrial and Martian) was useful.
Environmental Resilience: Exploring Scientific Concepts for ...

EPA Pesticide Factsheets

Report This report summarizes two Community Environmental Resilience Index workshops held at EPA in May and July of 2014. The workshops explored scientific concepts for building an index of indicators of community environmental resilience to natural or human-caused disasters. The index could be used to support disaster decision-making. Key workshop outcomes include: a working definition of environmental resilience and insight into how it relates to EPA's mission and Strategic Goals, a call for an inventory of EPA resiliency tools, a preliminary list of indicators and CERI structure, identification of next steps for index development, and emergence of a network of collaborators. The report can be used to support EPA's work in resilience under PPD-8, PPD-21, and the national response and disaster recovery frameworks. It can feed into interagency efforts on building community resilience.
Scalable Machine Learning for Massive Astronomical Datasets

NASA Astrophysics Data System (ADS)

Ball, Nicholas M.; Gray, A.

2014-04-01

We present the ability to perform data mining and machine learning operations on a catalog of half a billion astronomical objects. This is the result of the combination of robust, highly accurate machine learning algorithms with linear scalability that renders the applications of these algorithms to massive astronomical data tractable. We demonstrate the core algorithms kernel density estimation, K-means clustering, linear regression, nearest neighbors, random forest and gradient-boosted decision tree, singular value decomposition, support vector machine, and two-point correlation function. Each of these is relevant for astronomical applications such as finding novel astrophysical objects, characterizing artifacts in data, object classification (including for rare objects), object distances, finding the important features describing objects, density estimation of distributions, probabilistic quantities, and exploring the unknown structure of new data. The software, Skytree Server, runs on any UNIX-based machine, a virtual machine, or cloud-based and distributed systems including Hadoop. We have integrated it on the cloud computing system of the Canadian Astronomical Data Centre, the Canadian Advanced Network for Astronomical Research (CANFAR), creating the world's first cloud computing data mining system for astronomy. We demonstrate results showing the scaling of each of our major algorithms on large astronomical datasets, including the full 470,992,970 objects of the 2 Micron All-Sky Survey (2MASS) Point Source Catalog. We demonstrate the ability to find outliers in the full 2MASS dataset utilizing multiple methods, e.g., nearest neighbors. This is likely of particular interest to the radio astronomy community given, for example, that survey projects contain groups dedicated to this topic. 2MASS is used as a proof-of-concept dataset due to its convenience and availability. These results are of interest to any astronomical project with large and/or complex
Scalable Machine Learning for Massive Astronomical Datasets

NASA Astrophysics Data System (ADS)

Ball, Nicholas M.; Astronomy Data Centre, Canadian

2014-01-01

We present the ability to perform data mining and machine learning operations on a catalog of half a billion astronomical objects. This is the result of the combination of robust, highly accurate machine learning algorithms with linear scalability that renders the applications of these algorithms to massive astronomical data tractable. We demonstrate the core algorithms kernel density estimation, K-means clustering, linear regression, nearest neighbors, random forest and gradient-boosted decision tree, singular value decomposition, support vector machine, and two-point correlation function. Each of these is relevant for astronomical applications such as finding novel astrophysical objects, characterizing artifacts in data, object classification (including for rare objects), object distances, finding the important features describing objects, density estimation of distributions, probabilistic quantities, and exploring the unknown structure of new data. The software, Skytree Server, runs on any UNIX-based machine, a virtual machine, or cloud-based and distributed systems including Hadoop. We have integrated it on the cloud computing system of the Canadian Astronomical Data Centre, the Canadian Advanced Network for Astronomical Research (CANFAR), creating the world's first cloud computing data mining system for astronomy. We demonstrate results showing the scaling of each of our major algorithms on large astronomical datasets, including the full 470,992,970 objects of the 2 Micron All-Sky Survey (2MASS) Point Source Catalog. We demonstrate the ability to find outliers in the full 2MASS dataset utilizing multiple methods, e.g., nearest neighbors, and the local outlier factor. 2MASS is used as a proof-of-concept dataset due to its convenience and availability. These results are of interest to any astronomical project with large and/or complex datasets that wishes to extract the full scientific value from its data.
Exploration of Korean Students' Scientific Imagination Using the Scientific Imagination Inventory

NASA Astrophysics Data System (ADS)

Mun, Jiyeong; Mun, Kongju; Kim, Sung-Won

2015-09-01

This article reports on the study of the components of scientific imagination and describes the scales used to measure scientific imagination in Korean elementary and secondary students. In this study, we developed an inventory, which we call the Scientific Imagination Inventory (SII), in order to examine aspects of scientific imagination. We identified three conceptual components of scientific imagination, which were composed of (1) scientific sensitivity, (2) scientific creativity, and (3) scientific productivity. We administered SII to 662 students (4th-8th grades) and confirmed validity and reliability using exploratory factor analysis and Cronbach α coefficient. The characteristics of Korean elementary and secondary students' overall scientific imagination and difference across gender and grade level are discussed in the results section.
An interactive, multi-touch videowall for scientific data exploration

NASA Astrophysics Data System (ADS)

Blower, Jon; Griffiths, Guy; van Meersbergen, Maarten; Lusher, Scott; Styles, Jon

2014-05-01

The use of videowalls for scientific data exploration is rising as hardware becomes cheaper and the availability of software and multimedia content grows. Most videowalls are used primarily for outreach and communication purposes, but there is increasing interest in using large display screens to support exploratory visualization as an integral part of scientific research. In this PICO presentation we will present a brief overview of a new videowall system at the University of Reading, which is designed specifically to support interactive, exploratory visualization activities in climate science and Earth Observation. The videowall consists of eight 42-inch full-HD screens (in 4x2 formation), giving a total resolution of about 16 megapixels. The display is managed by a videowall controller, which can direct video to the screen from up to four external laptops, a purpose-built graphics workstation, or any combination thereof. A multi-touch overlay provides the capability for the user to interact directly with the data. There are many ways to use the videowall, and a key technical challenge is to make the most of the touch capabilities - touch has the potential to greatly reduce the learning curve in interactive data exploration, but most software is not yet designed for this purpose. In the PICO we will present an overview of some ways in which the wall can be employed in science, seeking feedback and discussion from the community. The system was inspired by an existing and highly-successful system (known as the "Collaboratorium") at the Netherlands e-Science Center (NLeSC). We will demonstrate how we have adapted NLeSC's visualization software to our system for touch-enabled multi-screen climate data exploration.
The ISECG Science White Paper - A Scientific Perspective on the Global Exploration Roadmap

NASA Astrophysics Data System (ADS)

Bussey, David B.; Worms, Jean-Claude; Spiero, Francois; Schlutz, Juergen; Ehrenfreund, Pascale

2016-07-01

Future space exploration goals call for sending humans and robots beyond low Earth orbit and establishing sustained access to destinations such as the Moon, asteroids and Mars. Space agencies participating in the International Space Exploration Coordination Group (ISECG) are discussing an international approach for achieving these goals, documented in ISECG's Global Exploration Roadmap (GER). The GER reference scenario reflects a step-wise evolution of critical capabilities from ISS to missions in the lunar vicinity in preparation for the journey of humans to Mars. As an element of this continued road mapping effort, the ISECG agencies are therefore soliciting input and coordinated discussion with the scientific community to better articulate and promote the scientific opportunities of the proposed mission themes. An improved understanding of the scientific drivers and the requirements to address priority science questions associated with the exploration destinations (Moon, Near Earth Asteroids, Mars and its moons) as well as the preparatory activities in cis-lunar space is beneficial to optimize the partnership of robotic assets and human presence beyond low Earth orbit. The interaction has resulted in the development of a Science White Paper to: • Identify and highlight the scientific opportunities in early exploration missions as the GER reference architecture matures, • Communicate overarching science themes and their relevance in the GER destinations, • Ensure international science communities' perspectives inform the future evolution of mission concepts considered in the GER The paper aims to capture the opportunities offered by the missions in the GER for a broad range of scientific disciplines. These include planetary and space sciences, astrobiology, life sciences, physical sciences, astronomy and Earth science. The paper is structured around grand science themes that draw together and connect research in the various disciplines, and it will focus on
The Path from Large Earth Science Datasets to Information

NASA Astrophysics Data System (ADS)

Vicente, G. A.

2013-12-01

The NASA Goddard Earth Sciences Data (GES) and Information Services Center (DISC) is one of the major Science Mission Directorate (SMD) for archiving and distribution of Earth Science remote sensing data, products and services. This virtual portal provides convenient access to Atmospheric Composition and Dynamics, Hydrology, Precipitation, Ozone, and model derived datasets (generated by GSFC's Global Modeling and Assimilation Office), the North American Land Data Assimilation System (NLDAS) and the Global Land Data Assimilation System (GLDAS) data products (both generated by GSFC's Hydrological Sciences Branch). This presentation demonstrates various tools and computational technologies developed in the GES DISC to manage the huge volume of data and products acquired from various missions and programs over the years. It explores approaches to archive, document, distribute, access and analyze Earth Science data and information as well as addresses the technical and scientific issues, governance and user support problem faced by scientists in need of multi-disciplinary datasets. It also discusses data and product metrics, user distribution profiles and lessons learned through interactions with the science communities around the world. Finally it demonstrates some of the most used data and product visualization and analyses tools developed and maintained by the GES DISC.
Exploration of Korean Students' Scientific Imagination Using the Scientific Imagination Inventory

ERIC Educational Resources Information Center

Mun, Jiyeong; Mun, Kongju; Kim, Sung-Won

2015-01-01

This article reports on the study of the components of scientific imagination and describes the scales used to measure scientific imagination in Korean elementary and secondary students. In this study, we developed an inventory, which we call the Scientific Imagination Inventory (SII), in order to examine aspects of scientific imagination. We…
Scientific Goals and Objectives for the Human Exploration of Mars: 1. Biology and Atmosphere/Climate

NASA Technical Reports Server (NTRS)

Levine, Joel S.; Garvin, J. B.; Anbar, A. D.; Beaty, D. W.; Bell, M. S.; Clancy, R. T.; Cockell, C. S.; Connerney, J. E.; Doran, P. T.; Delory, G.;

2008-01-01

To prepare for the exploration of Mars by humans, as outlined in the new national vision for Space Exploration (VSE), the Mars Exploration Program Analysis Group (MEPAG), chartered by NASA's Mars Exploration Program (MEP), formed a Human Exploration of Mars Science Analysis Group (HEM-SAG), in March 2007. HEM-SAG was chartered to develop the scientific goals and objectives for the human exploration of Mars based on the Mars Scientific Goals, Objectives, Investigations, and Priorities.1 The HEM-SAG is one of several humans to Mars scientific, engineering and mission architecture studies chartered in 2007 to support NASA s plans for the human exploration of Mars. The HEM-SAG is composed of about 30 Mars scientists representing the disciplines of Mars biology, climate/atmosphere, geology and geophysics from the U.S., Canada, England, France, Italy and Spain. MEPAG selected Drs. James B. Garvin (NASA Goddard Space Flight Center) and Joel S. Levine (NASA Langley Research Center) to serve as HEMSAG co-chairs. The HEM-SAG team conducted 20 telecons and convened three face-to-face meetings from March through October 2007. The management of MEP and MEPAG were briefed on the HEM-SAG interim findings in May. The HEM-SAG final report was presented on-line to the full MEPAG membership and was presented at the MEPAG meeting on February 20-21, 2008. This presentation will outline the HEM-SAG biology and climate/atmosphere goals and objectives. A companion paper will outline the HEM-SAG geology and geophysics goals and objectives.

Scientific Exploration of Induced SeisMicity and Stress (SEISMS)

NASA Astrophysics Data System (ADS)

Savage, Heather M.; Kirkpatrick, James D.; Mori, James J.; Brodsky, Emily E.; Ellsworth, William L.; Carpenter, Brett M.; Chen, Xiaowei; Cappa, Frédéric; Kano, Yasuyuki

2017-11-01

Several major fault-drilling projects have captured the interseismic and postseismic periods of earthquakes. However, near-field observations of faults immediately before and during an earthquake remain elusive due to the unpredictable nature of seismicity. The Scientific Exploration of Induced SeisMicity and Stress (SEISMS) workshop met in March 2017 to discuss the value of a drilling experiment where a fault is instrumented in advance of an earthquake induced through controlled fluid injection. The workshop participants articulated three key issues that could most effectively be addressed by such an experiment: (1) predictive understanding of the propensity for seismicity in reaction to human forcing, (2) identification of earthquake nucleation processes, and (3) constraints on the factors controlling earthquake size. A systematic review of previous injection experiments exposed important observational gaps in all of these areas. The participants discussed the instrumentation and technological needs as well as faults and tectonic areas that are feasible from both a societal and scientific standpoint.
Scientific Objectives of China Chang E 4 CE-4 Lunar Far-side Exploration Mission

NASA Astrophysics Data System (ADS)

Zhang, Hongbo; Zeng, Xingguo; Chen, Wangli

2017-10-01

China has achieved great success in the recently CE-1~CE-3 lunar missions, and in the year of 2018, China Lunar Exploration Program (CLEP) is going to launch the CE-4 mission. CE-4 satellite is the backup satellite of CE-3, so that it also consists of a Lander and a Rover. However, CE-4 is the first mission designed to detect the far side of the Moon in human lunar exploration history. So the biggest difference between CE-4 and CE-3 is that it will be equipped with a relay satellite in Earth-Moon-L2 Point for Earth-Moon Communication. And the scientific payloads carried on the Lander and Rover will also be different. It has been announced by the Chinese government that CE-4 mission will be equipped with some new international cooperated scientific payloads, such as the Low Frequency Radio Detector from Holland, Lunar Neutron and Radiation Dose Detector from Germany, Neutral Atom Detector from Sweden, and Lunar Miniature Optical Imaging Sounder from Saudi Arabia. The main scientific objective of CE-4 is to provide scientific data for lunar far side research, including: 1)general spatial environmental study of lunar far side；2)general research on the surface, shallow layer and deep layer of lunar far side；3)detection of low frequency radio on lunar far side using Low Frequency Radio Detector, which would be the first time of using such frequency band in lunar exploration history .
Exploring the Changes in Students' Understanding of the Scientific Method Using Word Associations

NASA Astrophysics Data System (ADS)

Gulacar, Ozcan; Sinan, Olcay; Bowman, Charles R.; Yildirim, Yetkin

2015-10-01

A study is presented that explores how students' knowledge structures, as related to the scientific method, compare at different student ages. A word association test comprised of ten total stimulus words, among them experiment, science fair, and hypothesis, is used to probe the students' knowledge structures. Students from grades four, five, and eight, as well as first-year college students were tested to reveal their knowledge structures relating to the scientific method. Younger students were found to have a naïve view of the science process with little understanding of how science relates to the real world. However, students' conceptions about the scientific process appear to be malleable, with science fairs a potentially strong influencer. The strength of associations between words is observed to change from grade to grade, with younger students placing science fair near the center of their knowledge structure regarding the scientific method, whereas older students conceptualize the scientific method around experiment.
Exploring homogeneity of correlation structures of gene expression datasets within and between etiological disease categories.

PubMed

Jong, Victor L; Novianti, Putri W; Roes, Kit C B; Eijkemans, Marinus J C

2014-12-01

The literature shows that classifiers perform differently across datasets and that correlations within datasets affect the performance of classifiers. The question that arises is whether the correlation structure within datasets differ significantly across diseases. In this study, we evaluated the homogeneity of correlation structures within and between datasets of six etiological disease categories; inflammatory, immune, infectious, degenerative, hereditary and acute myeloid leukemia (AML). We also assessed the effect of filtering; detection call and variance filtering on correlation structures. We downloaded microarray datasets from ArrayExpress for experiments meeting predefined criteria and ended up with 12 datasets for non-cancerous diseases and six for AML. The datasets were preprocessed by a common procedure incorporating platform-specific recommendations and the two filtering methods mentioned above. Homogeneity of correlation matrices between and within datasets of etiological diseases was assessed using the Box's M statistic on permuted samples. We found that correlation structures significantly differ between datasets of the same and/or different etiological disease categories and that variance filtering eliminates more uncorrelated probesets than detection call filtering and thus renders the data highly correlated.

Informal Formative Assessment and Scientific Inquiry: Exploring Teachers' Practices and Student Learning

ERIC Educational Resources Information Center

Ruiz-Primo, Maria Araceli; Furtak, Erin Marie

2006-01-01

What does informal formative assessment look like in the context of scientific inquiry teaching? Is it possible to identify different levels of informal assessment practices? Can different levels of informal assessment practices be related to levels of student learning? This study addresses these issues by exploring how 4 middle school science…
FLUXNET2015 Dataset: Batteries included

NASA Astrophysics Data System (ADS)

Pastorello, G.; Papale, D.; Agarwal, D.; Trotta, C.; Chu, H.; Canfora, E.; Torn, M. S.; Baldocchi, D. D.

2016-12-01

The synthesis datasets have become one of the signature products of the FLUXNET global network. They are composed from contributions of individual site teams to regional networks, being then compiled into uniform data products - now used in a wide variety of research efforts: from plant-scale microbiology to global-scale climate change. The FLUXNET Marconi Dataset in 2000 was the first in the series, followed by the FLUXNET LaThuile Dataset in 2007, with significant additions of data products and coverage, solidifying the adoption of the datasets as a research tool. The FLUXNET2015 Dataset counts with another round of substantial improvements, including extended quality control processes and checks, use of downscaled reanalysis data for filling long gaps in micrometeorological variables, multiple methods for USTAR threshold estimation and flux partitioning, and uncertainty estimates - all of which accompanied by auxiliary flags. This "batteries included" approach provides a lot of information for someone who wants to explore the data (and the processing methods) in detail. This inevitably leads to a large number of data variables. Although dealing with all these variables might seem overwhelming at first, especially to someone looking at eddy covariance data for the first time, there is method to our madness. In this work we describe the data products and variables that are part of the FLUXNET2015 Dataset, and the rationale behind the organization of the dataset, covering the simplified version (labeled SUBSET), the complete version (labeled FULLSET), and the auxiliary products in the dataset.
Ontology for Transforming Geo-Spatial Data for Discovery and Integration of Scientific Data

NASA Astrophysics Data System (ADS)

Nguyen, L.; Chee, T.; Minnis, P.

2013-12-01

Discovery and access to geo-spatial scientific data across heterogeneous repositories and multi-discipline datasets can present challenges for scientist. We propose to build a workflow for transforming geo-spatial datasets into semantic environment by using relationships to describe the resource using OWL Web Ontology, RDF, and a proposed geo-spatial vocabulary. We will present methods for transforming traditional scientific dataset, use of a semantic repository, and querying using SPARQL to integrate and access datasets. This unique repository will enable discovery of scientific data by geospatial bound or other criteria.
Management and assimilation of diverse, distributed watershed datasets

NASA Astrophysics Data System (ADS)

Varadharajan, C.; Faybishenko, B.; Versteeg, R.; Agarwal, D.; Hubbard, S. S.; Hendrix, V.

2016-12-01

The U.S. Department of Energy's (DOE) Watershed Function Scientific Focus Area (SFA) seeks to determine how perturbations to mountainous watersheds (e.g., floods, drought, early snowmelt) impact the downstream delivery of water, nutrients, carbon, and metals over seasonal to decadal timescales. We are building a software platform that enables integration of diverse and disparate field, laboratory, and simulation datasets, of various types including hydrological, geological, meteorological, geophysical, geochemical, ecological and genomic datasets across a range of spatial and temporal scales within the Rifle floodplain and the East River watershed, Colorado. We are using agile data management and assimilation approaches, to enable web-based integration of heterogeneous, multi-scale dataSensor-based observations of water-level, vadose zone and groundwater temperature, water quality, meteorology as well as biogeochemical analyses of soil and groundwater samples have been curated and archived in federated databases. Quality Assurance and Quality Control (QA/QC) are performed on priority datasets needed for on-going scientific analyses, and hydrological and geochemical modeling. Automated QA/QC methods are used to identify and flag issues in the datasets. Data integration is achieved via a brokering service that dynamically integrates data from distributed databases via web services, based on user queries. The integrated results are presented to users in a portal that enables intuitive search, interactive visualization and download of integrated datasets. The concepts, approaches and codes being used are shared across various data science components of various large DOE-funded projects such as the Watershed Function SFA, Next Generation Ecosystem Experiment (NGEE) Tropics, Ameriflux/FLUXNET, and Advanced Simulation Capability for Environmental Management (ASCEM), and together contribute towards DOE's cyberinfrastructure for data management and model-data integration.
Exploring the Changes in Students' Understanding of the Scientific Method Using Word Associations

ERIC Educational Resources Information Center

Gulacar, Ozcan; Sinan, Olcay; Bowman, Charles R.; Yildirim, Yetkin

2015-01-01

A study is presented that explores how students' knowledge structures, as related to the scientific method, compare at different student ages. A word association test comprised of ten total stimulus words, among them "experiment," "science fair," and "hypothesis," is used to probe the students' knowledge structures.…
Using network projections to explore co-incidence and context in large clinical datasets: Application to homelessness among U.S. Veterans.

PubMed

Pettey, Warren B P; Toth, Damon J A; Redd, Andrew; Carter, Marjorie E; Samore, Matthew H; Gundlapalli, Adi V

2016-06-01

Network projections of data can provide an efficient format for data exploration of co-incidence in large clinical datasets. We present and explore the utility of a network projection approach to finding patterns in health care data that could be exploited to prevent homelessness among U.S. Veterans. We divided Veteran ICD-9-CM (ICD9) data into two time periods (0-59 and 60-364days prior to the first evidence of homelessness) and then used Pajek social network analysis software to visualize these data as three different networks. A multi-relational network simultaneously displayed the magnitude of ties between the most frequent ICD9 pairings. A new association network visualized ICD9 pairings that greatly increased or decreased. A signed, subtraction network visualized the presence, absence, and magnitude difference between ICD9 associations by time period. A cohort of 9468 U.S. Veterans was identified as having administrative evidence of homelessness and visits in both time periods. They were seen in 222,599 outpatient visits that generated 484,339 ICD9 codes (average of 11.4 (range 1-23) visits and 2.2 (range 1-60) ICD9 codes per visit). Using the three network projection methods, we were able to show distinct differences in the pattern of co-morbidities in the two time periods. In the more distant time period preceding homelessness, the network was dominated by routine health maintenance visits and physical ailment diagnoses. In the 59days immediately prior to the homelessness identification, alcohol related diagnoses along with economic circumstances such as unemployment, legal circumstances, along with housing instability were noted. Network visualizations of large clinical datasets traditionally treated as tabular and difficult to manipulate reveal rich, previously hidden connections between data variables related to homelessness. A key feature is the ability to visualize changes in variables with temporality and in proximity to the event of interest. These
Extension of research data repository system to support direct compute access to biomedical datasets: enhancing Dataverse to support large datasets.

PubMed

McKinney, Bill; Meyer, Peter A; Crosas, Mercè; Sliz, Piotr

2017-01-01

Access to experimental X-ray diffraction image data is important for validation and reproduction of macromolecular models and indispensable for the development of structural biology processing methods. In response to the evolving needs of the structural biology community, we recently established a diffraction data publication system, the Structural Biology Data Grid (SBDG, data.sbgrid.org), to preserve primary experimental datasets supporting scientific publications. All datasets published through the SBDG are freely available to the research community under a public domain dedication license, with metadata compliant with the DataCite Schema (schema.datacite.org). A proof-of-concept study demonstrated community interest and utility. Publication of large datasets is a challenge shared by several fields, and the SBDG has begun collaborating with the Institute for Quantitative Social Science at Harvard University to extend the Dataverse (dataverse.org) open-source data repository system to structural biology datasets. Several extensions are necessary to support the size and metadata requirements for structural biology datasets. In this paper, we describe one such extension-functionality supporting preservation of file system structure within Dataverse-which is essential for both in-place computation and supporting non-HTTP data transfers. © 2016 New York Academy of Sciences.
Extension of research data repository system to support direct compute access to biomedical datasets: enhancing Dataverse to support large datasets

PubMed Central

McKinney, Bill; Meyer, Peter A.; Crosas, Mercè; Sliz, Piotr

2016-01-01

Access to experimental X-ray diffraction image data is important for validation and reproduction of macromolecular models and indispensable for the development of structural biology processing methods. In response to the evolving needs of the structural biology community, we recently established a diffraction data publication system, the Structural Biology Data Grid (SBDG, data.sbgrid.org), to preserve primary experimental datasets supporting scientific publications. All datasets published through the SBDG are freely available to the research community under a public domain dedication license, with metadata compliant with the DataCite Schema (schema.datacite.org). A proof-of-concept study demonstrated community interest and utility. Publication of large datasets is a challenge shared by several fields, and the SBDG has begun collaborating with the Institute for Quantitative Social Science at Harvard University to extend the Dataverse (dataverse.org) open-source data repository system to structural biology datasets. Several extensions are necessary to support the size and metadata requirements for structural biology datasets. In this paper, we describe one such extension—functionality supporting preservation of filesystem structure within Dataverse—which is essential for both in-place computation and supporting non-http data transfers. PMID:27862010
[German national consensus on wound documentation of leg ulcer : Part 1: Routine care - standard dataset and minimum dataset].

PubMed

Heyer, K; Herberger, K; Protz, K; Mayer, A; Dissemond, J; Debus, S; Augustin, M

2017-09-01

Standards for basic documentation and the course of treatment increase quality assurance and efficiency in health care. To date, no standards for the treatment of patients with leg ulcers are available in Germany. The aim of the study was to develop standards under routine conditions in the documentation of patients with leg ulcers. This article shows the recommended variables of a "standard dataset" and a "minimum dataset". Consensus building among experts from 38 scientific societies, professional associations, insurance and supply networks (n = 68 experts) took place. After conducting a systematic international literature research, available standards were reviewed and supplemented with our own considerations of the expert group. From 2012-2015 standards for documentation were defined in multistage online visits and personal meetings. A consensus was achieved for 18 variables for the minimum dataset and 48 variables for the standard dataset in a total of seven meetings and nine online Delphi visits. The datasets involve patient baseline data, data on the general health status, wound characteristics, diagnostic and therapeutic interventions, patient reported outcomes, nutrition, and education status. Based on a multistage continuous decision-making process, a standard in the measurement of events in routine care in patients with a leg ulcer was developed.
Exploring Two Approaches for an End-to-End Scientific Analysis Workflow

NASA Astrophysics Data System (ADS)

Dodelson, Scott; Kent, Steve; Kowalkowski, Jim; Paterno, Marc; Sehrish, Saba

2015-12-01

The scientific discovery process can be advanced by the integration of independently-developed programs run on disparate computing facilities into coherent workflows usable by scientists who are not experts in computing. For such advancement, we need a system which scientists can use to formulate analysis workflows, to integrate new components to these workflows, and to execute different components on resources that are best suited to run those components. In addition, we need to monitor the status of the workflow as components get scheduled and executed, and to access the intermediate and final output for visual exploration and analysis. Finally, it is important for scientists to be able to share their workflows with collaborators. We have explored two approaches for such an analysis framework for the Large Synoptic Survey Telescope (LSST) Dark Energy Science Collaboration (DESC); the first one is based on the use and extension of Galaxy, a web-based portal for biomedical research, and the second one is based on a programming language, Python. In this paper, we present a brief description of the two approaches, describe the kinds of extensions to the Galaxy system we have found necessary in order to support the wide variety of scientific analysis in the cosmology community, and discuss how similar efforts might be of benefit to the HEP community.
Exploring Two Approaches for an End-to-End Scientific Analysis Workflow

DOE PAGES

Dodelson, Scott; Kent, Steve; Kowalkowski, Jim; ...

2015-12-23

The advance of the scientific discovery process is accomplished by the integration of independently-developed programs run on disparate computing facilities into coherent workflows usable by scientists who are not experts in computing. For such advancement, we need a system which scientists can use to formulate analysis workflows, to integrate new components to these workflows, and to execute different components on resources that are best suited to run those components. In addition, we need to monitor the status of the workflow as components get scheduled and executed, and to access the intermediate and final output for visual exploration and analysis. Finally,more » it is important for scientists to be able to share their workflows with collaborators. Moreover we have explored two approaches for such an analysis framework for the Large Synoptic Survey Telescope (LSST) Dark Energy Science Collaboration (DESC), the first one is based on the use and extension of Galaxy, a web-based portal for biomedical research, and the second one is based on a programming language, Python. In our paper, we present a brief description of the two approaches, describe the kinds of extensions to the Galaxy system we have found necessary in order to support the wide variety of scientific analysis in the cosmology community, and discuss how similar efforts might be of benefit to the HEP community.« less
Publicly Releasing a Large Simulation Dataset with NDS Labs

NASA Astrophysics Data System (ADS)

Goldbaum, Nathan

2016-03-01

Optimally, all publicly funded research should be accompanied by the tools, code, and data necessary to fully reproduce the analysis performed in journal articles describing the research. This ideal can be difficult to attain, particularly when dealing with large (>10 TB) simulation datasets. In this lightning talk, we describe the process of publicly releasing a large simulation dataset to accompany the submission of a journal article. The simulation was performed using Enzo, an open source, community-developed N-body/hydrodynamics code and was analyzed using a wide range of community- developed tools in the scientific Python ecosystem. Although the simulation was performed and analyzed using an ecosystem of sustainably developed tools, we enable sustainable science using our data by making it publicly available. Combining the data release with the NDS Labs infrastructure allows a substantial amount of added value, including web-based access to analysis and visualization using the yt analysis package through an IPython notebook interface. In addition, we are able to accompany the paper submission to the arXiv preprint server with links to the raw simulation data as well as interactive real-time data visualizations that readers can explore on their own or share with colleagues during journal club discussions. It is our hope that the value added by these services will substantially increase the impact and readership of the paper.
Uvf - Unified Volume Format: A General System for Efficient Handling of Large Volumetric Datasets.

PubMed

Krüger, Jens; Potter, Kristin; Macleod, Rob S; Johnson, Christopher

2008-01-01

With the continual increase in computing power, volumetric datasets with sizes ranging from only a few megabytes to petascale are generated thousands of times per day. Such data may come from an ordinary source such as simple everyday medical imaging procedures, while larger datasets may be generated from cluster-based scientific simulations or measurements of large scale experiments. In computer science an incredible amount of work worldwide is put into the efficient visualization of these datasets. As researchers in the field of scientific visualization, we often have to face the task of handling very large data from various sources. This data usually comes in many different data formats. In medical imaging, the DICOM standard is well established, however, most research labs use their own data formats to store and process data. To simplify the task of reading the many different formats used with all of the different visualization programs, we present a system for the efficient handling of many types of large scientific datasets (see Figure 1 for just a few examples). While primarily targeted at structured volumetric data, UVF can store just about any type of structured and unstructured data. The system is composed of a file format specification with a reference implementation of a reader. It is not only a common, easy to implement format but also allows for efficient rendering of most datasets without the need to convert the data in memory.
PARLO: PArallel Run-Time Layout Optimization for Scientific Data Explorations with Heterogeneous Access Pattern

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gong, Zhenhuan; Boyuka, David; Zou, X

Download Citation Email Print Request Permissions Save to Project The size and scope of cutting-edge scientific simulations are growing much faster than the I/O and storage capabilities of their run-time environments. The growing gap is exacerbated by exploratory, data-intensive analytics, such as querying simulation data with multivariate, spatio-temporal constraints, which induces heterogeneous access patterns that stress the performance of the underlying storage system. Previous work addresses data layout and indexing techniques to improve query performance for a single access pattern, which is not sufficient for complex analytics jobs. We present PARLO a parallel run-time layout optimization framework, to achieve multi-levelmore » data layout optimization for scientific applications at run-time before data is written to storage. The layout schemes optimize for heterogeneous access patterns with user-specified priorities. PARLO is integrated with ADIOS, a high-performance parallel I/O middleware for large-scale HPC applications, to achieve user-transparent, light-weight layout optimization for scientific datasets. It offers simple XML-based configuration for users to achieve flexible layout optimization without the need to modify or recompile application codes. Experiments show that PARLO improves performance by 2 to 26 times for queries with heterogeneous access patterns compared to state-of-the-art scientific database management systems. Compared to traditional post-processing approaches, its underlying run-time layout optimization achieves a 56% savings in processing time and a reduction in storage overhead of up to 50%. PARLO also exhibits a low run-time resource requirement, while also limiting the performance impact on running applications to a reasonable level.« less
[Scientific significance and prospective application of digitized virtual human].

PubMed

Zhong, Shi-zhen

2003-03-01

As a cutting-edge research project, digitization of human anatomical information combines conventional medicine with information technology, computer technology, and virtual reality technology. Recent years have seen the establishment of, or the ongoing effort to establish various virtual human models in many countries, on the basis of continuous sections of human body that are digitized by means of computational medicine incorporating information technology to quantitatively simulate human physiological and pathological conditions, and to provide wide prospective applications in the fields of medicine and other disciplines. This article addresses 4 issues concerning the progress in virtual human model researches as the following: (1) Worldwide survey of sectioning and modeling of visible human. American visible human database was completed in 1994, which contains both a male and a female datasets, and has found wide application internationally. South Korea also finished the data collection for a male visible Korean human dataset in 2000. (2) Application of the dataset of Visible Human Project (VHP). This dataset has yielded plentiful fruits in medical education and clinical research, and further plans are proposed and practiced to construct a Physical Human and Physiological Human . (3) Scientific significance and prospect of virtual human studies. Digitized human dataset may eventually contribute to the development of many new high-tech industries. (4) Progress of virtual Chinese human project. The 174th session of Xiangshang Science Conferences held in 2001 marked the initiation of digitized virtual human project in China, and some key techniques have been explored. By now the data-collection process for 4 Chinese virtual human datasets have been successfully completed.
A dataset of forest biomass structure for Eurasia.

PubMed

Schepaschenko, Dmitry; Shvidenko, Anatoly; Usoltsev, Vladimir; Lakyda, Petro; Luo, Yunjian; Vasylyshyn, Roman; Lakyda, Ivan; Myklush, Yuriy; See, Linda; McCallum, Ian; Fritz, Steffen; Kraxner, Florian; Obersteiner, Michael

2017-05-16

The most comprehensive dataset of in situ destructive sampling measurements of forest biomass in Eurasia have been compiled from a combination of experiments undertaken by the authors and from scientific publications. Biomass is reported as four components: live trees (stem, bark, branches, foliage, roots); understory (above- and below ground); green forest floor (above- and below ground); and coarse woody debris (snags, logs, dead branches of living trees and dead roots), consisting of 10,351 unique records of sample plots and 9,613 sample trees from ca 1,200 experiments for the period 1930-2014 where there is overlap between these two datasets. The dataset also contains other forest stand parameters such as tree species composition, average age, tree height, growing stock volume, etc., when available. Such a dataset can be used for the development of models of biomass structure, biomass extension factors, change detection in biomass structure, investigations into biodiversity and species distribution and the biodiversity-productivity relationship, as well as the assessment of the carbon pool and its dynamics, among many others.
A dataset of forest biomass structure for Eurasia

NASA Astrophysics Data System (ADS)

Schepaschenko, Dmitry; Shvidenko, Anatoly; Usoltsev, Vladimir; Lakyda, Petro; Luo, Yunjian; Vasylyshyn, Roman; Lakyda, Ivan; Myklush, Yuriy; See, Linda; McCallum, Ian; Fritz, Steffen; Kraxner, Florian; Obersteiner, Michael

2017-05-01

The most comprehensive dataset of in situ destructive sampling measurements of forest biomass in Eurasia have been compiled from a combination of experiments undertaken by the authors and from scientific publications. Biomass is reported as four components: live trees (stem, bark, branches, foliage, roots); understory (above- and below ground); green forest floor (above- and below ground); and coarse woody debris (snags, logs, dead branches of living trees and dead roots), consisting of 10,351 unique records of sample plots and 9,613 sample trees from ca 1,200 experiments for the period 1930-2014 where there is overlap between these two datasets. The dataset also contains other forest stand parameters such as tree species composition, average age, tree height, growing stock volume, etc., when available. Such a dataset can be used for the development of models of biomass structure, biomass extension factors, change detection in biomass structure, investigations into biodiversity and species distribution and the biodiversity-productivity relationship, as well as the assessment of the carbon pool and its dynamics, among many others.
Genomics dataset of unidentified disclosed isolates.

PubMed

Rekadwad, Bhagwan N

2016-09-01

Analysis of DNA sequences is necessary for higher hierarchical classification of the organisms. It gives clues about the characteristics of organisms and their taxonomic position. This dataset is chosen to find complexities in the unidentified DNA in the disclosed patents. A total of 17 unidentified DNA sequences were thoroughly analyzed. The quick response codes were generated. AT/GC content of the DNA sequences analysis was carried out. The QR is helpful for quick identification of isolates. AT/GC content is helpful for studying their stability at different temperatures. Additionally, a dataset on cleavage code and enzyme code studied under the restriction digestion study, which helpful for performing studies using short DNA sequences was reported. The dataset disclosed here is the new revelatory data for exploration of unique DNA sequences for evaluation, identification, comparison and analysis.
Building the Next Generation of Scientific Explorers through Active Engagement with STEM Experts and International Space Station Resources

NASA Technical Reports Server (NTRS)

Graff, P. V.; Vanderbloemen, L.; Higgins, M.; Stefanov, W. L.; Rampe, E.

2015-01-01

Connecting students and teachers in classrooms with science, technology, engineering, and mathematics (STEM) experts provides an invaluable opportunity for all. These experts can share the benefits and utilization of resources from the International Space Station (ISS) while sharing and "translating" exciting science being conducted by professional scientists. Active engagement with these STEM experts involves students in the journey of science and exploration in an enthralling and understandable manner. This active engagement, connecting classrooms with scientific experts, helps inspire and build the next generation of scientific explorers in academia, private industry, and government.
Building Scientific Data's list of recommended data repositories

NASA Astrophysics Data System (ADS)

Hufton, A. L.; Khodiyar, V.; Hrynaszkiewicz, I.

2016-12-01

When Scientific Data launched in 2014 we provided our authors with a list of recommended data repositories to help them identify data hosting options that were likely to meet the journal's requirements. This list has grown in size and scope, and is now a central resource for authors across the Nature-titled journals. It has also been used in the development of data deposition policies and recommended repository lists across Springer Nature and at other publishers. Each new addition to the list is assessed according to a series of criteria that emphasize the stability of the resource, its commitment to principles of open science and its implementation of relevant community standards and reporting guidelines. A preference is expressed for repositories that issue digital object identifiers (DOIs) through the DataCite system and that share data under the Creative Commons CC0 waiver. Scientific Data currently lists fourteen repositories that focus on specific areas within the Earth and environmental sciences, as well as the broad scope repositories, Dryad and figshare. Readers can browse and filter datasets published at the journal by the host repository using ISA-explorer, a demo tool built by the ISA-tools team at Oxford University1. We believe that well-maintained lists like this one help publishers build a network of trust with community data repositories and provide an important complement to more comprehensive data repository indices and more formal certification efforts. In parallel, Scientific Data has also improved its policies to better support submissions from authors using institutional and project-specific repositories, without requiring each to apply for listing individually. Online resources Journal homepage: http://www.nature.com/scientificdata Data repository criteria: http://www.nature.com/sdata/policies/data-policies#repo-criteria Recommended data repositories: http://www.nature.com/sdata/policies/repositories Archived copies of the list: https

Exploring Nominalization in Scientific Textbooks: A Cross-Disciplinary Study of Hard and Soft Sciences

ERIC Educational Resources Information Center

Jalilifar, Alireza; White, Peter; Malekizadeh, N.

2017-01-01

Given the importance of disciplinary specificity in terms of the potential differences in the functionality of nominalizations in scientific textbooks and the dearth of studies of this type, the current study explores the extent to which nominalization is realized across two disciplines. To this aim, eight academic textbooks from Physics and…
Scientific rationale for Uranus and Neptune in situ explorations

NASA Astrophysics Data System (ADS)

Mousis, O.; Atkinson, D. H.; Cavalié, T.; Fletcher, L. N.; Amato, M. J.; Aslam, S.; Ferri, F.; Renard, J.-B.; Spilker, T.; Venkatapathy, E.; Wurz, P.; Aplin, K.; Coustenis, A.; Deleuil, M.; Dobrijevic, M.; Fouchet, T.; Guillot, T.; Hartogh, P.; Hewagama, T.; Hofstadter, M. D.; Hue, V.; Hueso, R.; Lebreton, J.-P.; Lellouch, E.; Moses, J.; Orton, G. S.; Pearl, J. C.; Sánchez-Lavega, A.; Simon, A.; Venot, O.; Waite, J. H.; Achterberg, R. K.; Atreya, S.; Billebaud, F.; Blanc, M.; Borget, F.; Brugger, B.; Charnoz, S.; Chiavassa, T.; Cottini, V.; d'Hendecourt, L.; Danger, G.; Encrenaz, T.; Gorius, N. J. P.; Jorda, L.; Marty, B.; Moreno, R.; Morse, A.; Nixon, C.; Reh, K.; Ronnet, T.; Schmider, F.-X.; Sheridan, S.; Sotin, C.; Vernazza, P.; Villanueva, G. L.

2018-06-01

The ice giants Uranus and Neptune are the least understood class of planets in our solar system but the most frequently observed type of exoplanets. Presumed to have a small rocky core, a deep interior comprising ∼70% heavy elements surrounded by a more dilute outer envelope of H2 and He, Uranus and Neptune are fundamentally different from the better-explored gas giants Jupiter and Saturn. Because of the lack of dedicated exploration missions, our knowledge of the composition and atmospheric processes of these distant worlds is primarily derived from remote sensing from Earth-based observatories and space telescopes. As a result, Uranus's and Neptune's physical and atmospheric properties remain poorly constrained and their roles in the evolution of the Solar System not well understood. Exploration of an ice giant system is therefore a high-priority science objective as these systems (including the magnetosphere, satellites, rings, atmosphere, and interior) challenge our understanding of planetary formation and evolution. Here we describe the main scientific goals to be addressed by a future in situ exploration of an ice giant. An atmospheric entry probe targeting the 10-bar level, about 5 scale heights beneath the tropopause, would yield insight into two broad themes: i) the formation history of the ice giants and, in a broader extent, that of the Solar System, and ii) the processes at play in planetary atmospheres. The probe would descend under parachute to measure composition, structure, and dynamics, with data returned to Earth using a Carrier Relay Spacecraft as a relay station. In addition, possible mission concepts and partnerships are presented, and a strawman ice-giant probe payload is described. An ice-giant atmospheric probe could represent a significant ESA contribution to a future NASA ice-giant flagship mission.
Accuracy assessment of the U.S. Geological Survey National Elevation Dataset, and comparison with other large-area elevation datasets: SRTM and ASTER

USGS Publications Warehouse

Gesch, Dean B.; Oimoen, Michael J.; Evans, Gayla A.

2014-01-01

The National Elevation Dataset (NED) is the primary elevation data product produced and distributed by the U.S. Geological Survey. The NED provides seamless raster elevation data of the conterminous United States, Alaska, Hawaii, U.S. island territories, Mexico, and Canada. The NED is derived from diverse source datasets that are processed to a specification with consistent resolutions, coordinate system, elevation units, and horizontal and vertical datums. The NED serves as the elevation layer of The National Map, and it provides basic elevation information for earth science studies and mapping applications in the United States and most of North America. An important part of supporting scientific and operational use of the NED is provision of thorough dataset documentation including data quality and accuracy metrics. The focus of this report is on the vertical accuracy of the NED and on comparison of the NED with other similar large-area elevation datasets, namely data from the Shuttle Radar Topography Mission (SRTM) and the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER).
The health care and life sciences community profile for dataset descriptions

PubMed Central

Alexiev, Vladimir; Ansell, Peter; Bader, Gary; Baran, Joachim; Bolleman, Jerven T.; Callahan, Alison; Cruz-Toledo, José; Gaudet, Pascale; Gombocz, Erich A.; Gonzalez-Beltran, Alejandra N.; Groth, Paul; Haendel, Melissa; Ito, Maori; Jupp, Simon; Juty, Nick; Katayama, Toshiaki; Kobayashi, Norio; Krishnaswami, Kalpana; Laibe, Camille; Le Novère, Nicolas; Lin, Simon; Malone, James; Miller, Michael; Mungall, Christopher J.; Rietveld, Laurens; Wimalaratne, Sarala M.; Yamaguchi, Atsuko

2016-01-01

Access to consistent, high-quality metadata is critical to finding, understanding, and reusing scientific data. However, while there are many relevant vocabularies for the annotation of a dataset, none sufficiently captures all the necessary metadata. This prevents uniform indexing and querying of dataset repositories. Towards providing a practical guide for producing a high quality description of biomedical datasets, the W3C Semantic Web for Health Care and the Life Sciences Interest Group (HCLSIG) identified Resource Description Framework (RDF) vocabularies that could be used to specify common metadata elements and their value sets. The resulting guideline covers elements of description, identification, attribution, versioning, provenance, and content summarization. This guideline reuses existing vocabularies, and is intended to meet key functional requirements including indexing, discovery, exchange, query, and retrieval of datasets, thereby enabling the publication of FAIR data. The resulting metadata profile is generic and could be used by other domains with an interest in providing machine readable descriptions of versioned datasets. PMID:27602295
Lunar scout missions: Galileo encounter results and application to scientific problems and exploration requirements

NASA Technical Reports Server (NTRS)

Head, J. W.; Belton, M.; Greeley, R.; Pieters, C.; Mcewen, A.; Neukum, G.; Mccord, T.

1993-01-01

The Lunar Scout Missions (payload: x-ray fluorescence spectrometer, high-resolution stereocamera, neutron spectrometer, gamma-ray spectrometer, imaging spectrometer, gravity experiment) will provide a global data set for the chemistry, mineralogy, geology, topography, and gravity of the Moon. These data will in turn provide an important baseline for the further scientific exploration of the Moon by all-purpose landers and micro-rovers, and sample return missions from sites shown to be of primary interest from the global orbital data. These data would clearly provide the basis for intelligent selection of sites for the establishment of lunar base sites for long-term scientific and resource exploration and engineering studies. The two recent Galileo encounters with the Moon (December, 1990 and December, 1992) illustrate how modern technology can be applied to significant lunar problems. We emphasize the regional results of the Galileo SSI to show the promise of geologic unit definition and characterization as an example of what can be done with the global coverage to be obtained by the Lunar Scout Missions.
Phobos spectral clustering: first results using the MRO-CRISM 0.4-2.5 micron dataset

NASA Astrophysics Data System (ADS)

Pajola, M.; Roush, T. L.; Marzo, G. A.; Simioni, E.

2016-12-01

Whether Phobos is a captured asteroid or it formed in situ around Mars, is still an outstanding question within the scientific community. The proposed Japanese Mars Moon eXploration (MMX) sample return mission has the chief scientific objective to solve this conundrum, reaching Phobos in early 2020s and returning Phobos samples to Earth few years later. Nonetheless, well before surface samples are returned to Earth, there are important spectral datasets that can be mined in order to constrain Phobos' surface properties and address implications regarding Phobos' origin. One of these is the MRO-CRISM multispectral observations of Phobos. The MRO-CRISM visible and infrared observations (0.4-2.5 micron) are here corrected for incidence and emission angles of the observation. Unlike previous studies of the MRO-CRISM data that selected specific regions for analyses, we apply a statistical technique that identifies different clusters based on a K-means partitioning algorithm. Selecting specific wavelength ranges of Phobos' reflectance spectra permits identification of possible mineralogical compounds and the spatial distribution of these on the surface of Phobos. This work paves the way to a deeper analysis of the available dataset regarding Phobos, potentially identifying regions of interest on the surface of Phobos that may warrant more detailed investigation by the MXX mission as potential sampling areas. Acknowledgments: M. Pajola was supported for this research by an appointment to the NASA Postdoctoral Program at the Ames Research Center administered by USRA.
iSBatch: a batch-processing platform for data analysis and exploration of live-cell single-molecule microscopy images and other hierarchical datasets.

PubMed

Caldas, Victor E A; Punter, Christiaan M; Ghodke, Harshad; Robinson, Andrew; van Oijen, Antoine M

2015-10-01

Recent technical advances have made it possible to visualize single molecules inside live cells. Microscopes with single-molecule sensitivity enable the imaging of low-abundance proteins, allowing for a quantitative characterization of molecular properties. Such data sets contain information on a wide spectrum of important molecular properties, with different aspects highlighted in different imaging strategies. The time-lapsed acquisition of images provides information on protein dynamics over long time scales, giving insight into expression dynamics and localization properties. Rapid burst imaging reveals properties of individual molecules in real-time, informing on their diffusion characteristics, binding dynamics and stoichiometries within complexes. This richness of information, however, adds significant complexity to analysis protocols. In general, large datasets of images must be collected and processed in order to produce statistically robust results and identify rare events. More importantly, as live-cell single-molecule measurements remain on the cutting edge of imaging, few protocols for analysis have been established and thus analysis strategies often need to be explored for each individual scenario. Existing analysis packages are geared towards either single-cell imaging data or in vitro single-molecule data and typically operate with highly specific algorithms developed for particular situations. Our tool, iSBatch, instead allows users to exploit the inherent flexibility of the popular open-source package ImageJ, providing a hierarchical framework in which existing plugins or custom macros may be executed over entire datasets or portions thereof. This strategy affords users freedom to explore new analysis protocols within large imaging datasets, while maintaining hierarchical relationships between experiments, samples, fields of view, cells, and individual molecules.
The Texture of Educational Inquiry: An Exploration of George Herbert Mead's Concept of the Scientific.

ERIC Educational Resources Information Center

Franzosa, Susan Douglas

1984-01-01

Explores the implications of Mead's philosophic social psychology for current disputes concerning the nature of the scientific in educational studies. Mead's contextualization of the knower and the known are found to be compatible with a contemporary critique of positivist paradigms and a critical reconceptualization of educational inquiry.…
A publicly available benchmark for biomedical dataset retrieval: the reference standard for the 2016 bioCADDIE dataset retrieval challenge

PubMed Central

Gururaj, Anupama E.; Chen, Xiaoling; Pournejati, Saeid; Alter, George; Hersh, William R.; Demner-Fushman, Dina; Ohno-Machado, Lucila

2017-01-01

Abstract The rapid proliferation of publicly available biomedical datasets has provided abundant resources that are potentially of value as a means to reproduce prior experiments, and to generate and explore novel hypotheses. However, there are a number of barriers to the re-use of such datasets, which are distributed across a broad array of dataset repositories, focusing on different data types and indexed using different terminologies. New methods are needed to enable biomedical researchers to locate datasets of interest within this rapidly expanding information ecosystem, and new resources are needed for the formal evaluation of these methods as they emerge. In this paper, we describe the design and generation of a benchmark for information retrieval of biomedical datasets, which was developed and used for the 2016 bioCADDIE Dataset Retrieval Challenge. In the tradition of the seminal Cranfield experiments, and as exemplified by the Text Retrieval Conference (TREC), this benchmark includes a corpus (biomedical datasets), a set of queries, and relevance judgments relating these queries to elements of the corpus. This paper describes the process through which each of these elements was derived, with a focus on those aspects that distinguish this benchmark from typical information retrieval reference sets. Specifically, we discuss the origin of our queries in the context of a larger collaborative effort, the biomedical and healthCAre Data Discovery Index Ecosystem (bioCADDIE) consortium, and the distinguishing features of biomedical dataset retrieval as a task. The resulting benchmark set has been made publicly available to advance research in the area of biomedical dataset retrieval. Database URL: https://biocaddie.org/benchmark-data PMID:29220453
Assessment of the NASA-USGS Global Land Survey (GLS) Datasets

USGS Publications Warehouse

Gutman, Garik; Huang, Chengquan; Chander, Gyanesh; Noojipady, Praveen; Masek, Jeffery G.

2013-01-01

The Global Land Survey (GLS) datasets are a collection of orthorectified, cloud-minimized Landsat-type satellite images, providing near complete coverage of the global land area decadally since the early 1970s. The global mosaics are centered on 1975, 1990, 2000, 2005, and 2010, and consist of data acquired from four sensors: Enhanced Thematic Mapper Plus, Thematic Mapper, Multispectral Scanner, and Advanced Land Imager. The GLS datasets have been widely used in land-cover and land-use change studies at local, regional, and global scales. This study evaluates the GLS datasets with respect to their spatial coverage, temporal consistency, geodetic accuracy, radiometric calibration consistency, image completeness, extent of cloud contamination, and residual gaps. In general, the three latest GLS datasets are of a better quality than the GLS-1990 and GLS-1975 datasets, with most of the imagery (85%) having cloud cover of less than 10%, the acquisition years clustered much more tightly around their target years, better co-registration relative to GLS-2000, and better radiometric absolute calibration. Probably, the most significant impediment to scientific use of the datasets is the variability of image phenology (i.e., acquisition day of year). This paper provides end-users with an assessment of the quality of the GLS datasets for specific applications, and where possible, suggestions for mitigating their deficiencies.
The Greenwich Photo-heliographic Results (1874 - 1976): Summary of the Observations, Applications, Datasets, Definitions and Errors

NASA Astrophysics Data System (ADS)

Willis, D. M.; Coffey, H. E.; Henwood, R.; Erwin, E. H.; Hoyt, D. V.; Wild, M. N.; Denig, W. F.

2013-11-01

The measurements of sunspot positions and areas that were published initially by the Royal Observatory, Greenwich, and subsequently by the Royal Greenwich Observatory (RGO), as the Greenwich Photo-heliographic Results ( GPR), 1874 - 1976, exist in both printed and digital forms. These printed and digital sunspot datasets have been archived in various libraries and data centres. Unfortunately, however, typographic, systematic and isolated errors can be found in the various datasets. The purpose of the present paper is to begin the task of identifying and correcting these errors. In particular, the intention is to provide in one foundational paper all the necessary background information on the original solar observations, their various applications in scientific research, the format of the different digital datasets, the necessary definitions of the quantities measured, and the initial identification of errors in both the printed publications and the digital datasets. Two companion papers address the question of specific identifiable errors; namely, typographic errors in the printed publications, and both isolated and systematic errors in the digital datasets. The existence of two independently prepared digital datasets, which both contain information on sunspot positions and areas, makes it possible to outline a preliminary strategy for the development of an even more accurate digital dataset. Further work is in progress to generate an extremely reliable sunspot digital dataset, based on the programme of solar observations supported for more than a century by the Royal Observatory, Greenwich, and the Royal Greenwich Observatory. This improved dataset should be of value in many future scientific investigations.
A hybrid organic-inorganic perovskite dataset

NASA Astrophysics Data System (ADS)

Kim, Chiho; Huan, Tran Doan; Krishnan, Sridevi; Ramprasad, Rampi

2017-05-01

Hybrid organic-inorganic perovskites (HOIPs) have been attracting a great deal of attention due to their versatility of electronic properties and fabrication methods. We prepare a dataset of 1,346 HOIPs, which features 16 organic cations, 3 group-IV cations and 4 halide anions. Using a combination of an atomic structure search method and density functional theory calculations, the optimized structures, the bandgap, the dielectric constant, and the relative energies of the HOIPs are uniformly prepared and validated by comparing with relevant experimental and/or theoretical data. We make the dataset available at Dryad Digital Repository, NoMaD Repository, and Khazana Repository (http://khazana.uconn.edu/), hoping that it could be useful for future data-mining efforts that can explore possible structure-property relationships and phenomenological models. Progressive extension of the dataset is expected as new organic cations become appropriate within the HOIP framework, and as additional properties are calculated for the new compounds found.
Comparison of CORA and EN4 in-situ datasets validation methods, toward a better quality merged dataset.

NASA Astrophysics Data System (ADS)

Szekely, Tanguy; Killick, Rachel; Gourrion, Jerome; Reverdin, Gilles

2017-04-01

CORA and EN4 are both global delayed time mode validated in-situ ocean temperature and salinity datasets distributed by the Met Office (http://www.metoffice.gov.uk/) and Copernicus (www.marine.copernicus.eu). A large part of the profiles distributed by CORA and EN4 in recent years are Argo profiles from the ARGO DAC, but profiles are also extracted from the World Ocean Database and TESAC profiles from GTSPP. In the case of CORA, data coming from the EUROGOOS Regional operationnal oserving system( ROOS) operated by European institutes no managed by National Data Centres and other datasets of profiles povided by scientific sources can also be found (Sea mammals profiles from MEOP, XBT datasets from cruises ...). (EN4 also takes data from the ASBO dataset to supplement observations in the Arctic). First advantage of this new merge product is to enhance the space and time coverage at global and european scales for the period covering 1950 till a year before the current year. This product is updated once a year and T&S gridded fields are alos generated for the period 1990-year n-1. The enhancement compared to the revious CORA product will be presented Despite the fact that the profiles distributed by both datasets are mostly the same, the quality control procedures developed by the Met Office and Copernicus teams differ, sometimes leading to different quality control flags for the same profile. Started in 2016 a new study started that aims to compare both validation procedures to move towards a Copernicus Marine Service dataset with the best features of CORA and EN4 validation.A reference data set composed of the full set of in-situ temperature and salinity measurements collected by Coriolis during 2015 is used. These measurements have been made thanks to wide range of instruments (XBTs, CTDs, Argo floats, Instrumented sea mammals,...), covering the global ocean. The reference dataset has been validated simultaneously by both teams.An exhaustive comparison of the
Use of Electronic Health-Related Datasets in Nursing and Health-Related Research.

PubMed

Al-Rawajfah, Omar M; Aloush, Sami; Hewitt, Jeanne Beauchamp

2015-07-01

Datasets of gigabyte size are common in medical sciences. There is increasing consensus that significant untapped knowledge lies hidden in these large datasets. This review article aims to discuss Electronic Health-Related Datasets (EHRDs) in terms of types, features, advantages, limitations, and possible use in nursing and health-related research. Major scientific databases, MEDLINE, ScienceDirect, and Scopus, were searched for studies or review articles regarding using EHRDs in research. A total number of 442 articles were located. After application of study inclusion criteria, 113 articles were included in the final review. EHRDs were categorized into Electronic Administrative Health-Related Datasets and Electronic Clinical Health-Related Datasets. Subcategories of each major category were identified. EHRDs are invaluable assets for nursing the health-related research. Advanced research skills such as using analytical softwares, advanced statistical procedures, dealing with missing data and missing variables will maximize the efficient utilization of EHRDs in research. © The Author(s) 2014.
The Role of Scientific Collections in Scientific Preparedness

PubMed Central

2015-01-01

Building on the findings and recommendations of the Interagency Working Group on Scientific Collections, Scientific Collections International (SciColl) aims to improve the rapid access to science collections across disciplines within the federal government and globally, between government agencies and private research institutions. SciColl offered a novel opportunity for the US Department of Health and Human Services, Office of the Assistant Secretary for Preparedness and Response, to explore the value of scientific research collections under the science preparedness initiative and integrate it as a research resource at each stage in the emergence of the infectious diseases cycle. Under the leadership of SciColl’s executive secretariat at the Smithsonian Institution, and with multiple federal and international partners, a workshop during October 2014 fully explored the intersections of the infectious disease cycle and the role scientific collections could play as an evidentiary scientific resource to mitigate risks associated with emerging infectious diseases. PMID:26380390
Providing Geographic Datasets as Linked Data in Sdi

NASA Astrophysics Data System (ADS)

Hietanen, E.; Lehto, L.; Latvala, P.

2016-06-01

In this study, a prototype service to provide data from Web Feature Service (WFS) as linked data is implemented. At first, persistent and unique Uniform Resource Identifiers (URI) are created to all spatial objects in the dataset. The objects are available from those URIs in Resource Description Framework (RDF) data format. Next, a Web Ontology Language (OWL) ontology is created to describe the dataset information content using the Open Geospatial Consortium's (OGC) GeoSPARQL vocabulary. The existing data model is modified in order to take into account the linked data principles. The implemented service produces an HTTP response dynamically. The data for the response is first fetched from existing WFS. Then the Geographic Markup Language (GML) format output of the WFS is transformed on-the-fly to the RDF format. Content Negotiation is used to serve the data in different RDF serialization formats. This solution facilitates the use of a dataset in different applications without replicating the whole dataset. In addition, individual spatial objects in the dataset can be referred with URIs. Furthermore, the needed information content of the objects can be easily extracted from the RDF serializations available from those URIs. A solution for linking data objects to the dataset URI is also introduced by using the Vocabulary of Interlinked Datasets (VoID). The dataset is divided to the subsets and each subset is given its persistent and unique URI. This enables the whole dataset to be explored with a web browser and all individual objects to be indexed by search engines.
FastQuery: A Parallel Indexing System for Scientific Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chou, Jerry; Wu, Kesheng; Prabhat,

2011-07-29

Modern scientific datasets present numerous data management and analysis challenges. State-of-the- art index and query technologies such as FastBit can significantly improve accesses to these datasets by augmenting the user data with indexes and other secondary information. However, a challenge is that the indexes assume the relational data model but the scientific data generally follows the array data model. To match the two data models, we design a generic mapping mechanism and implement an efficient input and output interface for reading and writing the data and their corresponding indexes. To take advantage of the emerging many-core architectures, we also developmore » a parallel strategy for indexing using threading technology. This approach complements our on-going MPI-based parallelization efforts. We demonstrate the flexibility of our software by applying it to two of the most commonly used scientific data formats, HDF5 and NetCDF. We present two case studies using data from a particle accelerator model and a global climate model. We also conducted a detailed performance study using these scientific datasets. The results show that FastQuery speeds up the query time by a factor of 2.5x to 50x, and it reduces the indexing time by a factor of 16 on 24 cores.« less
Publishing datasets with eSciDoc and panMetaDocs

NASA Astrophysics Data System (ADS)

Ulbricht, D.; Klump, J.; Bertelmann, R.

2012-04-01

Currently serveral research institutions worldwide undertake considerable efforts to have their scientific datasets published and to syndicate them to data portals as extensively described objects identified by a persistent identifier. This is done to foster the reuse of data, to make scientific work more transparent, and to create a citable entity that can be referenced unambigously in written publications. GFZ Potsdam established a publishing workflow for file based research datasets. Key software components are an eSciDoc infrastructure [1] and multiple instances of the data curation tool panMetaDocs [2]. The eSciDoc repository holds data objects and their associated metadata in container objects, called eSciDoc items. A key metadata element in this context is the publication status of the referenced data set. PanMetaDocs, which is based on PanMetaWorks [3], is a PHP based web application that allows to describe data with any XML-based metadata schema. The metadata fields can be filled with static or dynamic content to reduce the number of fields that require manual entries to a minimum and make use of contextual information in a project setting. Access rights can be applied to set visibility of datasets to other project members and allow collaboration on and notifying about datasets (RSS) and interaction with the internal messaging system, that was inherited from panMetaWorks. When a dataset is to be published, panMetaDocs allows to change the publication status of the eSciDoc item from status "private" to "submitted" and prepare the dataset for verification by an external reviewer. After quality checks, the item publication status can be changed to "published". This makes the data and metadata available through the internet worldwide. PanMetaDocs is developed as an eSciDoc application. It is an easy to use graphical user interface to eSciDoc items, their data and metadata. It is also an application supporting a DOI publication agent during the process of
DATS, the data tag suite to enable discoverability of datasets.

PubMed

Sansone, Susanna-Assunta; Gonzalez-Beltran, Alejandra; Rocca-Serra, Philippe; Alter, George; Grethe, Jeffrey S; Xu, Hua; Fore, Ian M; Lyle, Jared; Gururaj, Anupama E; Chen, Xiaoling; Kim, Hyeon-Eui; Zong, Nansu; Li, Yueling; Liu, Ruiling; Ozyurt, I Burak; Ohno-Machado, Lucila

2017-06-06

Today's science increasingly requires effective ways to find and access existing datasets that are distributed across a range of repositories. For researchers in the life sciences, discoverability of datasets may soon become as essential as identifying the latest publications via PubMed. Through an international collaborative effort funded by the National Institutes of Health (NIH)'s Big Data to Knowledge (BD2K) initiative, we have designed and implemented the DAta Tag Suite (DATS) model to support the DataMed data discovery index. DataMed's goal is to be for data what PubMed has been for the scientific literature. Akin to the Journal Article Tag Suite (JATS) used in PubMed, the DATS model enables submission of metadata on datasets to DataMed. DATS has a core set of elements, which are generic and applicable to any type of dataset, and an extended set that can accommodate more specialized data types. DATS is a platform-independent model also available as an annotated serialization in schema.org, which in turn is widely used by major search engines like Google, Microsoft, Yahoo and Yandex.
Exploring English Language Learners (ELL) Experiences with Scientific Language and Inquiry within a Real Life Context

ERIC Educational Resources Information Center

Algee, Lisa M.

2012-01-01

English Language Learners (ELL) are often at a distinct disadvantage from receiving authentic science learning opportunites. This study explored English Language Learners (ELL) learning experiences with scientific language and inquiry within a real life context. This research was theoretically informed by sociocultural theory and literature on…

Exploring South African High School Teachers' Conceptions of the Nature of Scientific Inquiry: A Case Study

ERIC Educational Resources Information Center

Dudu, Washington T.

2014-01-01

The paper explores conceptions of the nature of scientific inquiry (NOSI) held by five teachers who were purposively and conveniently sampled. Teachers' conceptions of the NOSI were determined using a Probes questionnaire. To confirm teachers' responses, a semi-structured interview was conducted with each teacher. The Probes questionnaire was…
Exploring Venus: the Venus Exploration Analysis Group (VEXAG)

NASA Astrophysics Data System (ADS)

Ocampo, A.; Atreya, S.; Thompson, T.; Luhmann, J.; Mackwell, S.; Baines, K.; Cutts, J.; Robinson, J.; Saunders, S.

In July 2005 NASA s Planetary Division established the Venus Exploration Analysis Group VEXAG http www lpi usra edu vexag in order to engage the scientific community at large in identifying scientific priorities and strategies for the exploration of Venus VEXAG is a community-based forum open to all interested in the exploration of Venus VEXAG was designed to provide scientific input and technology development plans for planning and prioritizing the study of Venus over the next several decades including a Venus surface sample return VEXAG regularly evaluates NASA s Venus exploration goals scientific objectives investigations and critical measurement requirements including the recommendations in the National Research Council Decadal Survey and NASA s Solar System Exploration Strategic Roadmap VEXAG will take into consideration the latest scientific results from ESA s Venus Express mission and the MESSENGER flybys as well as the results anticipated from JAXA s Venus Climate Orbiter together with science community inputs from venues such as the February 13-16 2006 AGU Chapman Conference to identify the scientific priorities and strategies for future NASA Venus exploration VEXAG is composed of two co-chairs Sushil Atreya University of Michigan Ann Arbor and Janet Luhmann University of California Berkeley VEXAG has formed three focus groups in the areas of 1 Planetary Formation and Evolution Surface and Interior Volcanism Geodynamics etc Focus Group Lead Steve Mackwell LPI 2 Atmospheric Evolution Dynamics Meteorology
BanglaLekha-Isolated: A multi-purpose comprehensive dataset of Handwritten Bangla Isolated characters.

PubMed

Biswas, Mithun; Islam, Rafiqul; Shom, Gautam Kumar; Shopon, Md; Mohammed, Nabeel; Momen, Sifat; Abedin, Anowarul

2017-06-01

BanglaLekha-Isolated, a Bangla handwritten isolated character dataset is presented in this article. This dataset contains 84 different characters comprising of 50 Bangla basic characters, 10 Bangla numerals and 24 selected compound characters. 2000 handwriting samples for each of the 84 characters were collected, digitized and pre-processed. After discarding mistakes and scribbles, 1,66,105 handwritten character images were included in the final dataset. The dataset also includes labels indicating the age and the gender of the subjects from whom the samples were collected. This dataset could be used not only for optical handwriting recognition research but also to explore the influence of gender and age on handwriting. The dataset is publicly available at https://data.mendeley.com/datasets/hf6sf8zrkc/2.
Educational and Scientific Applications of Climate Model Diagnostic Analyzer

NASA Astrophysics Data System (ADS)

Lee, S.; Pan, L.; Zhai, C.; Tang, B.; Kubar, T. L.; Zhang, J.; Bao, Q.

2016-12-01

Climate Model Diagnostic Analyzer (CMDA) is a web-based information system designed for the climate modeling and model analysis community to analyze climate data from models and observations. CMDA provides tools to diagnostically analyze climate data for model validation and improvement, and to systematically manage analysis provenance for sharing results with other investigators. CMDA utilizes cloud computing resources, multi-threading computing, machine-learning algorithms, web service technologies, and provenance-supporting technologies to address technical challenges that the Earth science modeling and model analysis community faces in evaluating and diagnosing climate models. As CMDA infrastructure and technology have matured, we have developed the educational and scientific applications of CMDA. Educationally, CMDA supported the summer school of the JPL Center for Climate Sciences for three years since 2014. In the summer school, the students work on group research projects where CMDA provide datasets and analysis tools. Each student is assigned to a virtual machine with CMDA installed in Amazon Web Services. A provenance management system for CMDA is developed to keep track of students' usages of CMDA, and to recommend datasets and analysis tools for their research topic. The provenance system also allows students to revisit their analysis results and share them with their group. Scientifically, we have developed several science use cases of CMDA covering various topics, datasets, and analysis types. Each use case developed is described and listed in terms of a scientific goal, datasets used, the analysis tools used, scientific results discovered from the use case, an analysis result such as output plots and data files, and a link to the exact analysis service call with all the input arguments filled. For example, one science use case is the evaluation of NCAR CAM5 model with MODIS total cloud fraction. The analysis service used is Difference Plot Service of
Exploring scientific creativity of eleventh-grade students in Taiwan

NASA Astrophysics Data System (ADS)

Liang, Jia-Chi

2002-04-01

Although most researchers focus on scientists' creativity, students' scientific creativity should be considered, especially for high school and college students. It is generally assumed that most professional creators in science emerge from amateur creators. Therefore, the purpose of this study is to investigate the relationship between students' scientific creativity and selected variables including creativity, problem finding, formulating hypotheses, science achievement, the nature of science, and attitudes toward science for finding significant predictors of eleventh grade students' scientific creativity. A total of 130 male eleventh-grade students in three biology classes participated in this study. The main instruments included the Test of Divergent Thinking (TDT) for creativity measurement, the Creativity Rating Scale (CRS) and the Creative Activities and Accomplishments Check Lists (CAACL ) for measurement of scientific creativity, the Nature of Scientific Knowledge Scale (NSKS) for measurement of the nature of science, and the Science Attitude Inventory II (SAI II) for measurement of attitudes toward science. In addition, two instruments on measuring students' abilities of problem finding and abilities of formulating hypotheses were developed by the researcher in this study. Data analysis involved descriptive statistics, Pearson product-moment correlations, and stepwise multiple regressions. The major findings suggested the following: (1) students' scientific creativity significantly correlated with some of selected variables such as attitudes toward science, problem finding, formulating hypotheses, the nature of science, resistance to closure, originality, and elaboration; (2) four significant predictors including attitudes toward science, problem finding, resistance to closure, and originality accounted for 48% of the variance of students' scientific creativity; (3) there were big differences between students with a higher and a lower degree of scientific
Lessons Learned while Exploring Cloud-Native Architectures for NASA EOSDIS Applications and Systems

NASA Technical Reports Server (NTRS)

Pilone, Dan

2016-01-01

As new, high data rate missions begin collecting data, the NASAs Earth Observing System Data and Information System (EOSDIS) archive is projected to grow roughly 20x to over 300PBs by 2025. To prepare for the dramatic increase in data and enable broad scientific inquiry into larger time series and datasets, NASA has been exploring the impact of applying cloud technologies throughout EOSDIS. In this talk we will provide an overview of NASAs prototyping and lessons learned in applying cloud architectures.
Efficient genotype compression and analysis of large genetic variation datasets

PubMed Central

Layer, Ryan M.; Kindlon, Neil; Karczewski, Konrad J.; Quinlan, Aaron R.

2015-01-01

Genotype Query Tools (GQT) is a new indexing strategy that expedites analyses of genome variation datasets in VCF format based on sample genotypes, phenotypes and relationships. GQT’s compressed genotype index minimizes decompression for analysis, and performance relative to existing methods improves with cohort size. We show substantial (up to 443 fold) performance gains over existing methods and demonstrate GQT’s utility for exploring massive datasets involving thousands to millions of genomes. PMID:26550772
Connecting the Public to Scientific Research Data - Science On a Sphere°

NASA Astrophysics Data System (ADS)

Henderson, M. A.; Russell, E. L.; Science on a Sphere Datasets

2011-12-01

Connecting the Public to Scientific Research Data - Science On a Sphere° Maurice Henderson, NASA Goddard Space Flight Center Elizabeth Russell, NOAA Earth System Research Laboratory, University of Colorado Cooperative Institute for Research in Environmental Sciences Science On a Sphere° is a six foot animated globe developed by the National Ocean and Atmospheric Administration, NOAA, as a means to display global scientific research data in an intuitive, engaging format in public forums. With over 70 permanent installations of SOS around the world in science museums, visitor's centers and universities, the audience that enjoys SOS yearly is substantial, wide-ranging, and diverse. Through partnerships with the National Aeronautics and Space Administration, NASA, the SOS Data Catalog (http://sos.noaa.gov/datasets/) has grown to a collection of over 350 datasets from NOAA, NASA, and many others. Using an external projection system, these datasets are displayed onto the sphere creating a seamless global image. In a cross-site evaluation of Science On a Sphere°, 82% of participants said yes, seeing information displayed on a sphere changed their understanding of the information. This unique technology captivates viewers and exposes them to scientific research data in a way that is accessible, presentable, and understandable. The datasets that comprise the SOS Data Catalog are scientific research data that have been formatted for display on SOS. By formatting research data into visualizations that can be used on SOS, NOAA and NASA are able to turn research data into educational materials that are easily accessible for users. In many cases, visualizations do not need to be modified because SOS uses a common map projection. The SOS Data Catalog has become a "one-stop shop" for a broad range of global datasets from across NOAA and NASA, and as a result, the traffic on the site is more than just SOS users. While the target audience for this site is SOS users, many
Interactive visualization and analysis of multimodal datasets for surgical applications.

PubMed

Kirmizibayrak, Can; Yim, Yeny; Wakid, Mike; Hahn, James

2012-12-01

Surgeons use information from multiple sources when making surgical decisions. These include volumetric datasets (such as CT, PET, MRI, and their variants), 2D datasets (such as endoscopic videos), and vector-valued datasets (such as computer simulations). Presenting all the information to the user in an effective manner is a challenging problem. In this paper, we present a visualization approach that displays the information from various sources in a single coherent view. The system allows the user to explore and manipulate volumetric datasets, display analysis of dataset values in local regions, combine 2D and 3D imaging modalities and display results of vector-based computer simulations. Several interaction methods are discussed: in addition to traditional interfaces including mouse and trackers, gesture-based natural interaction methods are shown to control these visualizations with real-time performance. An example of a medical application (medialization laryngoplasty) is presented to demonstrate how the combination of different modalities can be used in a surgical setting with our approach.
Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades.

PubMed

Orchard, Garrick; Jayawant, Ajinkya; Cohen, Gregory K; Thakor, Nitish

2015-01-01

Creating datasets for Neuromorphic Vision is a challenging task. A lack of available recordings from Neuromorphic Vision sensors means that data must typically be recorded specifically for dataset creation rather than collecting and labeling existing data. The task is further complicated by a desire to simultaneously provide traditional frame-based recordings to allow for direct comparison with traditional Computer Vision algorithms. Here we propose a method for converting existing Computer Vision static image datasets into Neuromorphic Vision datasets using an actuated pan-tilt camera platform. Moving the sensor rather than the scene or image is a more biologically realistic approach to sensing and eliminates timing artifacts introduced by monitor updates when simulating motion on a computer monitor. We present conversion of two popular image datasets (MNIST and Caltech101) which have played important roles in the development of Computer Vision, and we provide performance metrics on these datasets using spike-based recognition algorithms. This work contributes datasets for future use in the field, as well as results from spike-based algorithms against which future works can compare. Furthermore, by converting datasets already popular in Computer Vision, we enable more direct comparison with frame-based approaches.
Securely Measuring the Overlap between Private Datasets with Cryptosets

PubMed Central

Swamidass, S. Joshua; Matlock, Matthew; Rozenblit, Leon

2015-01-01

Many scientific questions are best approached by sharing data—collected by different groups or across large collaborative networks—into a combined analysis. Unfortunately, some of the most interesting and powerful datasets—like health records, genetic data, and drug discovery data—cannot be freely shared because they contain sensitive information. In many situations, knowing if private datasets overlap determines if it is worthwhile to navigate the institutional, ethical, and legal barriers that govern access to sensitive, private data. We report the first method of publicly measuring the overlap between private datasets that is secure under a malicious model without relying on private protocols or message passing. This method uses a publicly shareable summary of a dataset’s contents, its cryptoset, to estimate its overlap with other datasets. Cryptosets approach “information-theoretic” security, the strongest type of security possible in cryptography, which is not even crackable with infinite computing power. We empirically and theoretically assess both the accuracy of these estimates and the security of the approach, demonstrating that cryptosets are informative, with a stable accuracy, and secure. PMID:25714898
Research Infrastructure and Scientific Collections: The Supply and Demand of Scientific Research

NASA Astrophysics Data System (ADS)

Graham, E.; Schindel, D. E.

2016-12-01

Research infrastructure is essential in both experimental and observational sciences and is commonly thought of as single-sited facilities. In contrast, object-based scientific collections are distributed in nearly every way, including by location, taxonomy, geologic epoch, discipline, collecting processes, benefits sharing rules, and many others. These diffused collections may have been amassed for a particular discipline, but their potential for use and impact in other fields needs to be explored. Through a series of cross-disciplinary activities, Scientific Collections International (SciColl) has explored and developed new ways in which the supply of scientific collections can meet the demand of researchers in unanticipated ways. From cross-cutting workshops on emerging infectious diseases and food security, to an online portal of collections, SciColl aims to illustrate the scope and value of object-based scientific research infrastructure. As distributed infrastructure, the full impact of scientific collections to the research community is a result of discovering, utilizing, and networking these resources. Examples and case studies from infectious disease research, food security topics, and digital connectivity will be explored.
Sciologer: Visualizing and Exploring Scientific Communities

ERIC Educational Resources Information Center

Bales, Michael Eliot

2009-01-01

Despite the recognized need to increase interdisciplinary collaboration, there are few information resources available to provide researchers with an overview of scientific communities--topics under investigation by various groups, and patterns of collaboration among groups. The tools that are available are designed for expert social network…
Exploring English Language Learners (ELL) experiences with scientific language and inquiry within a real life context

NASA Astrophysics Data System (ADS)

Algee, Lisa M.

English Language Learners (ELL) are often at a distinct disadvantage from receiving authentic science learning opportunites. This study explored English Language Learners (ELL) learning experiences with scientific language and inquiry within a real life context. This research was theoretically informed by sociocultural theory and literature on student learning and science teaching for ELL. A qualitative, case study was used to explore students' learning experiences. Data from multiple sources was collected: student interviews, science letters, an assessment in another context, field-notes, student presentations, inquiry assessment, instructional group conversations, parent interviews, parent letters, parent homework, teacher-researcher evaluation, teacher-researcher reflective journal, and student ratings of learning activities. These data sources informed the following research questions: (1) Does participation in an out-of-school contextualized inquiry science project increase ELL use of scientific language? (2) Does participation in an out-of-school contextualized inquiry science project increase ELL understanding of scientific inquiry and their motivation to learn? (3) What are parents' funds of knowledge about the local ecology and does this inform students' experiences in the science project? All data sources concerning students were analyzed for similar patterns and trends and triangulation was sought through the use of these data sources. The remaining data sources concerning the teacher-researcher were used to inform and assess whether the pedagogical and research practices were in alignment with the proposed theoretical framework. Data sources concerning parental participation accessed funds of knowledge, which informed the curriculum in order to create continuity and connections between home and school. To ensure accuracy in the researchers' interpretations of student and parent responses during interviews, member checking was employed. The findings
Crew Roles and Interactions in Scientific Space Exploration

NASA Technical Reports Server (NTRS)

Love, Stanley G.; Bleacher, Jacob E.

2013-01-01

Future piloted space exploration missions will focus more on science than engineering, a change which will challenge existing concepts for flight crew tasking and demand that participants with contrasting skills, values, and backgrounds learn to cooperate as equals. In terrestrial space flight analogs such as Desert Research And Technology Studies, engineers, pilots, and scientists can practice working together, taking advantage of the full breadth of all team members training to produce harmonious, effective missions that maximize the time and attention the crew can devote to science. This paper presents, in a format usable as a reference by participants in the field, a successfully tested crew interaction model for such missions. The model builds upon the basic framework of a scientific field expedition by adding proven concepts from aviation and human spaceflight, including expeditionary behavior and cockpit resource management, cooperative crew tasking and adaptive leadership and followership, formal techniques for radio communication, and increased attention to operational considerations. The crews of future spaceflight analogs can use this model to demonstrate effective techniques, learn from each other, develop positive working relationships, and make their expeditions more successful, even if they have limited time to train together beforehand. This model can also inform the preparation and execution of actual future spaceflights.
Crew roles and interactions in scientific space exploration

NASA Astrophysics Data System (ADS)

Love, Stanley G.; Bleacher, Jacob E.

2013-10-01

Future piloted space exploration missions will focus more on science than engineering, a change which will challenge existing concepts for flight crew tasking and demand that participants with contrasting skills, values, and backgrounds learn to cooperate as equals. In terrestrial space flight analogs such as Desert Research And Technology Studies, engineers, pilots, and scientists can practice working together, taking advantage of the full breadth of all team members' training to produce harmonious, effective missions that maximize the time and attention the crew can devote to science. This paper presents, in a format usable as a reference by participants in the field, a successfully tested crew interaction model for such missions. The model builds upon the basic framework of a scientific field expedition by adding proven concepts from aviation and human space flight, including expeditionary behavior and cockpit resource management, cooperative crew tasking and adaptive leadership and followership, formal techniques for radio communication, and increased attention to operational considerations. The crews of future space flight analogs can use this model to demonstrate effective techniques, learn from each other, develop positive working relationships, and make their expeditions more successful, even if they have limited time to train together beforehand. This model can also inform the preparation and execution of actual future space flights.
ESSG-based global spatial reference frame for datasets interrelation

NASA Astrophysics Data System (ADS)

Yu, J. Q.; Wu, L. X.; Jia, Y. J.

2013-10-01

To know well about the highly complex earth system, a large volume of, as well as a large variety of, datasets on the planet Earth are being obtained, distributed, and shared worldwide everyday. However, seldom of existing systems concentrates on the distribution and interrelation of different datasets in a common Global Spatial Reference Frame (GSRF), which holds an invisble obstacle to the data sharing and scientific collaboration. Group on Earth Obeservation (GEO) has recently established a new GSRF, named Earth System Spatial Grid (ESSG), for global datasets distribution, sharing and interrelation in its 2012-2015 WORKING PLAN.The ESSG may bridge the gap among different spatial datasets and hence overcome the obstacles. This paper is to present the implementation of the ESSG-based GSRF. A reference spheroid, a grid subdvision scheme, and a suitable encoding system are required to implement it. The radius of ESSG reference spheroid was set to the double of approximated Earth radius to make datasets from different areas of earth system science being covered. The same paramerters of positioning and orienting as Earth Centred Earth Fixed (ECEF) was adopted for the ESSG reference spheroid to make any other GSRFs being freely transformed into the ESSG-based GSRF. Spheroid degenerated octree grid with radius refiment (SDOG-R) and its encoding method were taken as the grid subdvision and encoding scheme for its good performance in many aspects. A triple (C, T, A) model is introduced to represent and link different datasets based on the ESSG-based GSRF. Finally, the methods of coordinate transformation between the ESSGbased GSRF and other GSRFs were presented to make ESSG-based GSRF operable and propagable.
U.S. Geological Survey scientific activities in the exploration of Antarctica: 1995-96 field season

USGS Publications Warehouse

Meunier, Tony K.; Williams, Richard S.; Ferrigno, Jane G.

2007-01-01

The U.S. Geological Survey (USGS) mapping program in Antarctica is one of the longest continuously funded projects in the United States Antarctic Program (USAP). This is the 46th U.S. expedition to Antarctica in which USGS scientists have participated. The financial support from the National Science Foundation, which extends back to the time of the International Geophysical Year (IGY) in 1956-57, can be attributed to the need for accurate maps of specific field areas or regions where NSF-funded science projects were planned. The epoch of Antarctic exploration during the IGY was being driven by science and, in a spirit of peaceful cooperation, the international scientific community wanted to limit military activities on the continent to logistical support. The USGS, a Federal civilian science agency in the Department of the Interior, had, since its founding in 1879, carried out numerous field-based national (and some international) programs in biology, geology, hydrology, and mapping. Therefore, the USGS was the obvious choice for these tasks, because it already had a professional staff of experienced mapmakers and program managers with the foresight, dedication, and understanding of the need for accurate maps to support the science programs in Antarctica when asked to do so by the U.S. National Academy of Sciences. Public Laws 85-743 and 87-626, signed in August 1958 and in September 1962, respectively, authorized the Secretary, U.S. Department of the Interior, through the USGS, to support mapping and scientific work in Antarctica. The USGS mapping and science programs still play a significant role in the advancement of science in Antarctica today. Antarctica is the planet's 5th largest continent (13.2 million km2 (5.1 million mi2)), it contains the world's largest (of two) remaining ice sheet, and it is considered to be one of the most important scientific laboratories on Earth. This report provides documentation of USGS scientific activities in the exploration of
U.S. Geological Survey scientific activities in the exploration of Antarctica: 2002-03 field season

USGS Publications Warehouse

Meunier, Tony K.; Williams, Richard S.; Ferrigno, Jane G.

2007-01-01

The U.S. Geological Survey (USGS) mapping program in Antarctica is one of the longest continuously funded projects in the United States Antarctic Program (USAP). This is the 53rd U.S. expedition to Antarctica in which USGS scientists have participated. The financial support from the National Science Foundation, which extends back to the time of the International Geophysical Year (IGY) in 1956–57, can be attributed to the need for accurate maps of specific field areas or regions where NSF-funded science projects were planned. The epoch of Antarctic exploration during the IGY was being driven by science, and, in a spirit of peaceful cooperation, the international scientific community wanted to limit military activities on the continent to logistical support. The USGS, a Federal civilian science agency in the Department of the Interior, had, since its founding in 1879, carried out numerous field-based national (and some international) programs in biology, geology, hydrology, and mapping. Therefore, the USGS was the obvious choice for these tasks, because it already had a professional staff of experienced mapmakers and program managers with the foresight, dedication, and understanding of the need for accurate maps to support the science programs in Antarctica when asked to do so by the U.S. National Academy of Sciences. Public Laws 85-743 and 87-626, signed in August 1958 and in September 1962, respectively, authorized the Secretary, U.S. Department of the Interior, through the USGS, to support mapping and scientific work in Antarctica. The USGS mapping and science programs still play a significant role in the advancement of science in Antarctica today. Antarctica is the planet's 5th largest continent [13.2 million km2 (5.1 million mi2)], it contains the world's largest (of two) remaining ice sheets, and it is considered to be one of the most important scientific laboratories on Earth. This report provides documentation of USGS scientific activities in the
Mission to the Moon: Europe's priorities for the scientific exploration and utilisation of the Moon

NASA Astrophysics Data System (ADS)

Battrick, Bruce; Barron, C.

1992-06-01

A study to determine Europe's potential role in the future exploration and utilization of the Moon is presented. To establish the scientific justifications the Lunar Study Steering Group (LSSG) was established reflecting all scientific disciplines benefitting from a lunar base (Moon studies, astronomy, fusion, life sciences, etc.). Scientific issues were divided into three main areas: science of the Moon, including all investigations concerning the Moon as a planetary body; science from the Moon, using the Moon as a platform and therefore including observatories in the broadest sense; science on the Moon, including not only questions relating to human activities in space, but also the development of artificial ecosystems beyond the Earth. Science of the Moon focuses on geographical, geochemical and geological observations of the Earth-Moon system. Science from the Moon takes advantage of the stable lunar ground, its atmosphere free sky and, on the far side, its radio quiet environment. The Moon provides an attractive platform for the observation and study of the Universe. Two techniques that can make unique cause of the lunar platform are ultraviolet to submillimeter interferometric imaging, and very low frequency astronomy. One of the goals of life sciences studies (Science on the Moon) is obviously to provide the prerequisite information for establishing a manned lunar base. This includes studies of human physiology under reduced gravity, radiation protection and life support systems, and feasibility studies based on existing hardware. The overall recommendations are essentially to set up specific study teams for those fields judged to be the most promising for Europe, with the aim of providing more detailed scientific and technological specifications. It is also suggested that the scope of the overall study activities be expanded in order to derive mission scenarios for a viable ESA lunar exploration program and to consider economic, legal and policy matters

Artificial intelligence support for scientific model-building

NASA Technical Reports Server (NTRS)

Keller, Richard M.

1992-01-01

Scientific model-building can be a time-intensive and painstaking process, often involving the development of large and complex computer programs. Despite the effort involved, scientific models cannot easily be distributed and shared with other scientists. In general, implemented scientific models are complex, idiosyncratic, and difficult for anyone but the original scientific development team to understand. We believe that artificial intelligence techniques can facilitate both the model-building and model-sharing process. In this paper, we overview our effort to build a scientific modeling software tool that aids the scientist in developing and using models. This tool includes an interactive intelligent graphical interface, a high-level domain specific modeling language, a library of physics equations and experimental datasets, and a suite of data display facilities.
Using Graph Indices for the Analysis and Comparison of Chemical Datasets.

PubMed

Fourches, Denis; Tropsha, Alexander

2013-10-01

In cheminformatics, compounds are represented as points in multidimensional space of chemical descriptors. When all pairs of points found within certain distance threshold in the original high dimensional chemistry space are connected by distance-labeled edges, the resulting data structure can be defined as Dataset Graph (DG). We show that, similarly to the conventional description of organic molecules, many graph indices can be computed for DGs as well. We demonstrate that chemical datasets can be effectively characterized and compared by computing simple graph indices such as the average vertex degree or Randic connectivity index. This approach is used to characterize and quantify the similarity between different datasets or subsets of the same dataset (e.g., training, test, and external validation sets used in QSAR modeling). The freely available ADDAGRA program has been implemented to build and visualize DGs. The approach proposed and discussed in this report could be further explored and utilized for different cheminformatics applications such as dataset diversification by acquiring external compounds, dataset processing prior to QSAR modeling, or (dis)similarity modeling of multiple datasets studied in chemical genomics applications. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A Physics MOSAIC: Scientific Skills and Explorations for Students

NASA Astrophysics Data System (ADS)

May, S.; Clements, C.; Erickson, P. J.; Rogers, A.

2010-12-01

MOSAIC unit begins with a series of activities and lessons designed to take advantage of the large data sets MOSAIC is collecting all the time to teach students about measurement, uncertainty, and data analysis. The curriculum develops an intuitive approach to thinking about numbers in science, focusing on both implicit and explicit expressions of uncertainty. Our teaching unit concludes with a final research project to provide students with the opportunity to pursue an area of interest within mesospheric ozone. This project is conceived in such a way that it can be as self-directed as a teacher or student needs. Given current concern for the state of our atmosphere and ozone, MOSAIC provides a unique opportunity for student engagement in an area of scientific research that has not been extensively explored. MOSAIC data can be compared with online resources for other atmospheric, astronomical, or geophysical data, and have been analyzed for the effects of such variables as seasonal and solar flux variations, lunar phases, shuttle and rocket launches, and sudden stratospheric warming events.
Exploring Institutional Mechanisms for Scientific Input into the Management Cycle of the National Protected Area Network of Peru: Gaps and Opportunities.

PubMed

López-Rodríguez, M D; Castro, H; Arenas, M; Requena-Mullor, J M; Cano, A; Valenzuela, E; Cabello, J

2017-12-01

Understanding how to improve decision makers' use of scientific information across their different scales of management is a core challenge for narrowing the gap between science and conservation practice. Here, we present a study conducted in collaboration with decision makers that aims to explore the functionality of the mechanisms for scientific input within the institutional setting of the National Protected Area Network of Peru. First, we analyzed institutional mechanisms to assess the scientific information recorded by decision makers. Second, we developed two workshops involving scientists, decision makers and social actors to identify barriers to evidence-based conservation practice. Third, we administered 482 questionnaires to stakeholders to explore social perceptions of the role of science and the willingness to collaborate in the governance of protected areas. The results revealed that (1) the institutional mechanisms did not effectively promote the compilation and application of scientific knowledge for conservation practice; (2) six important barriers hindered scientific input in management decisions; and (3) stakeholders showed positive perceptions about the involvement of scientists in protected areas and expressed their willingness to collaborate in conservation practice. This collaborative research helped to (1) identify gaps and opportunities that should be addressed for increasing the effectiveness of the institutional mechanisms and (2) support institutional changes integrating science-based strategies for strengthening scientific input in decision-making. These insights provide a useful contextual orientation for scholars and decision makers interested in conducting empirical research to connect scientific inputs with operational aspects of the management cycle in other institutional settings around the world.
Exploring Institutional Mechanisms for Scientific Input into the Management Cycle of the National Protected Area Network of Peru: Gaps and Opportunities

NASA Astrophysics Data System (ADS)

López-Rodríguez, M. D.; Castro, H.; Arenas, M.; Requena-Mullor, J. M.; Cano, A.; Valenzuela, E.; Cabello, J.

2017-12-01

Understanding how to improve decision makers' use of scientific information across their different scales of management is a core challenge for narrowing the gap between science and conservation practice. Here, we present a study conducted in collaboration with decision makers that aims to explore the functionality of the mechanisms for scientific input within the institutional setting of the National Protected Area Network of Peru. First, we analyzed institutional mechanisms to assess the scientific information recorded by decision makers. Second, we developed two workshops involving scientists, decision makers and social actors to identify barriers to evidence-based conservation practice. Third, we administered 482 questionnaires to stakeholders to explore social perceptions of the role of science and the willingness to collaborate in the governance of protected areas. The results revealed that (1) the institutional mechanisms did not effectively promote the compilation and application of scientific knowledge for conservation practice; (2) six important barriers hindered scientific input in management decisions; and (3) stakeholders showed positive perceptions about the involvement of scientists in protected areas and expressed their willingness to collaborate in conservation practice. This collaborative research helped to (1) identify gaps and opportunities that should be addressed for increasing the effectiveness of the institutional mechanisms and (2) support institutional changes integrating science-based strategies for strengthening scientific input in decision-making. These insights provide a useful contextual orientation for scholars and decision makers interested in conducting empirical research to connect scientific inputs with operational aspects of the management cycle in other institutional settings around the world.
Statistical tests and identifiability conditions for pooling and analyzing multisite datasets

PubMed Central

Zhou, Hao Henry; Singh, Vikas; Johnson, Sterling C.; Wahba, Grace

2018-01-01

When sample sizes are small, the ability to identify weak (but scientifically interesting) associations between a set of predictors and a response may be enhanced by pooling existing datasets. However, variations in acquisition methods and the distribution of participants or observations between datasets, especially due to the distributional shifts in some predictors, may obfuscate real effects when datasets are combined. We present a rigorous statistical treatment of this problem and identify conditions where we can correct the distributional shift. We also provide an algorithm for the situation where the correction is identifiable. We analyze various properties of the framework for testing model fit, constructing confidence intervals, and evaluating consistency characteristics. Our technical development is motivated by Alzheimer’s disease (AD) studies, and we present empirical results showing that our framework enables harmonizing of protein biomarkers, even when the assays across sites differ. Our contribution may, in part, mitigate a bottleneck that researchers face in clinical research when pooling smaller sized datasets and may offer benefits when the subjects of interest are difficult to recruit or when resources prohibit large single-site studies. PMID:29386387
Statistical tests and identifiability conditions for pooling and analyzing multisite datasets.

PubMed

Zhou, Hao Henry; Singh, Vikas; Johnson, Sterling C; Wahba, Grace

2018-02-13

When sample sizes are small, the ability to identify weak (but scientifically interesting) associations between a set of predictors and a response may be enhanced by pooling existing datasets. However, variations in acquisition methods and the distribution of participants or observations between datasets, especially due to the distributional shifts in some predictors, may obfuscate real effects when datasets are combined. We present a rigorous statistical treatment of this problem and identify conditions where we can correct the distributional shift. We also provide an algorithm for the situation where the correction is identifiable. We analyze various properties of the framework for testing model fit, constructing confidence intervals, and evaluating consistency characteristics. Our technical development is motivated by Alzheimer's disease (AD) studies, and we present empirical results showing that our framework enables harmonizing of protein biomarkers, even when the assays across sites differ. Our contribution may, in part, mitigate a bottleneck that researchers face in clinical research when pooling smaller sized datasets and may offer benefits when the subjects of interest are difficult to recruit or when resources prohibit large single-site studies. Copyright © 2018 the Author(s). Published by PNAS.
A dataset of human decision-making in teamwork management.

PubMed

Yu, Han; Shen, Zhiqi; Miao, Chunyan; Leung, Cyril; Chen, Yiqiang; Fauvel, Simon; Lin, Jun; Cui, Lizhen; Pan, Zhengxiang; Yang, Qiang

2017-01-17

Today, most endeavours require teamwork by people with diverse skills and characteristics. In managing teamwork, decisions are often made under uncertainty and resource constraints. The strategies and the effectiveness of the strategies different people adopt to manage teamwork under different situations have not yet been fully explored, partially due to a lack of detailed large-scale data. In this paper, we describe a multi-faceted large-scale dataset to bridge this gap. It is derived from a game simulating complex project management processes. It presents the participants with different conditions in terms of team members' capabilities and task characteristics for them to exhibit their decision-making strategies. The dataset contains detailed data reflecting the decision situations, decision strategies, decision outcomes, and the emotional responses of 1,144 participants from diverse backgrounds. To our knowledge, this is the first dataset simultaneously covering these four facets of decision-making. With repeated measurements, the dataset may help establish baseline variability of decision-making in teamwork management, leading to more realistic decision theoretic models and more effective decision support approaches.
A dataset of human decision-making in teamwork management

PubMed Central

Yu, Han; Shen, Zhiqi; Miao, Chunyan; Leung, Cyril; Chen, Yiqiang; Fauvel, Simon; Lin, Jun; Cui, Lizhen; Pan, Zhengxiang; Yang, Qiang

2017-01-01

Today, most endeavours require teamwork by people with diverse skills and characteristics. In managing teamwork, decisions are often made under uncertainty and resource constraints. The strategies and the effectiveness of the strategies different people adopt to manage teamwork under different situations have not yet been fully explored, partially due to a lack of detailed large-scale data. In this paper, we describe a multi-faceted large-scale dataset to bridge this gap. It is derived from a game simulating complex project management processes. It presents the participants with different conditions in terms of team members’ capabilities and task characteristics for them to exhibit their decision-making strategies. The dataset contains detailed data reflecting the decision situations, decision strategies, decision outcomes, and the emotional responses of 1,144 participants from diverse backgrounds. To our knowledge, this is the first dataset simultaneously covering these four facets of decision-making. With repeated measurements, the dataset may help establish baseline variability of decision-making in teamwork management, leading to more realistic decision theoretic models and more effective decision support approaches. PMID:28094787
A dataset of human decision-making in teamwork management

NASA Astrophysics Data System (ADS)

Yu, Han; Shen, Zhiqi; Miao, Chunyan; Leung, Cyril; Chen, Yiqiang; Fauvel, Simon; Lin, Jun; Cui, Lizhen; Pan, Zhengxiang; Yang, Qiang

2017-01-01

Today, most endeavours require teamwork by people with diverse skills and characteristics. In managing teamwork, decisions are often made under uncertainty and resource constraints. The strategies and the effectiveness of the strategies different people adopt to manage teamwork under different situations have not yet been fully explored, partially due to a lack of detailed large-scale data. In this paper, we describe a multi-faceted large-scale dataset to bridge this gap. It is derived from a game simulating complex project management processes. It presents the participants with different conditions in terms of team members' capabilities and task characteristics for them to exhibit their decision-making strategies. The dataset contains detailed data reflecting the decision situations, decision strategies, decision outcomes, and the emotional responses of 1,144 participants from diverse backgrounds. To our knowledge, this is the first dataset simultaneously covering these four facets of decision-making. With repeated measurements, the dataset may help establish baseline variability of decision-making in teamwork management, leading to more realistic decision theoretic models and more effective decision support approaches.
Exploring Learners' Beliefs about Science Reading and Scientific Epistemic Beliefs, and Their Relations with Science Text Understanding

ERIC Educational Resources Information Center

Yang, Fang-Ying; Chang, Cheng-Chieh; Chen, Li-Ling; Chen, Yi-Chun

2016-01-01

The main purpose of this study was to explore learners' beliefs about science reading and scientific epistemic beliefs, and how these beliefs were associating with their understanding of science texts. About 400 10th graders were involved in the development and validation of the Beliefs about Science Reading Inventory (BSRI). To find the effects…
Operational use of spaceborne lidar datasets

NASA Astrophysics Data System (ADS)

Marenco, Franco; Halloran, Gemma; Forsythe, Mary

2018-04-01

The Met Office plans to use space lidar datasets from CALIPSO, CATS, Aeolus and EarthCARE operationally in near real time (NRT), for the detection of aerosols. The first step is the development of NRT imagery for nowcasting of volcanic events, air quality, and mineral dust episodes. Model verification and possibly assimilation will be explored. Assimilation trials of Aeolus winds are also planned. Here we will present our first in-house imagery and our operational requirements.
In Situ Resource Utilization Technologies for Enhancing and Expanding Mars Scientific and Exploration Missions

NASA Technical Reports Server (NTRS)

Sridhar, K. R.; Finn, J. E.

2000-01-01

The primary objectives of the Mars exploration program are to collect data for planetary science in a quest to answer questions related to Origins, to search for evidence of extinct and extant life, and to expand the human presence in the solar system. The public and political engagement that is critical for support of a Mars exploration program is based on all of these objectives. In order to retain and to build public and political support, it is important for NASA to have an integrated Mars exploration plan, not separate robotic and human plans that exist in parallel or in sequence. The resolutions stemming from the current architectural review and prioritization of payloads may be pivotal in determining whether NASA will have such a unified plan and retain public support. There are several potential scientific and technological links between the robotic-only missions that have been flown and planned to date, and the combined robotic and human missions that will come in the future. Taking advantage of and leveraging those links are central to the idea of a unified Mars exploration plan. One such link is in situ resource utilization (ISRU) as an enabling technology to provide consumables such as fuels, oxygen, sweep and utility gases from the Mars atmosphere.
Global Data Spatially Interrelate System for Scientific Big Data Spatial-Seamless Sharing

NASA Astrophysics Data System (ADS)

Yu, J.; Wu, L.; Yang, Y.; Lei, X.; He, W.

2014-04-01

A good data sharing system with spatial-seamless services will prevent the scientists from tedious, boring, and time consuming work of spatial transformation, and hence encourage the usage of the scientific data, and increase the scientific innovation. Having been adopted as the framework of Earth datasets by Group on Earth Observation (GEO), Earth System Spatial Grid (ESSG) is potential to be the spatial reference of the Earth datasets. Based on the implementation of ESSG, SDOG-ESSG, a data sharing system named global data spatially interrelate system (GASE) was design to make the data sharing spatial-seamless. The architecture of GASE was introduced. The implementation of the two key components, V-Pools, and interrelating engine, and the prototype is presented. Any dataset is firstly resampled into SDOG-ESSG, and is divided into small blocks, and then are mapped into hierarchical system of the distributed file system in V-Pools, which together makes the data serving at a uniform spatial reference and at a high efficiency. Besides, the datasets from different data centres are interrelated by the interrelating engine at the uniform spatial reference of SDOGESSG, which enables the system to sharing the open datasets in the internet spatial-seamless.
Exploring Korean Middle School Students' View about Scientific Inquiry

ERIC Educational Resources Information Center

Yang, Il-Ho; Park, Sang-Woo; Shin, Jung-Yun; Lim, Sung-Man

2017-01-01

The aim of this study is to examine Korean middle school students' view about scientific inquiry with the Views about Scientific Inquiry (VASI) questionnaire, an instrument that deals with eight aspects of scientific inquiry. 282 Korean middle school students participated in this study, and their responses were classified as informed, mixed, and…
Statistical Reference Datasets

National Institute of Standards and Technology Data Gateway

Statistical Reference Datasets (Web, free access) The Statistical Reference Datasets is also supported by the Standard Reference Data Program. The purpose of this project is to improve the accuracy of statistical software by providing reference datasets with certified computational results that enable the objective evaluation of statistical software.
Hands-on and Online: Scientific Explorations through Distance Learning

ERIC Educational Resources Information Center

Mawn, Mary V.; Carrico, Pauline; Charuk, Ken; Stote, Kim S.; Lawrence, Betty

2011-01-01

Laboratory experiments are often considered the defining characteristic of science courses. Such activities provide students with real-world contexts for applying scientific concepts, while also allowing them to develop scientific ways of thinking and promoting an interest in science. In recent years, an increasing number of campuses have moved…
Dataset Lifecycle Policy

NASA Technical Reports Server (NTRS)

Armstrong, Edward; Tauer, Eric

2013-01-01

The presentation focused on describing a new dataset lifecycle policy that the NASA Physical Oceanography DAAC (PO.DAAC) has implemented for its new and current datasets to foster improved stewardship and consistency across its archive. The overarching goal is to implement this dataset lifecycle policy for all new GHRSST GDS2 datasets and bridge the mission statements from the GHRSST Project Office and PO.DAAC to provide the best quality SST data in a cost-effective, efficient manner, preserving its integrity so that it will be available and usable to a wide audience.
A test-retest dataset for assessing long-term reliability of brain morphology and resting-state brain activity.

PubMed

Huang, Lijie; Huang, Taicheng; Zhen, Zonglei; Liu, Jia

2016-03-15

We present a test-retest dataset for evaluation of long-term reliability of measures from structural and resting-state functional magnetic resonance imaging (sMRI and rfMRI) scans. The repeated scan dataset was collected from 61 healthy adults in two sessions using highly similar imaging parameters at an interval of 103-189 days. However, as the imaging parameters were not completely identical, the reliability estimated from this dataset shall reflect the lower bounds of the true reliability of sMRI/rfMRI measures. Furthermore, in conjunction with other test-retest datasets, our dataset may help explore the impact of different imaging parameters on reliability of sMRI/rfMRI measures, which is especially critical for assessing datasets collected from multiple centers. In addition, intelligence quotient (IQ) was measured for each participant using Raven's Advanced Progressive Matrices. The data can thus be used for purposes other than assessing reliability of sMRI/rfMRI alone. For example, data from each single session could be used to associate structural and functional measures of the brain with the IQ metrics to explore brain-IQ association.
Exploring the Impacts of Cognitive and Metacognitive Prompting on Students' Scientific Inquiry Practices within an E-Learning Environment

ERIC Educational Resources Information Center

Zhang, Wen-Xin; Hsu, Ying-Shao; Wang, Chia-Yu; Ho, Yu-Ting

2015-01-01

This study explores the effects of metacognitive and cognitive prompting on the scientific inquiry practices of students with various levels of initial metacognition. Two junior high school classes participated in this study. One class, the experimental group (n?=?26), which received an inquiry-based curriculum with a combination of cognitive and…

Access to primary care for socio-economically disadvantaged older people in rural areas: exploring realist theory using structural equation modelling in a linked dataset.

PubMed

Ford, John A; Jones, Andy; Wong, Geoff; Clark, Allan; Porter, Tom; Steel, Nick

2018-06-19

Realist approaches seek to answer questions such as 'how?', 'why?', 'for whom?', 'in what circumstances?' and 'to what extent?' interventions 'work' using context-mechanism-outcome (CMO) configurations. Quantitative methods are not well-established in realist approaches, but structural equation modelling (SEM) may be useful to explore CMO configurations. Our aim was to assess the feasibility and appropriateness of SEM to explore CMO configurations and, if appropriate, make recommendations based on our access to primary care research. Our specific objectives were to map variables from two large population datasets to CMO configurations from our realist review looking at access to primary care, generate latent variables where needed, and use SEM to quantitatively test the CMO configurations. A linked dataset was created by merging individual patient data from the English Longitudinal Study of Ageing and practice data from the GP Patient Survey. Patients registered in rural practices and who were in the highest deprivation tertile were included. Three latent variables were defined using confirmatory factor analysis. SEM was used to explore the nine full CMOs. All models were estimated using robust maximum likelihoods and accounted for clustering at practice level. Ordinal variables were treated as continuous to ensure convergence. We successfully explored our CMO configurations, but analysis was limited because of data availability. Two hundred seventy-six participants were included. We found a statistically significant direct (context to outcome) or indirect effect (context to outcome via mechanism) for two of nine CMOs. The strongest association was between 'ease of getting through to the surgery' and 'being able to get an appointment' with an indirect mediated effect through convenience (proportion of the indirect effect of the total was 21%). Healthcare experience was not directly associated with getting an appointment, but there was a statistically significant
Extra-Vehicular Activity (EVA) and Mission Support Center (MSC) Design Elements for Future Human Scientific Exploration of Our Solar System

NASA Astrophysics Data System (ADS)

Miller, M. J.; Abercromby, A. F. J.; Chappell, S.; Beaton, K.; Kobs Nawotniak, S.; Brady, A. L.; Garry, W. B.; Lim, D. S. S.

2017-02-01

For future missions, there is a need to better understand how we can merge EVA operations concepts with the established purpose of performing scientific exploration and examine how human spaceflight could be successful under communication latency.
Inter-comparison of multiple statistically downscaled climate datasets for the Pacific Northwest, USA

PubMed Central

Jiang, Yueyang; Kim, John B.; Still, Christopher J.; Kerns, Becky K.; Kline, Jeffrey D.; Cunningham, Patrick G.

2018-01-01

Statistically downscaled climate data have been widely used to explore possible impacts of climate change in various fields of study. Although many studies have focused on characterizing differences in the downscaling methods, few studies have evaluated actual downscaled datasets being distributed publicly. Spatially focusing on the Pacific Northwest, we compare five statistically downscaled climate datasets distributed publicly in the US: ClimateNA, NASA NEX-DCP30, MACAv2-METDATA, MACAv2-LIVNEH and WorldClim. We compare the downscaled projections of climate change, and the associated observational data used as training data for downscaling. We map and quantify the variability among the datasets and characterize the spatio-temporal patterns of agreement and disagreement among the datasets. Pair-wise comparisons of datasets identify the coast and high-elevation areas as areas of disagreement for temperature. For precipitation, high-elevation areas, rainshadows and the dry, eastern portion of the study area have high dissimilarity among the datasets. By spatially aggregating the variability measures into watersheds, we develop guidance for selecting datasets within the Pacific Northwest climate change impact studies. PMID:29461513
Inter-comparison of multiple statistically downscaled climate datasets for the Pacific Northwest, USA.

PubMed

Jiang, Yueyang; Kim, John B; Still, Christopher J; Kerns, Becky K; Kline, Jeffrey D; Cunningham, Patrick G

2018-02-20

Statistically downscaled climate data have been widely used to explore possible impacts of climate change in various fields of study. Although many studies have focused on characterizing differences in the downscaling methods, few studies have evaluated actual downscaled datasets being distributed publicly. Spatially focusing on the Pacific Northwest, we compare five statistically downscaled climate datasets distributed publicly in the US: ClimateNA, NASA NEX-DCP30, MACAv2-METDATA, MACAv2-LIVNEH and WorldClim. We compare the downscaled projections of climate change, and the associated observational data used as training data for downscaling. We map and quantify the variability among the datasets and characterize the spatio-temporal patterns of agreement and disagreement among the datasets. Pair-wise comparisons of datasets identify the coast and high-elevation areas as areas of disagreement for temperature. For precipitation, high-elevation areas, rainshadows and the dry, eastern portion of the study area have high dissimilarity among the datasets. By spatially aggregating the variability measures into watersheds, we develop guidance for selecting datasets within the Pacific Northwest climate change impact studies.
The Problem with Big Data: Operating on Smaller Datasets to Bridge the Implementation Gap.

PubMed

Mann, Richard P; Mushtaq, Faisal; White, Alan D; Mata-Cervantes, Gabriel; Pike, Tom; Coker, Dalton; Murdoch, Stuart; Hiles, Tim; Smith, Clare; Berridge, David; Hinchliffe, Suzanne; Hall, Geoff; Smye, Stephen; Wilkie, Richard M; Lodge, J Peter A; Mon-Williams, Mark

2016-01-01

Big datasets have the potential to revolutionize public health. However, there is a mismatch between the political and scientific optimism surrounding big data and the public's perception of its benefit. We suggest a systematic and concerted emphasis on developing models derived from smaller datasets to illustrate to the public how big data can produce tangible benefits in the long term. In order to highlight the immediate value of a small data approach, we produced a proof-of-concept model predicting hospital length of stay. The results demonstrate that existing small datasets can be used to create models that generate a reasonable prediction, facilitating health-care delivery. We propose that greater attention (and funding) needs to be directed toward the utilization of existing information resources in parallel with current efforts to create and exploit "big data."
A multimodal MRI dataset of professional chess players.

PubMed

Li, Kaiming; Jiang, Jing; Qiu, Lihua; Yang, Xun; Huang, Xiaoqi; Lui, Su; Gong, Qiyong

2015-01-01

Chess is a good model to study high-level human brain functions such as spatial cognition, memory, planning, learning and problem solving. Recent studies have demonstrated that non-invasive MRI techniques are valuable for researchers to investigate the underlying neural mechanism of playing chess. For professional chess players (e.g., chess grand masters and masters or GM/Ms), what are the structural and functional alterations due to long-term professional practice, and how these alterations relate to behavior, are largely veiled. Here, we report a multimodal MRI dataset from 29 professional Chinese chess players (most of whom are GM/Ms), and 29 age matched novices. We hope that this dataset will provide researchers with new materials to further explore high-level human brain functions.
Exploring the Philosophical Underpinnings of Research: Relating Ontology and Epistemology to the Methodology and Methods of the Scientific, Interpretive, and Critical Research Paradigms

ERIC Educational Resources Information Center

Scotland, James

2012-01-01

This paper explores the philosophical underpinnings of three major educational research paradigms: scientific, interpretive, and critical. The aim was to outline and explore the interrelationships between each paradigm's ontology, epistemology, methodology and methods. This paper reveals and then discusses some of the underlying assumptions of…
A global dataset of crowdsourced land cover and land use reference data.

PubMed

Fritz, Steffen; See, Linda; Perger, Christoph; McCallum, Ian; Schill, Christian; Schepaschenko, Dmitry; Duerauer, Martina; Karner, Mathias; Dresel, Christopher; Laso-Bayas, Juan-Carlos; Lesiv, Myroslava; Moorthy, Inian; Salk, Carl F; Danylo, Olha; Sturn, Tobias; Albrecht, Franziska; You, Liangzhi; Kraxner, Florian; Obersteiner, Michael

2017-06-13

Global land cover is an essential climate variable and a key biophysical driver for earth system models. While remote sensing technology, particularly satellites, have played a key role in providing land cover datasets, large discrepancies have been noted among the available products. Global land use is typically more difficult to map and in many cases cannot be remotely sensed. In-situ or ground-based data and high resolution imagery are thus an important requirement for producing accurate land cover and land use datasets and this is precisely what is lacking. Here we describe the global land cover and land use reference data derived from the Geo-Wiki crowdsourcing platform via four campaigns. These global datasets provide information on human impact, land cover disagreement, wilderness and land cover and land use. Hence, they are relevant for the scientific community that requires reference data for global satellite-derived products, as well as those interested in monitoring global terrestrial ecosystems in general.
A global dataset of crowdsourced land cover and land use reference data

PubMed Central

Fritz, Steffen; See, Linda; Perger, Christoph; McCallum, Ian; Schill, Christian; Schepaschenko, Dmitry; Duerauer, Martina; Karner, Mathias; Dresel, Christopher; Laso-Bayas, Juan-Carlos; Lesiv, Myroslava; Moorthy, Inian; Salk, Carl F.; Danylo, Olha; Sturn, Tobias; Albrecht, Franziska; You, Liangzhi; Kraxner, Florian; Obersteiner, Michael

2017-01-01

Global land cover is an essential climate variable and a key biophysical driver for earth system models. While remote sensing technology, particularly satellites, have played a key role in providing land cover datasets, large discrepancies have been noted among the available products. Global land use is typically more difficult to map and in many cases cannot be remotely sensed. In-situ or ground-based data and high resolution imagery are thus an important requirement for producing accurate land cover and land use datasets and this is precisely what is lacking. Here we describe the global land cover and land use reference data derived from the Geo-Wiki crowdsourcing platform via four campaigns. These global datasets provide information on human impact, land cover disagreement, wilderness and land cover and land use. Hence, they are relevant for the scientific community that requires reference data for global satellite-derived products, as well as those interested in monitoring global terrestrial ecosystems in general. PMID:28608851
Datasets2Tools, repository and search engine for bioinformatics datasets, tools and canned analyses

PubMed Central

Torre, Denis; Krawczuk, Patrycja; Jagodnik, Kathleen M.; Lachmann, Alexander; Wang, Zichen; Wang, Lily; Kuleshov, Maxim V.; Ma’ayan, Avi

2018-01-01

Biomedical data repositories such as the Gene Expression Omnibus (GEO) enable the search and discovery of relevant biomedical digital data objects. Similarly, resources such as OMICtools, index bioinformatics tools that can extract knowledge from these digital data objects. However, systematic access to pre-generated ‘canned’ analyses applied by bioinformatics tools to biomedical digital data objects is currently not available. Datasets2Tools is a repository indexing 31,473 canned bioinformatics analyses applied to 6,431 datasets. The Datasets2Tools repository also contains the indexing of 4,901 published bioinformatics software tools, and all the analyzed datasets. Datasets2Tools enables users to rapidly find datasets, tools, and canned analyses through an intuitive web interface, a Google Chrome extension, and an API. Furthermore, Datasets2Tools provides a platform for contributing canned analyses, datasets, and tools, as well as evaluating these digital objects according to their compliance with the findable, accessible, interoperable, and reusable (FAIR) principles. By incorporating community engagement, Datasets2Tools promotes sharing of digital resources to stimulate the extraction of knowledge from biomedical research data. Datasets2Tools is freely available from: http://amp.pharm.mssm.edu/datasets2tools. PMID:29485625
Datasets2Tools, repository and search engine for bioinformatics datasets, tools and canned analyses.

PubMed

Torre, Denis; Krawczuk, Patrycja; Jagodnik, Kathleen M; Lachmann, Alexander; Wang, Zichen; Wang, Lily; Kuleshov, Maxim V; Ma'ayan, Avi

2018-02-27

Biomedical data repositories such as the Gene Expression Omnibus (GEO) enable the search and discovery of relevant biomedical digital data objects. Similarly, resources such as OMICtools, index bioinformatics tools that can extract knowledge from these digital data objects. However, systematic access to pre-generated 'canned' analyses applied by bioinformatics tools to biomedical digital data objects is currently not available. Datasets2Tools is a repository indexing 31,473 canned bioinformatics analyses applied to 6,431 datasets. The Datasets2Tools repository also contains the indexing of 4,901 published bioinformatics software tools, and all the analyzed datasets. Datasets2Tools enables users to rapidly find datasets, tools, and canned analyses through an intuitive web interface, a Google Chrome extension, and an API. Furthermore, Datasets2Tools provides a platform for contributing canned analyses, datasets, and tools, as well as evaluating these digital objects according to their compliance with the findable, accessible, interoperable, and reusable (FAIR) principles. By incorporating community engagement, Datasets2Tools promotes sharing of digital resources to stimulate the extraction of knowledge from biomedical research data. Datasets2Tools is freely available from: http://amp.pharm.mssm.edu/datasets2tools.
Measuring the effectiveness of scientific gatekeeping.

PubMed

Siler, Kyle; Lee, Kirby; Bero, Lisa

2015-01-13

Peer review is the main institution responsible for the evaluation and gestation of scientific research. Although peer review is widely seen as vital to scientific evaluation, anecdotal evidence abounds of gatekeeping mistakes in leading journals, such as rejecting seminal contributions or accepting mediocre submissions. Systematic evidence regarding the effectiveness--or lack thereof--of scientific gatekeeping is scant, largely because access to rejected manuscripts from journals is rarely available. Using a dataset of 1,008 manuscripts submitted to three elite medical journals, we show differences in citation outcomes for articles that received different appraisals from editors and peer reviewers. Among rejected articles, desk-rejected manuscripts, deemed as unworthy of peer review by editors, received fewer citations than those sent for peer review. Among both rejected and accepted articles, manuscripts with lower scores from peer reviewers received relatively fewer citations when they were eventually published. However, hindsight reveals numerous questionable gatekeeping decisions. Of the 808 eventually published articles in our dataset, our three focal journals rejected many highly cited manuscripts, including the 14 most popular; roughly the top 2 percent. Of those 14 articles, 12 were desk-rejected. This finding raises concerns regarding whether peer review is ill--suited to recognize and gestate the most impactful ideas and research. Despite this finding, results show that in our case studies, on the whole, there was value added in peer review. Editors and peer reviewers generally--but not always-made good decisions regarding the identification and promotion of quality in scientific manuscripts.
Making Sense of Scientific Biographies: Scientific Achievement, Nature of Science, and Storylines in College Students' Essays

ERIC Educational Resources Information Center

Hwang, Seyoung

2015-01-01

In this article, the educative value of scientific biographies will be explored, especially for non-science major college students. During the "Scientist's life and thought" course, 66 college students read nine scientific biographies including five biologists, covering the canonical scientific achievements in Western scientific history.…
Ice-Penetrating Robot for Scientific Exploration

NASA Technical Reports Server (NTRS)

Zimmerman, Wayne; Carsey, Frank; French, Lloyd

2007-01-01

The cryo-hydro integrated robotic penetrator system (CHIRPS) is a partially developed instrumentation system that includes a probe designed to deeply penetrate the European ice sheet in a search for signs of life. The CHIRPS could also be used on Earth for similar exploration of the polar ice caps especially at Lake Vostok in Antarctica. The CHIRPS probe advances downward by a combination of simple melting of ice (typically for upper, non-compacted layers of an ice sheet) or by a combination of melting of ice and pumping of meltwater (typically, for deeper, compacted layers). The heat and electric power for melting, pumping, and operating all of the onboard instrumentation and electronic circuitry are supplied by radioisotope power sources (RPSs) and thermoelectric converters energized by the RPSs. The instrumentation and electronic circuitry includes miniature guidance and control sensors and an advanced autonomous control system that has fault-management capabilities. The CHIRPS probe is about 1 m long and 15 cm in diameter. The RPSs generate a total thermal power of 1.8 kW. Initially, as this power melts the surrounding ice, a meltwater jacket about 1 mm thick forms around the probe. The center of gravity of the probe is well forward (down), so that the probe is vertically stabilized like a pendulum. Heat is circulated to the nose by means of miniature pumps and heat pipes. The probe melts ice to advance in a step-wise manner: Heat is applied to the nose to open up a melt void, then heat is applied to the side to allow the probe to slip down into the melt void. The melt void behind the probe is allowed to re-freeze. Four quadrant heaters on the nose and another four quadrant heaters on the rear (upper) surface of the probe are individually controllable for steering: Turning on two adjacent nose heaters on the nose and two adjacent heaters on the opposite side at the rear causes melt voids to form on opposing sides, such that the probe descends at an angle from
Learning to recognize rat social behavior: Novel dataset and cross-dataset application.

PubMed

Lorbach, Malte; Kyriakou, Elisavet I; Poppe, Ronald; van Dam, Elsbeth A; Noldus, Lucas P J J; Veltkamp, Remco C

2018-04-15

Social behavior is an important aspect of rodent models. Automated measuring tools that make use of video analysis and machine learning are an increasingly attractive alternative to manual annotation. Because machine learning-based methods need to be trained, it is important that they are validated using data from different experiment settings. To develop and validate automated measuring tools, there is a need for annotated rodent interaction datasets. Currently, the availability of such datasets is limited to two mouse datasets. We introduce the first, publicly available rat social interaction dataset, RatSI. We demonstrate the practical value of the novel dataset by using it as the training set for a rat interaction recognition method. We show that behavior variations induced by the experiment setting can lead to reduced performance, which illustrates the importance of cross-dataset validation. Consequently, we add a simple adaptation step to our method and improve the recognition performance. Most existing methods are trained and evaluated in one experimental setting, which limits the predictive power of the evaluation to that particular setting. We demonstrate that cross-dataset experiments provide more insight in the performance of classifiers. With our novel, public dataset we encourage the development and validation of automated recognition methods. We are convinced that cross-dataset validation enhances our understanding of rodent interactions and facilitates the development of more sophisticated recognition methods. Combining them with adaptation techniques may enable us to apply automated recognition methods to a variety of animals and experiment settings. Copyright © 2017 Elsevier B.V. All rights reserved.
The Transcriptome Analysis and Comparison Explorer--T-ACE: a platform-independent, graphical tool to process large RNAseq datasets of non-model organisms.

PubMed

Philipp, E E R; Kraemer, L; Mountfort, D; Schilhabel, M; Schreiber, S; Rosenstiel, P

2012-03-15

Next generation sequencing (NGS) technologies allow a rapid and cost-effective compilation of large RNA sequence datasets in model and non-model organisms. However, the storage and analysis of transcriptome information from different NGS platforms is still a significant bottleneck, leading to a delay in data dissemination and subsequent biological understanding. Especially database interfaces with transcriptome analysis modules going beyond mere read counts are missing. Here, we present the Transcriptome Analysis and Comparison Explorer (T-ACE), a tool designed for the organization and analysis of large sequence datasets, and especially suited for transcriptome projects of non-model organisms with little or no a priori sequence information. T-ACE offers a TCL-based interface, which accesses a PostgreSQL database via a php-script. Within T-ACE, information belonging to single sequences or contigs, such as annotation or read coverage, is linked to the respective sequence and immediately accessible. Sequences and assigned information can be searched via keyword- or BLAST-search. Additionally, T-ACE provides within and between transcriptome analysis modules on the level of expression, GO terms, KEGG pathways and protein domains. Results are visualized and can be easily exported for external analysis. We developed T-ACE for laboratory environments, which have only a limited amount of bioinformatics support, and for collaborative projects in which different partners work on the same dataset from different locations or platforms (Windows/Linux/MacOS). For laboratories with some experience in bioinformatics and programming, the low complexity of the database structure and open-source code provides a framework that can be customized according to the different needs of the user and transcriptome project.
Explore with Us

NASA Technical Reports Server (NTRS)

Morales, Lester

2012-01-01

The fundamental goal of this vision is to advance U.S. scientific, security and economic interest through a robust space exploration program. Implement a sustained and affordable human and robotic program to explore the solar system and beyond. Extend human presence across the solar system, starting with a human return to the Moon by the year 2020, in preparation for human exploration of Mars and other destinations. Develop the innovative technologies, knowledge, and infrastructures both to explore and to support decisions about the destinations for human exploration. Promote international and commercial participation in exploration to further U.S. scientific, security, and economic interests.
Politics and the Erosion of Federal Scientific Capacity: Restoring Scientific Integrity to Public Health Science

PubMed Central

Rest, Kathleen M.; Halpern, Michael H.

2007-01-01

Our nation’s health and prosperity are based on a foundation of independent scientific discovery. Yet in recent years, political interference in federal government science has become widespread, threatening this legacy. We explore the ways science has been misused, the attempts to measure the pervasiveness of this problem, and the effects on our long-term capacity to meet today’s most complex public health challenges. Good government and a functioning democracy require public policy decisions to be informed by independent science. The scientific and public health communities must speak out to defend taxpayer-funded science from political interference. Encouragingly, both the scientific community and Congress are exploring ways to restore scientific integrity to federal policymaking. PMID:17901422
Politics and the erosion of federal scientific capacity: restoring scientific integrity to public health science.

PubMed

Rest, Kathleen M; Halpern, Michael H

2007-11-01

Our nation's health and prosperity are based on a foundation of independent scientific discovery. Yet in recent years, political interference in federal government science has become widespread, threatening this legacy. We explore the ways science has been misused, the attempts to measure the pervasiveness of this problem, and the effects on our long-term capacity to meet today's most complex public health challenges. Good government and a functioning democracy require public policy decisions to be informed by independent science. The scientific and public health communities must speak out to defend taxpayer-funded science from political interference. Encouragingly, both the scientific community and Congress are exploring ways to restore scientific integrity to federal policymaking.
Ontology-Driven Discovery of Scientific Computational Entities

ERIC Educational Resources Information Center

Brazier, Pearl W.

2010-01-01

Many geoscientists use modern computational resources, such as software applications, Web services, scientific workflows and datasets that are readily available on the Internet, to support their research and many common tasks. These resources are often shared via human contact and sometimes stored in data portals; however, they are not necessarily…

Being an honest broker of hydrology: Uncovering, communicating and addressing model error in a climate change streamflow dataset

NASA Astrophysics Data System (ADS)

Chegwidden, O.; Nijssen, B.; Pytlak, E.

2017-12-01

Any model simulation has errors, including errors in meteorological data, process understanding, model structure, and model parameters. These errors may express themselves as bias, timing lags, and differences in sensitivity between the model and the physical world. The evaluation and handling of these errors can greatly affect the legitimacy, validity and usefulness of the resulting scientific product. In this presentation we will discuss a case study of handling and communicating model errors during the development of a hydrologic climate change dataset for the Pacific Northwestern United States. The dataset was the result of a four-year collaboration between the University of Washington, Oregon State University, the Bonneville Power Administration, the United States Army Corps of Engineers and the Bureau of Reclamation. Along the way, the partnership facilitated the discovery of multiple systematic errors in the streamflow dataset. Through an iterative review process, some of those errors could be resolved. For the errors that remained, honest communication of the shortcomings promoted the dataset's legitimacy. Thoroughly explaining errors also improved ways in which the dataset would be used in follow-on impact studies. Finally, we will discuss the development of the "streamflow bias-correction" step often applied to climate change datasets that will be used in impact modeling contexts. We will describe the development of a series of bias-correction techniques through close collaboration among universities and stakeholders. Through that process, both universities and stakeholders learned about the others' expectations and workflows. This mutual learning process allowed for the development of methods that accommodated the stakeholders' specific engineering requirements. The iterative revision process also produced a functional and actionable dataset while preserving its scientific merit. We will describe how encountering earlier techniques' pitfalls allowed us
Teaching through Trade Books: Recording Scientific Explorations

ERIC Educational Resources Information Center

Royce, Christine Anne

2016-01-01

Keeping a log of scientific investigations, discoveries, and notes is a process that scientists have used throughout history. Elementary-age children engage in similar types of documentation when they perform investigations and sketch, label, or provide details about their work and findings. This column includes activities inspired by children's…
CROSS DRIVE: A Collaborative and Distributed Virtual Environment for Exploitation of Atmospherical and Geological Datasets of Mars

NASA Astrophysics Data System (ADS)

Cencetti, Michele

2016-07-01

European space exploration missions have produced huge data sets of potentially immense value for research as well as for planning and operating future missions. For instance, Mars Exploration programs comprise a series of missions with launches ranging from the past to beyond present, which are anticipated to produce exceptional volumes of data which provide prospects for research breakthroughs and advancing further activities in space. These collected data include a variety of information, such as imagery, topography, atmospheric, geochemical datasets and more, which has resulted in and still demands, databases, versatile visualisation tools and data reduction methods. Such rate of valuable data acquisition requires the scientists, researchers and computer scientists to coordinate their storage, processing and relevant tools to enable efficient data analysis. However, the current position is that expert teams from various disciplines, the databases and tools are fragmented, leaving little scope for unlocking its value through collaborative activities. The benefits of collaborative virtual environments have been implemented in various industrial fields allowing real-time multi-user collaborative work among people from different disciplines. Exploiting the benefits of advanced immersive virtual environments (IVE) has been recognized as an important interaction paradigm to facilitate future space exploration. The current work is mainly aimed towards the presentation of the preliminary results coming from the CROSS DRIVE project. This research received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 607177 and is mainly aimed towards the implementation of a distributed virtual workspace for collaborative scientific discovery, mission planning and operations. The purpose of the CROSS DRIVE project is to lay foundations of collaborative European workspaces for space science. It will demonstrate the feasibility and
Dynamic analysis, transformation, dissemination and applications of scientific multidimensional data in ArcGIS Platform

NASA Astrophysics Data System (ADS)

Shrestha, S. R.; Collow, T. W.; Rose, B.

2016-12-01

Scientific datasets are generated from various sources and platforms but they are typically produced either by earth observation systems or by modelling systems. These are widely used for monitoring, simulating, or analyzing measurements that are associated with physical, chemical, and biological phenomena over the ocean, atmosphere, or land. A significant subset of scientific datasets stores values directly as rasters or in a form that can be rasterized. This is where a value exists at every cell in a regular grid spanning the spatial extent of the dataset. Government agencies like NOAA, NASA, EPA, USGS produces large volumes of near real-time, forecast, and historical data that drives climatological and meteorological studies, and underpins operations ranging from weather prediction to sea ice loss. Modern science is computationally intensive because of the availability of an enormous amount of scientific data, the adoption of data-driven analysis, and the need to share these dataset and research results with the public. ArcGIS as a platform is sophisticated and capable of handling such complex domain. We'll discuss constructs and capabilities applicable to multidimensional gridded data that can be conceptualized as a multivariate space-time cube. Building on the concept of a two-dimensional raster, a typical multidimensional raster dataset could contain several "slices" within the same spatial extent. We will share a case from the NOAA Climate Forecast Systems Reanalysis (CFSR) multidimensional data as an example of how large collections of rasters can be efficiently organized and managed through a data model within a geodatabase called "Mosaic dataset" and dynamically transformed and analyzed using raster functions. A raster function is a lightweight, raster-valued transformation defined over a mixed set of raster and scalar input. That means, just like any tool, you can provide a raster function with input parameters. It enables dynamic processing of only the
A sophisticated lander for scientific exploration of Mars: scientific objectives and implementation of the Mars-96 Small Station

NASA Astrophysics Data System (ADS)

Linkin, V.; Harri, A.-M.; Lipatov, A.; Belostotskaja, K.; Derbunovich, B.; Ekonomov, A.; Khloustova, L.; Kremnev, R.; Makarov, V.; Martinov, B.; Nenarokov, D.; Prostov, M.; Pustovalov, A.; Shustko, G.; Järvinen, I.; Kivilinna, H.; Korpela, S.; Kumpulainen, K.; Lehto, A.; Pellinen, R.; Pirjola, R.; Riihelä, P.; Salminen, A.; Schmidt, W.; Siili, T.; Blamont, J.; Carpentier, T.; Debus, A.; Hua, C. T.; Karczewski, J.-F.; Laplace, H.; Levacher, P.; Lognonné, Ph.; Malique, C.; Menvielle, M.; Mouli, G.; Pommereau, J.-P.; Quotb, K.; Runavot, J.; Vienne, D.; Grunthaner, F.; Kuhnke, F.; Musmann, G.; Rieder, R.; Wänke, H.; Economou, T.; Herring, M.; Lane, A.; McKay, C. P.

1998-02-01

A mission to Mars including two Small Stations, two Penetrators and an Orbiter was launched at Baikonur, Kazakhstan, on 16 November 1996. This was called the Mars-96 mission. The Small Stations were expected to land in September 1997 (L s approximately 178°), nominally to Amazonis-Arcadia region on locations (33 N, 169.4 W) and (37.6 N, 161.9W). The fourth stage of the Mars-96 launcher malfunctioned and hence the mission was lost. However, the state of the art concept of the Small Station can be applied to future Martian lander missions. Also, from the manufacturing and performance point of view, the Mars-96 Small Station could be built as such at low cost, and be fairly easily accommodated on almost any forthcoming Martian mission. This is primarily due to the very simple interface between the Small Station and the spacecraft. The Small Station is a sophisticated piece of equipment. With the total available power of approximately 400 mW the Station successfully supports an ambitious scientific program. The Station accommodates a panoramic camera, an alpha-proton-x-ray spectrometer, a seismometer, a magnetometer, an oxidant instrument, equipment for meteorological observations, and sensors for atmospheric measurement during the descent phase, including images taken by a descent phase camera. The total mass of the Small Station with payload on the Martian surface, including the airbags, is only 32 kg. Lander observations on the surface of Mars combined with data from Orbiter instruments will shed light on the contemporary Mars and its evolution. As in the Mars-96 mission, specific science goals could be exploration of the interior and surface of Mars, investigation of the structure and dynamics of the atmosphere, the role of water and other materials containing volatiles and in situ studies of the atmospheric boundary layer processes. To achieve the scientific goals of the mission the lander should carry a versatile set of instruments. The Small Station
A sophisticated lander for scientific exploration of Mars: scientific objectives and implementation of the Mars-96 Small Station.

PubMed

Linkin, V; Harri, A M; Lipatov, A; Belostotskaja, K; Derbunovich, B; Ekonomov, A; Khloustova, L; Kremnev, R; Makarov, V; Martinov, B; Nenarokov, D; Prostov, M; Pustovalov, A; Shustko, G; Jarvinen, I; Kivilinna, H; Korpela, S; Kumpulainen, K; Lehto, A; Pellinen, R; Pirjola, R; Riihela, P; Salminen, A; Schmidt, W; McKay, C P

1998-01-01

A mission to Mars including two Small Stations, two Penetrators and an Orbiter was launched at Baikonur, Kazakhstan, on 16 November 1996. This was called the Mars-96 mission. The Small Stations were expected to land in September 1997 (Ls approximately 178 degrees), nominally to Amazonis-Arcadia region on locations (33 N, 169.4 W) and (37.6 N, 161.9 W). The fourth stage of the Mars-96 launcher malfunctioned and hence the mission was lost. However, the state of the art concept of the Small Station can be applied to future Martian lander missions. Also, from the manufacturing and performance point of view, the Mars-96 Small Station could be built as such at low cost, and be fairly easily accommodated on almost any forthcoming Martian mission. This is primarily due to the very simple interface between the Small Station and the spacecraft. The Small Station is a sophisticated piece of equipment. With the total available power of approximately 400 mW the Station successfully supports an ambitious scientific program. The Station accommodates a panoramic camera, an alpha-proton-x-ray spectrometer, a seismometer, a magnetometer, an oxidant instrument, equipment for meteorological observations, and sensors for atmospheric measurement during the descent phase, including images taken by a descent phase camera. The total mass of the Small Station with payload on the Martian surface, including the airbags, is only 32 kg. Lander observations on the surface of Mars combined with data from Orbiter instruments will shed light on the contemporary Mars and its evolution. As in the Mars-96 mission, specific science goals could be exploration of the interior and surface of Mars, investigation of the structure and dynamics of the atmosphere, the role of water and other materials containing volatiles and in situ studies of the atmospheric boundary layer processes. To achieve the scientific goals of the mission the lander should carry a versatile set of instruments. The Small Station
Scientific exploration of low-gravity planetary bodies using the Highland Terrain Hopper

NASA Astrophysics Data System (ADS)

Mège, D.; Grygorczuk, J.; Gurgurewicz, J.; Wiśniewski, Ł.; Rickman, H.; Banaszkiewicz, M.; Kuciński, T.; Skocki, K.

2013-09-01

Field geoscientists need to collect three-dimensional data in order characterise the lithologic succession and structure of terrains, recontruct their evolution, and eventually reveal the history of a portion of the planet. This is achieved by walking up and down mountains and valleys, interpreting geological and geophysical traverses, and reading measures made at station located at key sites on mountain peaks or rocky promontories. These activities have been denied to conventional planetary exploration rovers because engineering constraints for landing are strong, especially in terms of allowed terrain roughness and slopes. The Highland Terrain Hopper, a new, light and robust locomotion system, addresses the challenge of accessing most areas on low-gravity planetary body for performing scientific observations and measurements, alone or as part of a hopper commando. Examples of geological applications on Mars and the Moon are given.
A TT&C Performance Simulator for Space Exploration and Scientific Satellites - Architecture and Applications

NASA Astrophysics Data System (ADS)

Donà, G.; Faletra, M.

2015-09-01

This paper presents the TT&C performance simulator toolkit developed internally at Thales Alenia Space Italia (TAS-I) to support the design of TT&C subsystems for space exploration and scientific satellites. The simulator has a modular architecture and has been designed using a model-based approach using standard engineering tools such as MATLAB/SIMULINK and mission analysis tools (e.g. STK). The simulator is easily reconfigurable to fit different types of satellites, different mission requirements and different scenarios parameters. This paper provides a brief description of the simulator architecture together with two examples of applications used to demonstrate some of the simulator’s capabilities.
Dynamics Explorer 2: Continued FPI and NACS instrument data analysis and associated scientific activity at the University of Michigan

NASA Technical Reports Server (NTRS)

Burns, Alan; Killeen, T. L.

1993-01-01

The grant entitled 'Dynamics Explorer 2 - continued FPI and NACS instrument data analysis and associated scientific activity at the University of Michigan' is a continuation of a grant that began with instrument development for the Dynamics Explorer 2 (DE 2) satellite. Over the years, many publications and presentations at scientific meetings have occurred under the aegis of this grant. This present report details the progress that has been made in the final three years of the grant. In these last 4 years of the grant 26 papers have been published or are in press and about 10 more are in preparation or have been submitted. A large number of presentations have been made in the same time span: 36 are listed in Appendix 2. Evidence of the high educational utility of this research is indicated by the list of Ph. D. and M. S. theses that have been completed in the last 3 years that have involved work connected with NAG5-465. The structure of this report is as follows: a brief synopsis of the aims of the grant NAG5-465 is given in the next section; then there is a summary of the scientific accomplishments that have occurred over the grant period; last, we make some brief concluding remarks. Reprints of articles that have recently appeared in refereed journals are appended to the end of this document.
Lakatos' Scientific Research Programmes as a Framework for Analysing Informal Argumentation about Socio-Scientific Issues

ERIC Educational Resources Information Center

Chang, Shu-Nu; Chiu, Mei-Hung

2008-01-01

The purpose of this study is to explore how Lakatos' scientific research programmes might serve as a theoretical framework for representing and evaluating informal argumentation about socio-scientific issues. Seventy undergraduate science and non-science majors were asked to make written arguments about four socio-scientific issues. Our analysis…
Exploring Scientific Information for Policy Making under Deep Uncertainty

NASA Astrophysics Data System (ADS)

Forni, L.; Galaitsi, S.; Mehta, V. K.; Escobar, M.; Purkey, D. R.; Depsky, N. J.; Lima, N. A.

2016-12-01

Each actor evaluating potential management strategies brings her/his own distinct set of objectives to a complex decision space of system uncertainties. The diversity of these objectives require detailed and rigorous analyses that responds to multifaceted challenges. However, the utility of this information depends on the accessibility of scientific information to decision makers. This paper demonstrates data visualization tools for presenting scientific results to decision makers in two case studies, La Paz/ El Alto, Bolivia, and Yuba County,California. Visualization output from the case studies combines spatiotemporal, multivariate and multirun/multiscenario information to produce information corresponding to the objectives defined by key actors and stakeholders. These tools can manage complex data and distill scientific information into accessible formats. Using the visualizations, scientists and decision makers can navigate the decision space and potential objective trade-offs to facilitate discussion and consensus building. These efforts can support identifying stable negotiatedagreements between different stakeholders.
Atlas-Guided Cluster Analysis of Large Tractography Datasets

PubMed Central

Ros, Christian; Güllmar, Daniel; Stenzel, Martin; Mentzel, Hans-Joachim; Reichenbach, Jürgen Rainer

2013-01-01

Diffusion Tensor Imaging (DTI) and fiber tractography are important tools to map the cerebral white matter microstructure in vivo and to model the underlying axonal pathways in the brain with three-dimensional fiber tracts. As the fast and consistent extraction of anatomically correct fiber bundles for multiple datasets is still challenging, we present a novel atlas-guided clustering framework for exploratory data analysis of large tractography datasets. The framework uses an hierarchical cluster analysis approach that exploits the inherent redundancy in large datasets to time-efficiently group fiber tracts. Structural information of a white matter atlas can be incorporated into the clustering to achieve an anatomically correct and reproducible grouping of fiber tracts. This approach facilitates not only the identification of the bundles corresponding to the classes of the atlas; it also enables the extraction of bundles that are not present in the atlas. The new technique was applied to cluster datasets of 46 healthy subjects. Prospects of automatic and anatomically correct as well as reproducible clustering are explored. Reconstructed clusters were well separated and showed good correspondence to anatomical bundles. Using the atlas-guided cluster approach, we observed consistent results across subjects with high reproducibility. In order to investigate the outlier elimination performance of the clustering algorithm, scenarios with varying amounts of noise were simulated and clustered with three different outlier elimination strategies. By exploiting the multithreading capabilities of modern multiprocessor systems in combination with novel algorithms, our toolkit clusters large datasets in a couple of minutes. Experiments were conducted to investigate the achievable speedup and to demonstrate the high performance of the clustering framework in a multiprocessing environment. PMID:24386292
Atlas-guided cluster analysis of large tractography datasets.

PubMed

Ros, Christian; Güllmar, Daniel; Stenzel, Martin; Mentzel, Hans-Joachim; Reichenbach, Jürgen Rainer

2013-01-01

Diffusion Tensor Imaging (DTI) and fiber tractography are important tools to map the cerebral white matter microstructure in vivo and to model the underlying axonal pathways in the brain with three-dimensional fiber tracts. As the fast and consistent extraction of anatomically correct fiber bundles for multiple datasets is still challenging, we present a novel atlas-guided clustering framework for exploratory data analysis of large tractography datasets. The framework uses an hierarchical cluster analysis approach that exploits the inherent redundancy in large datasets to time-efficiently group fiber tracts. Structural information of a white matter atlas can be incorporated into the clustering to achieve an anatomically correct and reproducible grouping of fiber tracts. This approach facilitates not only the identification of the bundles corresponding to the classes of the atlas; it also enables the extraction of bundles that are not present in the atlas. The new technique was applied to cluster datasets of 46 healthy subjects. Prospects of automatic and anatomically correct as well as reproducible clustering are explored. Reconstructed clusters were well separated and showed good correspondence to anatomical bundles. Using the atlas-guided cluster approach, we observed consistent results across subjects with high reproducibility. In order to investigate the outlier elimination performance of the clustering algorithm, scenarios with varying amounts of noise were simulated and clustered with three different outlier elimination strategies. By exploiting the multithreading capabilities of modern multiprocessor systems in combination with novel algorithms, our toolkit clusters large datasets in a couple of minutes. Experiments were conducted to investigate the achievable speedup and to demonstrate the high performance of the clustering framework in a multiprocessing environment.
The Interplay between Scientific Overlap and Cooperation and the Resulting Gain in Co-Authorship Interactions.

PubMed

Mayrose, Itay; Freilich, Shiri

2015-01-01

Considering the importance of scientific interactions, understanding the principles that govern fruitful scientific research is crucial to policy makers and scientists alike. The outcome of an interaction is to a large extent dependent on the balancing of contradicting motivations accompanying the establishment of collaborations. Here, we assembled a dataset of nearly 20,000 publications authored by researchers affiliated with ten top universities. Based on this data collection, we estimated the extent of different interaction types between pairwise combinations of researchers. We explored the interplay between the overlap in scientific interests and the tendency to collaborate, and associated these estimates with measures of scientific quality and social accessibility aiming at studying the typical resulting gain of different interaction patterns. Our results show that scientists tend to collaborate more often with colleagues with whom they share moderate to high levels of mutual interests and knowledge while cooperative tendency declines at higher levels of research-interest overlap, suggesting fierce competition, and at the lower levels, suggesting communication gaps. Whereas the relative number of alliances dramatically differs across a gradient of research overlap, the scientific impact of the resulting articles remains similar. When considering social accessibility, we find that though collaborations between remote researchers are relatively rare, their quality is significantly higher than studies produced by close-circle scientists. Since current collaboration patterns do not necessarily overlap with gaining optimal scientific quality, these findings should encourage scientists to reconsider current collaboration strategies.
Systematic Processing of Clementine Data for Scientific Analyses

NASA Technical Reports Server (NTRS)

Mcewen, A. S.

1993-01-01

If fully successful, the Clementine mission will return about 3,000,000 lunar images and more than 5000 images of Geographos. Effective scientific analyses of such large datasets require systematic processing efforts. Concepts for two such efforts are described: glogal multispectral imaging of the moon; and videos of Geographos.
Scientific Exploration of Near-Earth Objects via the Crew Exploration Vehicle

NASA Technical Reports Server (NTRS)

Abell, Paul A.; Korsmeyer, D. J.; Landis, R. R.; Lu, E.; Adamo (D.); Jones (T.); Lemke, L.; Gonzales, A.; Gershman, B.; Morrison, D.;

2007-01-01

The concept of a crewed mission to a Near-Earth Object (NEO) has been analyzed in depth in 1989 as part of the Space Exploration Initiative. Since that time two other studies have investigated the possibility of sending similar missions to NEOs. A more recent study has been sponsored by the Advanced Programs Office within NASA's Constellation Program. This study team has representatives from across NASA and is currently examining the feasibility of sending a Crew Exploration Vehicle (CEV) to a near-Earth object (NEO). The ideal mission profile would involve a crew of 2 or 3 astronauts on a 90 to 120 day flight, which would include a 7 to 14 day stay for proximity operations at the target NEO. One of the significant advantages of this type of mission is that it strengthens and validates the foundational infrastructure for the Vision for Space Exploration (VSE) and Exploration Systems Architecture Study (ESAS) in the run up to the lunar sorties at the end of the next decade (approx.2020). Sending a human expedition to a NEO, within the context of the VSE and ESAS, demonstrates the broad utility of the Constellation Program s Orion (CEV) crew capsule and Ares (CLV) launch systems. This mission would be the first human expedition to an interplanetary body outside of the cislunar system. Also, it will help NASA regain crucial operational experience conducting human exploration missions outside of low Earth orbit, which humanity has not attempted in nearly 40 years.

Scientific Misconduct.

ERIC Educational Resources Information Center

Goodstein, David

2002-01-01

Explores scientific fraud, asserting that while few scientists actually falsify results, the field has become so competitive that many are misbehaving in other ways; an example would be unreasonable criticism by anonymous peer reviewers. (EV)
LRO-LAMP Observations of Illumination Conditions in the Lunar South Pole: Multi-Dataset and Model Comparison

NASA Astrophysics Data System (ADS)

Mandt, Kathleen; Mazarico, Erwan; Greathouse, Thomas K.; Byron, Ben; Retherford, Kurt D.; Gladstone, Randy; Liu, Yang; Hendrix, Amanda R.; Hurley, Dana; Stickle, Angela; Wes Patterson, G.; Cahill, Joshua; Williams, Jean-Pierre

2017-10-01

The south pole of the Moon is an area of great interest for exploration and scientific research because many low-lying regions are permanently shaded and are likely to trap volatiles for extended periods of time, while adjacent topographic highs can experience extended periods of sunlight. One of the goals of the Lunar Reconnaissance Orbiter (LRO) mission is to characterize the temporal variability of illumination of the lunar polar regions for the benefit of future exploration efforts. We use far ultraviolet (FUV) observations made by the Lyman Alpha Mapping Project (LAMP) to evaluate illumination at the lunar south pole (within 5° of the pole).LAMP observations are made through passive remote sensing in the FUV wavelength range of 57-196 nm using reflected sunlight during daytime observations and reflected light from the IPM and UV-bright stars during nighttime observations. In this study we focused on the region within 5° of the pole, and produced maps using nighttime data taken between September 2009 and February 2014. Summing over long time periods is necessary to obtain sufficient signal to noise. Many of the maps produced for this study show excess brightness in the “Off Band”, or 155-190 nm, because sunlight scattered into the PSRs is most evident in this wavelength range.LAMP observes the highest rate of scattered sunlight in two large PSRs during nighttime observations: Haworth and Shoemaker. We focus on these craters for comparisons with an illumination model and other LRO datasets. We find that the observations of scattered sunlight do not agree with model predictions. However, preliminary results comparing LAMP maps with other LRO datasets show a correlation between LAMP observations of scattered sunlight and Diviner measurements for maximum temperature.
Improving average ranking precision in user searches for biomedical research datasets

PubMed Central

Gobeill, Julien; Gaudinat, Arnaud; Vachon, Thérèse; Ruch, Patrick

2017-01-01

Abstract Availability of research datasets is keystone for health and life science study reproducibility and scientific progress. Due to the heterogeneity and complexity of these data, a main challenge to be overcome by research data management systems is to provide users with the best answers for their search queries. In the context of the 2016 bioCADDIE Dataset Retrieval Challenge, we investigate a novel ranking pipeline to improve the search of datasets used in biomedical experiments. Our system comprises a query expansion model based on word embeddings, a similarity measure algorithm that takes into consideration the relevance of the query terms, and a dataset categorization method that boosts the rank of datasets matching query constraints. The system was evaluated using a corpus with 800k datasets and 21 annotated user queries, and provided competitive results when compared to the other challenge participants. In the official run, it achieved the highest infAP, being +22.3% higher than the median infAP of the participant’s best submissions. Overall, it is ranked at top 2 if an aggregated metric using the best official measures per participant is considered. The query expansion method showed positive impact on the system’s performance increasing our baseline up to +5.0% and +3.4% for the infAP and infNDCG metrics, respectively. The similarity measure algorithm showed robust performance in different training conditions, with small performance variations compared to the Divergence from Randomness framework. Finally, the result categorization did not have significant impact on the system’s performance. We believe that our solution could be used to enhance biomedical dataset management systems. The use of data driven expansion methods, such as those based on word embeddings, could be an alternative to the complexity of biomedical terminologies. Nevertheless, due to the limited size of the assessment set, further experiments need to be performed to draw
Does Anyone Really Know Anything? An Exploration of Constructivist Meaning and Identity in the Tension between Scientific and Religious Knowledge

ERIC Educational Resources Information Center

Starr, Lisa J.

2010-01-01

In this paper I discuss the tension created by religion and science in one student's understanding of knowledge and truth by exploring two questions: "How do individuals accommodate their religious beliefs with their understanding of science?" and "How does religious knowledge interact with scientific knowledge to construct meaning?" A…

Scientific Balloons for Venus Exploration

NASA Astrophysics Data System (ADS)

Cutts, James; Yavrouian, Andre; Nott, Julian; Baines, Kevin; Limaye, Sanjay; Wilson, Colin; Kerzhanovich, Viktor; Voss, Paul; Hall, Jeffery

Almost 30 years ago, two balloons were successfully deployed into the atmosphere of Venus as an element of the VeGa - Venus Halley mission conducted by the Soviet Union. As interest in further Venus exploration grows among the established planetary exploration agencies - in Europe, Japan, Russia and the United States, use of balloons is emerging as an essential part of that investigative program. Venus balloons have been proposed in NASA’s Discovery program and ESA’s cosmic vision program and are a key element in NASA’s strategic plan for Venus exploration. At JPL, the focus for the last decade has been on the development of a 7m diameter superpressure pressure(twice that of VeGa) capable of carrying a 100 kg payload (14 times that of VeGA balloons), operating for more than 30 days (15 times the 2 day flight duration of the VeGa balloons) and transmitting up to 20 Mbit of data (300 times that of VeGa balloons). This new generation of balloons must tolerate day night transitions on Venus as well as extended exposure to the sulfuric acid environment. These constant altitude balloons operating at an altitude of about 55 km on Venus where temperatures are benign can also deploy sondes to sound the atmosphere beneath the probe and deliver deep sondes equipped to survive and operate down to the surface. The technology for these balloons is now maturing rapidly and we are now looking forward to the prospects for altitude control balloons that can cycle repeatedly through the Venus cloud region. One concept, which has been used for tropospheric profiling in Antarctica, is the pumped-helium balloon, with heritage to the anchor balloon, and would be best adapted for flight above the 55 km level. Phase change balloons, which use the atmosphere as a heat engine, can be used to investigate the lower cloud region down to 30 km. Progress in components for high temperature operation may also enable investigation of the deep atmosphere of Venus with metal-based balloons.
SADI, SHARE, and the in silico scientific method

PubMed Central

2010-01-01

Background The emergence and uptake of Semantic Web technologies by the Life Sciences provides exciting opportunities for exploring novel ways to conduct in silico science. Web Service Workflows are already becoming first-class objects in “the new way”, and serve as explicit, shareable, referenceable representations of how an experiment was done. In turn, Semantic Web Service projects aim to facilitate workflow construction by biological domain-experts such that workflows can be edited, re-purposed, and re-published by non-informaticians. However the aspects of the scientific method relating to explicit discourse, disagreement, and hypothesis generation have remained relatively impervious to new technologies. Results Here we present SADI and SHARE - a novel Semantic Web Service framework, and a reference implementation of its client libraries. Together, SADI and SHARE allow the semi- or fully-automatic discovery and pipelining of Semantic Web Services in response to ad hoc user queries. Conclusions The semantic behaviours exhibited by SADI and SHARE extend the functionalities provided by Description Logic Reasoners such that novel assertions can be automatically added to a data-set without logical reasoning, but rather by analytical or annotative services. This behaviour might be applied to achieve the “semantification” of those aspects of the in silico scientific method that are not yet supported by Semantic Web technologies. We support this suggestion using an example in the clinical research space. PMID:21210986
Mining and Utilizing Dataset Relevancy from Oceanographic Dataset (MUDROD) Metadata, Usage Metrics, and User Feedback to Improve Data Discovery and Access

NASA Astrophysics Data System (ADS)

Li, Y.; Jiang, Y.; Yang, C. P.; Armstrong, E. M.; Huang, T.; Moroni, D. F.; McGibbney, L. J.

2016-12-01

Big oceanographic data have been produced, archived and made available online, but finding the right data for scientific research and application development is still a significant challenge. A long-standing problem in data discovery is how to find the interrelationships between keywords and data, as well as the intrarelationships of the two individually. Most previous research attempted to solve this problem by building domain-specific ontology either manually or through automatic machine learning techniques. The former is costly, labor intensive and hard to keep up-to-date, while the latter is prone to noise and may be difficult for human to understand. Large-scale user behavior data modelling represents a largely untapped, unique, and valuable source for discovering semantic relationships among domain-specific vocabulary. In this article, we propose a search engine framework for mining and utilizing dataset relevancy from oceanographic dataset metadata, user behaviors, and existing ontology. The objective is to improve discovery accuracy of oceanographic data and reduce time for scientist to discover, download and reformat data for their projects. Experiments and a search example show that the proposed search engine helps both scientists and general users search with better ranking results, recommendation, and ontology navigation.
Exploration of the Solar System's Ocean Worlds as a Scientific (and Societal) Imperative

NASA Astrophysics Data System (ADS)

Lunine, J. I.

2017-12-01

The extraordinary discoveries made by multiple planetary spacecraft in the past 20 years have changed planetary scientists' perception of various objects as potential abodes for life, in particular a newly-recognized class of solar system objects called ocean worlds: those bodies with globe-girdling liquids on their surfaces or in their interiors. A reasonably complete list would include 13 bodies, of which the Earth is one, with Mars and Ceres classified as bodies with evidence for past oceans. For three bodies on this list—Europa, Titan and Enceladus—there are multiple independent lines of evidence for subsurface salty liquid water oceans. Of these, Enceladus' ocean has been directly sampled through its persistent plume, and Titan possesses not only an internal ocean but surface seas and lakes of methane and other hydrocarbons. All three of these moons are candidates for hosting microbial life, although in the case of Titan much of the interest is in a putative biochemistry dramatically different from ours, that would work in liquid methane. The possibility that after a half century of planetary exploration we may finally know where to find alien life raises the issue of the priority of life detection missions. Do they supersede ambitious plans for Mars or for Cassini-like explorations of Uranus and Neptune? I consider three possible imperatives: the scientific (elimination of the N=1 problem from biology), the cultural (proper framing of our place in the cosmos) and the political (the value propositions for planetary exploration that we offer the taxpayers).
A call for virtual experiments: accelerating the scientific process.

PubMed

Cooper, Jonathan; Vik, Jon Olav; Waltemath, Dagmar

2015-01-01

Experimentation is fundamental to the scientific method, whether for exploration, description or explanation. We argue that promoting the reuse of virtual experiments (the in silico analogues of wet-lab or field experiments) would vastly improve the usefulness and relevance of computational models, encouraging critical scrutiny of models and serving as a common language between modellers and experimentalists. We review the benefits of reusable virtual experiments: in specifying, assaying, and comparing the behavioural repertoires of models; as prerequisites for reproducible research; to guide model reuse and composition; and for quality assurance in the translational application of models. A key step towards achieving this is that models and experimental protocols should be represented separately, but annotated so as to facilitate the linking of models to experiments and data. Lastly, we outline how the rigorous, streamlined confrontation between experimental datasets and candidate models would enable a "continuous integration" of biological knowledge, transforming our approach to systems biology. Copyright © 2014 Elsevier Ltd. All rights reserved.
An Interactive Virtual 3D Tool for Scientific Exploration of Planetary Surfaces

NASA Astrophysics Data System (ADS)

Traxler, Christoph; Hesina, Gerd; Gupta, Sanjeev; Paar, Gerhard

2014-05-01

In this paper we present an interactive 3D visualization tool for scientific analysis and planning of planetary missions. At the moment scientists have to look at individual camera images separately. There is no tool to combine them in three dimensions and look at them seamlessly as a geologist would do (by walking backwards and forwards resulting in different scales). For this reason a virtual 3D reconstruction of the terrain that can be interactively explored is necessary. Such a reconstruction has to consider multiple scales ranging from orbital image data to close-up surface image data from rover cameras. The 3D viewer allows seamless zooming between these various scales, giving scientists the possibility to relate small surface features (e.g. rock outcrops) to larger geological contexts. For a reliable geologic assessment a realistic surface rendering is important. Therefore the material properties of the rock surfaces will be considered for real-time rendering. This is achieved by an appropriate Bidirectional Reflectance Distribution Function (BRDF) estimated from the image data. The BRDF is implemented to run on the Graphical Processing Unit (GPU) to enable realistic real-time rendering, which allows a naturalistic perception for scientific analysis. Another important aspect for realism is the consideration of natural lighting conditions, which means skylight to illuminate the reconstructed scene. In our case we provide skylights from Mars and Earth, which allows switching between these two modes of illumination. This gives geologists the opportunity to perceive rock outcrops from Mars as they would appear on Earth facilitating scientific assessment. Besides viewing the virtual reconstruction on multiple scales, scientists can also perform various measurements, i.e. geo-coordinates of a selected point or distance between two surface points. Rover or other models can be placed into the scene and snapped onto certain location of the terrain. These are
Parton Distributions based on a Maximally Consistent Dataset

NASA Astrophysics Data System (ADS)

Rojo, Juan

2016-04-01

The choice of data that enters a global QCD analysis can have a substantial impact on the resulting parton distributions and their predictions for collider observables. One of the main reasons for this has to do with the possible presence of inconsistencies, either internal within an experiment or external between different experiments. In order to assess the robustness of the global fit, different definitions of a conservative PDF set, that is, a PDF set based on a maximally consistent dataset, have been introduced. However, these approaches are typically affected by theory biases in the selection of the dataset. In this contribution, after a brief overview of recent NNPDF developments, we propose a new, fully objective, definition of a conservative PDF set, based on the Bayesian reweighting approach. Using the new NNPDF3.0 framework, we produce various conservative sets, which turn out to be mutually in agreement within the respective PDF uncertainties, as well as with the global fit. We explore some of their implications for LHC phenomenology, finding also good consistency with the global fit result. These results provide a non-trivial validation test of the new NNPDF3.0 fitting methodology, and indicate that possible inconsistencies in the fitted dataset do not affect substantially the global fit PDFs.
Smallsats, Cubesats and Scientific Exploration

NASA Astrophysics Data System (ADS)

Stofan, E. R.

2015-12-01

Smallsats (including Cubesats) have taken off in the aerospace research community - moving beyond simple tools for undergraduate and graduate students and into the mainstream of science research. Cubesats started the "smallsat" trend back in the late 1990's early 2000's, with the first Cubesats launching in 2003. NASA anticipates a number of future benefits from small satellite missions, including lower costs, more rapid development, higher risk tolerance, and lower barriers to entry for universities and small businesses. The Agency's Space Technology Mission Directorate is currently addressing technology gaps in small satellite platforms, while the Science Mission Directorate pursues miniaturization of science instruments. Launch opportunities are managed through the Cubesat Launch Initiative, and the Agency manages these projects as sub-orbital payloads with little program overhead. In this session we bring together scientists and technologists to discuss the current state of the smallsat field. We explore ideas for new investments, new instruments, or new applications that NASA should be investing in to expand the utility of smallsats. We discuss the status of a NASA-directed NRC study on the utility of small satellites. Looking to the future, what does NASA need to invest in now, to enable high impact ("decadal survey" level) science with smallsats? How do we push the envelope? We anticipate smallsats will contribute significantly to a more robust exploration and science program for NASA and the country.
RIEMS: a software pipeline for sensitive and comprehensive taxonomic classification of reads from metagenomics datasets.

PubMed

Scheuch, Matthias; Höper, Dirk; Beer, Martin

2015-03-03

Fuelled by the advent and subsequent development of next generation sequencing technologies, metagenomics became a powerful tool for the analysis of microbial communities both scientifically and diagnostically. The biggest challenge is the extraction of relevant information from the huge sequence datasets generated for metagenomics studies. Although a plethora of tools are available, data analysis is still a bottleneck. To overcome the bottleneck of data analysis, we developed an automated computational workflow called RIEMS - Reliable Information Extraction from Metagenomic Sequence datasets. RIEMS assigns every individual read sequence within a dataset taxonomically by cascading different sequence analyses with decreasing stringency of the assignments using various software applications. After completion of the analyses, the results are summarised in a clearly structured result protocol organised taxonomically. The high accuracy and performance of RIEMS analyses were proven in comparison with other tools for metagenomics data analysis using simulated sequencing read datasets. RIEMS has the potential to fill the gap that still exists with regard to data analysis for metagenomics studies. The usefulness and power of RIEMS for the analysis of genuine sequencing datasets was demonstrated with an early version of RIEMS in 2011 when it was used to detect the orthobunyavirus sequences leading to the discovery of Schmallenberg virus.
Tracing Young Children's Scientific Reasoning

NASA Astrophysics Data System (ADS)

Tytler, Russell; Peterson, Suzanne

2003-08-01

This paper explores the scientific reasoning of 14 children across their first two years of primary school. Children's view of experimentation, their approach to exploration, and their negotiation of competing knowledge claims, are interpreted in terms of categories of epistemological reasoning. Children's epistemological reasoning is distinguished from their ability to control variables. While individual children differ substantially, they show a relatively steady growth in their reasoning, with some contextual variation. A number of these children are reasoning at a level well in advance of curriculum expectations, and it is argued that current recommended practice in primary science needs to be rethought. The data is used to explore the relationship between reasoning and knowledge, and to argue that the generation and exploration of ideas must be the key driver of scientific activity in the primary school.
Enhancing endorsement of scientific inquiry increases support for pro-environment policies.

PubMed

Drummond, Aaron; Palmer, Matthew A; Sauer, James D

2016-09-01

Pro-environment policies require public support and engagement, but in countries such as the USA, public support for pro-environment policies remains low. Increasing public scientific literacy is unlikely to solve this, because increased scientific literacy does not guarantee increased acceptance of critical environmental issues (e.g. that climate change is occurring). We distinguish between scientific literacy (basic scientific knowledge) and endorsement of scientific inquiry (perceiving science as a valuable way of accumulating knowledge), and examine the relationship between people's endorsement of scientific inquiry and their support for pro-environment policy. Analysis of a large, publicly available dataset shows that support for pro-environment policies is more strongly related to endorsement of scientific inquiry than to scientific literacy among adolescents. An experiment demonstrates that a brief intervention can increase support for pro-environment policies via increased endorsement of scientific inquiry among adults. Public education about the merits of scientific inquiry may facilitate increased support for pro-environment policies.
Enhancing endorsement of scientific inquiry increases support for pro-environment policies

PubMed Central

Palmer, Matthew A.; Sauer, James D.

2016-01-01

Pro-environment policies require public support and engagement, but in countries such as the USA, public support for pro-environment policies remains low. Increasing public scientific literacy is unlikely to solve this, because increased scientific literacy does not guarantee increased acceptance of critical environmental issues (e.g. that climate change is occurring). We distinguish between scientific literacy (basic scientific knowledge) and endorsement of scientific inquiry (perceiving science as a valuable way of accumulating knowledge), and examine the relationship between people's endorsement of scientific inquiry and their support for pro-environment policy. Analysis of a large, publicly available dataset shows that support for pro-environment policies is more strongly related to endorsement of scientific inquiry than to scientific literacy among adolescents. An experiment demonstrates that a brief intervention can increase support for pro-environment policies via increased endorsement of scientific inquiry among adults. Public education about the merits of scientific inquiry may facilitate increased support for pro-environment policies. PMID:27703700
ANISEED 2017: extending the integrated ascidian database to the exploration and evolutionary comparison of genome-scale datasets

PubMed Central

Brozovic, Matija; Dantec, Christelle; Dardaillon, Justine; Dauga, Delphine; Faure, Emmanuel; Gineste, Mathieu; Louis, Alexandra; Naville, Magali; Nitta, Kazuhiro R; Piette, Jacques; Reeves, Wendy; Scornavacca, Céline; Simion, Paul; Vincentelli, Renaud; Bellec, Maelle; Aicha, Sameh Ben; Fagotto, Marie; Guéroult-Bellone, Marion; Haeussler, Maximilian; Jacox, Edwin; Lowe, Elijah K; Mendez, Mickael; Roberge, Alexis; Stolfi, Alberto; Yokomori, Rui; Cambillau, Christian; Christiaen, Lionel; Delsuc, Frédéric; Douzery, Emmanuel; Dumollard, Rémi; Kusakabe, Takehiro; Nakai, Kenta; Nishida, Hiroki; Satou, Yutaka; Swalla, Billie; Veeman, Michael; Volff, Jean-Nicolas

2018-01-01

Abstract ANISEED (www.aniseed.cnrs.fr) is the main model organism database for tunicates, the sister-group of vertebrates. This release gives access to annotated genomes, gene expression patterns, and anatomical descriptions for nine ascidian species. It provides increased integration with external molecular and taxonomy databases, better support for epigenomics datasets, in particular RNA-seq, ChIP-seq and SELEX-seq, and features novel interactive interfaces for existing and novel datatypes. In particular, the cross-species navigation and comparison is enhanced through a novel taxonomy section describing each represented species and through the implementation of interactive phylogenetic gene trees for 60% of tunicate genes. The gene expression section displays the results of RNA-seq experiments for the three major model species of solitary ascidians. Gene expression is controlled by the binding of transcription factors to cis-regulatory sequences. A high-resolution description of the DNA-binding specificity for 131 Ciona robusta (formerly C. intestinalis type A) transcription factors by SELEX-seq is provided and used to map candidate binding sites across the Ciona robusta and Phallusia mammillata genomes. Finally, use of a WashU Epigenome browser enhances genome navigation, while a Genomicus server was set up to explore microsynteny relationships within tunicates and with vertebrates, Amphioxus, echinoderms and hemichordates. PMID:29149270
Dissecting the space-time structure of tree-ring datasets using the partial triadic analysis.

PubMed

Rossi, Jean-Pierre; Nardin, Maxime; Godefroid, Martin; Ruiz-Diaz, Manuela; Sergent, Anne-Sophie; Martinez-Meier, Alejandro; Pâques, Luc; Rozenberg, Philippe

2014-01-01

Tree-ring datasets are used in a variety of circumstances, including archeology, climatology, forest ecology, and wood technology. These data are based on microdensity profiles and consist of a set of tree-ring descriptors, such as ring width or early/latewood density, measured for a set of individual trees. Because successive rings correspond to successive years, the resulting dataset is a ring variables × trees × time datacube. Multivariate statistical analyses, such as principal component analysis, have been widely used for extracting worthwhile information from ring datasets, but they typically address two-way matrices, such as ring variables × trees or ring variables × time. Here, we explore the potential of the partial triadic analysis (PTA), a multivariate method dedicated to the analysis of three-way datasets, to apprehend the space-time structure of tree-ring datasets. We analyzed a set of 11 tree-ring descriptors measured in 149 georeferenced individuals of European larch (Larix decidua Miller) during the period of 1967-2007. The processing of densitometry profiles led to a set of ring descriptors for each tree and for each year from 1967-2007. The resulting three-way data table was subjected to two distinct analyses in order to explore i) the temporal evolution of spatial structures and ii) the spatial structure of temporal dynamics. We report the presence of a spatial structure common to the different years, highlighting the inter-individual variability of the ring descriptors at the stand scale. We found a temporal trajectory common to the trees that could be separated into a high and low frequency signal, corresponding to inter-annual variations possibly related to defoliation events and a long-term trend possibly related to climate change. We conclude that PTA is a powerful tool to unravel and hierarchize the different sources of variation within tree-ring datasets.
Automating Geospatial Visualizations with Smart Default Renderers for Data Exploration Web Applications

NASA Astrophysics Data System (ADS)

Ekenes, K.

2017-12-01

This presentation will outline the process of creating a web application for exploring large amounts of scientific geospatial data using modern automated cartographic techniques. Traditional cartographic methods, including data classification, may inadvertently hide geospatial and statistical patterns in the underlying data. This presentation demonstrates how to use smart web APIs that quickly analyze the data when it loads, and provides suggestions for the most appropriate visualizations based on the statistics of the data. Since there are just a few ways to visualize any given dataset well, it is imperative to provide smart default color schemes tailored to the dataset as opposed to static defaults. Since many users don't go beyond default values, it is imperative that they are provided with smart default visualizations. Multiple functions for automating visualizations are available in the Smart APIs, along with UI elements allowing users to create more than one visualization for a dataset since there isn't a single best way to visualize a given dataset. Since bivariate and multivariate visualizations are particularly difficult to create effectively, this automated approach removes the guesswork out of the process and provides a number of ways to generate multivariate visualizations for the same variables. This allows the user to choose which visualization is most appropriate for their presentation. The methods used in these APIs and the renderers generated by them are not available elsewhere. The presentation will show how statistics can be used as the basis for automating default visualizations of data along continuous ramps, creating more refined visualizations while revealing the spread and outliers of the data. Adding interactive components to instantaneously alter visualizations allows users to unearth spatial patterns previously unknown among one or more variables. These applications may focus on a single dataset that is frequently updated, or configurable
Exploring the Possibilities: Earth and Space Science Missions in the Context of Exploration

NASA Technical Reports Server (NTRS)

Pfarr, Barbara; Calabrese, Michael; Kirkpatrick, James; Malay, Jonathan T.

2006-01-01

According to Dr. Edward J. Weiler, Director of the Goddard Space Flight Center, "Exploration without science is tourism". At the American Astronautical Society's 43rd Annual Robert H. Goddard Memorial Symposium it was quite apparent to all that NASA's current Exploration Initiative is tightly coupled to multiple scientific initiatives: exploration will enable new science and science will enable exploration. NASA's Science Mission Directorate plans to develop priority science missions that deliver science that is vital, compelling and urgent. This paper will discuss the theme of the Goddard Memorial Symposium that science plays a key role in exploration. It will summarize the key scientific questions and some of the space and Earth science missions proposed to answer them, including the Mars and Lunar Exploration Programs, the Beyond Einstein and Navigator Programs, and the Earth-Sun System missions. It will also discuss some of the key technologies that will enable these missions, including the latest in instruments and sensors, large space optical system technologies and optical communications, and briefly discuss developments and achievements since the Symposium. Throughout history, humans have made the biggest scientific discoveries by visiting unknown territories; by going to the Moon and other planets and by seeking out habitable words, NASA is continuing humanity's quest for scientific knowledge.
Kernel-based discriminant feature extraction using a representative dataset

NASA Astrophysics Data System (ADS)

Li, Honglin; Sancho Gomez, Jose-Luis; Ahalt, Stanley C.

2002-07-01

Discriminant Feature Extraction (DFE) is widely recognized as an important pre-processing step in classification applications. Most DFE algorithms are linear and thus can only explore the linear discriminant information among the different classes. Recently, there has been several promising attempts to develop nonlinear DFE algorithms, among which is Kernel-based Feature Extraction (KFE). The efficacy of KFE has been experimentally verified by both synthetic data and real problems. However, KFE has some known limitations. First, KFE does not work well for strongly overlapped data. Second, KFE employs all of the training set samples during the feature extraction phase, which can result in significant computation when applied to very large datasets. Finally, KFE can result in overfitting. In this paper, we propose a substantial improvement to KFE that overcomes the above limitations by using a representative dataset, which consists of critical points that are generated from data-editing techniques and centroid points that are determined by using the Frequency Sensitive Competitive Learning (FSCL) algorithm. Experiments show that this new KFE algorithm performs well on significantly overlapped datasets, and it also reduces computational complexity. Further, by controlling the number of centroids, the overfitting problem can be effectively alleviated.
Data Integration for Heterogenous Datasets

PubMed Central

2014-01-01

Abstract More and more, the needs of data analysts are requiring the use of data outside the control of their own organizations. The increasing amount of data available on the Web, the new technologies for linking data across datasets, and the increasing need to integrate structured and unstructured data are all driving this trend. In this article, we provide a technical overview of the emerging “broad data” area, in which the variety of heterogeneous data being used, rather than the scale of the data being analyzed, is the limiting factor in data analysis efforts. The article explores some of the emerging themes in data discovery, data integration, linked data, and the combination of structured and unstructured data. PMID:25553272
Fixing Dataset Search

NASA Technical Reports Server (NTRS)

Lynnes, Chris

2014-01-01

Three current search engines are queried for ozone data at the GES DISC. The results range from sub-optimal to counter-intuitive. We propose a method to fix dataset search by implementing a robust relevancy ranking scheme. The relevancy ranking scheme is based on several heuristics culled from more than 20 years of helping users select datasets.
A dataset on human navigation strategies in foreign networked systems.

PubMed

Kőrösi, Attila; Csoma, Attila; Rétvári, Gábor; Heszberger, Zalán; Bíró, József; Tapolcai, János; Pelle, István; Klajbár, Dávid; Novák, Márton; Halasi, Valentina; Gulyás, András

2018-03-13

Humans are involved in various real-life networked systems. The most obvious examples are social and collaboration networks but the language and the related mental lexicon they use, or the physical map of their territory can also be interpreted as networks. How do they find paths between endpoints in these networks? How do they obtain information about a foreign networked world they find themselves in, how they build mental model for it and how well they succeed in using it? Large, open datasets allowing the exploration of such questions are hard to find. Here we report a dataset collected by a smartphone application, in which players navigate between fixed length source and destination English words step-by-step by changing only one letter at a time. The paths reflect how the players master their navigation skills in such a foreign networked world. The dataset can be used in the study of human mental models for the world around us, or in a broader scope to investigate the navigation strategies in complex networked systems.

A dataset on human navigation strategies in foreign networked systems

PubMed Central

Kőrösi, Attila; Csoma, Attila; Rétvári, Gábor; Heszberger, Zalán; Bíró, József; Tapolcai, János; Pelle, István; Klajbár, Dávid; Novák, Márton; Halasi, Valentina; Gulyás, András

2018-01-01

Humans are involved in various real-life networked systems. The most obvious examples are social and collaboration networks but the language and the related mental lexicon they use, or the physical map of their territory can also be interpreted as networks. How do they find paths between endpoints in these networks? How do they obtain information about a foreign networked world they find themselves in, how they build mental model for it and how well they succeed in using it? Large, open datasets allowing the exploration of such questions are hard to find. Here we report a dataset collected by a smartphone application, in which players navigate between fixed length source and destination English words step-by-step by changing only one letter at a time. The paths reflect how the players master their navigation skills in such a foreign networked world. The dataset can be used in the study of human mental models for the world around us, or in a broader scope to investigate the navigation strategies in complex networked systems. PMID:29533391
Three visualization approaches for communicating and exploring PIT tag data

USGS Publications Warehouse

Letcher, Benjamin; Walker, Jeffrey D.; O'Donnell, Matthew; Whiteley, Andrew R.; Nislow, Keith; Coombs, Jason

2018-01-01

As the number, size and complexity of ecological datasets has increased, narrative and interactive raw data visualizations have emerged as important tools for exploring and understanding these large datasets. As a demonstration, we developed three visualizations to communicate and explore passive integrated transponder tag data from two long-term field studies. We created three independent visualizations for the same dataset, allowing separate entry points for users with different goals and experience levels. The first visualization uses a narrative approach to introduce users to the study. The second visualization provides interactive cross-filters that allow users to explore multi-variate relationships in the dataset. The last visualization allows users to visualize the movement histories of individual fish within the stream network. This suite of visualization tools allows a progressive discovery of more detailed information and should make the data accessible to users with a wide variety of backgrounds and interests.
Collaboration tools and techniques for large model datasets

USGS Publications Warehouse

Signell, R.P.; Carniel, S.; Chiggiato, J.; Janekovic, I.; Pullen, J.; Sherwood, C.R.

2008-01-01

In MREA and many other marine applications, it is common to have multiple models running with different grids, run by different institutions. Techniques and tools are described for low-bandwidth delivery of data from large multidimensional datasets, such as those from meteorological and oceanographic models, directly into generic analysis and visualization tools. Output is stored using the NetCDF CF Metadata Conventions, and then delivered to collaborators over the web via OPeNDAP. OPeNDAP datasets served by different institutions are then organized via THREDDS catalogs. Tools and procedures are then used which enable scientists to explore data on the original model grids using tools they are familiar with. It is also low-bandwidth, enabling users to extract just the data they require, an important feature for access from ship or remote areas. The entire implementation is simple enough to be handled by modelers working with their webmasters - no advanced programming support is necessary. ?? 2007 Elsevier B.V. All rights reserved.
Development of a SPARK Training Dataset

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sayre, Amanda M.; Olson, Jarrod R.

2015-03-01

In its first five years, the National Nuclear Security Administration’s (NNSA) Next Generation Safeguards Initiative (NGSI) sponsored more than 400 undergraduate, graduate, and post-doctoral students in internships and research positions (Wyse 2012). In the past seven years, the NGSI program has, and continues to produce a large body of scientific, technical, and policy work in targeted core safeguards capabilities and human capital development activities. Not only does the NGSI program carry out activities across multiple disciplines, but also across all U.S. Department of Energy (DOE)/NNSA locations in the United States. However, products are not readily shared among disciplines and acrossmore » locations, nor are they archived in a comprehensive library. Rather, knowledge of NGSI-produced literature is localized to the researchers, clients, and internal laboratory/facility publication systems such as the Electronic Records and Information Capture Architecture (ERICA) at the Pacific Northwest National Laboratory (PNNL). There is also no incorporated way of analyzing existing NGSI literature to determine whether the larger NGSI program is achieving its core safeguards capabilities and activities. A complete library of NGSI literature could prove beneficial to a cohesive, sustainable, and more economical NGSI program. The Safeguards Platform for Automated Retrieval of Knowledge (SPARK) has been developed to be a knowledge storage, retrieval, and analysis capability to capture safeguards knowledge to exist beyond the lifespan of NGSI. During the development process, it was necessary to build a SPARK training dataset (a corpus of documents) for initial entry into the system and for demonstration purposes. We manipulated these data to gain new information about the breadth of NGSI publications, and they evaluated the science-policy interface at PNNL as a practical demonstration of SPARK’s intended analysis capability. The analysis demonstration sought to answer
Large and linked in scientific publishing

PubMed Central

2012-01-01

We are delighted to announce the launch of GigaScience, an online open-access journal that focuses on research using or producing large datasets in all areas of biological and biomedical sciences. GigaScience is a new type of journal that provides standard scientific publishing linked directly to a database that hosts all the relevant data. The primary goals for the journal, detailed in this editorial, are to promote more rapid data release, broader use and reuse of data, improved reproducibility of results, and direct, easy access between analyses and their data. Direct and permanent connections of scientific analyses and their data (achieved by assigning all hosted data a citable DOI) will enable better analysis and deeper interpretation of the data in the future. PMID:23587310
Large and linked in scientific publishing.

PubMed

Goodman, Laurie; Edmunds, Scott C; Basford, Alexandra T

2012-07-12

We are delighted to announce the launch of GigaScience, an online open-access journal that focuses on research using or producing large datasets in all areas of biological and biomedical sciences. GigaScience is a new type of journal that provides standard scientific publishing linked directly to a database that hosts all the relevant data. The primary goals for the journal, detailed in this editorial, are to promote more rapid data release, broader use and reuse of data, improved reproducibility of results, and direct, easy access between analyses and their data. Direct and permanent connections of scientific analyses and their data (achieved by assigning all hosted data a citable DOI) will enable better analysis and deeper interpretation of the data in the future.
Future scientific exploration of Taurus-Littrow

NASA Technical Reports Server (NTRS)

Taylor, G. Jeffrey

1992-01-01

The Apollo 17 site was surveyed with great skill and the collected samples have been studied thoroughly (but not completely) in the 20 years since. Ironically, the success of the field and sample studies makes the site an excellent candidate for a return mission. Rather than solving all the problems, the Apollo 17 mission provided a set of sophisticated questions that can be answered only by returning to the site and exploring further. This paper addresses the major unsolved problems in lunar science and points out the units at the Apollo 17 site that are most suitable for addressing each problem. It then discusses how crucial data can be obtained by robotic rovers and human field work. I conclude that, in general, the most important information can be obtained only by human exploration. The paper ends with some guesses about what we could have learned at the Apollo 17 site from a fairly sophisticated rover capable of in situ analyses, instead of sending people.
Stewardship of Integrity in Scientific Communication.

PubMed

Albertine, Kurt H

2018-06-14

Integrity in the pursuit of discovery through application of the scientific method and reporting the results is an obligation for each of us as scientists. We cannot let the value of science be diminished because discovering knowledge is vital to understand ourselves and our impacts on the earth. We support the value of science by our stewardship of integrity in the conduct, training, reporting, and proposing of scientific investigation. The players who have these responsibilities are authors, reviewers, editors, and readers. Each role has to be played with vigilance for ethical behavior, including compliance with regulations for protections of study subjects, use of select agents and biohazards, regulations of use of stem cells, resource sharing, posting datasets to public repositories, etc. The positive take-home message is that the scientific community is taking steps in behavior to protect the integrity of science. This article is protected by copyright. All rights reserved. © 2018 Wiley Periodicals, Inc.
An Interdisciplinary Exploration of the Mariana Region with the NOAA ship Okeanos Explorer: Scientific Highlights from the April-July 2016 Expedition

NASA Astrophysics Data System (ADS)

Glickson, D.; Amon, D.; Pomponi, S. A.; Fryer, P. B.; Elliott, K.; Lobecker, E.; Cantwell, K. L.; Kelley, C.

2016-12-01

From April to July 2016, an interdisciplinary team of ship-based and shore-based scientists investigated the biology and geology of the Marianas region as part of the 3-year NOAA Campaign to Address the Pacific monument Science, Technology, and Ocean NEeds (CAPSTONE) using the telepresence-enabled NOAA ship Okeanos Explorer. The focus of the expedition was on the Marianas Trench Marine National Monument and the waters of the Commonwealth of the Northern Mariana Islands. A variety of habitats were explored, including deep-sea coral and sponge communities, bottom fisheries, mud volcanoes, hydrothermal vents, Prime Crust Zone seamounts, and the Trench subduction zone. The expedition successfully collected baseline information at 41 sites at depths from 240 to 6,000 m. High-resolution imagery was obtained along the dive tracks, both in the water column and on the seafloor. Over 130 biological and geologic samples were collected. Many of the organisms documented are likely to be new species or new records of occurrence, and dozens of observations were the first ever collected in situ. Almost 74,000 square kilometers of seafloor were mapped, greatly improving both coverage and resolution in the region. New geologic features were mapped and explored, including ridges and new lava flow fields. Public engagement was substantial, with over 3.1 million total views of the live streaming video/audio feeds. The telepresence paradigm was tested rigorously, with active participation from 100 scientists in five countries and at least nine time zones. The shore-based team provided strong scientific expertise, complementing and expanding the knowledge of the ship-based science leads.
Resource Prospector, the Decadal Survey and the Scientific Context for the Exploration of the Moon

NASA Technical Reports Server (NTRS)

Elphic, R. C.; Colaprete, A.; Andrews, D. R.

2017-01-01

The Inner Planets Panel of the Planetary Exploration Decadal Survey defined several science questions related to the origins, emplacement, and sequestration of lunar polar volatiles: 1. What is the lateral and vertical distribution of the volatile deposits? 2. What is the chemical composition and variability of polar volatiles? 3. What is the isotopic composition of the volatiles? 4. What is the physical form of the volatiles? 5. What is the rate of the current volatile deposition? A mission concept study, the Lunar Polar Volatiles Explorer (LPVE), defined a approximately $1B New Frontiers mission to address these questions. The NAS/NRC report, 'Scientific Context for the Exploration of the Moon' identified he lunar poles as special environments with important implications. It put forth the following goals: Science Goal 4a-Determine the compositional state (elemental, isotopic, mineralogic) and compositional distribution (lateral and depth) of the volatile component in lunar polar regions. Science Goal 4b-Determine the source(s) for lunar polar volatiles. Science Goal 4c-Understand the transport, retention, alteration, and loss processes that operate on volatile materials at permanently shaded lunar regions. Science Goal 4d-Understand the physical properties of the extremely cold (and possibly volatile rich) polar regolith. Science Goal 4e-Determine what the cold polar regolith reveals about the ancient solar environment.
Where Do the Sand-Dust Storms Come From?: Conversations with Specialists from the Exploring Sand-Dust Storms Scientific Expedition Team

ERIC Educational Resources Information Center

Shixin, Liu

2004-01-01

This article relates the different views from specialists of the scientific expedition team for the exploration of the origin of sand-dust storms. They observed and examined on-site the ecological environment of places of origin for sand-dust storms, and tried to find out causes of sand-dust storm and what harm it can cause in the hope of…
Integrative Exploratory Analysis of Two or More Genomic Datasets.

PubMed

Meng, Chen; Culhane, Aedin

2016-01-01

Exploratory analysis is an essential step in the analysis of high throughput data. Multivariate approaches such as correspondence analysis (CA), principal component analysis, and multidimensional scaling are widely used in the exploratory analysis of single dataset. Modern biological studies often assay multiple types of biological molecules (e.g., mRNA, protein, phosphoproteins) on a same set of biological samples, thereby creating multiple different types of omics data or multiassay data. Integrative exploratory analysis of these multiple omics data is required to leverage the potential of multiple omics studies. In this chapter, we describe the application of co-inertia analysis (CIA; for analyzing two datasets) and multiple co-inertia analysis (MCIA; for three or more datasets) to address this problem. These methods are powerful yet simple multivariate approaches that represent samples using a lower number of variables, allowing a more easily identification of the correlated structure in and between multiple high dimensional datasets. Graphical representations can be employed to this purpose. In addition, the methods simultaneously project samples and variables (genes, proteins) onto the same lower dimensional space, so the most variant variables from each dataset can be selected and associated with samples, which can be further used to facilitate biological interpretation and pathway analysis. We applied CIA to explore the concordance between mRNA and protein expression in a panel of 60 tumor cell lines from the National Cancer Institute. In the same 60 cell lines, we used MCIA to perform a cross-platform comparison of mRNA gene expression profiles obtained on four different microarray platforms. Last, as an example of integrative analysis of multiassay or multi-omics data we analyzed transcriptomic, proteomic, and phosphoproteomic data from pluripotent (iPS) and embryonic stem (ES) cell lines.
Opposing ends of the spectrum: Exploring trust in scientific and religious authorities.

PubMed

Cacciatore, Michael A; Browning, Nick; Scheufele, Dietram A; Brossard, Dominique; Xenos, Michael A; Corley, Elizabeth A

2018-01-01

Given the ethical questions that surround emerging science, this study is interested in studying public trust in scientific and religious authorities for information about the risks and benefits of science. Using data from a nationally representative survey of American adults, we employ regression analysis to better understand the relationships between several variables-including values, knowledge, and media attention-and trust in religious organizations and scientific institutions. We found that Evangelical Christians are generally more trusting of religious authority figures to tell the truth about the risks and benefits of science and technology, and only slightly less likely than non-Evangelicals to trust scientific authorities for the same information. We also found that many Evangelicals use mediated information and science knowledge differently than non-Evangelicals, with both increased knowledge and attention to scientific media having positive impacts on trust in scientific authorities among the latter, but not the former group.
Isfahan MISP Dataset

PubMed Central

Kashefpur, Masoud; Kafieh, Rahele; Jorjandi, Sahar; Golmohammadi, Hadis; Khodabande, Zahra; Abbasi, Mohammadreza; Teifuri, Nilufar; Fakharzadeh, Ali Akbar; Kashefpoor, Maryam; Rabbani, Hossein

2017-01-01

An online depository was introduced to share clinical ground truth with the public and provide open access for researchers to evaluate their computer-aided algorithms. PHP was used for web programming and MySQL for database managing. The website was entitled “biosigdata.com.” It was a fast, secure, and easy-to-use online database for medical signals and images. Freely registered users could download the datasets and could also share their own supplementary materials while maintaining their privacies (citation and fee). Commenting was also available for all datasets, and automatic sitemap and semi-automatic SEO indexing have been set for the site. A comprehensive list of available websites for medical datasets is also presented as a Supplementary (http://journalonweb.com/tempaccess/4800.584.JMSS_55_16I3253.pdf). PMID:28487832
Isfahan MISP Dataset.

PubMed

Kashefpur, Masoud; Kafieh, Rahele; Jorjandi, Sahar; Golmohammadi, Hadis; Khodabande, Zahra; Abbasi, Mohammadreza; Teifuri, Nilufar; Fakharzadeh, Ali Akbar; Kashefpoor, Maryam; Rabbani, Hossein

2017-01-01

An online depository was introduced to share clinical ground truth with the public and provide open access for researchers to evaluate their computer-aided algorithms. PHP was used for web programming and MySQL for database managing. The website was entitled "biosigdata.com." It was a fast, secure, and easy-to-use online database for medical signals and images. Freely registered users could download the datasets and could also share their own supplementary materials while maintaining their privacies (citation and fee). Commenting was also available for all datasets, and automatic sitemap and semi-automatic SEO indexing have been set for the site. A comprehensive list of available websites for medical datasets is also presented as a Supplementary (http://journalonweb.com/tempaccess/4800.584.JMSS_55_16I3253.pdf).
Exploring Careers. Scientific and Technical Occupations.

ERIC Educational Resources Information Center

Bureau of Labor Statistics (DOL), Washington, DC.

"Exploring Careers" is a career education resource program, published in fifteen separate booklets, for junior high school-age students. It provides information about the world of work and offers its readers a way of learning about themselves and relating that information to career choices. The publications aim to build career awareness…
Exploring Relationships in Big Data

NASA Astrophysics Data System (ADS)

Mahabal, A.; Djorgovski, S. G.; Crichton, D. J.; Cinquini, L.; Kelly, S.; Colbert, M. A.; Kincaid, H.

2015-12-01

Big Data are characterized by several different 'V's. Volume, Veracity, Volatility, Value and so on. For many datasets inflated Volumes through redundant features often make the data more noisy and difficult to extract Value out of. This is especially true if one is comparing/combining different datasets, and the metadata are diverse. We have been exploring ways to exploit such datasets through a variety of statistical machinery, and visualization. We show how we have applied it to time-series from large astronomical sky-surveys. This was done in the Virtual Observatory framework. More recently we have been doing similar work for a completely different domain viz. biology/cancer. The methodology reuse involves application to diverse datasets gathered through the various centers associated with the Early Detection Research Network (EDRN) for cancer, an initiative of the National Cancer Institute (NCI). Application to Geo datasets is a natural extension.
Communicating Ocean Acidification and Climate Change to Public Audiences Using Scientific Data, Interactive Exploration Tools, and Visual Narratives

NASA Astrophysics Data System (ADS)

Miller, M. K.; Rossiter, A.; Spitzer, W.

2016-12-01

The Exploratorium, a hands-on science museum, explores local environmental conditions of San Francisco Bay to connect audiences to the larger global implications of ocean acidification and climate change. The work is centered in the Fisher Bay Observatory at Pier 15, a glass-walled gallery sited for explorations of urban San Francisco and the Bay. Interactive exhibits, high-resolution data visualizations, and mediated activities and conversations communicate to public audiences the impacts of excess carbon dioxide in the atmosphere and ocean. Through a 10-year education partnership with NOAA and two environmental literacy grants funded by its Office of Education, the Exploratorium has been part of two distinct but complementary strategies to increase climate literacy beyond traditional classroom settings. We will discuss two projects that address the ways complex scientific information can be transformed into learning opportunities for the public, providing information citizens can use for decision-making in their personal lives and their communities. The Visualizing Change project developed "visual narratives" that combine scientific visualizations and other images with story telling about the science and potential solutions of climate impacts on the ocean. The narratives were designed to engage curiosity and provide the public with hopeful and useful information to stimulate solutions-oriented behavior rather than to communicate despair about climate change. Training workshops for aquarium and museum docents prepare informal educators to use the narratives and help them frame productive conversations with the pubic. The Carbon Networks project, led by the Exploratorium, uses local and Pacific Rim data to explore the current state of climate change and ocean acidification. The Exploratorium collects and displays local ocean and atmosphere data as a member of the Central and Northern California Ocean Observing System and as an observing station for NOAA's Pacific
Preprocessed Consortium for Neuropsychiatric Phenomics dataset.

PubMed

Gorgolewski, Krzysztof J; Durnez, Joke; Poldrack, Russell A

2017-01-01

Here we present preprocessed MRI data of 265 participants from the Consortium for Neuropsychiatric Phenomics (CNP) dataset. The preprocessed dataset includes minimally preprocessed data in the native, MNI and surface spaces accompanied with potential confound regressors, tissue probability masks, brain masks and transformations. In addition the preprocessed dataset includes unthresholded group level and single subject statistical maps from all tasks included in the original dataset. We hope that availability of this dataset will greatly accelerate research.
Exploring Arctic history through scientific drilling

NASA Astrophysics Data System (ADS)

ODP Leg 151 Shipboard Scientific Party

During the brief Arctic summer of 1993, the Ocean Drilling Program's research vessel JOIDES Resolution recovered the first scientific drill cores from the eastern Arctic Ocean. Dodging rafts of pack ice shed from the Arctic ice cap, the science party sampled sediments north of 80°N latitude from the Yermak Plateau, as well as from sites in Fram Strait, the northeastern Greenland margin, and the Iceland Plateau (Figure 1).The sediments collected reveal the earliest history of the connection between the North Atlantic and Arctic Oceans through the Nordic Seas. The region between Greenland and Norway first formed a series of isolated basins, sometimes with restricted deep circulation, that eventually joined and allowed deep and surface Arctic Ocean water to invade the region. A record was also retrieved that shows major glaciation in the region began about 2.5 m.y.a.

FUn: a framework for interactive visualizations of large, high-dimensional datasets on the web.

PubMed

Probst, Daniel; Reymond, Jean-Louis

2018-04-15

During the past decade, big data have become a major tool in scientific endeavors. Although statistical methods and algorithms are well-suited for analyzing and summarizing enormous amounts of data, the results do not allow for a visual inspection of the entire data. Current scientific software, including R packages and Python libraries such as ggplot2, matplotlib and plot.ly, do not support interactive visualizations of datasets exceeding 100 000 data points on the web. Other solutions enable the web-based visualization of big data only through data reduction or statistical representations. However, recent hardware developments, especially advancements in graphical processing units, allow for the rendering of millions of data points on a wide range of consumer hardware such as laptops, tablets and mobile phones. Similar to the challenges and opportunities brought to virtually every scientific field by big data, both the visualization of and interaction with copious amounts of data are both demanding and hold great promise. Here we present FUn, a framework consisting of a client (Faerun) and server (Underdark) module, facilitating the creation of web-based, interactive 3D visualizations of large datasets, enabling record level visual inspection. We also introduce a reference implementation providing access to SureChEMBL, a database containing patent information on more than 17 million chemical compounds. The source code and the most recent builds of Faerun and Underdark, Lore.js and the data preprocessing toolchain used in the reference implementation, are available on the project website (http://doc.gdb.tools/fun/). daniel.probst@dcb.unibe.ch or jean-louis.reymond@dcb.unibe.ch.
ANISEED 2017: extending the integrated ascidian database to the exploration and evolutionary comparison of genome-scale datasets.

PubMed

Brozovic, Matija; Dantec, Christelle; Dardaillon, Justine; Dauga, Delphine; Faure, Emmanuel; Gineste, Mathieu; Louis, Alexandra; Naville, Magali; Nitta, Kazuhiro R; Piette, Jacques; Reeves, Wendy; Scornavacca, Céline; Simion, Paul; Vincentelli, Renaud; Bellec, Maelle; Aicha, Sameh Ben; Fagotto, Marie; Guéroult-Bellone, Marion; Haeussler, Maximilian; Jacox, Edwin; Lowe, Elijah K; Mendez, Mickael; Roberge, Alexis; Stolfi, Alberto; Yokomori, Rui; Brown, C Titus; Cambillau, Christian; Christiaen, Lionel; Delsuc, Frédéric; Douzery, Emmanuel; Dumollard, Rémi; Kusakabe, Takehiro; Nakai, Kenta; Nishida, Hiroki; Satou, Yutaka; Swalla, Billie; Veeman, Michael; Volff, Jean-Nicolas; Lemaire, Patrick

2018-01-04

ANISEED (www.aniseed.cnrs.fr) is the main model organism database for tunicates, the sister-group of vertebrates. This release gives access to annotated genomes, gene expression patterns, and anatomical descriptions for nine ascidian species. It provides increased integration with external molecular and taxonomy databases, better support for epigenomics datasets, in particular RNA-seq, ChIP-seq and SELEX-seq, and features novel interactive interfaces for existing and novel datatypes. In particular, the cross-species navigation and comparison is enhanced through a novel taxonomy section describing each represented species and through the implementation of interactive phylogenetic gene trees for 60% of tunicate genes. The gene expression section displays the results of RNA-seq experiments for the three major model species of solitary ascidians. Gene expression is controlled by the binding of transcription factors to cis-regulatory sequences. A high-resolution description of the DNA-binding specificity for 131 Ciona robusta (formerly C. intestinalis type A) transcription factors by SELEX-seq is provided and used to map candidate binding sites across the Ciona robusta and Phallusia mammillata genomes. Finally, use of a WashU Epigenome browser enhances genome navigation, while a Genomicus server was set up to explore microsynteny relationships within tunicates and with vertebrates, Amphioxus, echinoderms and hemichordates. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Efficiently Exploring Multilevel Data with Recursive Partitioning

ERIC Educational Resources Information Center

Martin, Daniel P.; von Oertzen, Timo; Rimm-Kaufman, Sara E.

2015-01-01

There is an increasing number of datasets with many participants, variables, or both, in education and other fields that often deal with large, multilevel data structures. Once initial confirmatory hypotheses are exhausted, it can be difficult to determine how best to explore the dataset to discover hidden relationships that could help to inform…
Integration of multi-source and multi-scale datasets for 3D structural modeling for subsurface exploration targeting, Luanchuan Mo-polymetallic district, China

NASA Astrophysics Data System (ADS)

Wang, Gongwen; Ma, Zhenbo; Li, Ruixi; Song, Yaowu; Qu, Jianan; Zhang, Shouting; Yan, Changhai; Han, Jiangwei

2017-04-01

In this paper, multi-source (geophysical, geochemical, geological and remote sensing) datasets were used to construct multi-scale (district-, deposit-, and orebody-scale) 3D geological models and extract 3D exploration criteria for subsurface Mo-polymetallic exploration targeting in the Luanchuan district in China. The results indicate that (i) a series of region-/district-scale NW-trending thrusts controlled main Mo-polymetallic forming, and they were formed by regional Indosinian Qinling orogenic events, the secondary NW-trending district-scale folds and NE-trending faults and the intrusive stock structure are produced based on thrust structure in Caledonian-Indosinian orogenic events; they are ore-bearing zones and ore-forming structures; (ii) the NW-trending district-scale and NE-trending deposit-scale normal faults were crossed and controlled by the Jurassic granite stocks in 3D space, they are associated with the magma-skarn Mo polymetallic mineralization (the 3D buffer distance of ore-forming granite stocks is 600 m) and the NW-trending hydrothermal Pb-Zn deposits which are surrounded by the Jurassic granite stocks and constrained by NW-trending or NE-trending faults (the 3D buffer distance of ore-forming fault is 700 m); and (iii) nine Mo polymetallic and four Pb-Zn targets were identified in the subsurface of the Luanchuan district.
Preliminary AirMSPI Datasets

Atmospheric Science Data Center

2018-02-26

... Datasets The data files available through this web page and ftp links are preliminary AIrMSPI datasets from recent campaigns. ... and geometric corrections. Caution should be used for science analysis. At a later date, more qualified versions will be made public. ...
Collaboration as a means toward a better dataset for both stakeholders and scientist

NASA Astrophysics Data System (ADS)

Chegwidden, O.; Rupp, D. E.; Nijssen, B.; Pytlak, E.; Knight, K.

2016-12-01

In 2013, the University of Washington (UW) and Oregon State University began a three-year project to evaluate climate change impacts in the Columbia River Basin (CRB) in the North American Pacific Northwest. The project was funded and coordinated by the River Management Joint Operating Committee (RMJOC), consisting of the Bonneville Power Administration (BPA), US Army Corps of Engineers (USACE), and US Bureau of Reclamation (USBR) and included a host of stakeholders in the region. The team worked to foster communication and collaboration throughout the production process, and also discovered effective collaborative strategies along the way. Project status updates occurred through a variety of outlets, ranging from monthly team check-ins to bi-annual workshops for a much larger audience. The workshops were used to solicit ongoing and timely feedback from a variety of stakeholders including RMJOC members, fish habitat advocates, tribal representatives and public utilities. To further facilitate collaboration, the team restructured the original project timeline, opting for delivering a provisional dataset nine months before the scheduled delivery of the final dataset. This allowed for a previously unplanned series of reviews from stakeholders in the region, who contributed their own expertise and interests to the dataset. The restructuring also encouraged the development of a streamlined infrastructure for performing the actual model simulation, resulting in two benefits: (1) reproducibility, an oft-touted goal within the scientific community, and (2) the ability to incorporate improvements from both stakeholders and scientists at a late stage in the project. We will highlight some of the key scientist-stakeholder engagement interactions throughout the project. We will show that active co-production resulted in a product more useful for not only stakeholders in the region, but also the scientific community.
Exploring "The World around Us" in a Community of Scientific Enquiry

ERIC Educational Resources Information Center

Dunlop, Lynda; Compton, Kirsty; Clarke, Linda; McKelvey-Martin, Valerie

2013-01-01

The primary Communities of Scientific Enquiry project is one element of the outreach work in Science in Society in Biomedical Sciences in partnership with the School of Education at the University of Ulster. The project aims to develop scientific understanding and skills at key stage 2 and is a response to several contemporary issues in primary…
Open University Learning Analytics dataset.

PubMed

Kuzilek, Jakub; Hlosta, Martin; Zdrahal, Zdenek

2017-11-28

Learning Analytics focuses on the collection and analysis of learners' data to improve their learning experience by providing informed guidance and to optimise learning materials. To support the research in this area we have developed a dataset, containing data from courses presented at the Open University (OU). What makes the dataset unique is the fact that it contains demographic data together with aggregated clickstream data of students' interactions in the Virtual Learning Environment (VLE). This enables the analysis of student behaviour, represented by their actions. The dataset contains the information about 22 courses, 32,593 students, their assessment results, and logs of their interactions with the VLE represented by daily summaries of student clicks (10,655,280 entries). The dataset is freely available at https://analyse.kmi.open.ac.uk/open_dataset under a CC-BY 4.0 license.
Open University Learning Analytics dataset

PubMed Central

Kuzilek, Jakub; Hlosta, Martin; Zdrahal, Zdenek

2017-01-01

Learning Analytics focuses on the collection and analysis of learners’ data to improve their learning experience by providing informed guidance and to optimise learning materials. To support the research in this area we have developed a dataset, containing data from courses presented at the Open University (OU). What makes the dataset unique is the fact that it contains demographic data together with aggregated clickstream data of students’ interactions in the Virtual Learning Environment (VLE). This enables the analysis of student behaviour, represented by their actions. The dataset contains the information about 22 courses, 32,593 students, their assessment results, and logs of their interactions with the VLE represented by daily summaries of student clicks (10,655,280 entries). The dataset is freely available at https://analyse.kmi.open.ac.uk/open_dataset under a CC-BY 4.0 license. PMID:29182599
New public dataset for spotting patterns in medieval document images

NASA Astrophysics Data System (ADS)

En, Sovann; Nicolas, Stéphane; Petitjean, Caroline; Jurie, Frédéric; Heutte, Laurent

2017-01-01

With advances in technology, a large part of our cultural heritage is becoming digitally available. In particular, in the field of historical document image analysis, there is now a growing need for indexing and data mining tools, thus allowing us to spot and retrieve the occurrences of an object of interest, called a pattern, in a large database of document images. Patterns may present some variability in terms of color, shape, or context, making the spotting of patterns a challenging task. Pattern spotting is a relatively new field of research, still hampered by the lack of available annotated resources. We present a new publicly available dataset named DocExplore dedicated to spotting patterns in historical document images. The dataset contains 1500 images and 1464 queries, and allows the evaluation of two tasks: image retrieval and pattern localization. A standardized benchmark protocol along with ad hoc metrics is provided for a fair comparison of the submitted approaches. We also provide some first results obtained with our baseline system on this new dataset, which show that there is room for improvement and that should encourage researchers of the document image analysis community to design new systems and submit improved results.
EEGVIS: A MATLAB Toolbox for Browsing, Exploring, and Viewing Large Datasets.

PubMed

Robbins, Kay A

2012-01-01

Recent advances in data monitoring and sensor technology have accelerated the acquisition of very large data sets. Streaming data sets from instrumentation such as multi-channel EEG recording usually must undergo substantial pre-processing and artifact removal. Even when using automated procedures, most scientists engage in laborious manual examination and processing to assure high quality data and to indentify interesting or problematic data segments. Researchers also do not have a convenient method of method of visually assessing the effects of applying any stage in a processing pipeline. EEGVIS is a MATLAB toolbox that allows users to quickly explore multi-channel EEG and other large array-based data sets using multi-scale drill-down techniques. Customizable summary views reveal potentially interesting sections of data, which users can explore further by clicking to examine using detailed viewing components. The viewer and a companion browser are built on our MoBBED framework, which has a library of modular viewing components that can be mixed and matched to best reveal structure. Users can easily create new viewers for their specific data without any programming during the exploration process. These viewers automatically support pan, zoom, resizing of individual components, and cursor exploration. The toolbox can be used directly in MATLAB at any stage in a processing pipeline, as a plug-in for EEGLAB, or as a standalone precompiled application without MATLAB running. EEGVIS and its supporting packages are freely available under the GNU general public license at http://visual.cs.utsa.edu/eegvis.
Organization of Biomedical Data for Collaborative Scientific Research: A Research Information Management System

PubMed Central

Myneni, Sahiti; Patel, Vimla L.

2010-01-01

Biomedical researchers often work with massive, detailed and heterogeneous datasets. These datasets raise new challenges of information organization and management for scientific interpretation, as they demand much of the researchers’ time and attention. The current study investigated the nature of the problems that researchers face when dealing with such data. Four major problems identified with existing biomedical scientific information management methods were related to data organization, data sharing, collaboration, and publications. Therefore, there is a compelling need to develop an efficient and user-friendly information management system to handle the biomedical research data. This study evaluated the implementation of an information management system, which was introduced as part of the collaborative research to increase scientific productivity in a research laboratory. Laboratory members seemed to exhibit frustration during the implementation process. However, empirical findings revealed that they gained new knowledge and completed specified tasks while working together with the new system. Hence, researchers are urged to persist and persevere when dealing with any new technology, including an information management system in a research laboratory environment. PMID:20543892
Organization of Biomedical Data for Collaborative Scientific Research: A Research Information Management System.

PubMed

Myneni, Sahiti; Patel, Vimla L

2010-06-01

Biomedical researchers often work with massive, detailed and heterogeneous datasets. These datasets raise new challenges of information organization and management for scientific interpretation, as they demand much of the researchers' time and attention. The current study investigated the nature of the problems that researchers face when dealing with such data. Four major problems identified with existing biomedical scientific information management methods were related to data organization, data sharing, collaboration, and publications. Therefore, there is a compelling need to develop an efficient and user-friendly information management system to handle the biomedical research data. This study evaluated the implementation of an information management system, which was introduced as part of the collaborative research to increase scientific productivity in a research laboratory. Laboratory members seemed to exhibit frustration during the implementation process. However, empirical findings revealed that they gained new knowledge and completed specified tasks while working together with the new system. Hence, researchers are urged to persist and persevere when dealing with any new technology, including an information management system in a research laboratory environment.
Relationships between Scientific Process Skills and Scientific Creativity: Mediating Role of Nature of Science Knowledge

ERIC Educational Resources Information Center

Ozdemir, Gokhan; Dikici, Ayhan

2017-01-01

The purpose of this study is to explore the strength of relationships between 7th grade students' Scientific Process Skills (SPS), Nature of Science (NOS) beliefs, and Scientific Creativity (SC) through Structural Equation Modeling (SEM). For this purpose, data were collected from 332 students of two public middle school students in Turkey. SPS,…
Web-based visualization of very large scientific astronomy imagery

NASA Astrophysics Data System (ADS)

Bertin, E.; Pillay, R.; Marmo, C.

2015-04-01

Visualizing and navigating through large astronomy images from a remote location with current astronomy display tools can be a frustrating experience in terms of speed and ergonomics, especially on mobile devices. In this paper, we present a high performance, versatile and robust client-server system for remote visualization and analysis of extremely large scientific images. Applications of this work include survey image quality control, interactive data query and exploration, citizen science, as well as public outreach. The proposed software is entirely open source and is designed to be generic and applicable to a variety of datasets. It provides access to floating point data at terabyte scales, with the ability to precisely adjust image settings in real-time. The proposed clients are light-weight, platform-independent web applications built on standard HTML5 web technologies and compatible with both touch and mouse-based devices. We put the system to the test and assess the performance of the system and show that a single server can comfortably handle more than a hundred simultaneous users accessing full precision 32 bit astronomy data.
Future Visions for Scientific Human Exploration

NASA Astrophysics Data System (ADS)

Garvin, James

2002-01-01

Human exploration has always played a vital role within NASA, in spite of current perceptions that today it is adrift as a consequence of the resource challenges associated with construction and operation of the International Space Station (ISS). On the basis of the significance of human spaceflight within NASA's overall mission, periodic evaluation of its strategic position has been conducted by various groups, most recently exemplified by the recent Human Exploration and Development of Space Enterprise Strategic Plan. While such reports paint one potential future pathway, they are necessarily constrained by the ground rules and assumptions under which they are developed. An alternate approach, involving a small team of individuals selected as "brainstormers," has been ongoing within NASA for the past two years in an effort to capture a vision of a long-term future for human spaceflight not limited by nearer-term "point design" solutions. This paper describes the guiding principles and concepts developed by this team. It is not intended to represent an implementation plan, but rather one perspective on what could result as human beings extend their range of experience in spaceflight beyond today's beach-head of Low-Earth Orbit (LEO).
Enhancing Conservation with High Resolution Productivity Datasets for the Conterminous United States

NASA Astrophysics Data System (ADS)

Robinson, Nathaniel Paul

across the CONUS domain. The main results of this work are three publicly available datasets: 1) 30 m Landsat NDVI; 2) 250 m MODIS based GPP and NPP; and 3) 30 m Landsat based GPP and NPP. My goal is that these products prove useful for the wider scientific, conservation, and land management communities as we continue to strive for better conservation and management practices.
MaGnET: Malaria Genome Exploration Tool.

PubMed

Sharman, Joanna L; Gerloff, Dietlind L

2013-09-15

The Malaria Genome Exploration Tool (MaGnET) is a software tool enabling intuitive 'exploration-style' visualization of functional genomics data relating to the malaria parasite, Plasmodium falciparum. MaGnET provides innovative integrated graphic displays for different datasets, including genomic location of genes, mRNA expression data, protein-protein interactions and more. Any selection of genes to explore made by the user is easily carried over between the different viewers for different datasets, and can be changed interactively at any point (without returning to a search). Free online use (Java Web Start) or download (Java application archive and MySQL database; requires local MySQL installation) at http://malariagenomeexplorer.org joanna.sharman@ed.ac.uk or dgerloff@ffame.org Supplementary data are available at Bioinformatics online.
Harnessing Connectivity in a Large-Scale Small-Molecule Sensitivity Dataset.

PubMed

Seashore-Ludlow, Brinton; Rees, Matthew G; Cheah, Jaime H; Cokol, Murat; Price, Edmund V; Coletti, Matthew E; Jones, Victor; Bodycombe, Nicole E; Soule, Christian K; Gould, Joshua; Alexander, Benjamin; Li, Ava; Montgomery, Philip; Wawer, Mathias J; Kuru, Nurdan; Kotz, Joanne D; Hon, C Suk-Yee; Munoz, Benito; Liefeld, Ted; Dančík, Vlado; Bittker, Joshua A; Palmer, Michelle; Bradner, James E; Shamji, Alykhan F; Clemons, Paul A; Schreiber, Stuart L

2015-11-01

Identifying genetic alterations that prime a cancer cell to respond to a particular therapeutic agent can facilitate the development of precision cancer medicines. Cancer cell-line (CCL) profiling of small-molecule sensitivity has emerged as an unbiased method to assess the relationships between genetic or cellular features of CCLs and small-molecule response. Here, we developed annotated cluster multidimensional enrichment analysis to explore the associations between groups of small molecules and groups of CCLs in a new, quantitative sensitivity dataset. This analysis reveals insights into small-molecule mechanisms of action, and genomic features that associate with CCL response to small-molecule treatment. We are able to recapitulate known relationships between FDA-approved therapies and cancer dependencies and to uncover new relationships, including for KRAS-mutant cancers and neuroblastoma. To enable the cancer community to explore these data, and to generate novel hypotheses, we created an updated version of the Cancer Therapeutic Response Portal (CTRP v2). We present the largest CCL sensitivity dataset yet available, and an analysis method integrating information from multiple CCLs and multiple small molecules to identify CCL response predictors robustly. We updated the CTRP to enable the cancer research community to leverage these data and analyses. ©2015 American Association for Cancer Research.
Future Visions for Scientific Human Exploration

NASA Technical Reports Server (NTRS)

Garvin, James

2005-01-01

Today, humans explore deep-space locations such as Mars, asteroids, and beyond, vicariously here on Earth, with noteworthy success. However, to achieve the revolutionary breakthroughs that have punctuated the history of science since the dawn of the Space Age has always required humans as "the discoverers," as Daniel Boorstin contends in this book of the same name. During Apollo 17, human explorers on the lunar surface discovered the "genesis rock," orange glass, and humans in space revamped the optically crippled Hubble Space Telescope to enable some of the greatest astronomical discoveries of all time. Science-driven human exploration is about developing the opportunities for such events, perhaps associated with challenging problems such as whether we can identify life beyond Earth within the universe. At issue, however, is how to safely insert humans and the spaceflight systems required to allow humans to operate as they do best in the hostile environment of deep space. The first issue is minimizing the problems associated with human adaptation to the most challenging aspects of deep space space radiation and microgravity (or non-Earth gravity). One solution path is to develop technologies that allow for minimization of the exposure time of people to deep space, as was accomplished in Apollo. For a mission to the planet Mars, this might entail new technological solutions for in-space propulsion that would make possible time-minimized transfers to and from Mars. The problem of rapid, reliable in-space transportation is challenged by the celestial mechanics of moving in space and the so-called "rocket equation." To travel to Mars from Earth in less than the time fuel-minimizing trajectories allow (i.e., Hohmann transfers) requires an exponential increase in the amount of fuel. Thus, month-long transits would require a mass of fuel as large as the dry mass of the ISS, assuming the existence of continuous acceleration engines. This raises the largest technological

Hot Salsa: A Laboratory Exercise Exploring the Scientific Method.

ERIC Educational Resources Information Center

Levri, Edward P.; Levri, Maureen A.

2003-01-01

Presents a laboratory exercise on spicy food and body temperature that introduces the scientific method to introductory biology students. Suggests that when students perform their own experiments which they have developed, it helps with their understanding of and confidence in doing science. (Author/SOE)
Characterizing scientific production and consumption in Physics

PubMed Central

Zhang, Qian; Perra, Nicola; Gonçalves, Bruno; Ciulla, Fabio; Vespignani, Alessandro

2013-01-01

We analyze the entire publication database of the American Physical Society generating longitudinal (50 years) citation networks geolocalized at the level of single urban areas. We define the knowledge diffusion proxy, and scientific production ranking algorithms to capture the spatio-temporal dynamics of Physics knowledge worldwide. By using the knowledge diffusion proxy we identify the key cities in the production and consumption of knowledge in Physics as a function of time. The results from the scientific production ranking algorithm allow us to characterize the top cities for scholarly research in Physics. Although we focus on a single dataset concerning a specific field, the methodology presented here opens the path to comparative studies of the dynamics of knowledge across disciplines and research areas. PMID:23571320
Geospatial Visualization of Scientific Data Through Keyhole Markup Language

NASA Astrophysics Data System (ADS)

Wernecke, J.; Bailey, J. E.

2008-12-01

The development of virtual globes has provided a fun and innovative tool for exploring the surface of the Earth. However, it has been the paralleling maturation of Keyhole Markup Language (KML) that has created a new medium and perspective through which to visualize scientific datasets. Originally created by Keyhole Inc., and then acquired by Google in 2004, in 2007 KML was given over to the Open Geospatial Consortium (OGC). It became an OGC international standard on 14 April 2008, and has subsequently been adopted by all major geobrowser developers (e.g., Google, Microsoft, ESRI, NASA) and many smaller ones (e.g., Earthbrowser). By making KML a standard at a relatively young stage in its evolution, developers of the language are seeking to avoid the issues that plagued the early World Wide Web and development of Hypertext Markup Language (HTML). The popularity and utility of Google Earth, in particular, has been enhanced by KML features such as the Smithsonian volcano layer and the dynamic weather layers. Through KML, users can view real-time earthquake locations (USGS), view animations of polar sea-ice coverage (NSIDC), or read about the daily activities of chimpanzees (Jane Goodall Institute). Perhaps even more powerful is the fact that any users can create, edit, and share their own KML, with no or relatively little knowledge of manipulating computer code. We present an overview of the best current scientific uses of KML and a guide to how scientists can learn to use KML themselves.
Exploring Global Exposure Factors Resources URLs

EPA Pesticide Factsheets

The dataset is a compilation of hyperlinks (URLs) for resources (databases, compendia, published articles, etc.) useful for exposure assessment specific to consumer product use.This dataset is associated with the following publication:Zaleski, R., P. Egeghy, and P. Hakkinen. Exploring Global Exposure Factors Resources for Use in Consumer Exposure Assessments. International Journal of Environmental Research and Public Health. Molecular Diversity Preservation International, Basel, SWITZERLAND, 13(7): 744, (2016).
Comparing methods of analysing datasets with small clusters: case studies using four paediatric datasets.

PubMed

Marston, Louise; Peacock, Janet L; Yu, Keming; Brocklehurst, Peter; Calvert, Sandra A; Greenough, Anne; Marlow, Neil

2009-07-01

Studies of prematurely born infants contain a relatively large percentage of multiple births, so the resulting data have a hierarchical structure with small clusters of size 1, 2 or 3. Ignoring the clustering may lead to incorrect inferences. The aim of this study was to compare statistical methods which can be used to analyse such data: generalised estimating equations, multilevel models, multiple linear regression and logistic regression. Four datasets which differed in total size and in percentage of multiple births (n = 254, multiple 18%; n = 176, multiple 9%; n = 10 098, multiple 3%; n = 1585, multiple 8%) were analysed. With the continuous outcome, two-level models produced similar results in the larger dataset, while generalised least squares multilevel modelling (ML GLS 'xtreg' in Stata) and maximum likelihood multilevel modelling (ML MLE 'xtmixed' in Stata) produced divergent estimates using the smaller dataset. For the dichotomous outcome, most methods, except generalised least squares multilevel modelling (ML GH 'xtlogit' in Stata) gave similar odds ratios and 95% confidence intervals within datasets. For the continuous outcome, our results suggest using multilevel modelling. We conclude that generalised least squares multilevel modelling (ML GLS 'xtreg' in Stata) and maximum likelihood multilevel modelling (ML MLE 'xtmixed' in Stata) should be used with caution when the dataset is small. Where the outcome is dichotomous and there is a relatively large percentage of non-independent data, it is recommended that these are accounted for in analyses using logistic regression with adjusted standard errors or multilevel modelling. If, however, the dataset has a small percentage of clusters greater than size 1 (e.g. a population dataset of children where there are few multiples) there appears to be less need to adjust for clustering.
AMRZone: A Runtime AMR Data Sharing Framework For Scientific Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Wenzhao; Tang, Houjun; Harenberg, Steven

Frameworks that facilitate runtime data sharing across multiple applications are of great importance for scientific data analytics. Although existing frameworks work well over uniform mesh data, they can not effectively handle adaptive mesh refinement (AMR) data. Among the challenges to construct an AMR-capable framework include: (1) designing an architecture that facilitates online AMR data management; (2) achieving a load-balanced AMR data distribution for the data staging space at runtime; and (3) building an effective online index to support the unique spatial data retrieval requirements for AMR data. Towards addressing these challenges to support runtime AMR data sharing across scientific applications,more » we present the AMRZone framework. Experiments over real-world AMR datasets demonstrate AMRZone's effectiveness at achieving a balanced workload distribution, reading/writing large-scale datasets with thousands of parallel processes, and satisfying queries with spatial constraints. Moreover, AMRZone's performance and scalability are even comparable with existing state-of-the-art work when tested over uniform mesh data with up to 16384 cores; in the best case, our framework achieves a 46% performance improvement.« less
Autonomous localisation of rovers for future planetary exploration

NASA Astrophysics Data System (ADS)

Bajpai, Abhinav

Future Mars exploration missions will have increasingly ambitious goals compared to current rover and lander missions. There will be a need for extremely long distance traverses over shorter periods of time. This will allow more varied and complex scientific tasks to be performed and increase the overall value of the missions. The missions may also include a sample return component, where items collected on the surface will be returned to a cache in order to be returned to Earth, for further study. In order to make these missions feasible, future rover platforms will require increased levels of autonomy, allowing them to operate without heavy reliance on a terrestrial ground station. Being able to autonomously localise the rover is an important element in increasing the rover's capability to independently explore. This thesis develops a Planetary Monocular Simultaneous Localisation And Mapping (PM-SLAM) system aimed specifically at a planetary exploration context. The system uses a novel modular feature detection and tracking algorithm called hybrid-saliency in order to achieve robust tracking, while maintaining low computational complexity in the SLAM filter. The hybrid saliency technique uses a combination of cognitive inspired saliency features with point-based feature descriptors as input to the SLAM filter. The system was tested on simulated datasets generated using the Planetary, Asteroid and Natural scene Generation Utility (PANGU) as well as two real world datasets which closely approximated images from a planetary environment. The system was shown to provide a higher accuracy of localisation estimate than a state-of-the-art VO system tested on the same data set. In order to be able to localise the rover absolutely, further techniques are investigated which attempt to determine the rover's position in orbital maps. Orbiter Mask Matching uses point-based features detected by the rover to associate descriptors with large features extracted from orbital
Developing a Resource for Implementing ArcSWAT Using Global Datasets

NASA Astrophysics Data System (ADS)

Taggart, M.; Caraballo Álvarez, I. O.; Mueller, C.; Palacios, S. L.; Schmidt, C.; Milesi, C.; Palmer-Moloney, L. J.

2015-12-01

This project developed a comprehensive user manual outlining methods for adapting and implementing global datasets for use within ArcSWAT for international and worldwide applications. The Soil and Water Assessment Tool (SWAT) is a hydrologic model that looks at a number of hydrologic variables including runoff and the chemical makeup of water at a given location on the Earth's surface using Digital Elevation Models (DEM), land cover, soil, and weather data. However, the application of ArcSWAT for projects outside of the United States is challenging as there is no standard framework for inputting global datasets into ArcSWAT. This project aims to remove this obstacle by outlining methods for adapting and implementing these global datasets via the user manual. The manual takes the user through the processes of data conditioning while providing solutions and suggestions for common errors. The efficacy of the manual was explored using examples from watersheds located in Puerto Rico, Mexico and Western Africa. Each run explored the various options for setting up a ArcSWAT project as well as a range of satellite data products and soil databases. Future work will incorporate in-situ data for validation and calibration of the model and outline additional resources to assist future users in efficiently implementing the model for worldwide applications. The capacity to manage and monitor freshwater availability is of critical importance in both developed and developing countries. As populations grow and climate changes, both the quality and quantity of freshwater are affected resulting in negative impacts on the health of the surrounding population. The use of hydrologic models such as ArcSWAT can help stakeholders and decision makers understand the future impacts of these changes enabling informed and substantiated decisions.
Background qualitative analysis of the European Reference Life Cycle Database (ELCD) energy datasets - part I: fuel datasets.

PubMed

Garraín, Daniel; Fazio, Simone; de la Rúa, Cristina; Recchioni, Marco; Lechón, Yolanda; Mathieux, Fabrice

2015-01-01

The aim of this study is to identify areas of potential improvement of the European Reference Life Cycle Database (ELCD) fuel datasets. The revision is based on the data quality indicators described by the ILCD Handbook, applied on sectorial basis. These indicators evaluate the technological, geographical and time-related representativeness of the dataset and the appropriateness in terms of completeness, precision and methodology. Results show that ELCD fuel datasets have a very good quality in general terms, nevertheless some findings and recommendations in order to improve the quality of Life-Cycle Inventories have been derived. Moreover, these results ensure the quality of the fuel-related datasets to any LCA practitioner, and provide insights related to the limitations and assumptions underlying in the datasets modelling. Giving this information, the LCA practitioner will be able to decide whether the use of the ELCD fuel datasets is appropriate based on the goal and scope of the analysis to be conducted. The methodological approach would be also useful for dataset developers and reviewers, in order to improve the overall DQR of databases.
Strategy for outer planets exploration

NASA Technical Reports Server (NTRS)

1975-01-01

NASA's Planetary Programs Office formed a number of scientific working groups to study in depth the potential scientific return from the various candidate missions to the outer solar system. The results of these working group studies were brought together in a series of symposia to evaluate the potential outer planet missions and to discuss strategies for exploration of the outer solar system that were consistent with fiscal constraints and with anticipated spacecraft and launch vehicle capabilities. A logical, scientifically sound, and cost effective approach to exploration of the outer solar system is presented.
AceTree: a major update and case study in the long term maintenance of open-source scientific software.

PubMed

Katzman, Braden; Tang, Doris; Santella, Anthony; Bao, Zhirong

2018-04-04

AceTree, a software application first released in 2006, facilitates exploration, curation and editing of tracked C. elegans nuclei in 4-dimensional (4D) fluorescence microscopy datasets. Since its initial release, AceTree has been continuously used to interact with, edit and interpret C. elegans lineage data. In its 11 year lifetime, AceTree has been periodically updated to meet the technical and research demands of its community of users. This paper presents the newest iteration of AceTree which contains extensive updates, demonstrates the new applicability of AceTree in other developmental contexts, and presents its evolutionary software development paradigm as a viable model for maintaining scientific software. Large scale updates have been made to the user interface for an improved user experience. Tools have been grouped according to functionality and obsolete methods have been removed. Internal requirements have been changed that enable greater flexibility of use both in C. elegans contexts and in other model organisms. Additionally, the original 3-dimensional (3D) viewing window has been completely reimplemented. The new window provides a new suite of tools for data exploration. By responding to technical advancements and research demands, AceTree has remained a useful tool for scientific research for over a decade. The updates made to the codebase have extended AceTree's applicability beyond its initial use in C. elegans and enabled its usage with other model organisms. The evolution of AceTree demonstrates a viable model for maintaining scientific software over long periods of time.
The garden as a laboratory: the role of domestic gardens as places of scientific exploration in the long 18th century.

PubMed

Hickman, Clare

2014-06-01

Eighteenth-century gardens have traditionally been viewed as spaces designed for leisure, and as representations of political status, power and taste. In contrast, this paper will explore the concept that gardens in this period could be seen as dynamic spaces where scientific experiment and medical practice could occur. Two examples have been explored in the pilot study which has led to this paper - the designed landscapes associated with John Hunter's Earl's Court residence, in London, and the garden at Edward Jenner's house in Berkeley, Gloucestershire. Garden history methodologies have been implemented in order to consider the extent to which these domestic gardens can be viewed as experimental spaces.
Digital Rocks Portal: a sustainable platform for imaged dataset sharing, translation and automated analysis

NASA Astrophysics Data System (ADS)

Prodanovic, M.; Esteva, M.; Hanlon, M.; Nanda, G.; Agarwal, P.

2015-12-01

Recent advances in imaging have provided a wealth of 3D datasets that reveal pore space microstructure (nm to cm length scale) and allow investigation of nonlinear flow and mechanical phenomena from first principles using numerical approaches. This framework has popularly been called "digital rock physics". Researchers, however, have trouble storing and sharing the datasets both due to their size and the lack of standardized image types and associated metadata for volumetric datasets. This impedes scientific cross-validation of the numerical approaches that characterize large scale porous media properties, as well as development of multiscale approaches required for correct upscaling. A single research group typically specializes in an imaging modality and/or related modeling on a single length scale, and lack of data-sharing infrastructure makes it difficult to integrate different length scales. We developed a sustainable, open and easy-to-use repository called the Digital Rocks Portal, that (1) organizes images and related experimental measurements of different porous materials, (2) improves access to them for a wider community of geosciences or engineering researchers not necessarily trained in computer science or data analysis. Once widely accepter, the repository will jumpstart productivity and enable scientific inquiry and engineering decisions founded on a data-driven basis. This is the first repository of its kind. We show initial results on incorporating essential software tools and pipelines that make it easier for researchers to store and reuse data, and for educators to quickly visualize and illustrate concepts to a wide audience. For data sustainability and continuous access, the portal is implemented within the reliable, 24/7 maintained High Performance Computing Infrastructure supported by the Texas Advanced Computing Center (TACC) at the University of Texas at Austin. Long-term storage is provided through the University of Texas System Research
A Sample Data Publication: Interactive Access, Analysis and Display of Remotely Stored Datasets From Hurricane Charley

NASA Astrophysics Data System (ADS)

Weber, J.; Domenico, B.

2004-12-01

This paper is an example of what we call data interactive publications. With a properly configured workstation, the readers can click on "hotspots" in the document that launches an interactive analysis tool called the Unidata Integrated Data Viewer (IDV). The IDV will enable the readers to access, analyze and display datasets on remote servers as well as documents describing them. Beyond the parameters and datasets initially configured into the paper, the analysis tool will have access to all the other dataset parameters as well as to a host of other datasets on remote servers. These data interactive publications are built on top of several data delivery, access, discovery, and visualization tools developed by Unidata and its partner organizations. For purposes of illustrating this integrative technology, we will use data from the event of Hurricane Charley over Florida from August 13-15, 2004. This event illustrates how components of this process fit together. The Local Data Manager (LDM), Open-source Project for a Network Data Access Protocol (OPeNDAP) and Abstract Data Distribution Environment (ADDE) services, Thematic Realtime Environmental Distributed Data Service (THREDDS) cataloging services, and the IDV are highlighted in this example of a publication with embedded pointers for accessing and interacting with remote datasets. An important objective of this paper is to illustrate how these integrated technologies foster the creation of documents that allow the reader to learn the scientific concepts by direct interaction with illustrative datasets, and help build a framework for integrated Earth System science.
Unified Access Architecture for Large-Scale Scientific Datasets

NASA Astrophysics Data System (ADS)

Karna, Risav

2014-05-01

Data-intensive sciences have to deploy diverse large scale database technologies for data analytics as scientists have now been dealing with much larger volume than ever before. While array databases have bridged many gaps between the needs of data-intensive research fields and DBMS technologies (Zhang 2011), invocation of other big data tools accompanying these databases is still manual and separate the database management's interface. We identify this as an architectural challenge that will increasingly complicate the user's work flow owing to the growing number of useful but isolated and niche database tools. Such use of data analysis tools in effect leaves the burden on the user's end to synchronize the results from other data manipulation analysis tools with the database management system. To this end, we propose a unified access interface for using big data tools within large scale scientific array database using the database queries themselves to embed foreign routines belonging to the big data tools. Such an invocation of foreign data manipulation routines inside a query into a database can be made possible through a user-defined function (UDF). UDFs that allow such levels of freedom as to call modules from another language and interface back and forth between the query body and the side-loaded functions would be needed for this purpose. For the purpose of this research we attempt coupling of four widely used tools Hadoop (hadoop1), Matlab (matlab1), R (r1) and ScaLAPACK (scalapack1) with UDF feature of rasdaman (Baumann 98), an array-based data manager, for investigating this concept. The native array data model used by an array-based data manager provides compact data storage and high performance operations on ordered data such as spatial data, temporal data, and matrix-based data for linear algebra operations (scidbusr1). Performances issues arising due to coupling of tools with different paradigms, niche functionalities, separate processes and output
Lunar Daylight Exploration

NASA Technical Reports Server (NTRS)

Griffin, Brand Norman

2010-01-01

With 1 rover, 2 astronauts and 3 days, the Apollo 17 Mission covered over 30 km, setup 10 scientific experiments and returned 110 kg of samples. This is a lot of science in a short time and the inspiration for a barebones, return-to-the-Moon strategy called Daylight Exploration. The Daylight Exploration approach poses an answer to the question, What could the Apollo crew have done with more time and today s robotics? In contrast to more ambitious and expensive strategies that create outposts then rely on pressurized rovers to drive to the science sites, Daylight Exploration is a low-overhead approach conceived to land near the scientific site, conduct Apollo-like exploration then leave before the sun goes down. A key motivation behind Daylight Exploration is cost reduction, but it does not come at the expense of scientific exploration. As a goal, Daylight Exploration provides access to the top 10 science sites by using the best capabilities of human and robotic exploration. Most science sites are within an equatorial band of 26 degrees latitude and on the Moon, at the equator, the day is 14 Earth days long; even more important, the lunar night is 14 days long. Human missions are constrained to 12 days because the energy storage systems required to operate during the lunar night adds mass, complexity and cost. In addition, short missions are beneficial because they require fewer consumables, do not require an airlock, reduce radiation exposure, minimize the dwell-time for the ascent and orbiting propulsion systems and allow a low-mass, campout accommodations. Key to Daylight Exploration is the use of piloted rovers used as tele-operated science platforms. Rovers are launched before or with the crew, and continue to operate between crew visits analyzing and collecting samples during the lunar daylight
An automated curation procedure for addressing chemical errors and inconsistencies in public datasets used in QSAR modelling.

PubMed

Mansouri, K; Grulke, C M; Richard, A M; Judson, R S; Williams, A J

2016-11-01

The increasing availability of large collections of chemical structures and associated experimental data provides an opportunity to build robust QSAR models for applications in different fields. One common concern is the quality of both the chemical structure information and associated experimental data. Here we describe the development of an automated KNIME workflow to curate and correct errors in the structure and identity of chemicals using the publicly available PHYSPROP physicochemical properties and environmental fate datasets. The workflow first assembles structure-identity pairs using up to four provided chemical identifiers, including chemical name, CASRNs, SMILES, and MolBlock. Problems detected included errors and mismatches in chemical structure formats, identifiers and various structure validation issues, including hypervalency and stereochemistry descriptions. Subsequently, a machine learning procedure was applied to evaluate the impact of this curation process. The performance of QSAR models built on only the highest-quality subset of the original dataset was compared with the larger curated and corrected dataset. The latter showed statistically improved predictive performance. The final workflow was used to curate the full list of PHYSPROP datasets, and is being made publicly available for further usage and integration by the scientific community.
Design of an audio advertisement dataset

NASA Astrophysics Data System (ADS)

Fu, Yutao; Liu, Jihong; Zhang, Qi; Geng, Yuting

2015-12-01

Since more and more advertisements swarm into radios, it is necessary to establish an audio advertising dataset which could be used to analyze and classify the advertisement. A method of how to establish a complete audio advertising dataset is presented in this paper. The dataset is divided into four different kinds of advertisements. Each advertisement's sample is given in *.wav file format, and annotated with a txt file which contains its file name, sampling frequency, channel number, broadcasting time and its class. The classifying rationality of the advertisements in this dataset is proved by clustering the different advertisements based on Principal Component Analysis (PCA). The experimental results show that this audio advertisement dataset offers a reliable set of samples for correlative audio advertisement experimental studies.
Science in Writing: Learning Scientific Argument in Principle and Practice

ERIC Educational Resources Information Center

Cope, Bill; Kalantzis, Mary; Abd-El-Khalick, Fouad; Bagley, Elizabeth

2013-01-01

This article explores the processes of writing in science and in particular the "complex performance" of writing a scientific argument. The article explores in general terms the nature of scientific argumentation in which the author-scientist makes claims, provides evidence to support these claims, and develops chains of scientific…
Background qualitative analysis of the European reference life cycle database (ELCD) energy datasets - part II: electricity datasets.

PubMed

Garraín, Daniel; Fazio, Simone; de la Rúa, Cristina; Recchioni, Marco; Lechón, Yolanda; Mathieux, Fabrice

2015-01-01

The aim of this paper is to identify areas of potential improvement of the European Reference Life Cycle Database (ELCD) electricity datasets. The revision is based on the data quality indicators described by the International Life Cycle Data system (ILCD) Handbook, applied on sectorial basis. These indicators evaluate the technological, geographical and time-related representativeness of the dataset and the appropriateness in terms of completeness, precision and methodology. Results show that ELCD electricity datasets have a very good quality in general terms, nevertheless some findings and recommendations in order to improve the quality of Life-Cycle Inventories have been derived. Moreover, these results ensure the quality of the electricity-related datasets to any LCA practitioner, and provide insights related to the limitations and assumptions underlying in the datasets modelling. Giving this information, the LCA practitioner will be able to decide whether the use of the ELCD electricity datasets is appropriate based on the goal and scope of the analysis to be conducted. The methodological approach would be also useful for dataset developers and reviewers, in order to improve the overall Data Quality Requirements of databases.

Scientific Inquiry: A Model for Online Searching.

ERIC Educational Resources Information Center

Harter, Stephen P.

1984-01-01

Explores scientific inquiry as philosophical and behavioral model for online search specialist and information retrieval process. Nature of scientific research is described and online analogs to research concepts of variable, hypothesis formulation and testing, operational definition, validity, reliability, assumption, and cyclical nature of…
Effects of VR system fidelity on analyzing isosurface visualization of volume datasets.

PubMed

Laha, Bireswar; Bowman, Doug A; Socha, John J

2014-04-01

Volume visualization is an important technique for analyzing datasets from a variety of different scientific domains. Volume data analysis is inherently difficult because volumes are three-dimensional, dense, and unfamiliar, requiring scientists to precisely control the viewpoint and to make precise spatial judgments. Researchers have proposed that more immersive (higher fidelity) VR systems might improve task performance with volume datasets, and significant results tied to different components of display fidelity have been reported. However, more information is needed to generalize these results to different task types, domains, and rendering styles. We visualized isosurfaces extracted from synchrotron microscopic computed tomography (SR-μCT) scans of beetles, in a CAVE-like display. We ran a controlled experiment evaluating the effects of three components of system fidelity (field of regard, stereoscopy, and head tracking) on a variety of abstract task categories that are applicable to various scientific domains, and also compared our results with those from our prior experiment using 3D texture-based rendering. We report many significant findings. For example, for search and spatial judgment tasks with isosurface visualization, a stereoscopic display provides better performance, but for tasks with 3D texture-based rendering, displays with higher field of regard were more effective, independent of the levels of the other display components. We also found that systems with high field of regard and head tracking improve performance in spatial judgment tasks. Our results extend existing knowledge and produce new guidelines for designing VR systems to improve the effectiveness of volume data analysis.
GeoNotebook: Browser based Interactive analysis and visualization workflow for very large climate and geospatial datasets

NASA Astrophysics Data System (ADS)

Ozturk, D.; Chaudhary, A.; Votava, P.; Kotfila, C.

2016-12-01

Jointly developed by Kitware and NASA Ames, GeoNotebook is an open source tool designed to give the maximum amount of flexibility to analysts, while dramatically simplifying the process of exploring geospatially indexed datasets. Packages like Fiona (backed by GDAL), Shapely, Descartes, Geopandas, and PySAL provide a stack of technologies for reading, transforming, and analyzing geospatial data. Combined with the Jupyter notebook and libraries like matplotlib/Basemap it is possible to generate detailed geospatial visualizations. Unfortunately, visualizations generated is either static or does not perform well for very large datasets. Also, this setup requires a great deal of boilerplate code to create and maintain. Other extensions exist to remedy these problems, but they provide a separate map for each input cell and do not support map interactions that feed back into the python environment. To support interactive data exploration and visualization on large datasets we have developed an extension to the Jupyter notebook that provides a single dynamic map that can be managed from the Python environment, and that can communicate back with a server which can perform operations like data subsetting on a cloud-based cluster.
Exploring the potential of using stories about diverse scientists and reflective activities to enrich primary students' images of scientists and scientific work

NASA Astrophysics Data System (ADS)

Sharkawy, Azza

2012-06-01

The purpose of this qualitative study was to explore the potential of using stories about diverse scientists to broaden primary students' images of scientists and scientific work. Stories featuring scientists from diverse socio-cultural backgrounds (i.e., physical ability, gender, ethnicity) were presented to 11 grade one students over a 15 -week period. My analysis of pre-and post audio-taped interview transcripts, draw-a-scientist-tests (Chambers 1983), participant observations and student work suggest that the stories about scientists and follow-up reflective activities provided resources for students that helped them: (a) acquire images of scientists from less dominant socio-cultural backgrounds; (b) enrich their views of scientific work from predominantly hands-on/activity-oriented views to ones that includes cognitive and positive affective dimensions. One of the limitations of using stories as a tool to extend students' thinking about science is highlighted in a case study of a student who expresses resistance to some of the counter-stereotypic images presented in the stories. I also present two additional case studies that illustrate how shifts in student' views of the nature of scientific work can change their interest in future participation in scientific work.
Exploring Turkish Upper Primary Level Science Textbooks' Coverage of Scientific Literacy Themes

ERIC Educational Resources Information Center

Çakici, Yilmaz

2012-01-01

Problem Statement: Since the 1970s, scientific literacy has been a major goal of national educational systems throughout the world, and thus reform movements in science education call for all students to be scientifically literate. Despite some good curricular changes and developments across the globe, much remains to be achieved. Given that…
Bring NASA Scientific Data into GIS

NASA Astrophysics Data System (ADS)

Xu, H.

2016-12-01

NASA's Earth Observation System (EOS) and many other missions produce data of huge volume and near real time which drives the research and understanding of climate change. Geographic Information System (GIS) is a technology used for the management, visualization and analysis of spatial data. Since it's inception in the 1960s, GIS has been applied to many fields at the city, state, national, and world scales. People continue to use it today to analyze and visualize trends, patterns, and relationships from the massive datasets of scientific data. There is great interest in both the scientific and GIS communities in improving technologies that can bring scientific data into a GIS environment, where scientific research and analysis can be shared through the GIS platform to the public. Most NASA scientific data are delivered in the Hierarchical Data Format (HDF), a format is both flexible and powerful. However, this flexibility results in challenges when trying to develop supported GIS software - data stored with HDF formats lack a unified standard and convention among these products. The presentation introduces an information model that enables ArcGIS software to ingest NASA scientific data and create a multidimensional raster - univariate and multivariate hypercubes - for scientific visualization and analysis. We will present the framework how ArcGIS leverages the open source GDAL (Geospatial Data Abstract Library) to support its raster data access, discuss how we overcame the GDAL drivers limitations in handing scientific products that are stored with HDF4 and HDF5 formats and how we improve the way in modeling the multidimensionality with GDAL. In additional, we will talk about the direction of ArcGIS handling NASA products and demonstrate how the multidimensional information model can help scientists work with various data products such as MODIS, MOPPIT, SMAP as well as many data products in a GIS environment.
Subsampling for dataset optimisation

NASA Astrophysics Data System (ADS)

Ließ, Mareike

2017-04-01

Soil-landscapes have formed by the interaction of soil-forming factors and pedogenic processes. In modelling these landscapes in their pedodiversity and the underlying processes, a representative unbiased dataset is required. This concerns model input as well as output data. However, very often big datasets are available which are highly heterogeneous and were gathered for various purposes, but not to model a particular process or data space. As a first step, the overall data space and/or landscape section to be modelled needs to be identified including considerations regarding scale and resolution. Then the available dataset needs to be optimised via subsampling to well represent this n-dimensional data space. A couple of well-known sampling designs may be adapted to suit this purpose. The overall approach follows three main strategies: (1) the data space may be condensed and de-correlated by a factor analysis to facilitate the subsampling process. (2) Different methods of pattern recognition serve to structure the n-dimensional data space to be modelled into units which then form the basis for the optimisation of an existing dataset through a sensible selection of samples. Along the way, data units for which there is currently insufficient soil data available may be identified. And (3) random samples from the n-dimensional data space may be replaced by similar samples from the available dataset. While being a presupposition to develop data-driven statistical models, this approach may also help to develop universal process models and identify limitations in existing models.
Facilitating Stewardship of scientific data through standards based workflows

NASA Astrophysics Data System (ADS)

Bastrakova, I.; Kemp, C.; Potter, A. K.

2013-12-01

scientific data acquisition and analysis requirements and effective interoperable data management and delivery. This includes participating in national and international dialogue on development of standards, embedding data management activities in business processes, and developing scientific staff as effective data stewards. Similar approach is applied to the geophysical data. By ensuring the geophysical datasets at GA strictly follow metadata and industry standards we are able to implement a provenance based workflow where the data is easily discoverable, geophysical processing can be applied to it and results can be stored. The provenance based workflow enables metadata records for the results to be produced automatically from the input dataset metadata.
Reports and recommendations from COSPAR Planetary Exploration Committee (PEX) & International Lunar Exploration Working Group (ILEWG)

NASA Astrophysics Data System (ADS)

Ehrenfreund, Pascale; Foing, Bernard

2014-05-01

In response to the growing importance of space exploration, the objectives of the COSPAR Panel on Exploration (PEX) are to provide high quality, independent science input to support the development of a global space exploration program while working to safeguard the scientific assets of solar system bodies. PEX engages with COSPAR Commissions and Panels, science foundations, IAA, IAF, UN bodies, and IISL to support in particular national and international space exploration working groups and the new era of planetary exploration. COSPAR's input, as gathered by PEX, is intended to express the consensus view of the international scientific community and should ultimately provide a series of guidelines to support future space exploration activities and cooperative efforts, leading to outstanding scientific discoveries, opportunities for innovation, strategic partnerships, technology progression, and inspiration for people of all ages and cultures worldwide. We shall focus on the lunar exploration aspects, where the COSPAR PEX is building on previous COSPAR, ILEWG and community conferences. An updated COSPAR PEX report is published and available online (Ehrenfreund P. et al, COSPAR planetary exploration panel report, http://www.gwu.edu/~spi/assets/COSPAR_PEX2012.pdf). We celebrate 20 years after the 1st International Conference on Exploration and Utilisation of the Moon at Beatenberg in June 1994. The International Lunar Exploration Working Group (ILEWG) was established the year after in April 1995 at an EGS meeting in Hamburg, Germany. As established in its charter, this working group reports to COSPAR and is charged with developing an international strategy for the exploration of the Moon (http://sci.esa.int/ilewg/ ). It discusses coordination between missions, and a road map for future international lunar exploration and utilisation. It fosters information exchange or potential and real future lunar robotic and human missions, as well as for new scientific and
Development of a video tampering dataset for forensic investigation.

PubMed

Ismael Al-Sanjary, Omar; Ahmed, Ahmed Abdullah; Sulong, Ghazali

2016-09-01

Forgery is an act of modifying a document, product, image or video, among other media. Video tampering detection research requires an inclusive database of video modification. This paper aims to discuss a comprehensive proposal to create a dataset composed of modified videos for forensic investigation, in order to standardize existing techniques for detecting video tampering. The primary purpose of developing and designing this new video library is for usage in video forensics, which can be consciously associated with reliable verification using dynamic and static camera recognition. To the best of the author's knowledge, there exists no similar library among the research community. Videos were sourced from YouTube and by exploring social networking sites extensively by observing posted videos and rating their feedback. The video tampering dataset (VTD) comprises a total of 33 videos, divided among three categories in video tampering: (1) copy-move, (2) splicing, and (3) swapping-frames. Compared to existing datasets, this is a higher number of tampered videos, and with longer durations. The duration of every video is 16s, with a 1280×720 resolution, and a frame rate of 30 frames per second. Moreover, all videos possess the same formatting quality (720p(HD).avi). Both temporal and spatial video features were considered carefully during selection of the videos, and there exists complete information related to the doctored regions in every modified video in the VTD dataset. This database has been made publically available for research on splicing, Swapping frames, and copy-move tampering, and, as such, various video tampering detection issues with ground truth. The database has been utilised by many international researchers and groups of researchers. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Cell-Phone Use and Cancer: A Case Study Exploring the Scientific Method

ERIC Educational Resources Information Center

Colon Parrilla, Wilma V.

2007-01-01

Designed for an introductory nonmajors biology course, this case study presents students with a series of short news stories describing a scientific study of cell-phone use and its health effects. Students read the news stories and then the scientific paper they are based on, comparing the information presented by the news media to the information…
Promoting Science Learning and Scientific Identification through Contemporary Scientific Investigations

NASA Astrophysics Data System (ADS)

Van Horne, Katie

This dissertation investigates the implementation issues and the educational opportunities associated with "taking the practice turn" in science education. This pedagogical shift focuses instructional experiences on engaging students in the epistemic practices of science both to learn the core ideas of the disciplines, as well as to gain an understanding of and personal connection to the scientific enterprise. In Chapter 2, I examine the teacher-researcher co-design collaboration that supported the classroom implementation of a year-long, project-based biology curriculum that was under development. This study explores the dilemmas that arose when teachers implemented a new intervention and how the dilemmas arose and were managed throughout the collaboration of researchers and teachers and between the teachers. In the design-based research of Chapter 3, I demonstrate how students' engagement in epistemic practices in contemporary science investigations supported their conceptual development about genetics. The analysis shows how this involved a complex interaction between the scientific, school and community practices in students' lives and how through varied participation in the practices students come to write about and recognize how contemporary investigations can give them leverage for science-based action outside of the school setting. Finally, Chapter 4 explores the characteristics of learning environments for supporting the development of scientific practice-linked identities. Specific features of the learning environment---access to the intellectual work of the domain, authentic roles and accountability, space to make meaningful contributions in relation to personal interests, and practice-linked identity resources that arose from interactions in the learning setting---supported learners in stabilizing practice-linked science identities through their engagement in contemporary scientific practices. This set of studies shows that providing students with the
The garden as a laboratory: the role of domestic gardens as places of scientific exploration in the long 18th century

PubMed Central

HICKMAN, CLARE

2014-01-01

Eighteenth-century gardens have traditionally been viewed as spaces designed for leisure, and as representations of political status, power and taste. In contrast, this paper will explore the concept that gardens in this period could be seen as dynamic spaces where scientific experiment and medical practice could occur. Two examples have been explored in the pilot study which has led to this paper — the designed landscapes associated with John Hunter’s Earl’s Court residence, in London, and the garden at Edward Jenner’s house in Berkeley, Gloucestershire. Garden history methodologies have been implemented in order to consider the extent to which these domestic gardens can be viewed as experimental spaces. PMID:26052165
MaGnET: Malaria Genome Exploration Tool

PubMed Central

Sharman, Joanna L.; Gerloff, Dietlind L.

2013-01-01

Summary: The Malaria Genome Exploration Tool (MaGnET) is a software tool enabling intuitive ‘exploration-style’ visualization of functional genomics data relating to the malaria parasite, Plasmodium falciparum. MaGnET provides innovative integrated graphic displays for different datasets, including genomic location of genes, mRNA expression data, protein–protein interactions and more. Any selection of genes to explore made by the user is easily carried over between the different viewers for different datasets, and can be changed interactively at any point (without returning to a search). Availability and Implementation: Free online use (Java Web Start) or download (Java application archive and MySQL database; requires local MySQL installation) at http://malariagenomeexplorer.org Contact: joanna.sharman@ed.ac.uk or dgerloff@ffame.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23894142
A global distributed basin morphometric dataset

NASA Astrophysics Data System (ADS)

Shen, Xinyi; Anagnostou, Emmanouil N.; Mei, Yiwen; Hong, Yang

2017-01-01

Basin morphometry is vital information for relating storms to hydrologic hazards, such as landslides and floods. In this paper we present the first comprehensive global dataset of distributed basin morphometry at 30 arc seconds resolution. The dataset includes nine prime morphometric variables; in addition we present formulas for generating twenty-one additional morphometric variables based on combination of the prime variables. The dataset can aid different applications including studies of land-atmosphere interaction, and modelling of floods and droughts for sustainable water management. The validity of the dataset has been consolidated by successfully repeating the Hack's law.
Scientific Assessment of NASA's Solar System Exploration Roadmap

NASA Technical Reports Server (NTRS)

1996-01-01

At its June 24-28, 1996, meeting, the Space Studies Board's Committee on Planetary and Lunar Exploration (COMPLEX), chaired by Ronald Greeley of Arizona State University, conducted an assessment of NASA's Mission to the Solar System Roadmap report. This assessment was made at the specific request of Dr. Jurgen Rahe, NASA's science program director for solar system exploration. The assessment includes consideration of the process by which the Roadmap was developed, comparison of the goals and objectives of the Roadmap with published National Research Council (NRC) recommendations, and suggestions for improving the Roadmap.
Multilayered complex network datasets for three supply chain network archetypes on an urban road grid.

PubMed

Viljoen, Nadia M; Joubert, Johan W

2018-02-01

This article presents the multilayered complex network formulation for three different supply chain network archetypes on an urban road grid and describes how 500 instances were randomly generated for each archetype. Both the supply chain network layer and the urban road network layer are directed unweighted networks. The shortest path set is calculated for each of the 1 500 experimental instances. The datasets are used to empirically explore the impact that the supply chain's dependence on the transport network has on its vulnerability in Viljoen and Joubert (2017) [1]. The datasets are publicly available on Mendeley (Joubert and Viljoen, 2017) [2].
[Scientific journalism and epidemiological risk].

PubMed

Luiz, Olinda do Carmo

2007-01-01

The importance of the communications media in the construction of symbols has been widely acknowledged. Many of the articles on health published in the daily newspapers mention medical studies, sourced from scientific publications focusing on new risks. The disclosure of risk studies in the mass media is also a topic for editorials and articles in scientific journals, focusing the problem of distortions and the appearance of contradictory news items. The purpose of this paper is to explore the meaning and content of disclosing scientific risk studies in large-circulation daily newspapers, analyzing news items published in Brazil and the scientific publications used as their sources during 2000. The "risk" is presented in the scientific research projects as a "black box" in the meaning of Latour, with the news items downplaying scientific disputes and underscoring associations between behavioral habits and the occurrence of diseases, emphasizing individual aspects of the epidemiological approach, to the detriment of the group.
Attribute Utility Motivated k-anonymization of Datasets to Support the Heterogeneous Needs of Biomedical Researchers

PubMed Central

Ye, Huimin; Chen, Elizabeth S.

2011-01-01

In order to support the increasing need to share electronic health data for research purposes, various methods have been proposed for privacy preservation including k-anonymity. Many k-anonymity models provide the same level of anoymization regardless of practical need, which may decrease the utility of the dataset for a particular research study. In this study, we explore extensions to the k-anonymity algorithm that aim to satisfy the heterogeneous needs of different researchers while preserving privacy as well as utility of the dataset. The proposed algorithm, Attribute Utility Motivated k-anonymization (AUM), involves analyzing the characteristics of attributes and utilizing them to minimize information loss during the anonymization process. Through comparison with two existing algorithms, Mondrian and Incognito, preliminary results indicate that AUM may preserve more information from original datasets thus providing higher quality results with lower distortion. PMID:22195223
Exploring methods to expedite the recording of CEST datasets using selective pulse excitation

NASA Astrophysics Data System (ADS)

Yuwen, Tairan; Bouvignies, Guillaume; Kay, Lewis E.

2018-07-01

Chemical Exchange Saturation Transfer (CEST) has emerged as a powerful tool for studies of biomolecular conformational exchange involving the interconversion between a major, visible conformer and one or more minor, invisible states. Applications typically entail recording a large number of 2D datasets, each of which differs in the position of a weak radio frequency field, so as to generate a CEST profile for each nucleus from which the chemical shifts of spins in the invisible state(s) are obtained. Here we compare a number of band-selective CEST schemes for speeding up the process using either DANTE or cosine-modulated excitation approaches. We show that while both are essentially identical for applications such as 15N CEST, in cases where the probed spins are dipolar or scalar coupled to other like spins there can be advantages for the cosine-excitation scheme.

Exploration Science Opportunities for Students within Higher Education

NASA Astrophysics Data System (ADS)

Bailey, Brad; Minafra, Joseph; Schmidt, Gregory

2016-10-01

The NASA Solar System Exploration Research Virtual Institute (SSERVI) is a virtual institute focused on exploration science related to near-term human exploration targets, training the next generation of lunar scientists, and education and public outreach. As part of the SSERVI mission, we act as a hub for opportunities that engage the public through education and outreach efforts in addition to forming new interdisciplinary, scientific collaborations.SSERVI provides opportunities for students to bridge the scientific and generational gap currently existing in the planetary exploration field. This bridge is essential to the continued international success of scientific, as well as human and robotic, exploration.The decline in funding opportunities after the termination of the Apollo missions to the Moon in the early 1970's produced a large gap in both the scientific knowledge and experience of the original lunar Apollo researchers and the resurgent group of young lunar/NEA researchers that have emerged within the last 15 years. One of SSERVI's many goals is to bridge this gap through the many networking and scientific connections made between young researchers and established planetary principle investigators. To this end, SSERVI has supported the establishment of NextGen Lunar Scientists and Engineers group (NGLSE), a group of students and early-career professionals designed to build experience and provide networking opportunities to its members. SSERVI has also created the LunarGradCon, a scientific conference dedicated solely to graduate and undergraduate students working in the lunar field. Additionally, SSERVI produces monthly seminars and bi-yearly virtual workshops that introduce students to the wide variety of exploration science being performed in today's research labs. SSERVI also brokers opportunities for domestic and international student exchange between collaborating laboratories as well as internships at our member institutions. SSERVI provides a
Bayesian correlated clustering to integrate multiple datasets

PubMed Central

Kirk, Paul; Griffin, Jim E.; Savage, Richard S.; Ghahramani, Zoubin; Wild, David L.

2012-01-01

Motivation: The integration of multiple datasets remains a key challenge in systems biology and genomic medicine. Modern high-throughput technologies generate a broad array of different data types, providing distinct—but often complementary—information. We present a Bayesian method for the unsupervised integrative modelling of multiple datasets, which we refer to as MDI (Multiple Dataset Integration). MDI can integrate information from a wide range of different datasets and data types simultaneously (including the ability to model time series data explicitly using Gaussian processes). Each dataset is modelled using a Dirichlet-multinomial allocation (DMA) mixture model, with dependencies between these models captured through parameters that describe the agreement among the datasets. Results: Using a set of six artificially constructed time series datasets, we show that MDI is able to integrate a significant number of datasets simultaneously, and that it successfully captures the underlying structural similarity between the datasets. We also analyse a variety of real Saccharomyces cerevisiae datasets. In the two-dataset case, we show that MDI’s performance is comparable with the present state-of-the-art. We then move beyond the capabilities of current approaches and integrate gene expression, chromatin immunoprecipitation–chip and protein–protein interaction data, to identify a set of protein complexes for which genes are co-regulated during the cell cycle. Comparisons to other unsupervised data integration techniques—as well as to non-integrative approaches—demonstrate that MDI is competitive, while also providing information that would be difficult or impossible to extract using other methods. Availability: A Matlab implementation of MDI is available from http://www2.warwick.ac.uk/fac/sci/systemsbiology/research/software/. Contact: D.L.Wild@warwick.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID
The French Muséum national d'histoire naturelle vascular plant herbarium collection dataset

NASA Astrophysics Data System (ADS)

Le Bras, Gwenaël; Pignal, Marc; Jeanson, Marc L.; Muller, Serge; Aupic, Cécile; Carré, Benoît; Flament, Grégoire; Gaudeul, Myriam; Gonçalves, Claudia; Invernón, Vanessa R.; Jabbour, Florian; Lerat, Elodie; Lowry, Porter P.; Offroy, Bérangère; Pimparé, Eva Pérez; Poncy, Odile; Rouhan, Germinal; Haevermans, Thomas

2017-02-01

We provide a quantitative description of the French national herbarium vascular plants collection dataset. Held at the Muséum national d'histoire naturelle, Paris, it currently comprises records for 5,400,000 specimens, representing 90% of the estimated total of specimens. Ninety nine percent of the specimen entries are linked to one or more images and 16% have field-collecting information available. This major botanical collection represents the results of over three centuries of exploration and study. The sources of the collection are global, with a strong representation for France, including overseas territories, and former French colonies. The compilation of this dataset was made possible through numerous national and international projects, the most important of which was linked to the renovation of the herbarium building. The vascular plant collection is actively expanding today, hence the continuous growth exhibited by the dataset, which can be fully accessed through the GBIF portal or the MNHN database portal (available at: https://science.mnhn.fr/institution/mnhn/collection/p/item/search/form). This dataset is a major source of data for systematics, global plants macroecological studies or conservation assessments.
The French Muséum national d'histoire naturelle vascular plant herbarium collection dataset.

PubMed

Le Bras, Gwenaël; Pignal, Marc; Jeanson, Marc L; Muller, Serge; Aupic, Cécile; Carré, Benoît; Flament, Grégoire; Gaudeul, Myriam; Gonçalves, Claudia; Invernón, Vanessa R; Jabbour, Florian; Lerat, Elodie; Lowry, Porter P; Offroy, Bérangère; Pimparé, Eva Pérez; Poncy, Odile; Rouhan, Germinal; Haevermans, Thomas

2017-02-14

We provide a quantitative description of the French national herbarium vascular plants collection dataset. Held at the Muséum national d'histoire naturelle, Paris, it currently comprises records for 5,400,000 specimens, representing 90% of the estimated total of specimens. Ninety nine percent of the specimen entries are linked to one or more images and 16% have field-collecting information available. This major botanical collection represents the results of over three centuries of exploration and study. The sources of the collection are global, with a strong representation for France, including overseas territories, and former French colonies. The compilation of this dataset was made possible through numerous national and international projects, the most important of which was linked to the renovation of the herbarium building. The vascular plant collection is actively expanding today, hence the continuous growth exhibited by the dataset, which can be fully accessed through the GBIF portal or the MNHN database portal (available at: https://science.mnhn.fr/institution/mnhn/collection/p/item/search/form). This dataset is a major source of data for systematics, global plants macroecological studies or conservation assessments.
The French Muséum national d’histoire naturelle vascular plant herbarium collection dataset

PubMed Central

Le Bras, Gwenaël; Pignal, Marc; Jeanson, Marc L.; Muller, Serge; Aupic, Cécile; Carré, Benoît; Flament, Grégoire; Gaudeul, Myriam; Gonçalves, Claudia; Invernón, Vanessa R.; Jabbour, Florian; Lerat, Elodie; Lowry, Porter P.; Offroy, Bérangère; Pimparé, Eva Pérez; Poncy, Odile; Rouhan, Germinal; Haevermans, Thomas

2017-01-01

We provide a quantitative description of the French national herbarium vascular plants collection dataset. Held at the Muséum national d’histoire naturelle, Paris, it currently comprises records for 5,400,000 specimens, representing 90% of the estimated total of specimens. Ninety nine percent of the specimen entries are linked to one or more images and 16% have field-collecting information available. This major botanical collection represents the results of over three centuries of exploration and study. The sources of the collection are global, with a strong representation for France, including overseas territories, and former French colonies. The compilation of this dataset was made possible through numerous national and international projects, the most important of which was linked to the renovation of the herbarium building. The vascular plant collection is actively expanding today, hence the continuous growth exhibited by the dataset, which can be fully accessed through the GBIF portal or the MNHN database portal (available at: https://science.mnhn.fr/institution/mnhn/collection/p/item/search/form). This dataset is a major source of data for systematics, global plants macroecological studies or conservation assessments. PMID:28195585
Scientists and Scientific Thinking: Understanding Scientific Thinking through an Investigation of Scientists Views about Superstitions and Religious Beliefs

ERIC Educational Resources Information Center

Coll, Richard K.; Lay, Mark C.; Taylor, Neil

2008-01-01

Scientific literacy is explored in this paper which describes two studies that seek to understand a particular feature of the nature of science; namely scientists' habits of mind. The research investigated scientists' views of scientific evidence and how scientists judge evidence claims. The first study is concerned with scientists' views of what…
SATORI: a system for ontology-guided visual exploration of biomedical data repositories.

PubMed

Lekschas, Fritz; Gehlenborg, Nils

2018-04-01

The ever-increasing number of biomedical datasets provides tremendous opportunities for re-use but current data repositories provide limited means of exploration apart from text-based search. Ontological metadata annotations provide context by semantically relating datasets. Visualizing this rich network of relationships can improve the explorability of large data repositories and help researchers find datasets of interest. We developed SATORI-an integrative search and visual exploration interface for the exploration of biomedical data repositories. The design is informed by a requirements analysis through a series of semi-structured interviews. We evaluated the implementation of SATORI in a field study on a real-world data collection. SATORI enables researchers to seamlessly search, browse and semantically query data repositories via two visualizations that are highly interconnected with a powerful search interface. SATORI is an open-source web application, which is freely available at http://satori.refinery-platform.org and integrated into the Refinery Platform. nils@hms.harvard.edu. Supplementary data are available at Bioinformatics online.
Dam Removal Information Portal (DRIP)—A map-based resource linking scientific studies and associated geospatial information about dam removals

USGS Publications Warehouse

Duda, Jeffrey J.; Wieferich, Daniel J.; Bristol, R. Sky; Bellmore, J. Ryan; Hutchison, Vivian B.; Vittum, Katherine M.; Craig, Laura; Warrick, Jonathan A.

2016-08-18

The removal of dams has recently increased over historical levels due to aging infrastructure, changing societal needs, and modern safety standards rendering some dams obsolete. Where possibilities for river restoration, or improved safety, exceed the benefits of retaining a dam, removal is more often being considered as a viable option. Yet, as this is a relatively new development in the history of river management, science is just beginning to guide our understanding of the physical and ecological implications of dam removal. Ultimately, the “lessons learned” from previous scientific studies on the outcomes dam removal could inform future scientific understanding of ecosystem outcomes, as well as aid in decision-making by stakeholders. We created a database visualization tool, the Dam Removal Information Portal (DRIP), to display map-based, interactive information about the scientific studies associated with dam removals. Serving both as a bibliographic source as well as a link to other existing databases like the National Hydrography Dataset, the derived National Dam Removal Science Database serves as the foundation for a Web-based application that synthesizes the existing scientific studies associated with dam removals. Thus, using the DRIP application, users can explore information about completed dam removal projects (for example, their location, height, and date removed), as well as discover sources and details of associated of scientific studies. As such, DRIP is intended to be a dynamic collection of scientific information related to dams that have been removed in the United States and elsewhere. This report describes the architecture and concepts of this “metaknowledge” database and the DRIP visualization tool.
Introducing the VISAGE project - Visualization for Integrated Satellite, Airborne, and Ground-based data Exploration

NASA Astrophysics Data System (ADS)

Gatlin, P. N.; Conover, H.; Berendes, T.; Maskey, M.; Naeger, A. R.; Wingo, S. M.

2017-12-01

A key component of NASA's Earth observation system is its field experiments, for intensive observation of particular weather phenomena, or for ground validation of satellite observations. These experiments collect data from a wide variety of airborne and ground-based instruments, on different spatial and temporal scales, often in unique formats. The field data are often used with high volume satellite observations that have very different spatial and temporal coverage. The challenges inherent in working with such diverse datasets make it difficult for scientists to rapidly collect and analyze the data for physical process studies and validation of satellite algorithms. The newly-funded VISAGE project will address these issues by combining and extending nascent efforts to provide on-line data fusion, exploration, analysis and delivery capabilities. A key building block is the Field Campaign Explorer (FCX), which allows users to examine data collected during field campaigns and simplifies data acquisition for event-based research. VISAGE will extend FCX's capabilities beyond interactive visualization and exploration of coincident datasets, to provide interrogation of data values and basic analyses such as ratios and differences between data fields. The project will also incorporate new, higher level fused and aggregated analysis products from the System for Integrating Multi-platform data to Build the Atmospheric column (SIMBA), which combines satellite and ground-based observations into a common gridded atmospheric column data product; and the Validation Network (VN), which compiles a nationwide database of coincident ground- and satellite-based radar measurements of precipitation for larger scale scientific analysis. The VISAGE proof-of-concept will target "golden cases" from Global Precipitation Measurement Ground Validation campaigns. This presentation will introduce the VISAGE project, initial accomplishments and near term plans.
Map_plot and bgg_plot: software for integration of geoscience datasets

NASA Astrophysics Data System (ADS)

Gaillot, Philippe; Punongbayan, Jane T.; Rea, Brice

2004-02-01

Since 1985, the Ocean Drilling Program (ODP) has been supporting multidisciplinary research in exploring the structure and history of Earth beneath the oceans. After more than 200 Legs, complementary datasets covering different geological environments, periods and space scales have been obtained and distributed world-wide using the ODP-Janus and Lamont Doherty Earth Observatory-Borehole Research Group (LDEO-BRG) database servers. In Earth Sciences, more than in any other science, the ensemble of these data is characterized by heterogeneous formats and graphical representation modes. In order to fully and quickly assess this information, a set of Unix/Linux and Generic Mapping Tool-based C programs has been designed to convert and integrate datasets acquired during the present ODP and the future Integrated ODP (IODP) Legs. Using ODP Leg 199 datasets, we show examples of the capabilities of the proposed programs. The program map_plot is used to easily display datasets onto 2-D maps. The program bgg_plot (borehole geology and geophysics plot) displays data with respect to depth and/or time. The latter program includes depth shifting, filtering and plotting of core summary information, continuous and discrete-sample core measurements (e.g. physical properties, geochemistry, etc.), in situ continuous logs, magneto- and bio-stratigraphies, specific sedimentological analyses (lithology, grain size, texture, porosity, etc.), as well as core and borehole wall images. Outputs from both programs are initially produced in PostScript format that can be easily converted to Portable Document Format (PDF) or standard image formats (GIF, JPEG, etc.) using widely distributed conversion programs. Based on command line operations and customization of parameter files, these programs can be included in other shell- or database-scripts, automating plotting procedures of data requests. As an open source software, these programs can be customized and interfaced to fulfill any specific
Benchmarking Spike-Based Visual Recognition: A Dataset and Evaluation

PubMed Central

Liu, Qian; Pineda-García, Garibaldi; Stromatias, Evangelos; Serrano-Gotarredona, Teresa; Furber, Steve B.

2016-01-01

Today, increasing attention is being paid to research into spike-based neural computation both to gain a better understanding of the brain and to explore biologically-inspired computation. Within this field, the primate visual pathway and its hierarchical organization have been extensively studied. Spiking Neural Networks (SNNs), inspired by the understanding of observed biological structure and function, have been successfully applied to visual recognition and classification tasks. In addition, implementations on neuromorphic hardware have enabled large-scale networks to run in (or even faster than) real time, making spike-based neural vision processing accessible on mobile robots. Neuromorphic sensors such as silicon retinas are able to feed such mobile systems with real-time visual stimuli. A new set of vision benchmarks for spike-based neural processing are now needed to measure progress quantitatively within this rapidly advancing field. We propose that a large dataset of spike-based visual stimuli is needed to provide meaningful comparisons between different systems, and a corresponding evaluation methodology is also required to measure the performance of SNN models and their hardware implementations. In this paper we first propose an initial NE (Neuromorphic Engineering) dataset based on standard computer vision benchmarksand that uses digits from the MNIST database. This dataset is compatible with the state of current research on spike-based image recognition. The corresponding spike trains are produced using a range of techniques: rate-based Poisson spike generation, rank order encoding, and recorded output from a silicon retina with both flashing and oscillating input stimuli. In addition, a complementary evaluation methodology is presented to assess both model-level and hardware-level performance. Finally, we demonstrate the use of the dataset and the evaluation methodology using two SNN models to validate the performance of the models and their hardware
Benchmarking Spike-Based Visual Recognition: A Dataset and Evaluation.

PubMed

Liu, Qian; Pineda-García, Garibaldi; Stromatias, Evangelos; Serrano-Gotarredona, Teresa; Furber, Steve B

2016-01-01

Today, increasing attention is being paid to research into spike-based neural computation both to gain a better understanding of the brain and to explore biologically-inspired computation. Within this field, the primate visual pathway and its hierarchical organization have been extensively studied. Spiking Neural Networks (SNNs), inspired by the understanding of observed biological structure and function, have been successfully applied to visual recognition and classification tasks. In addition, implementations on neuromorphic hardware have enabled large-scale networks to run in (or even faster than) real time, making spike-based neural vision processing accessible on mobile robots. Neuromorphic sensors such as silicon retinas are able to feed such mobile systems with real-time visual stimuli. A new set of vision benchmarks for spike-based neural processing are now needed to measure progress quantitatively within this rapidly advancing field. We propose that a large dataset of spike-based visual stimuli is needed to provide meaningful comparisons between different systems, and a corresponding evaluation methodology is also required to measure the performance of SNN models and their hardware implementations. In this paper we first propose an initial NE (Neuromorphic Engineering) dataset based on standard computer vision benchmarksand that uses digits from the MNIST database. This dataset is compatible with the state of current research on spike-based image recognition. The corresponding spike trains are produced using a range of techniques: rate-based Poisson spike generation, rank order encoding, and recorded output from a silicon retina with both flashing and oscillating input stimuli. In addition, a complementary evaluation methodology is presented to assess both model-level and hardware-level performance. Finally, we demonstrate the use of the dataset and the evaluation methodology using two SNN models to validate the performance of the models and their hardware
77 FR 15052 - Dataset Workshop-U.S. Billion Dollar Disasters Dataset (1980-2011): Assessing Dataset Strengths...

Federal Register 2010, 2011, 2012, 2013, 2014

2012-03-14

... and related methodology. Emphasis will be placed on dataset accuracy and time-dependent biases. Pathways to overcome accuracy and bias issues will be an important focus. Participants will consider...] Guidance for improving these methods. [cir] Recommendations for rectifying any known time-dependent biases...
A Network Mission: Completing the Scientific Foundation for the Exploration of Mars

NASA Technical Reports Server (NTRS)

W. B. Banerdt

2000-01-01

Despite recent setbacks and vacillations in the Mars Surveyor Program, in many respects the exploration of Mars has historically followed a relatively logical path. Early fly-bys provided brief glimpses of the planet and paved the way for the initial orbital reconnaissance of Mariner 9. The Viking orbiters completed the initial survey, while the Viking landers provided our first close-up look at the surface. Essentially, Mars Pathfinder served a similar role, giving a brief look at another place on the surface. And finally, Mars Global Surveyor (and the up-coming orbital mission in 2001) are taking the next step in providing in-depth, global observations of many of the fundamental characteristics of the planet, as well as selected high-resolution views of the surface. With this last step we are well on our way to acquiring the global scientific context that is necessary both for understanding Mars in general, its origin and evolution, and for use as a basis to plan and execute the next level of focused investigations. However, even with the successful completion of these missions this context will be incomplete. Whereas we now know a great deal about the surface of Mars in a global sense, we know very little about its interior, even at depths of only a meter or so. Also, as most of this information has been acquire by remote sensing, we still lack much of the bridging knowledge between the global view and the processes and character of the surface environments themselves. Thus, in many ways we lack sufficient fundamental understanding to intelligently cast the critical investigations into important questions of the origins and evolution of Mars in general, and in particular, life. The next step in building our understanding of Mars has been identified by several previous groups who were charged with creating a strategy for Mars exploration (e.g., COMPLEX, MarSWG, Planetary Roadmap Team). This is a so-called "network" mission, which places a large number of science
NP-PAH Interaction Dataset

EPA Pesticide Factsheets

Dataset presents concentrations of organic pollutants, such as polyaromatic hydrocarbon compounds, in water samples. Water samples of known volume and concentration were allowed to equilibrate with known mass of nanoparticles. The mixture was then ultracentrifuged and sampled for analysis. This dataset is associated with the following publication:Sahle-Demessie, E., A. Zhao, C. Han, B. Hann, and H. Grecsek. Interaction of engineered nanomaterials with hydrophobic organic pollutants.. Journal of Nanotechnology. Hindawi Publishing Corporation, New York, NY, USA, 27(28): 284003, (2016).
Scientific Investigations Associated with the Human Exploration of Mars in the Next 35 Years

NASA Technical Reports Server (NTRS)

Niles, P. B.; Beaty, David; Hays, Lindsay; Bass, Deborah; Bell, Mary Sue; Bleacher, Jake; Cabrol, Nathalie A.; Conrad, Pan; Eppler, Dean; Hamilton, Vicky;

2017-01-01

A human mission to Mars would present an unprecedented opportunity to investigate the earliest history of the solar system. This history that has largely been overwritten on Earth by active geological processing throughout its history, but on Mars, large swaths of the ancient crust remain exposed at the surface, allowing us to investigate martian processes at the earliest time periods when life first appeared on the Earth. Mars' surface has been largely frozen in place for 4 billion years, and after losing its atmosphere and magnetic field what re-mains is an ancient landscape of former hydrothermal systems, river beds, volcanic eruptions, and impact craters. This allows us to investigate scientific questions ranging from the nature of the impact history of the solar system to the origins of life. We present here a summary of the findings of the Human Science Objectives Science Analysis Group, or HSO-SAG chartered by MEPAG in 2015 to address science objectives and landing site criteria for future human missions to Mars (Niles, Beaty et al. 2015). Currently, NASA's plan to land astronauts on Mars in the mid 2030's would allow for robust human exploration of the surface in the next 35 years. We expect that crews would be able to traverse to sites up to 100 km away from the original landing site using robust rovers. A habitat outfitted with state of the art laboratory facilities that could enable the astronauts to perform cutting edge science on the surface of Mars. Robotic/human partnership during exploration would further enhance the science return of the mission.

The Planned Europa Clipper Mission: Exploring Europa to Investigate its Habitability

NASA Astrophysics Data System (ADS)

Pappalardo, Robert T.; Senske, David A.; Korth, Haje; Blaney, Diana L.; Blankenship, Donald D.; Christensen, Philip R.; Kempf, Sascha; Raymond, Carol Anne; Retherford, Kurt D.; Turtle, Elizabeth P.; Waite, J. Hunter; Westlake, Joseph H.; Collins, Geoffrey; Gudipati, Murthy; Lunine, Jonathan I.; Paty, Carol; Rathbun, Julie A.; Roberts, James; E Schmidt, Britney; Soderblom, Jason M.; Europa Clipper Science Team

2017-10-01

A key driver of planetary exploration is to understand the processes that lead to habitability across the solar system. In this context, the science goal of the planned Europa Clipper mission is: Explore Europa to investigate its habitability. Following from this goal are three Mission Objectives: 1) Characterize the ice shell and any subsurface water, including their heterogeneity, ocean properties, and the nature of surface-ice-ocean exchange; 2) Understand the habitability of Europa's ocean through composition and chemistry; and 3) Understand the formation of surface features, including sites of recent or current activity, and characterize localities of high science interest. Folded into these three objectives is the desire to search for and characterize any current activity.To address the Europa science objectives, a highly capable and synergistic suite of nine instruments comprise the mission's scientific payload. This payload includes five remote-sensing instruments that observe the wavelength range from ultraviolet through radar, specifically: Europa UltraViolet Spectrograph (Europa-UVS), Europa Imaging System (EIS), Mapping Imaging Spectrometer for Europa (MISE), Europa THErMal Imaging System (E-THEMIS), and Radar for Europa Assessment and Sounding: Ocean to Near-surface (REASON). In addition, four in-situ instruments measure fields and particles: Interior Characterization of Europa using MAGnetometry (ICEMAG), Plasma Instrument for Magnetic Sounding (PIMS), MAss Spectrometer for Planetary EXploration (MASPEX), and SUrface Dust Analyzer (SUDA). Moreover, gravity science can be addressed via the spacecraft's telecommunication system, and scientifically valuable engineering data from the radiation monitoring system would augment the plasma dataset. Working together, the planned Europa mission’s science payload would allow testing of hypotheses relevant to the composition, interior, and geology of Europa, to address the potential habitability of this
Constructing Scientific Applications from Heterogeneous Resources

NASA Technical Reports Server (NTRS)

Schichting, Richard D.

1995-01-01

A new model for high-performance scientific applications in which such applications are implemented as heterogeneous distributed programs or, equivalently, meta-computations, is investigated. The specific focus of this grant was a collaborative effort with researchers at NASA and the University of Toledo to test and improve Schooner, a software interconnection system, and to explore the benefits of increased user interaction with existing scientific applications.
A Benchmark Dataset and Saliency-guided Stacked Autoencoders for Video-based Salient Object Detection.

PubMed

Li, Jia; Xia, Changqun; Chen, Xiaowu

2017-10-12

Image-based salient object detection (SOD) has been extensively studied in past decades. However, video-based SOD is much less explored due to the lack of large-scale video datasets within which salient objects are unambiguously defined and annotated. Toward this end, this paper proposes a video-based SOD dataset that consists of 200 videos. In constructing the dataset, we manually annotate all objects and regions over 7,650 uniformly sampled keyframes and collect the eye-tracking data of 23 subjects who free-view all videos. From the user data, we find that salient objects in a video can be defined as objects that consistently pop-out throughout the video, and objects with such attributes can be unambiguously annotated by combining manually annotated object/region masks with eye-tracking data of multiple subjects. To the best of our knowledge, it is currently the largest dataset for videobased salient object detection. Based on this dataset, this paper proposes an unsupervised baseline approach for video-based SOD by using saliencyguided stacked autoencoders. In the proposed approach, multiple spatiotemporal saliency cues are first extracted at the pixel, superpixel and object levels. With these saliency cues, stacked autoencoders are constructed in an unsupervised manner that automatically infers a saliency score for each pixel by progressively encoding the high-dimensional saliency cues gathered from the pixel and its spatiotemporal neighbors. In experiments, the proposed unsupervised approach is compared with 31 state-of-the-art models on the proposed dataset and outperforms 30 of them, including 19 imagebased classic (unsupervised or non-deep learning) models, six image-based deep learning models, and five video-based unsupervised models. Moreover, benchmarking results show that the proposed dataset is very challenging and has the potential to boost the development of video-based SOD.
The citation merit of scientific publications.

PubMed

Crespo, Juan A; Ortuño-Ortín, Ignacio; Ruiz-Castillo, Javier

2012-01-01

We propose a new method to assess the merit of any set of scientific papers in a given field based on the citations they receive. Given a field and a citation impact indicator, such as the mean citation or the [Formula: see text]-index, the merit of a given set of [Formula: see text] articles is identified with the probability that a randomly drawn set of [Formula: see text] articles from a given pool of articles in that field has a lower citation impact according to the indicator in question. The method allows for comparisons between sets of articles of different sizes and fields. Using a dataset acquired from Thomson Scientific that contains the articles published in the periodical literature in the period 1998-2007, we show that the novel approach yields rankings of research units different from those obtained by a direct application of the mean citation or the [Formula: see text]-index.

Data Discovery of Big and Diverse Climate Change Datasets - Options, Practices and Challenges

NASA Astrophysics Data System (ADS)

Palanisamy, G.; Boden, T.; McCord, R. A.; Frame, M. T.

2013-12-01

Developing data search tools is a very common, but often confusing, task for most of the data intensive scientific projects. These search interfaces need to be continually improved to handle the ever increasing diversity and volume of data collections. There are many aspects which determine the type of search tool a project needs to provide to their user community. These include: number of datasets, amount and consistency of discovery metadata, ancillary information such as availability of quality information and provenance, and availability of similar datasets from other distributed sources. Environmental Data Science and Systems (EDSS) group within the Environmental Science Division at the Oak Ridge National Laboratory has a long history of successfully managing diverse and big observational datasets for various scientific programs via various data centers such as DOE's Atmospheric Radiation Measurement Program (ARM), DOE's Carbon Dioxide Information and Analysis Center (CDIAC), USGS's Core Science Analytics and Synthesis (CSAS) metadata Clearinghouse and NASA's Distributed Active Archive Center (ORNL DAAC). This talk will showcase some of the recent developments for improving the data discovery within these centers The DOE ARM program recently developed a data discovery tool which allows users to search and discover over 4000 observational datasets. These datasets are key to the research efforts related to global climate change. The ARM discovery tool features many new functions such as filtered and faceted search logic, multi-pass data selection, filtering data based on data quality, graphical views of data quality and availability, direct access to data quality reports, and data plots. The ARM Archive also provides discovery metadata to other broader metadata clearinghouses such as ESGF, IASOA, and GOS. In addition to the new interface, ARM is also currently working on providing DOI metadata records to publishers such as Thomson Reuters and Elsevier. The ARM
Clusternomics: Integrative context-dependent clustering for heterogeneous datasets

PubMed Central

Wernisch, Lorenz

2017-01-01

Integrative clustering is used to identify groups of samples by jointly analysing multiple datasets describing the same set of biological samples, such as gene expression, copy number, methylation etc. Most existing algorithms for integrative clustering assume that there is a shared consistent set of clusters across all datasets, and most of the data samples follow this structure. However in practice, the structure across heterogeneous datasets can be more varied, with clusters being joined in some datasets and separated in others. In this paper, we present a probabilistic clustering method to identify groups across datasets that do not share the same cluster structure. The proposed algorithm, Clusternomics, identifies groups of samples that share their global behaviour across heterogeneous datasets. The algorithm models clusters on the level of individual datasets, while also extracting global structure that arises from the local cluster assignments. Clusters on both the local and the global level are modelled using a hierarchical Dirichlet mixture model to identify structure on both levels. We evaluated the model both on simulated and on real-world datasets. The simulated data exemplifies datasets with varying degrees of common structure. In such a setting Clusternomics outperforms existing algorithms for integrative and consensus clustering. In a real-world application, we used the algorithm for cancer subtyping, identifying subtypes of cancer from heterogeneous datasets. We applied the algorithm to TCGA breast cancer dataset, integrating gene expression, miRNA expression, DNA methylation and proteomics. The algorithm extracted clinically meaningful clusters with significantly different survival probabilities. We also evaluated the algorithm on lung and kidney cancer TCGA datasets with high dimensionality, again showing clinically significant results and scalability of the algorithm. PMID:29036190
Clusternomics: Integrative context-dependent clustering for heterogeneous datasets.

PubMed

Gabasova, Evelina; Reid, John; Wernisch, Lorenz

2017-10-01

Integrative clustering is used to identify groups of samples by jointly analysing multiple datasets describing the same set of biological samples, such as gene expression, copy number, methylation etc. Most existing algorithms for integrative clustering assume that there is a shared consistent set of clusters across all datasets, and most of the data samples follow this structure. However in practice, the structure across heterogeneous datasets can be more varied, with clusters being joined in some datasets and separated in others. In this paper, we present a probabilistic clustering method to identify groups across datasets that do not share the same cluster structure. The proposed algorithm, Clusternomics, identifies groups of samples that share their global behaviour across heterogeneous datasets. The algorithm models clusters on the level of individual datasets, while also extracting global structure that arises from the local cluster assignments. Clusters on both the local and the global level are modelled using a hierarchical Dirichlet mixture model to identify structure on both levels. We evaluated the model both on simulated and on real-world datasets. The simulated data exemplifies datasets with varying degrees of common structure. In such a setting Clusternomics outperforms existing algorithms for integrative and consensus clustering. In a real-world application, we used the algorithm for cancer subtyping, identifying subtypes of cancer from heterogeneous datasets. We applied the algorithm to TCGA breast cancer dataset, integrating gene expression, miRNA expression, DNA methylation and proteomics. The algorithm extracted clinically meaningful clusters with significantly different survival probabilities. We also evaluated the algorithm on lung and kidney cancer TCGA datasets with high dimensionality, again showing clinically significant results and scalability of the algorithm.
Facing the Challenges of Accessing, Managing, and Integrating Large Observational Datasets in Ecology: Enabling and Enriching the Use of NEON's Observational Data

NASA Astrophysics Data System (ADS)

Thibault, K. M.

2013-12-01

As the construction of NEON and its transition to operations progresses, more and more data will become available to the scientific community, both from NEON directly and from the concomitant growth of existing data repositories. Many of these datasets include ecological observations of a diversity of taxa in both aquatic and terrestrial environments. Although observational data have been collected and used throughout the history of organismal biology, the field has not yet fully developed a culture of data management, documentation, standardization, sharing and discoverability to facilitate the integration and synthesis of datasets. Moreover, the tools required to accomplish these goals, namely database design, implementation, and management, and automation and parallelization of analytical tasks through computational techniques, have not historically been included in biology curricula, at either the undergraduate or graduate levels. To ensure the success of data-generating projects like NEON in advancing organismal ecology and to increase transparency and reproducibility of scientific analyses, an acceleration of the cultural shift to open science practices, the development and adoption of data standards, such as the DarwinCore standard for taxonomic data, and increased training in computational approaches for biologists need to be realized. Here I highlight several initiatives that are intended to increase access to and discoverability of publicly available datasets and equip biologists and other scientists with the skills that are need to manage, integrate, and analyze data from multiple large-scale projects. The EcoData Retriever (ecodataretriever.org) is a tool that downloads publicly available datasets, re-formats the data into an efficient relational database structure, and then automatically imports the data tables onto a user's local drive into the database tool of the user's choice. The automation of these tasks results in nearly instantaneous execution
Latest processing status and quality assessment of the GOMOS, MIPAS and SCIAMACHY ESA dataset

NASA Astrophysics Data System (ADS)

Niro, F.; Brizzi, G.; Saavedra de Miguel, L.; Scarpino, G.; Dehn, A.; Fehr, T.; von Kuhlmann, R.

2011-12-01

GOMOS, MIPAS and SCIAMACHY instruments are successfully observing the changing Earth's atmosphere since the launch of the ENVISAT-ESA platform on March 2002. The measurements recorded by these instruments are relevant for the Atmospheric-Chemistry community both in terms of time extent and variety of observing geometry and techniques. In order to fully exploit these measurements, it is crucial to maintain a good reliability in the data processing and distribution and to continuously improving the scientific output. The goal is to meet the evolving needs of both the near-real-time and research applications. Within this frame, the ESA operational processor remains the reference code, although many scientific algorithms are nowadays available to the users. In fact, the ESA algorithm has a well-established calibration and validation scheme, a certified quality assessment process and the possibility to reach a wide users' community. Moreover, the ESA algorithm upgrade procedures and the re-processing performances have much improved during last two years, thanks to the recent updates of the Ground Segment infrastructure and overall organization. The aim of this paper is to promote the usage and stress the quality of the ESA operational dataset for the GOMOS, MIPAS and SCIAMACHY missions. The recent upgrades in the ESA processor (GOMOS V6, MIPAS V5 and SCIAMACHY V5) will be presented, with detailed information on improvements in the scientific output and preliminary validation results. The planned algorithm evolution and on-going re-processing campaigns will be mentioned that involves the adoption of advanced set-up, such as the MIPAS V6 re-processing on a clouds-computing system. Finally, the quality control process will be illustrated that allows to guarantee a standard of quality to the users. In fact, the operational ESA algorithm is carefully tested before switching into operations and the near-real time and off-line production is thoughtfully verified via the
Multi-facetted Metadata - Describing datasets with different metadata schemas at the same time

NASA Astrophysics Data System (ADS)

Ulbricht, Damian; Klump, Jens; Bertelmann, Roland

2013-04-01

Inspired by the wish to re-use research data a lot of work is done to bring data systems of the earth sciences together. Discovery metadata is disseminated to data portals to allow building of customized indexes of catalogued dataset items. Data that were once acquired in the context of a scientific project are open for reappraisal and can now be used by scientists that were not part of the original research team. To make data re-use easier, measurement methods and measurement parameters must be documented in an application metadata schema and described in a written publication. Linking datasets to publications - as DataCite [1] does - requires again a specific metadata schema and every new use context of the measured data may require yet another metadata schema sharing only a subset of information with the meta information already present. To cope with the problem of metadata schema diversity in our common data repository at GFZ Potsdam we established a solution to store file-based research data and describe these with an arbitrary number of metadata schemas. Core component of the data repository is an eSciDoc infrastructure that provides versioned container objects, called eSciDoc [2] "items". The eSciDoc content model allows assigning files to "items" and adding any number of metadata records to these "items". The eSciDoc items can be submitted, revised, and finally published, which makes the data and metadata available through the internet worldwide. GFZ Potsdam uses eSciDoc to support its scientific publishing workflow, including mechanisms for data review in peer review processes by providing temporary web links for external reviewers that do not have credentials to access the data. Based on the eSciDoc API, panMetaDocs [3] provides a web portal for data management in research projects. PanMetaDocs, which is based on panMetaWorks [4], is a PHP based web application that allows to describe data with any XML-based schema. It uses the eSciDoc infrastructures
TRI Preliminary Dataset

EPA Pesticide Factsheets

The TRI preliminary dataset includes the most current TRI data available and reflects toxic chemical releases and pollution prevention activities that occurred at TRI facilities during the each calendar year.
Ares V: Application to Solar System Scientific Exploration

NASA Technical Reports Server (NTRS)

Reh, Kim; Spilker, Tom; Elliott, John; Balint, Tibor; Donahue, Ben; McCormick, Dave; Smith, David B.; Tandon, Sunil; Woodcock, Gordon

2008-01-01

The following sections describe Ares V performance and its payoff to a wide array of potential solar system exploration missions. Application to potential Astrophysics missions is addressed in Reference 3.
Comparison of recent SnIa datasets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sanchez, J.C. Bueno; Perivolaropoulos, L.; Nesseris, S., E-mail: jbueno@cc.uoi.gr, E-mail: nesseris@nbi.ku.dk, E-mail: leandros@uoi.gr

2009-11-01

We rank the six latest Type Ia supernova (SnIa) datasets (Constitution (C), Union (U), ESSENCE (Davis) (E), Gold06 (G), SNLS 1yr (S) and SDSS-II (D)) in the context of the Chevalier-Polarski-Linder (CPL) parametrization w(a) = w{sub 0}+w{sub 1}(1−a), according to their Figure of Merit (FoM), their consistency with the cosmological constant (ΛCDM), their consistency with standard rulers (Cosmic Microwave Background (CMB) and Baryon Acoustic Oscillations (BAO)) and their mutual consistency. We find a significant improvement of the FoM (defined as the inverse area of the 95.4% parameter contour) with the number of SnIa of these datasets ((C) highest FoM, (U),more » (G), (D), (E), (S) lowest FoM). Standard rulers (CMB+BAO) have a better FoM by about a factor of 3, compared to the highest FoM SnIa dataset (C). We also find that the ranking sequence based on consistency with ΛCDM is identical with the corresponding ranking based on consistency with standard rulers ((S) most consistent, (D), (C), (E), (U), (G) least consistent). The ranking sequence of the datasets however changes when we consider the consistency with an expansion history corresponding to evolving dark energy (w{sub 0},w{sub 1}) = (−1.4,2) crossing the phantom divide line w = −1 (it is practically reversed to (G), (U), (E), (S), (D), (C)). The SALT2 and MLCS2k2 fitters are also compared and some peculiar features of the SDSS-II dataset when standardized with the MLCS2k2 fitter are pointed out. Finally, we construct a statistic to estimate the internal consistency of a collection of SnIa datasets. We find that even though there is good consistency among most samples taken from the above datasets, this consistency decreases significantly when the Gold06 (G) dataset is included in the sample.« less
Thermal analyses of the International Ultraviolet Explorer (IUE) scientific instrument using the NASTRAN thermal analyzer (NTA): A general purpose summary

NASA Technical Reports Server (NTRS)

Jackson, C. E., Jr.

1976-01-01

The NTA Level 15.5.2/3, was used to provide non-linear steady-state (NLSS) and non-linear transient (NLTR) thermal predictions for the International Ultraviolet Explorer (IUE) Scientific Instrument (SI). NASTRAN structural models were used as the basis for the thermal models, which were produced by a straight forward conversion procedure. The accuracy of this technique was sub-sequently demonstrated by a comparison of NTA predicts with the results of a thermal vacuum test of the IUE Engineering Test Unit (ETU). Completion of these tasks was aided by the use of NTA subroutines.
Exploring Science in the Studio: NSF-Funded Initiatives to Increase Scientific Literacy in Undergraduate Art and Design Students

NASA Astrophysics Data System (ADS)

Metzger, C. A.

2015-12-01

The project Exploring Science in the Studio at California College of the Arts (CCA), one of the oldest and most influential art and design schools in the country, pursues ways to enable undergraduate students to become scientifically literate problem-solvers in a variety of careers and to give content and context to their creative practices. The two main branches of this National Science Foundation-funded project are a series of courses called Science in the Studio (SitS) and the design of the Mobile Units for Science Exploration (MUSE) system, which allow instructors to bring science equipment directly into the studios. Ongoing since 2010, each fall semester a series of interdisciplinary SitS courses are offered in the college's principal areas of study (architecture, design, fine arts, humanities and sciences, and diversity studies) thematically linked by Earth and environmental science topics such as water, waste, and sustainability. Each course receives funding to embed guest scientists from other colleges and universities, industry, or agriculture directly into the studio courses. These scientists worked in tandem with the studio faculty and gave lectures, led field trips, conducted studio visits, and advised the students' creative endeavors, culminating in an annual SitS exhibition of student work. The MUSE system, of fillable carts and a storage and display unit, was designed by undergraduate students in a Furniture studio who explored, experimented, and researched various ways science materials and equipment are stored, collected, and displayed, for use in the current and future science and studio curricula at CCA. Sustainable practices and "smart design" underpinned all of the work completed in the studio. The materials selected for the new Science Collection at CCA include environmental monitoring equipment and test kits, a weather station, a stream table, a rock and fossil collection, and a vertebrate skull collection. The SitS courses and MUSE system
Existing Instrumentation and Scientific Drivers for a Subduction Zone Observatory in Latin America

NASA Astrophysics Data System (ADS)

Frassetto, A.; Woodward, R.; Detrick, R. S.

2015-12-01

The subduction zones along the western shore of the Americas provide numerous societally relevant scientific questions that have yet to be fully explored and would make an excellent target for a comprehensive, integrated Subduction Zone Observatory (SZO). Further, recent discussions in Latin America indicate that there are a large number of existing stations that could serve as a backbone for an SZO. Such preexisting geophysical infrastructure commonly plays a vital role in new science initiatives, from small PI-led experiments to the establishment of the USArray Transportable Array, Reference Network, Cascadia Amphibious Array, and the redeployment of EarthScope Transportable Array stations to Alaska. Creating an SZO along the western coast of the Americas could strongly leverage the portfolio of existing seismic and geodetic stations across regions of interest. In this presentation, we will discuss the concept and experience of leveraging existing infrastructure in major new observational programs, outline the state of geophysical networks in the Americas (emphasizing current seismic networks but also looking back on historical temporary deployments), and provide an overview of potential scientific targets in the Americas that encompass a sampling of recently produced research results and datasets. Additionally, we will reflect on strategies for establishing meaningful collaborations across Latin America, an aspect that will be critical to the international partnerships, and associated capacity building, needed for a successful SZO initiative.
NASA's Solar System Exploration Research Virtual Institute: Science and Technology for Lunar Exploration

NASA Technical Reports Server (NTRS)

Schmidt, Greg; Bailey, Brad; Gibbs, Kristina

2015-01-01

The NASA Solar System Exploration Research Virtual Institute (SSERVI) is a virtual institute focused on research at the intersection of science and exploration, training the next generation of lunar scientists, and development and support of the international community. As part of its mission, SSERVI acts as a hub for opportunities that engage the larger scientific and exploration communities in order to form new interdisciplinary, research-focused collaborations. The nine domestic SSERVI teams that comprise the U.S. complement of the Institute engage with the international science and exploration communities through workshops, conferences, online seminars and classes, student exchange programs and internships. SSERVI represents a close collaboration between science, technology and exploration enabling a deeper, integrated understanding of the Moon and other airless bodies as human exploration moves beyond low Earth orbit. SSERVI centers on the scientific aspects of exploration as they pertain to the Moon, Near Earth Asteroids (NEAs) and the moons of Mars, with additional aspects of related technology development, including a major focus on human exploration-enabling efforts such as resolving Strategic Knowledge Gaps (SKGs). The Institute focuses on interdisciplinary, exploration-related science focused on airless bodies targeted as potential human destinations. Areas of study represent the broad spectrum of lunar, NEA, and Martian moon sciences encompassing investigations of the surface, interior, exosphere, and near-space environments as well as science uniquely enabled from these bodies. This research profile integrates investigations of plasma physics, geology/geochemistry, technology integration, solar system origins/evolution, regolith geotechnical properties, analogues, volatiles, ISRU and exploration potential of the target bodies. New opportunities for both domestic and international partnerships are continually generated through these research and
Seasonal evaluation of evapotranspiration fluxes from MODIS satellite and mesoscale model downscaled global reanalysis datasets

NASA Astrophysics Data System (ADS)

Srivastava, Prashant K.; Han, Dawei; Islam, Tanvir; Petropoulos, George P.; Gupta, Manika; Dai, Qiang

2016-04-01

Reference evapotranspiration (ETo) is an important variable in hydrological modeling, which is not always available, especially for ungauged catchments. Satellite data, such as those available from the MODerate Resolution Imaging Spectroradiometer (MODIS), and global datasets via the European Centre for Medium Range Weather Forecasts (ECMWF) reanalysis (ERA) interim and National Centers for Environmental Prediction (NCEP) reanalysis are important sources of information for ETo. This study explored the seasonal performances of MODIS (MOD16) and Weather Research and Forecasting (WRF) model downscaled global reanalysis datasets, such as ERA interim and NCEP-derived ETo, against ground-based datasets. Overall, on the basis of the statistical metrics computed, ETo derived from ERA interim and MODIS were more accurate in comparison to the estimates from NCEP for all the seasons. The pooled datasets also revealed a similar performance to the seasonal assessment with higher agreement for the ERA interim (r = 0.96, RMSE = 2.76 mm/8 days; bias = 0.24 mm/8 days), followed by MODIS (r = 0.95, RMSE = 7.66 mm/8 days; bias = -7.17 mm/8 days) and NCEP (r = 0.76, RMSE = 11.81 mm/8 days; bias = -10.20 mm/8 days). The only limitation with downscaling ERA interim reanalysis datasets using WRF is that it is time-consuming in contrast to the readily available MODIS operational product for use in mesoscale studies and practical applications.
VIPER: a visualisation tool for exploring inheritance inconsistencies in genotyped pedigrees

PubMed Central

2012-01-01

Background Pedigree genotype datasets are used for analysing genetic inheritance and to map genetic markers and traits. Such datasets consist of hundreds of related animals genotyped for thousands of genetic markers and invariably contain multiple errors in both the pedigree structure and in the associated individual genotype data. These errors manifest as apparent inheritance inconsistencies in the pedigree, and invalidate analyses of marker inheritance patterns across the dataset. Cleaning raw datasets of bad data points (incorrect pedigree relationships, unreliable marker assays, suspect samples, bad genotype results etc.) requires expert exploration of the patterns of exposed inconsistencies in the context of the inheritance pedigree. In order to assist this process we are developing VIPER (Visual Pedigree Explorer), a software tool that integrates an inheritance-checking algorithm with a novel space-efficient pedigree visualisation, so that reported inheritance inconsistencies are overlaid on an interactive, navigable representation of the pedigree structure. Methods and results This paper describes an evaluation of how VIPER displays the different scales and types of dataset that occur experimentally, with a description of how VIPER's display interface and functionality meet the challenges presented by such data. We examine a range of possible error types found in real and simulated pedigree genotype datasets, demonstrating how these errors are exposed and explored using the VIPER interface and we evaluate the utility and usability of the interface to the domain expert. Evaluation was performed as a two stage process with the assistance of domain experts (geneticists). The initial evaluation drove the iterative implementation of further features in the software prototype, as required by the users, prior to a final functional evaluation of the pedigree display for exploring the various error types, data scales and structures. Conclusions The VIPER display was
[Spatial domain display for interference image dataset].

PubMed

Wang, Cai-Ling; Li, Yu-Shan; Liu, Xue-Bin; Hu, Bing-Liang; Jing, Juan-Juan; Wen, Jia

2011-11-01

The requirements of imaging interferometer visualization is imminent for the user of image interpretation and information extraction. However, the conventional researches on visualization only focus on the spectral image dataset in spectral domain. Hence, the quick show of interference spectral image dataset display is one of the nodes in interference image processing. The conventional visualization of interference dataset chooses classical spectral image dataset display method after Fourier transformation. In the present paper, the problem of quick view of interferometer imager in image domain is addressed and the algorithm is proposed which simplifies the matter. The Fourier transformation is an obstacle since its computation time is very large and the complexion would be even deteriorated with the size of dataset increasing. The algorithm proposed, named interference weighted envelopes, makes the dataset divorced from transformation. The authors choose three interference weighted envelopes respectively based on the Fourier transformation, features of interference data and human visual system. After comparing the proposed with the conventional methods, the results show the huge difference in display time.
Agile data management for curation of genomes to watershed datasets

NASA Astrophysics Data System (ADS)

Varadharajan, C.; Agarwal, D.; Faybishenko, B.; Versteeg, R.

2015-12-01

A software platform is being developed for data management and assimilation [DMA] as part of the U.S. Department of Energy's Genomes to Watershed Sustainable Systems Science Focus Area 2.0. The DMA components and capabilities are driven by the project science priorities and the development is based on agile development techniques. The goal of the DMA software platform is to enable users to integrate and synthesize diverse and disparate field, laboratory, and simulation datasets, including geological, geochemical, geophysical, microbiological, hydrological, and meteorological data across a range of spatial and temporal scales. The DMA objectives are (a) developing an integrated interface to the datasets, (b) storing field monitoring data, laboratory analytical results of water and sediments samples collected into a database, (c) providing automated QA/QC analysis of data and (d) working with data providers to modify high-priority field and laboratory data collection and reporting procedures as needed. The first three objectives are driven by user needs, while the last objective is driven by data management needs. The project needs and priorities are reassessed regularly with the users. After each user session we identify development priorities to match the identified user priorities. For instance, data QA/QC and collection activities have focused on the data and products needed for on-going scientific analyses (e.g. water level and geochemistry). We have also developed, tested and released a broker and portal that integrates diverse datasets from two different databases used for curation of project data. The development of the user interface was based on a user-centered design process involving several user interviews and constant interaction with data providers. The initial version focuses on the most requested feature - i.e. finding the data needed for analyses through an intuitive interface. Once the data is found, the user can immediately plot and download data
Representation of Scientific Methodology in Secondary Science Textbooks

ERIC Educational Resources Information Center

Binns, Ian C.; Bell, Randy L.

2015-01-01

This study explored how eight widely used secondary science textbooks described scientific methodology and to what degree the textbooks' examples and investigations were consistent with this description. Data consisted of all text from student and teacher editions that referred to scientific methodology and all investigations. Analysis used an…
Ethical muscle and scientific interests: a role for philosophy in scientific research.

PubMed

Kaposy, Chris

2008-03-01

Ethics, a branch of philosophy, has a place in the regulatory framework of human subjects research. Sometimes, however, ethical concepts and arguments play a more central role in scientific activity. This can happen, for example, when violations of research norms are also ethical violations. In such a situation, ethical arguments can be marshaled to improve the quality of the scientific research. I explore two different examples in which philosophers and scientists have used ethical arguments to plead for epistemological improvements in the conduct of research. The first example deals with research dishonesty in pharmaceutical development. The second example is concerned with neuropsychological research using fMRI technology.
A Tropical Marine Microbial Natural Products Geobibliography as an Example of Desktop Exploration of Current Research Using Web Visualisation Tools

PubMed Central

Mukherjee, Joydeep; Llewellyn, Lyndon E; Evans-Illidge, Elizabeth A

2008-01-01

Microbial marine biodiscovery is a recent scientific endeavour developing at a time when information and other technologies are also undergoing great technical strides. Global visualisation of datasets is now becoming available to the world through powerful and readily available software such as Worldwind™, ArcGIS Explorer™ and Google Earth™. Overlaying custom information upon these tools is within the hands of every scientist and more and more scientific organisations are making data available that can also be integrated into these global visualisation tools. The integrated global view that these tools enable provides a powerful desktop exploration tool. Here we demonstrate the value of this approach to marine microbial biodiscovery by developing a geobibliography that incorporates citations on tropical and near-tropical marine microbial natural products research with Google Earth™ and additional ancillary global data sets. The tools and software used are all readily available and the reader is able to use and install the material described in this article. PMID:19172194

Secondary analysis of national survey datasets.

PubMed

Boo, Sunjoo; Froelicher, Erika Sivarajan

2013-06-01

This paper describes the methodological issues associated with secondary analysis of large national survey datasets. Issues about survey sampling, data collection, and non-response and missing data in terms of methodological validity and reliability are discussed. Although reanalyzing large national survey datasets is an expedient and cost-efficient way of producing nursing knowledge, successful investigations require a methodological consideration of the intrinsic limitations of secondary survey analysis. Nursing researchers using existing national survey datasets should understand potential sources of error associated with survey sampling, data collection, and non-response and missing data. Although it is impossible to eliminate all potential errors, researchers using existing national survey datasets must be aware of the possible influence of errors on the results of the analyses. © 2012 The Authors. Japan Journal of Nursing Science © 2012 Japan Academy of Nursing Science.
cellVIEW: a Tool for Illustrative and Multi-Scale Rendering of Large Biomolecular Datasets

PubMed Central

Le Muzic, Mathieu; Autin, Ludovic; Parulek, Julius; Viola, Ivan

2017-01-01

In this article we introduce cellVIEW, a new system to interactively visualize large biomolecular datasets on the atomic level. Our tool is unique and has been specifically designed to match the ambitions of our domain experts to model and interactively visualize structures comprised of several billions atom. The cellVIEW system integrates acceleration techniques to allow for real-time graphics performance of 60 Hz display rate on datasets representing large viruses and bacterial organisms. Inspired by the work of scientific illustrators, we propose a level-of-detail scheme which purpose is two-fold: accelerating the rendering and reducing visual clutter. The main part of our datasets is made out of macromolecules, but it also comprises nucleic acids strands which are stored as sets of control points. For that specific case, we extend our rendering method to support the dynamic generation of DNA strands directly on the GPU. It is noteworthy that our tool has been directly implemented inside a game engine. We chose to rely on a third party engine to reduce software development work-load and to make bleeding-edge graphics techniques more accessible to the end-users. To our knowledge cellVIEW is the only suitable solution for interactive visualization of large bimolecular landscapes on the atomic level and is freely available to use and extend. PMID:29291131
Utilizing the Antarctic Master Directory to find orphan datasets

NASA Astrophysics Data System (ADS)

Bonczkowski, J.; Carbotte, S. M.; Arko, R. A.; Grebas, S. K.

2011-12-01

While most Antarctic data are housed at an established disciplinary-specific data repository, there are data types for which no suitable repository exists. In some cases, these "orphan" data, without an appropriate national archive, are served from local servers by the principal investigators who produced the data. There are many pitfalls with data served privately, including the frequent lack of adequate documentation to ensure the data can be understood by others for re-use and the impermanence of personal web sites. For example, if an investigator leaves an institution and the data moves, the link published is no longer accessible. To ensure continued availability of data, submission to long-term national data repositories is needed. As stated in the National Science Foundation Office of Polar Programs (NSF/OPP) Guidelines and Award Conditions for Scientific Data, investigators are obligated to submit their data for curation and long-term preservation; this includes the registration of a dataset description into the Antarctic Master Directory (AMD), http://gcmd.nasa.gov/Data/portals/amd/. The AMD is a Web-based, searchable directory of thousands of dataset descriptions, known as DIF records, submitted by scientists from over 20 countries. It serves as a node of the International Directory Network/Global Change Master Directory (IDN/GCMD). The US Antarctic Program Data Coordination Center (USAP-DCC), http://www.usap-data.org/, funded through NSF/OPP, was established in 2007 to help streamline the process of data submission and DIF record creation. When data does not quite fit within any existing disciplinary repository, it can be registered within the USAP-DCC as the fallback data repository. Within the scope of the USAP-DCC we undertook the challenge of discovering and "rescuing" orphan datasets currently registered within the AMD. In order to find which DIF records led to data served privately, all records relating to US data within the AMD were parsed. After
Scientific drilling projects in ancient lakes: Integrating geological and biological histories

NASA Astrophysics Data System (ADS)

Wilke, Thomas; Wagner, Bernd; Van Bocxlaer, Bert; Albrecht, Christian; Ariztegui, Daniel; Delicado, Diana; Francke, Alexander; Harzhauser, Mathias; Hauffe, Torsten; Holtvoeth, Jens; Just, Janna; Leng, Melanie J.; Levkov, Zlatko; Penkman, Kirsty; Sadori, Laura; Skinner, Alister; Stelbrink, Björn; Vogel, Hendrik; Wesselingh, Frank; Wonik, Thomas

2016-08-01

Sedimentary sequences in ancient or long-lived lakes can reach several thousands of meters in thickness and often provide an unrivalled perspective of the lake's regional climatic, environmental, and biological history. Over the last few years, deep-drilling projects in ancient lakes became increasingly multi- and interdisciplinary, as, among others, seismological, sedimentological, biogeochemical, climatic, environmental, paleontological, and evolutionary information can be obtained from sediment cores. However, these multi- and interdisciplinary projects pose several challenges. The scientists involved typically approach problems from different scientific perspectives and backgrounds, and setting up the program requires clear communication and the alignment of interests. One of the most challenging tasks, besides the actual drilling operation, is to link diverse datasets with varying resolution, data quality, and age uncertainties to answer interdisciplinary questions synthetically and coherently. These problems are especially relevant when secondary data, i.e., datasets obtained independently of the drilling operation, are incorporated in analyses. Nonetheless, the inclusion of secondary information, such as isotopic data from fossils found in outcrops or genetic data from extant species, may help to achieve synthetic answers. Recent technological and methodological advances in paleolimnology are likely to increase the possibilities of integrating secondary information. Some of the new approaches have started to revolutionize scientific drilling in ancient lakes, but at the same time, they also add a new layer of complexity to the generation and analysis of sediment-core data. The enhanced opportunities presented by new scientific approaches to study the paleolimnological history of these lakes, therefore, come at the expense of higher logistic, communication, and analytical efforts. Here we review types of data that can be obtained in ancient lake drilling
Provenance Challenges for Earth Science Dataset Publication

NASA Technical Reports Server (NTRS)

Tilmes, Curt

2011-01-01

Modern science is increasingly dependent on computational analysis of very large data sets. Organizing, referencing, publishing those data has become a complex problem. Published research that depends on such data often fails to cite the data in sufficient detail to allow an independent scientist to reproduce the original experiments and analyses. This paper explores some of the challenges related to data identification, equivalence and reproducibility in the domain of data intensive scientific processing. It will use the example of Earth Science satellite data, but the challenges also apply to other domains.
Scientific Exploration of Near-Earth Objects via the Crew Exploration Vehicle

NASA Technical Reports Server (NTRS)

Abell, P. A.; Korsmeyer, D. J.; Landis, R. R.; Lu, E.; Adamo, D.; Jones, T.; Lemke, L.; Gonzales, A.; Gershman, B.; Morrison, D.;

2007-01-01

The concept of a crewed mission to a near-Earth object (NEO) has been previously analyzed several times in the past. A more in depth feasibility study has been sponsored by the Advanced Projects Office within NASA's Constellation Program to examine the ability of a Crew Exploration Vehicle (CEV) to support a mission to a NEO. The national mission profile would involve a crew of 2 or 3 astronauts on a 90 to 120 day mission, which would include a 7 to 14 day stay for proximity operations at the target NEO.

Semi-supervised tracking of extreme weather events in global spatio-temporal climate datasets

NASA Astrophysics Data System (ADS)

Kim, S. K.; Prabhat, M.; Williams, D. N.

2017-12-01

Deep neural networks have been successfully applied to solve problem to detect extreme weather events in large scale climate datasets and attend superior performance that overshadows all previous hand-crafted methods. Recent work has shown that multichannel spatiotemporal encoder-decoder CNN architecture is able to localize events in semi-supervised bounding box. Motivated by this work, we propose new learning metric based on Variational Auto-Encoders (VAE) and Long-Short-Term-Memory (LSTM) to track extreme weather events in spatio-temporal dataset. We consider spatio-temporal object tracking problems as learning probabilistic distribution of continuous latent features of auto-encoder using stochastic variational inference. For this, we assume that our datasets are i.i.d and latent features is able to be modeled by Gaussian distribution. In proposed metric, we first train VAE to generate approximate posterior given multichannel climate input with an extreme climate event at fixed time. Then, we predict bounding box, location and class of extreme climate events using convolutional layers given input concatenating three features including embedding, sampled mean and standard deviation. Lastly, we train LSTM with concatenated input to learn timely information of dataset by recurrently feeding output back to next time-step's input of VAE. Our contribution is two-fold. First, we show the first semi-supervised end-to-end architecture based on VAE to track extreme weather events which can apply to massive scaled unlabeled climate datasets. Second, the information of timely movement of events is considered for bounding box prediction using LSTM which can improve accuracy of localization. To our knowledge, this technique has not been explored neither in climate community or in Machine Learning community.
When Scientific Knowledge, Daily Life Experience, Epistemological and Social Considerations Intersect: Students' Argumentation in Group Discussions on a Socio-Scientific Issue

ERIC Educational Resources Information Center

Albe, Virginie

2008-01-01

Socio-scientific issues in class have been proposed in an effort to democratise science in society. A micro-ethnographic approach has been used to explore how students elaborate arguments on a socio-scientific controversy in the context of small group discussions. Several processes of group argumentation have been identified. Students' arguments…
International Ultraviolet Explorer Observatory operations

NASA Technical Reports Server (NTRS)

1985-01-01

This volume contains the final report for the International Ultraviolet Explorer IUE Observatory Operations contract. The fundamental operational objective of the International Ultraviolet Explorer (IUE) program is to translate competitively selected observing programs into IUE observations, to reduce these observations into meaningful scientific data, and then to present these data to the Guest Observer in a form amenable to the pursuit of scientific research. The IUE Observatory is the key to this objective since it is the central control and support facility for all science operations functions within the IUE Project. In carrying out the operation of this facility, a number of complex functions were provided beginning with telescope scheduling and operation, proceeding to data processing, and ending with data distribution and scientific data analysis. In support of these critical-path functions, a number of other significant activities were also provided, including scientific instrument calibration, systems analysis, and software support. Routine activities have been summarized briefly whenever possible.
Scientific Visualization & Modeling for Earth Systems Science Education

NASA Technical Reports Server (NTRS)

Chaudhury, S. Raj; Rodriguez, Waldo J.

2003-01-01

Providing research experiences for undergraduate students in Earth Systems Science (ESS) poses several challenges at smaller academic institutions that might lack dedicated resources for this area of study. This paper describes the development of an innovative model that involves students with majors in diverse scientific disciplines in authentic ESS research. In studying global climate change, experts typically use scientific visualization techniques applied to remote sensing data collected by satellites. In particular, many problems related to environmental phenomena can be quantitatively addressed by investigations based on datasets related to the scientific endeavours such as the Earth Radiation Budget Experiment (ERBE). Working with data products stored at NASA's Distributed Active Archive Centers, visualization software specifically designed for students and an advanced, immersive Virtual Reality (VR) environment, students engage in guided research projects during a structured 6-week summer program. Over the 5-year span, this program has afforded the opportunity for students majoring in biology, chemistry, mathematics, computer science, physics, engineering and science education to work collaboratively in teams on research projects that emphasize the use of scientific visualization in studying the environment. Recently, a hands-on component has been added through science student partnerships with school-teachers in data collection and reporting for the GLOBE Program (GLobal Observations to Benefit the Environment).
Scientific objectives of human exploration of Mars

USGS Publications Warehouse

Carr, M.H.

1996-01-01

While human exploration of Mars is unlikely to be undertaken for science reasons alone, science will be the main beneficiary. A wide range of science problems can be addressed at Mars. The planet formed in a different part of the solar system from the Earth and retains clues concerning compositional and environmental conditions in that part of the solar system when the planets formed. Mars has had a long and complex history that has involved almost as wide a range of processes as occurred on Earth. Elucidation of this history will require a comprehensive program of field mapping, geophysical sounding, in situ analyses, and return of samples to Earth that are representative of the planet's diversity. The origin and evolution of the Mars' atmosphere are very different from the Earth's, Mars having experienced major secular and cyclical changes in climate. Clues as to precisely how the atmosphere has evolved are embedded in its present chemistry, possibly in surface sinks of former atmosphere-forming volatiles, and in the various products of interaction between the atmosphere and surface. The present atmosphere also provides a means of testing general circulation models applicable to all planets. Although life is unlikely to be still extant on Mars, life may have started early in the planet's history. A major goal of any future exploration will, therefore, be to search for evidence of indigenous life.
On the utility of 3D hand cursors to explore medical volume datasets with a touchless interface.

PubMed

Lopes, Daniel Simões; Parreira, Pedro Duarte de Figueiredo; Paulo, Soraia Figueiredo; Nunes, Vitor; Rego, Paulo Amaral; Neves, Manuel Cassiano; Rodrigues, Pedro Silva; Jorge, Joaquim Armando

2017-08-01

Analyzing medical volume datasets requires interactive visualization so that users can extract anatomo-physiological information in real-time. Conventional volume rendering systems rely on 2D input devices, such as mice and keyboards, which are known to hamper 3D analysis as users often struggle to obtain the desired orientation that is only achieved after several attempts. In this paper, we address which 3D analysis tools are better performed with 3D hand cursors operating on a touchless interface comparatively to a 2D input devices running on a conventional WIMP interface. The main goals of this paper are to explore the capabilities of (simple) hand gestures to facilitate sterile manipulation of 3D medical data on a touchless interface, without resorting on wearables, and to evaluate the surgical feasibility of the proposed interface next to senior surgeons (N=5) and interns (N=2). To this end, we developed a touchless interface controlled via hand gestures and body postures to rapidly rotate and position medical volume images in three-dimensions, where each hand acts as an interactive 3D cursor. User studies were conducted with laypeople, while informal evaluation sessions were carried with senior surgeons, radiologists and professional biomedical engineers. Results demonstrate its usability as the proposed touchless interface improves spatial awareness and a more fluent interaction with the 3D volume than with traditional 2D input devices, as it requires lesser number of attempts to achieve the desired orientation by avoiding the composition of several cumulative rotations, which is typically necessary in WIMP interfaces. However, tasks requiring precision such as clipping plane visualization and tagging are best performed with mouse-based systems due to noise, incorrect gestures detection and problems in skeleton tracking that need to be addressed before tests in real medical environments might be performed. Copyright © 2017 Elsevier Inc. All rights reserved.
A Framework for Socio-Scientific Issues Based Education

ERIC Educational Resources Information Center

Presley, Morgan L.; Sickel, Aaron J.; Muslu, Nilay; Merle-Johnson, Dominike; Witzig, Stephen B.; Izci, Kemal; Sadler, Troy D.

2013-01-01

Science instruction based on student exploration of socio-scientific issues (SSI) has been presented as a powerful strategy for supporting science learning and the development of scientific literacy. This paper presents an instructional framework for SSI based education. The framework is based on a series of research studies conducted in a diverse…
Communication System Architecture for Planetary Exploration

NASA Technical Reports Server (NTRS)

Braham, Stephen P.; Alena, Richard; Gilbaugh, Bruce; Glass, Brian; Norvig, Peter (Technical Monitor)

2001-01-01

Future human missions to Mars will require effective communications supporting exploration activities and scientific field data collection. Constraints on cost, size, weight and power consumption for all communications equipment make optimization of these systems very important. These information and communication systems connect people and systems together into coherent teams performing the difficult and hazardous tasks inherent in planetary exploration. The communication network supporting vehicle telemetry data, mission operations, and scientific collaboration must have excellent reliability, and flexibility.
ispace's Polar Ice Explorer: Commerically Exploring the Poles of the Moon

NASA Astrophysics Data System (ADS)

Calzada-Diaz, A.; Acierno, K.; Rasera, J. N.; Lamamy, J.-A.

2018-04-01

This work provides the background, rationales, and scientific objectives for the ispace Polar Ice Explorer Project, an ISRU exploratory mission that aims to provide data about the lunar polar environment.
Modern Scientific Literacy: A Case Study of Multiliteracies and Scientific Practices in a Fifth Grade Classroom

NASA Astrophysics Data System (ADS)

Allison, Elizabeth; Goldston, M. Jenice

2018-01-01

This study investigates the convergence of multiliteracies and scientific practices in a fifth grade classroom. As students' lives become increasingly multimodal, diverse, and globalized, the traditional notions of literacy must be revisited (New London Group 1996). With the adoption of the Next Generation Science Standards (NGSS Lead States 2013a) in many states, either in their entirety or in adapted forms, it becomes useful to explore the interconnectedness multiliteracies and scientific practices and the resulting implications for scientific literacy. The case study included a fifth grade classroom, including the students and teacher. In order to create a rich description of the cases involved, data were collected and triangulated through teacher interviews, student interviews and focus groups, and classroom observations. Findings reveal that as science activities were enriched with multiliteracies and scientific practices, students were engaged in developing skills and knowledge central to being scientifically literate. Furthermore, this study establishes that characteristics of scientific literacy, by its intent and purpose, are a form of multiliteracies in elementary classrooms. Therefore, the teaching and learning of science and its practices for scientific literacy are in turn reinforcing the development of broader multiliteracies.
U.S. Datasets

Cancer.gov

Datasets for U.S. mortality, U.S. populations, standard populations, county attributes, and expected survival. Plus SEER-linked databases (SEER-Medicare, SEER-Medicare Health Outcomes Survey [SEER-MHOS], SEER-Consumer Assessment of Healthcare Providers and Systems [SEER-CAHPS]).
Chemical datuments as scientific enablers.

PubMed

Rzepa, Henry S

2013-01-23

This article is an attempt to construct a chemical datument as a means of presenting insights into chemical phenomena in a scientific journal. An exploration of the interactions present in a small fragment of duplex Z-DNA and the nature of the catalytic centre of a carbon-dioxide/alkene epoxide alternating co-polymerisation is presented in this datument, with examples of the use of three software tools, one based on Java, the other two using Javascript and HTML5 technologies. The implications for the evolution of scientific journals are discussed.
Chemical datuments as scientific enablers

PubMed Central

2013-01-01

This article is an attempt to construct a chemical datument as a means of presenting insights into chemical phenomena in a scientific journal. An exploration of the interactions present in a small fragment of duplex Z-DNA and the nature of the catalytic centre of a carbon-dioxide/alkene epoxide alternating co-polymerisation is presented in this datument, with examples of the use of three software tools, one based on Java, the other two using Javascript and HTML5 technologies. The implications for the evolution of scientific journals are discussed. PMID:23343381
Achieving a balance - Science and human exploration

NASA Technical Reports Server (NTRS)

Duke, Michael B.

1992-01-01

An evaluation is made of the opportunities for advancing the scientific understanding of Mars through a research program, conducted under the egis of NASA's Space Exploration Initiative, which emphasizes the element of human exploration as well as the requisite robotic component. A Mars exploration program that involves such complementary human/robotic components will entail the construction of a closed ecological life-support system, long-duration spacecraft facilities for crews, and the development of extraterrestrial resources; these R&D imperatives will have great subsequent payoffs, both scientific and economic.

Updated archaeointensity dataset from the SW Pacific

NASA Astrophysics Data System (ADS)

Hill, Mimi; Nilsson, Andreas; Holme, Richard; Hurst, Elliot; Turner, Gillian; Herries, Andy; Sheppard, Peter

2016-04-01

It is well known that there are far more archaeomagnetic data from the Northern Hemisphere than from the Southern. Here we present a compilation of archaeointensity data from the SW Pacific region covering the past 3000 years. The results have primarily been obtained from a collection of ceramics from the SW Pacific Islands including Fiji, Tonga, Papua New Guinea, New Caledonia and Vanuatu. In addition we present results obtained from heated clay balls from Australia. The microwave method has predominantly been used with a variety of experimental protocols including IZZI and Coe variants. Standard Thellier archaeointensity experiments using the IZZI protocol have also been carried out on selected samples. The dataset is compared to regional predictions from current global geomagnetic field models, and the influence of the new data on constraining the pfm9k family of global geomagnetic field models is explored.
Simulation of Smart Home Activity Datasets

PubMed Central

Synnott, Jonathan; Nugent, Chris; Jeffers, Paul

2015-01-01

A globally ageing population is resulting in an increased prevalence of chronic conditions which affect older adults. Such conditions require long-term care and management to maximize quality of life, placing an increasing strain on healthcare resources. Intelligent environments such as smart homes facilitate long-term monitoring of activities in the home through the use of sensor technology. Access to sensor datasets is necessary for the development of novel activity monitoring and recognition approaches. Access to such datasets is limited due to issues such as sensor cost, availability and deployment time. The use of simulated environments and sensors may address these issues and facilitate the generation of comprehensive datasets. This paper provides a review of existing approaches for the generation of simulated smart home activity datasets, including model-based approaches and interactive approaches which implement virtual sensors, environments and avatars. The paper also provides recommendation for future work in intelligent environment simulation. PMID:26087371
Simulation of Smart Home Activity Datasets.

PubMed

Synnott, Jonathan; Nugent, Chris; Jeffers, Paul

2015-06-16

A globally ageing population is resulting in an increased prevalence of chronic conditions which affect older adults. Such conditions require long-term care and management to maximize quality of life, placing an increasing strain on healthcare resources. Intelligent environments such as smart homes facilitate long-term monitoring of activities in the home through the use of sensor technology. Access to sensor datasets is necessary for the development of novel activity monitoring and recognition approaches. Access to such datasets is limited due to issues such as sensor cost, availability and deployment time. The use of simulated environments and sensors may address these issues and facilitate the generation of comprehensive datasets. This paper provides a review of existing approaches for the generation of simulated smart home activity datasets, including model-based approaches and interactive approaches which implement virtual sensors, environments and avatars. The paper also provides recommendation for future work in intelligent environment simulation.
Mario Bunge's Scientific Realism

ERIC Educational Resources Information Center

Cordero, Alberto

2012-01-01

This paper presents and comments on Mario Bunge's scientific realism. After a brief introduction in Sects. 1 and 2 outlines Bunge's conception of realism. Focusing on the case of quantum mechanics, Sect. 3 explores how his approach plays out for problematic theories. Section 4 comments on Bunge's project against the background of the current…
Parallel Index and Query for Large Scale Data Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chou, Jerry; Wu, Kesheng; Ruebel, Oliver

2011-07-18

Modern scientific datasets present numerous data management and analysis challenges. State-of-the-art index and query technologies are critical for facilitating interactive exploration of large datasets, but numerous challenges remain in terms of designing a system for process- ing general scientific datasets. The system needs to be able to run on distributed multi-core platforms, efficiently utilize underlying I/O infrastructure, and scale to massive datasets. We present FastQuery, a novel software framework that address these challenges. FastQuery utilizes a state-of-the-art index and query technology (FastBit) and is designed to process mas- sive datasets on modern supercomputing platforms. We apply FastQuery to processing ofmore » a massive 50TB dataset generated by a large scale accelerator modeling code. We demonstrate the scalability of the tool to 11,520 cores. Motivated by the scientific need to search for inter- esting particles in this dataset, we use our framework to reduce search time from hours to tens of seconds.« less
Secret Science: Exploring Cold War Greenland

NASA Astrophysics Data System (ADS)

Harper, K.

2013-12-01

During the early Cold War - from the immediate postwar period through the 1960s - the United States military carried out extensive scientific studies and pursued technological developments in Greenland. With few exceptions, most of these were classified - sometimes because new scientific knowledge was born classified, but mostly because the reasons behind the scientific explorations were. Meteorological and climatological, ionospheric, glaciological, seismological, and geological studies were among the geophysical undertakings carried out by military and civilian scientists--some in collaboration with the Danish government, and some carried out without their knowledge. This poster will present some of the results of the Exploring Greenland Project that is coming to a conclusion at Denmark's Aarhus University.
Exploring frontiers of the deep biosphere through scientific ocean drilling

NASA Astrophysics Data System (ADS)

Inagaki, F.; D'Hondt, S.; Hinrichs, K. U.

2015-12-01

Since the first deep biosphere-dedicated Ocean Drilling Program (ODP) Leg 201 using the US drill ship JOIDES Resolution in 2002, scientific ocean drilling has offered unique opportunities to expand our knowledge of the nature and extent of the deep biosphere. The latest estimate of the global subseafloor microbial biomass is ~1029cells, accounting for 4 Gt of carbon and ~1% of the Earth's total living biomass. The subseafloor microbial communities are evolutionarily diverse and their metabolic rates are extraordinarily slow. Nevertheless, accumulating activity most likely plays a significant role in elemental cycles over geological time. In 2010, during Integrated Ocean Drilling Program (IODP) Expedition 329, the JOIDES Resolutionexplored the deep biosphere in the open-ocean South Pacific Gyre—the largest oligotrophic province on our planet. During Expedition 329, relatively high concentrations of dissolved oxygen and significantly low biomass of microbial populations were observed in the entire sediment column, indicating that (i) there is no limit to life in open-ocean sediment and (ii) a significant amount of oxygen reaches through the sediment to the upper oceanic crust. This "deep aerobic biosphere" inhabits the sediment throughout up to ~37 percent of the world's oceans. The remaining ~63 percent of the oceans is comprised of higher productivity areas that contain the "deep anaerobic biosphere". In 2012, during IODP Expedition 337, the Japanese drill ship Chikyu explored coal-bearing sediments down to 2,466 meters below the seafloor off the Shimokita Peninsula, Japan. Geochemical and microbiological analyses consistently showed the occurrence of methane-producing communities associated with the coal beds. Cell concentrations in deep sediments were notably lower than those expected from the global regression line, implying that the bottom of the deep biosphere is approached in these beds. Taxonomic composition of the deep coal-bearing communities profoundly
The climate visualizer: Sense-making through scientific visualization

NASA Astrophysics Data System (ADS)

Gordin, Douglas N.; Polman, Joseph L.; Pea, Roy D.

1994-12-01

This paper describes the design of a learning environment, called the Climate Visualizer, intended to facilitate scientific sense-making in high school classrooms by providing students the ability to craft, inspect, and annotate scientific visualizations. The theoretical back-ground for our design presents a view of learning as acquiring and critiquing cultural practices and stresses the need for students to appropriate the social and material aspects of practice when learning an area. This is followed by a description of the design of the Climate Visualizer, including detailed accounts of its provision of spatial and temporal context and the quantitative and visual representations it employs. A broader context is then explored by describing its integration into the high school science classroom. This discussion explores how visualizations can promote the creation of scientific theories, especially in conjunction with the Collaboratory Notebook, an embedded environment for creating and critiquing scientific theories and visualizations. Finally, we discuss the design trade-offs we have made in light of our theoretical orientation, and our hopes for further progress.
Mars scientific investigations as a precursor for human exploration.

PubMed

Ahlf, P; Cantwell, E; Ostrach, L; Pline, A

2000-01-01

In the past two years, NASA has begun to develop and implement plans for investigations on robotic Mars missions which are focused toward returning data critical for planning human missions to Mars. The Mars Surveyor Program 2001 Orbiter and Lander missions will mark the first time that experiments dedicated to preparation for human exploration will be carried out. Investigations on these missions and future missions range from characterization of the physical and chemical environment of Mars, to predicting the response of biology to the Mars environment. Planning for such missions must take into account existing data from previous Mars missions which were not necessarily focused on human exploration preparation. At the same time, plans for near term missions by the international community must be considered to avoid duplication of effort. This paper reviews data requirements for human exploration and applicability of existing data. It will also describe current plans for investigations and place them within the context of related international activities. c 2000 International Astronautical Federation. Published by Elsevier Science Ltd. All rights reserved.
Mars scientific investigations as a precursor for human exploration

NASA Technical Reports Server (NTRS)

Ahlf, P.; Cantwell, E.; Ostrach, L.; Pline, A.

2000-01-01

In the past two years, NASA has begun to develop and implement plans for investigations on robotic Mars missions which are focused toward returning data critical for planning human missions to Mars. The Mars Surveyor Program 2001 Orbiter and Lander missions will mark the first time that experiments dedicated to preparation for human exploration will be carried out. Investigations on these missions and future missions range from characterization of the physical and chemical environment of Mars, to predicting the response of biology to the Mars environment. Planning for such missions must take into account existing data from previous Mars missions which were not necessarily focused on human exploration preparation. At the same time, plans for near term missions by the international community must be considered to avoid duplication of effort. This paper reviews data requirements for human exploration and applicability of existing data. It will also describe current plans for investigations and place them within the context of related international activities. c 2000 International Astronautical Federation. Published by Elsevier Science Ltd. All rights reserved.
PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets.

PubMed

Djokic-Petrovic, Marija; Cvjetkovic, Vladimir; Yang, Jeremy; Zivanovic, Marko; Wild, David J

2017-09-20

There are a huge variety of data sources relevant to chemical, biological and pharmacological research, but these data sources are highly siloed and cannot be queried together in a straightforward way. Semantic technologies offer the ability to create links and mappings across datasets and manage them as a single, linked network so that searching can be carried out across datasets, independently of the source. We have developed an application called PIBAS FedSPARQL that uses semantic technologies to allow researchers to carry out such searching across a vast array of data sources. PIBAS FedSPARQL is a web-based query builder and result set visualizer of bioinformatics data. As an advanced feature, our system can detect similar data items identified by different Uniform Resource Identifiers (URIs), using a text-mining algorithm based on the processing of named entities to be used in Vector Space Model and Cosine Similarity Measures. According to our knowledge, PIBAS FedSPARQL was unique among the systems that we found in that it allows detecting of similar data items. As a query builder, our system allows researchers to intuitively construct and run Federated SPARQL queries across multiple data sources, including global initiatives, such as Bio2RDF, Chem2Bio2RDF, EMBL-EBI, and one local initiative called CPCTAS, as well as additional user-specified data source. From the input topic, subtopic, template and keyword, a corresponding initial Federated SPARQL query is created and executed. Based on the data obtained, end users have the ability to choose the most appropriate data sources in their area of interest and exploit their Resource Description Framework (RDF) structure, which allows users to select certain properties of data to enhance query results. The developed system is flexible and allows intuitive creation and execution of queries for an extensive range of bioinformatics topics. Also, the novel "similar data items detection" algorithm can be particularly
Explorers Program Management

NASA Technical Reports Server (NTRS)

Volpe, Frank; Comberiate, Anthony B. (Technical Monitor)

2001-01-01

The mission of the Explorer Program is to provide frequent flight opportunities for world-class scientific investigations from space within the following space science themes: 1) Astronomical Search for Origins and Planetary Systems; 2) Structure and Evolution of the Universe; and 3) The Sun-Earth Connection. America's space exploration started with Explorer 1 which was launched February 1, 1958 and discovered the Van Allen Radiation Belts. Over 75 Explorer missions have flown. The program seeks to enhance public awareness of, and appreciation for, space science and to incorporate. educational and public outreach activities as integral parts of space science investigations.
Students' abilities to critique scientific evidence when reading and writing scientific arguments

NASA Astrophysics Data System (ADS)

Knight, Amanda M.

Scientific arguments are used to persuade others for explanations that make sense of the natural world. Over time, through the accumulation of evidence, one explanation for a scientific phenomenon tends to take precedence. In science education, arguments make students' thinking and reasoning visible while also supporting the development of their conceptual, procedural, and epistemic knowledge. As such, argumentation has become a goal within recent policy documents, including the Next Generation Science Standards, which, in turn, presents a need for comprehensive, effective, and scalable assessments. This dissertation used assessments that measure students' abilities to critique scientific evidence, which is measured in terms of the form of justification and the support of empirical evidence, when reading and writing scientific arguments. Cognitive interviews were then conducted with a subset of the students to explore the criteria they used to critique scientific evidence. Specifically, the research investigated what characteristics of scientific evidence the students preferred, how they critiqued both forms of justification and empirical evidence, and whether the four constructs represented four separate abilities. Findings suggest that students' prioritized the type of empirical evidence to the form of justification, and most often selected relevant-supporting justifications. When writing scientific arguments, most students constructed a justified claim, but struggled to justify their claims with empirical evidence. In comparison, when reading scientific arguments, students had trouble locating a justification when it was not empirical data. Additionally, it was more difficult for students to critique than identify or locate empirical evidence, and it was more difficult for students to identify than locate empirical evidence. Findings from the cognitive interviews suggest that students with more specific criteria tended to have more knowledge of the construct
Scientific workflows as productivity tools for drug discovery.

PubMed

Shon, John; Ohkawa, Hitomi; Hammer, Juergen

2008-05-01

Large pharmaceutical companies annually invest tens to hundreds of millions of US dollars in research informatics to support their early drug discovery processes. Traditionally, most of these investments are designed to increase the efficiency of drug discovery. The introduction of do-it-yourself scientific workflow platforms has enabled research informatics organizations to shift their efforts toward scientific innovation, ultimately resulting in a possible increase in return on their investments. Unlike the handling of most scientific data and application integration approaches, researchers apply scientific workflows to in silico experimentation and exploration, leading to scientific discoveries that lie beyond automation and integration. This review highlights some key requirements for scientific workflow environments in the pharmaceutical industry that are necessary for increasing research productivity. Examples of the application of scientific workflows in research and a summary of recent platform advances are also provided.
Lunar Exploration and Science in ESA

NASA Astrophysics Data System (ADS)

Carpenter, J.; Houdou, B.; Fisackerly, R.; De Rosa, D.; Espinasse, S.; Hufenbach, B.

2013-09-01

Lunar exploration continues to be a priority for the European Space Agency (ESA) and is recognized as the next step for human exploration beyond low Earth orbit. The Moon is also recognized as an important scientific target providing vital information on the history of the inner solar system; Earth and the emergence of life, and fundamental information on the formation and evolution of terrestrial planets. The Moon also provides a platform that can be utilized for fundamental science and to prepare the way for exploration deeper into space and towards a human Mars mission, the ultimate exploration goal. Lunar missions can also provide a means of preparing for a Mars sample return mission, which is an important long term robotic milestone. ESA is preparing for future participation in lunar exploration through a combination of human and robotic activities, in cooperation with international partners. These include activities on the ISS and participation with US led Multi-Purpose Crew Vehicle, which is planned for a first unmanned lunar flight in 2017. Future activities planned activities also include participation in international robotic missions. These activities are performed with a view to generating the technologies, capabilities, knowledge and heritage that will make Europe an indispensible partner in the exploration missions of the future. We present ESA's plans for Lunar exploration and the current status of activities. In particular we will show that this programme gives rise to unique scientific opportunities and prepares scientifically and technologically for future exploratory steps.
Public understanding of science is not scientific literacy

DOE Office of Scientific and Technical Information (OSTI.GOV)

McGowan, A.

1995-12-31

The author notes that public understanding of science has, in many quarters, been taken over by the wrong notion of scientific literacy. The need for the scientific community to develop the language that speaks to the public in general is explored. Methodologies to improve communication to the general public and increase their understanding with clearly developed metaphors are examined.
NATIONAL HYDROGRAPHY DATASET

EPA Science Inventory

Resource Purpose:The National Hydrography Dataset (NHD) is a comprehensive set of digital spatial data that contains information about surface water features such as lakes, ponds, streams, rivers, springs and wells. Within the NHD, surface water features are combined to fo...
Multiresolution comparison of precipitation datasets for large-scale models

NASA Astrophysics Data System (ADS)

Chun, K. P.; Sapriza Azuri, G.; Davison, B.; DeBeer, C. M.; Wheater, H. S.

2014-12-01

Gridded precipitation datasets are crucial for driving large-scale models which are related to weather forecast and climate research. However, the quality of precipitation products is usually validated individually. Comparisons between gridded precipitation products along with ground observations provide another avenue for investigating how the precipitation uncertainty would affect the performance of large-scale models. In this study, using data from a set of precipitation gauges over British Columbia and Alberta, we evaluate several widely used North America gridded products including the Canadian Gridded Precipitation Anomalies (CANGRD), the National Center for Environmental Prediction (NCEP) reanalysis, the Water and Global Change (WATCH) project, the thin plate spline smoothing algorithms (ANUSPLIN) and Canadian Precipitation Analysis (CaPA). Based on verification criteria for various temporal and spatial scales, results provide an assessment of possible applications for various precipitation datasets. For long-term climate variation studies (~100 years), CANGRD, NCEP, WATCH and ANUSPLIN have different comparative advantages in terms of their resolution and accuracy. For synoptic and mesoscale precipitation patterns, CaPA provides appealing performance of spatial coherence. In addition to the products comparison, various downscaling methods are also surveyed to explore new verification and bias-reduction methods for improving gridded precipitation outputs for large-scale models.
Mars Exploration Rover Surface Operations

NASA Astrophysics Data System (ADS)

Erickson, J. K.; Adler, M.; Crisp, J.; Mishkin, A.; Welch, R.

2002-01-01

The Mars Exploration Rover Project is an ambitious mission to land two highly capable rovers on Mars and concurrently explore the Martian surface for three months each. Launching in 2003, surface operations will commence on January 4, 2004 with the first landing, followed by the second landing on January 25. The prime mission for the second rover will end on April 27, 2004. The science objectives of exploring multiple locations within each of two widely separated and scientifically distinct landing sites will be accomplished along with the demonstration of key surface exploration technologies for future missions. This paper will provide an overview of the planned mission, and also focus on the different operations challenges inherent in operating these two very off road vehicles, and the solutions adopted to enable the best utilization of their capabilities for high science return and responsiveness to scientific discovery.
The Optimum Dataset method - examples of the application

NASA Astrophysics Data System (ADS)

Błaszczak-Bąk, Wioleta; Sobieraj-Żłobińska, Anna; Wieczorek, Beata

2018-01-01

Data reduction is a procedure to decrease the dataset in order to make their analysis more effective and easier. Reduction of the dataset is an issue that requires proper planning, so after reduction it meets all the user's expectations. Evidently, it is better if the result is an optimal solution in terms of adopted criteria. Within reduction methods, which provide the optimal solution there is the Optimum Dataset method (OptD) proposed by Błaszczak-Bąk (2016). The paper presents the application of this method for different datasets from LiDAR and the possibility of using the method for various purposes of the study. The following reduced datasets were presented: (a) measurement of Sielska street in Olsztyn (Airbrone Laser Scanning data - ALS data), (b) measurement of the bas-relief that is on the building in Gdańsk (Terrestrial Laser Scanning data - TLS data), (c) dataset from Biebrza river measurment (TLS data).

Exploring the QSAR's predictive truthfulness of the novel N-tuple discrete derivative indices on benchmark datasets.

PubMed

Martínez-Santiago, O; Marrero-Ponce, Y; Vivas-Reyes, R; Rivera-Borroto, O M; Hurtado, E; Treto-Suarez, M A; Ramos, Y; Vergara-Murillo, F; Orozco-Ugarriza, M E; Martínez-López, Y

2017-05-01

Graph derivative indices (GDIs) have recently been defined over N-atoms (N = 2, 3 and 4) simultaneously, which are based on the concept of derivatives in discrete mathematics (finite difference), metaphorical to the derivative concept in classical mathematical analysis. These molecular descriptors (MDs) codify topo-chemical and topo-structural information based on the concept of the derivative of a molecular graph with respect to a given event (S) over duplex, triplex and quadruplex relations of atoms (vertices). These GDIs have been successfully applied in the description of physicochemical properties like reactivity, solubility and chemical shift, among others, and in several comparative quantitative structure activity/property relationship (QSAR/QSPR) studies. Although satisfactory results have been obtained in previous modelling studies with the aforementioned indices, it is necessary to develop new, more rigorous analysis to assess the true predictive performance of the novel structure codification. So, in the present paper, an assessment and statistical validation of the performance of these novel approaches in QSAR studies are executed, as well as a comparison with those of other QSAR procedures reported in the literature. To achieve the main aim of this research, QSARs were developed on eight chemical datasets widely used as benchmarks in the evaluation/validation of several QSAR methods and/or many different MDs (fundamentally 3D MDs). Three to seven variable QSAR models were built for each chemical dataset, according to the original dissection into training/test sets. The models were developed by using multiple linear regression (MLR) coupled with a genetic algorithm as the feature wrapper selection technique in the MobyDigs software. Each family of GDIs (for duplex, triplex and quadruplex) behaves similarly in all modelling, although there were some exceptions. However, when all families were used in combination, the results achieved were quantitatively
International health research monitoring: exploring a scientific and a cooperative approach using participatory action research

PubMed Central

Chantler, Tracey; Cheah, Phaik Yeong; Miiro, George; Hantrakum, Viriya; Nanvubya, Annet; Ayuo, Elizabeth; Kivaya, Esther; Kidola, Jeremiah; Kaleebu, Pontiano; Parker, Michael; Njuguna, Patricia; Ashley, Elizabeth; Guerin, Philippe J; Lang, Trudie

2014-01-01

Objectives To evaluate and determine the value of monitoring models developed by the Mahidol Oxford Tropical Research Unit and the East African Consortium for Clinical Research, consider how this can be measured and explore monitors’ and investigators’ experiences of and views about the nature, purpose and practice of monitoring. Research design A case study approach was used within the context of participatory action research because one of the aims was to guide and improve practice. 34 interviews, five focus groups and observations of monitoring practice were conducted. Setting and participants Fieldwork occurred in the places where the monitoring models are coordinated and applied in Thailand, Cambodia, Uganda and Kenya. Participants included those coordinating the monitoring schemes, monitors, senior investigators and research staff. Analysis Transcribed textual data from field notes, interviews and focus groups was imported into a qualitative data software program (NVIVO V. 10) and analysed inductively and thematically by a qualitative researcher. The initial coding framework was reviewed internally and two main categories emerged from the subsequent interrogation of the data. Results The categories that were identified related to the conceptual framing and nature of monitoring, and the practice of monitoring, including relational factors. Particular emphasis was given to the value of a scientific and cooperative style of monitoring as a means of enhancing data quality, trust and transparency. In terms of practice the primary purpose of monitoring was defined as improving the conduct of health research and increasing the capacity of researchers and trial sites. Conclusions The models studied utilise internal and network wide expertise to improve the ethics and quality of clinical research. They demonstrate how monitoring can be a scientific and constructive exercise rather than a threatening process. The value of cooperative relations needs to be given
Scientific Rationale and Requirements for a Global Seismic Network on Mars

NASA Technical Reports Server (NTRS)

Solomon, Sean C.; Anderson, Don L.; Banerdt, W. Bruce; Butler, Rhett G.; Davis, Paul M.; Duennebier, Frederick K.; Nakamura, Yosio; Okal, Emile A.; Phillips, Roger J.

1991-01-01

Following a brief overview of the mission concepts for a Mars Global Network Mission as of the time of the workshop, we present the principal scientific objectives to be achieved by a Mars seismic network. We review the lessons for extraterrestrial seismology gained from experience to date on the Moon and on Mars. An important unknown on Mars is the expected rate of seismicity, but theoretical expectations and extrapolation from lunar experience both support the view that seismicity rates, wave propagation characteristics, and signal-to-noise ratios are favorable to the collection of a scientifically rich dataset during the multiyear operation of a global seismic experiment. We discuss how particular types of seismic waves will provide the most useful information to address each of the scientific objectives, and this discussion provides the basis for a strategy for station siting. Finally, we define the necessary technical requirements for the seismic stations.
Far Travelers: The Exploring Machines.

ERIC Educational Resources Information Center

Nicks, Oran W.

The National Aeronautics and Space Administration (NASA) program of lunar and planetary exploration produced a flood of scientific information about the moon, planets and the environment of interplanetary space. This book is an account of the people, machines, and the events of this scientific enterprise. It is a story of organizations,…
Solar System Exploration, 1995-2000

NASA Technical Reports Server (NTRS)

Squyres, S.; Varsi, G.; Veverka, J.; Soderblom, L.; Black, D.; Stern, A.; Stetson, D.; Brown, R. A.; Niehoff, J.; Squibb, G.

1994-01-01

Goals for planetary exploration during the next decade include: (1) determine how our solar system formed, and understand whether planetary systems are a common phenomenon through out the cosmos; (2) explore the diverse changes that planets have undergone throughout their history and that take place at present, including those that distinguish Earth as a planet; (3) understand how life might have formed on Earth, whether life began anywhere else in the solar system, and whether life (including intelligent beings) might be a common cosmic phenomenon; (4) discover and investigate natural phenomena that occur under conditions not realizable in laboratories; (5) discover and inventory resources in the solar system that could be used by human civilizations in the future; and (6) make the solar system a part of the human experience in the same way that Earth is, and hence lay the groundwork for human expansion into the solar system in the coming century. The plan for solar system exploration is motivated by these goals as well as the following principle: The solar system exploration program will conduct flight programs and supporting data analysis and scientific research commensurate with United States leadership in space exploration. These programs and research must be of the highest scientific merit, they must be responsive to public excitement regarding planetary exploration, and they must contribute to larger national goals in technology and education. The result will be new information, which is accessible to the public, creates new knowledge, and stimulates programs of education to increase the base of scientific knowledge in the general public.
Emerging Nanophotonic Applications Explored with Advanced Scientific Parallel Computing

NASA Astrophysics Data System (ADS)

Meng, Xiang

The domain of nanoscale optical science and technology is a combination of the classical world of electromagnetics and the quantum mechanical regime of atoms and molecules. Recent advancements in fabrication technology allows the optical structures to be scaled down to nanoscale size or even to the atomic level, which are far smaller than the wavelength they are designed for. These nanostructures can have unique, controllable, and tunable optical properties and their interactions with quantum materials can have important near-field and far-field optical response. Undoubtedly, these optical properties can have many important applications, ranging from the efficient and tunable light sources, detectors, filters, modulators, high-speed all-optical switches; to the next-generation classical and quantum computation, and biophotonic medical sensors. This emerging research of nanoscience, known as nanophotonics, is a highly interdisciplinary field requiring expertise in materials science, physics, electrical engineering, and scientific computing, modeling and simulation. It has also become an important research field for investigating the science and engineering of light-matter interactions that take place on wavelength and subwavelength scales where the nature of the nanostructured matter controls the interactions. In addition, the fast advancements in the computing capabilities, such as parallel computing, also become as a critical element for investigating advanced nanophotonic devices. This role has taken on even greater urgency with the scale-down of device dimensions, and the design for these devices require extensive memory and extremely long core hours. Thus distributed computing platforms associated with parallel computing are required for faster designs processes. Scientific parallel computing constructs mathematical models and quantitative analysis techniques, and uses the computing machines to analyze and solve otherwise intractable scientific challenges. In
The Centennial Trends Greater Horn of Africa precipitation dataset.

PubMed

Funk, Chris; Nicholson, Sharon E; Landsfeld, Martin; Klotter, Douglas; Peterson, Pete; Harrison, Laura

2015-01-01

East Africa is a drought prone, food and water insecure region with a highly variable climate. This complexity makes rainfall estimation challenging, and this challenge is compounded by low rain gauge densities and inhomogeneous monitoring networks. The dearth of observations is particularly problematic over the past decade, since the number of records in globally accessible archives has fallen precipitously. This lack of data coincides with an increasing scientific and humanitarian need to place recent seasonal and multi-annual East African precipitation extremes in a deep historic context. To serve this need, scientists from the UC Santa Barbara Climate Hazards Group and Florida State University have pooled their station archives and expertise to produce a high quality gridded 'Centennial Trends' precipitation dataset. Additional observations have been acquired from the national meteorological agencies and augmented with data provided by other universities. Extensive quality control of the data was carried out and seasonal anomalies interpolated using kriging. This paper documents the CenTrends methodology and data.
The Centennial Trends Greater Horn of Africa precipitation dataset

USGS Publications Warehouse

Funk, Chris; Nicholson, Sharon E.; Landsfeld, Martin F.; Klotter, Douglas; Peterson, Pete J.; Harrison, Laura

2015-01-01

East Africa is a drought prone, food and water insecure region with a highly variable climate. This complexity makes rainfall estimation challenging, and this challenge is compounded by low rain gauge densities and inhomogeneous monitoring networks. The dearth of observations is particularly problematic over the past decade, since the number of records in globally accessible archives has fallen precipitously. This lack of data coincides with an increasing scientific and humanitarian need to place recent seasonal and multi-annual East African precipitation extremes in a deep historic context. To serve this need, scientists from the UC Santa Barbara Climate Hazards Group and Florida State University have pooled their station archives and expertise to produce a high quality gridded ‘Centennial Trends’ precipitation dataset. Additional observations have been acquired from the national meteorological agencies and augmented with data provided by other universities. Extensive quality control of the data was carried out and seasonal anomalies interpolated using kriging. This paper documents the CenTrends methodology and data.
A New Combinatorial Optimization Approach for Integrated Feature Selection Using Different Datasets: A Prostate Cancer Transcriptomic Study

PubMed Central

Puthiyedth, Nisha; Riveros, Carlos; Berretta, Regina; Moscato, Pablo

2015-01-01

Background The joint study of multiple datasets has become a common technique for increasing statistical power in detecting biomarkers obtained from smaller studies. The approach generally followed is based on the fact that as the total number of samples increases, we expect to have greater power to detect associations of interest. This methodology has been applied to genome-wide association and transcriptomic studies due to the availability of datasets in the public domain. While this approach is well established in biostatistics, the introduction of new combinatorial optimization models to address this issue has not been explored in depth. In this study, we introduce a new model for the integration of multiple datasets and we show its application in transcriptomics. Methods We propose a new combinatorial optimization problem that addresses the core issue of biomarker detection in integrated datasets. Optimal solutions for this model deliver a feature selection from a panel of prospective biomarkers. The model we propose is a generalised version of the (α,β)-k-Feature Set problem. We illustrate the performance of this new methodology via a challenging meta-analysis task involving six prostate cancer microarray datasets. The results are then compared to the popular RankProd meta-analysis tool and to what can be obtained by analysing the individual datasets by statistical and combinatorial methods alone. Results Application of the integrated method resulted in a more informative signature than the rank-based meta-analysis or individual dataset results, and overcomes problems arising from real world datasets. The set of genes identified is highly significant in the context of prostate cancer. The method used does not rely on homogenisation or transformation of values to a common scale, and at the same time is able to capture markers associated with subgroups of the disease. PMID:26106884
A microarray whole-genome gene expression dataset in a rat model of inflammatory corneal angiogenesis.

PubMed

Mukwaya, Anthony; Lindvall, Jessica M; Xeroudaki, Maria; Peebo, Beatrice; Ali, Zaheer; Lennikov, Anton; Jensen, Lasse Dahl Ejby; Lagali, Neil

2016-11-22

In angiogenesis with concurrent inflammation, many pathways are activated, some linked to VEGF and others largely VEGF-independent. Pathways involving inflammatory mediators, chemokines, and micro-RNAs may play important roles in maintaining a pro-angiogenic environment or mediating angiogenic regression. Here, we describe a gene expression dataset to facilitate exploration of pro-angiogenic, pro-inflammatory, and remodelling/normalization-associated genes during both an active capillary sprouting phase, and in the restoration of an avascular phenotype. The dataset was generated by microarray analysis of the whole transcriptome in a rat model of suture-induced inflammatory corneal neovascularisation. Regions of active capillary sprout growth or regression in the cornea were harvested and total RNA extracted from four biological replicates per group. High quality RNA was obtained for gene expression analysis using microarrays. Fold change of selected genes was validated by qPCR, and protein expression was evaluated by immunohistochemistry. We provide a gene expression dataset that may be re-used to investigate corneal neovascularisation, and may also have implications in other contexts of inflammation-mediated angiogenesis.
Trace: a high-throughput tomographic reconstruction engine for large-scale datasets

DOE PAGES

Bicer, Tekin; Gursoy, Doga; Andrade, Vincent De; ...

2017-01-28

Here, synchrotron light source and detector technologies enable scientists to perform advanced experiments. These scientific instruments and experiments produce data at such scale and complexity that large-scale computation is required to unleash their full power. One of the widely used data acquisition technique at light sources is Computed Tomography, which can generate tens of GB/s depending on x-ray range. A large-scale tomographic dataset, such as mouse brain, may require hours of computation time with a medium size workstation. In this paper, we present Trace, a data-intensive computing middleware we developed for implementation and parallelization of iterative tomographic reconstruction algorithms. Tracemore » provides fine-grained reconstruction of tomography datasets using both (thread level) shared memory and (process level) distributed memory parallelization. Trace utilizes a special data structure called replicated reconstruction object to maximize application performance. We also present the optimizations we have done on the replicated reconstruction objects and evaluate them using a shale and a mouse brain sinogram. Our experimental evaluations show that the applied optimizations and parallelization techniques can provide 158x speedup (using 32 compute nodes) over single core configuration, which decreases the reconstruction time of a sinogram (with 4501 projections and 22400 detector resolution) from 12.5 hours to less than 5 minutes per iteration.« less
Trace: a high-throughput tomographic reconstruction engine for large-scale datasets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bicer, Tekin; Gursoy, Doga; Andrade, Vincent De

Here, synchrotron light source and detector technologies enable scientists to perform advanced experiments. These scientific instruments and experiments produce data at such scale and complexity that large-scale computation is required to unleash their full power. One of the widely used data acquisition technique at light sources is Computed Tomography, which can generate tens of GB/s depending on x-ray range. A large-scale tomographic dataset, such as mouse brain, may require hours of computation time with a medium size workstation. In this paper, we present Trace, a data-intensive computing middleware we developed for implementation and parallelization of iterative tomographic reconstruction algorithms. Tracemore » provides fine-grained reconstruction of tomography datasets using both (thread level) shared memory and (process level) distributed memory parallelization. Trace utilizes a special data structure called replicated reconstruction object to maximize application performance. We also present the optimizations we have done on the replicated reconstruction objects and evaluate them using a shale and a mouse brain sinogram. Our experimental evaluations show that the applied optimizations and parallelization techniques can provide 158x speedup (using 32 compute nodes) over single core configuration, which decreases the reconstruction time of a sinogram (with 4501 projections and 22400 detector resolution) from 12.5 hours to less than 5 minutes per iteration.« less
Exploring the Assessment of and Relationship between Elementary Students' Scientific Creativity and Science Inquiry

ERIC Educational Resources Information Center

Yang, Kuay-Keng; Lin, Shu-Fen; Hong, Zuway-R; Lin, Huann-shyang

2016-01-01

The purposes of this study were to (a) develop and validate instruments to assess elementary students' scientific creativity and science inquiry, (b) investigate the relationship between the two competencies, and (c) compare the two competencies among different grade level students. The scientific creativity test was composed of 7 open-ended items…
Nine martian years of dust optical depth observations: A reference dataset

NASA Astrophysics Data System (ADS)

Montabone, Luca; Forget, Francois; Kleinboehl, Armin; Kass, David; Wilson, R. John; Millour, Ehouarn; Smith, Michael; Lewis, Stephen; Cantor, Bruce; Lemmon, Mark; Wolff, Michael

2016-07-01

We present a multi-annual reference dataset of the horizontal distribution of airborne dust from martian year 24 to 32 using observations of the martian atmosphere from April 1999 to June 2015 made by the Thermal Emission Spectrometer (TES) aboard Mars Global Surveyor, the Thermal Emission Imaging System (THEMIS) aboard Mars Odyssey, and the Mars Climate Sounder (MCS) aboard Mars Reconnaissance Orbiter (MRO). Our methodology to build the dataset works by gridding the available retrievals of column dust optical depth (CDOD) from TES and THEMIS nadir observations, as well as the estimates of this quantity from MCS limb observations. The resulting (irregularly) gridded maps (one per sol) were validated with independent observations of CDOD by PanCam cameras and Mini-TES spectrometers aboard the Mars Exploration Rovers "Spirit" and "Opportunity", by the Surface Stereo Imager aboard the Phoenix lander, and by the Compact Reconnaissance Imaging Spectrometer for Mars aboard MRO. Finally, regular maps of CDOD are produced by spatially interpolating the irregularly gridded maps using a kriging method. These latter maps are used as dust scenarios in the Mars Climate Database (MCD) version 5, and are useful in many modelling applications. The two datasets (daily irregularly gridded maps and regularly kriged maps) for the nine available martian years are publicly available as NetCDF files and can be downloaded from the MCD website at the URL: http://www-mars.lmd.jussieu.fr/mars/dust_climatology/index.html
Relevancy Ranking of Satellite Dataset Search Results

NASA Technical Reports Server (NTRS)

Lynnes, Christopher; Quinn, Patrick; Norton, James

2017-01-01

As the Variety of Earth science datasets increases, science researchers find it more challenging to discover and select the datasets that best fit their needs. The most common way of search providers to address this problem is to rank the datasets returned for a query by their likely relevance to the user. Large web page search engines typically use text matching supplemented with reverse link counts, semantic annotations and user intent modeling. However, this produces uneven results when applied to dataset metadata records simply externalized as a web page. Fortunately, data and search provides have decades of experience in serving data user communities, allowing them to form heuristics that leverage the structure in the metadata together with knowledge about the user community. Some of these heuristics include specific ways of matching the user input to the essential measurements in the dataset and determining overlaps of time range and spatial areas. Heuristics based on the novelty of the datasets can prioritize later, better versions of data over similar predecessors. And knowledge of how different user types and communities use data can be brought to bear in cases where characteristics of the user (discipline, expertise) or their intent (applications, research) can be divined. The Earth Observing System Data and Information System has begun implementing some of these heuristics in the relevancy algorithm of its Common Metadata Repository search engine.
Hydrodynamic modelling and global datasets: Flow connectivity and SRTM data, a Bangkok case study.

NASA Astrophysics Data System (ADS)

Trigg, M. A.; Bates, P. B.; Michaelides, K.

2012-04-01

The rise in the global interconnected manufacturing supply chains requires an understanding and consistent quantification of flood risk at a global scale. Flood risk is often better quantified (or at least more precisely defined) in regions where there has been an investment in comprehensive topographical data collection such as LiDAR coupled with detailed hydrodynamic modelling. Yet in regions where these data and modelling are unavailable, the implications of flooding and the knock on effects for global industries can be dramatic, as evidenced by the recent floods in Bangkok, Thailand. There is a growing momentum in terms of global modelling initiatives to address this lack of a consistent understanding of flood risk and they will rely heavily on the application of available global datasets relevant to hydrodynamic modelling, such as Shuttle Radar Topography Mission (SRTM) data and its derivatives. These global datasets bring opportunities to apply consistent methodologies on an automated basis in all regions, while the use of coarser scale datasets also brings many challenges such as sub-grid process representation and downscaled hydrology data from global climate models. There are significant opportunities for hydrological science in helping define new, realistic and physically based methodologies that can be applied globally as well as the possibility of gaining new insights into flood risk through analysis of the many large datasets that will be derived from this work. We use Bangkok as a case study to explore some of the issues related to using these available global datasets for hydrodynamic modelling, with particular focus on using SRTM data to represent topography. Research has shown that flow connectivity on the floodplain is an important component in the dynamics of flood flows on to and off the floodplain, and indeed within different areas of the floodplain. A lack of representation of flow connectivity, often due to data resolution limitations, means
Teaching Scientific Communication Skills in Science Studies: Does It Make a Difference?

ERIC Educational Resources Information Center

Spektor-Levy, Ornit; Eylon, Bat-Sheva; Scherz, Zahava

2009-01-01

This study explores the impact of "Scientific Communication" (SC) skills instruction on students' performances in scientific literacy assessment tasks. We present a general model for skills instruction, characterized by explicit and spiral instruction, integration into content learning, practice in several scientific topics, and application of…
Development of a global historic monthly mean precipitation dataset

NASA Astrophysics Data System (ADS)

Yang, Su; Xu, Wenhui; Xu, Yan; Li, Qingxiang

2016-04-01

Global historic precipitation dataset is the base for climate and water cycle research. There have been several global historic land surface precipitation datasets developed by international data centers such as the US National Climatic Data Center (NCDC), European Climate Assessment & Dataset project team, Met Office, etc., but so far there are no such datasets developed by any research institute in China. In addition, each dataset has its own focus of study region, and the existing global precipitation datasets only contain sparse observational stations over China, which may result in uncertainties in East Asian precipitation studies. In order to take into account comprehensive historic information, users might need to employ two or more datasets. However, the non-uniform data formats, data units, station IDs, and so on add extra difficulties for users to exploit these datasets. For this reason, a complete historic precipitation dataset that takes advantages of various datasets has been developed and produced in the National Meteorological Information Center of China. Precipitation observations from 12 sources are aggregated, and the data formats, data units, and station IDs are unified. Duplicated stations with the same ID are identified, with duplicated observations removed. Consistency test, correlation coefficient test, significance t-test at the 95% confidence level, and significance F-test at the 95% confidence level are conducted first to ensure the data reliability. Only those datasets that satisfy all the above four criteria are integrated to produce the China Meteorological Administration global precipitation (CGP) historic precipitation dataset version 1.0. It contains observations at 31 thousand stations with 1.87 × 107 data records, among which 4152 time series of precipitation are longer than 100 yr. This dataset plays a critical role in climate research due to its advantages in large data volume and high density of station network, compared to
Fostering Inquiry and Scientific Investigation in Students by Using GPS Data to Explore Plate Tectonics and Volcanic Deformation

NASA Astrophysics Data System (ADS)

Olds, S. E.; Eriksson, S.

2007-12-01

The Education and Outreach program at UNAVCO has developed free instructional materials using authentic high-precision GPS data for secondary education and undergraduate students in Earth science courses. Using inquiry-based, data-rich activities, students investigate crustal deformation and plate motion using GPS data and learn how these measurements are important to scientific discovery and understanding natural hazards and the current state of prediction. Because this deformation is expressed on Earth's surface over familiar time scales and on easily visualized orders of magnitude, GPS data represent an effective method for illustrating the geomorphic effects of plate tectonics and, in essence, allow students to 'see' plates move and volcanoes deform. The activities foster student skills to critically assess different forms of data, to visualize abstract concepts, and to evaluate multiple lines of evidence to analyze scientific problems. The activities are scaffolded to begin with basic concepts about GPS data and analyzing simple plate motion and move towards data analyses for more complex motion and crustal deformation. As part of assessment, students can apply new knowledge to explore other geographic regions independently. Learning activities currently include exploring motion along the San Andreas Fault, monitoring volcano deformation and ground movement at the Yellowstone Caldera, and analyzing ground motion along the subduction zone in the Cascadia region. To support educators and their students in their investigations, UNAVCO has developed the Data for Educators portal; http://www.unavco.org/edu_outreach/data.html. This portal provides a Google-map displaying the locations of GPS stations, web links to numerical GPS data that illustrate specific Earth processes, and educational activities that incorporate this data. The GPS data is freely available in a format compatible with standard spreadsheet and graphing programs as well as visualization and
Exploring teachers' beliefs and knowledge about scientific inquiry and the nature of science: A collaborative action research project

NASA Astrophysics Data System (ADS)

Fazio, Xavier Eric

Science curriculum reform goals espouse the need to foster and support the development of scientific literacy in students. Two critical goals of scientific literacy are students' engagement in, and developing more realistic conceptions about scientific inquiry (SI) and the nature of science (NOS). In order to promote the learning of these curriculum emphases, teachers themselves must possess beliefs and knowledge supportive of them. Collaborative action research is a viable form of curriculum and teacher development that can be used to support teachers in developing the requisite beliefs and knowledge that can promote these scientific literacy goals. This research study used a collective case study methodology to describe and interpret the views and actions of four teachers participating in a collaborative action research project. I explored the teachers' SI and NOS views throughout the project as they investigated ideas and theories, critically examined their current curricular practice, and implemented and reflected on these modified curricular practices. By the end of the research study, all participants had uniquely augmented their understanding of SI and NOS. The participants were better able to provide explanatory depth to some SI and NOS ideas; however, specific belief revision with respect to SI and NOS ideas was nominal. Furthermore, their idealized action research plans were not implemented to the extent that they were planned. Explanations for these findings include: impact of significant past educational experiences, prior understanding of SI and NOS, depth of content and pedagogical content knowledge of the discipline, and institutional and instructional constraints. Nonetheless, through participation in the collaborative action research process, the teachers developed professionally, personally, and socially. They identified many positive outcomes from participating in a collaborative action research project; however, they espoused constraints to

A dataset on tail risk of commodities markets.

PubMed

Powell, Robert J; Vo, Duc H; Pham, Thach N; Singh, Abhay K

2017-12-01

This article contains the datasets related to the research article "The long and short of commodity tails and their relationship to Asian equity markets"(Powell et al., 2017) [1]. The datasets contain the daily prices (and price movements) of 24 different commodities decomposed from the S&P GSCI index and the daily prices (and price movements) of three share market indices including World, Asia, and South East Asia for the period 2004-2015. Then, the dataset is divided into annual periods, showing the worst 5% of price movements for each year. The datasets are convenient to examine the tail risk of different commodities as measured by Conditional Value at Risk (CVaR) as well as their changes over periods. The datasets can also be used to investigate the association between commodity markets and share markets.
Teaching the Thrill of Discovery: Student Exploration of the Large-Scale Structures of the Universe

NASA Astrophysics Data System (ADS)

Juneau, Stephanie; Dey, Arjun; Walker, Constance E.; NOAO Data Lab

2018-01-01

In collaboration with the Teen Astronomy Cafes program, the NOAO Data Lab is developing online Jupyter Notebooks as a free and publicly accessible tool for students and teachers. Each interactive activity teaches students simultaneously about coding and astronomy with a focus on large datasets. Therefore, students learn state-of-the-art techniques at the cross-section between astronomy and data science. During the activity entitled “Our Vast Universe”, students use real spectroscopic data to measure the distance to galaxies before moving on to a catalog with distances to over 100,000 galaxies. Exploring this dataset gives students an appreciation of the large number of galaxies in the universe (2 trillion!), and leads them to discover how galaxies are located in large and impressive filamentary structures. During the Teen Astronomy Cafes program, the notebook is supplemented with visual material conducive to discussion, and hands-on activities involving cubes representing model universes. These steps contribute to build the students’ physical intuition and give them a better grasp of the concepts before using software and coding. At the end of the activity, students have made their own measurements, and have experienced scientific research directly. More information is available online for the Teen Astronomy Cafes (teensciencecafe.org/cafes) and the NOAO Data Lab (datalab.noao.edu).
Publishing and Editing of Semantically-Enabled Scientific Metadata Across Multiple Web Platforms: Challenges and Experiences

NASA Astrophysics Data System (ADS)

Patton, E. W.; West, P.; Greer, R.; Jin, B.

2011-12-01

Following on work presented at the 2010 AGU Fall Meeting, we present a number of real-world collections of semantically-enabled scientific metadata ingested into the Tetherless World RDF2HTML system as structured data and presented and edited using that system. Two separate datasets from two different domains (oceanography and solar sciences) are made available using existing web standards and services, e.g. encoded using ontologies represented with the Web Ontology Language (OWL) and stored in a SPARQL endpoint for querying. These datasets are deployed for use in three different web environments, i.e. Drupal, MediaWiki, and a custom web portal written in Java, to highlight the cross-platform nature of the data presentation. Stylesheets used to transform concepts in each domain as well as shared terms into HTML will be presented to show the power of using common ontologies to publish data and support reuse of existing terminologies. In addition, a single domain dataset is shared between two separate portal instances to demonstrate the ability for this system to offer distributed access and modification of content across the Internet. Lastly, we will highlight challenges that arose in the software engineering process, outline the design choices we made in solving those issues, and discuss how future improvements to this and other systems will enable the evolution of distributed, decentralized collaborations for scientific data sharing across multiple research groups.
A Dataset of Metaphors from the Italian Literature: Exploring Psycholinguistic Variables and the Role of Context

PubMed Central

Bambini, Valentina; Resta, Donatella; Grimaldi, Mirko

2014-01-01

Defining the specific role of the factors that affect metaphor processing is a fundamental step for fully understanding figurative language comprehension, either in discourse and conversation or in reading poems and novels. This study extends the currently available materials on everyday metaphorical expressions by providing the first dataset of metaphors extracted from literary texts and scored for the major psycholinguistic variables, considering also the effect of context. A set of 115 Italian literary metaphors presented in isolation (Experiment 1) and a subset of 65 literary metaphors embedded in their original texts (Experiment 2) were rated on several dimensions (word and phrase frequency, readability, cloze probability, familiarity, concreteness, difficulty and meaningfulness). Overall, literary metaphors scored around medium-low values on all dimensions in both experiments. Collected data were subjected to correlation analysis, which showed the presence of a strong cluster of variables—mainly familiarity, difficulty, and meaningfulness—when literary metaphor were presented in isolation. A weaker cluster was observed when literary metaphors were presented in the original contexts, with familiarity no longer correlating with meaningfulness. Context manipulation influenced familiarity, concreteness and difficulty ratings, which were lower in context than out of context, while meaningfulness increased. Throughout the different dimensions, the literary context seems to promote a global interpretative activity that enhances the open-endedness of the metaphor as a semantic structure constantly open to all possible interpretations intended by the author and driven by the text. This dataset will be useful for the design of future experimental studies both on literary metaphor and on the role of context in figurative meaning, combining ecological validity and aesthetic aspects of language. PMID:25244522
Provenance of Earth Science Datasets - How Deep Should One Go?

NASA Astrophysics Data System (ADS)

Ramapriyan, H.; Manipon, G. J. M.; Aulenbach, S.; Duggan, B.; Goldstein, J.; Hua, H.; Tan, D.; Tilmes, C.; Wilson, B. D.; Wolfe, R.; Zednik, S.

2015-12-01

For credibility of scientific research, transparency and reproducibility are essential. This fundamental tenet has been emphasized for centuries, and has been receiving increased attention in recent years. The Office of Management and Budget (2002) addressed reproducibility and other aspects of quality and utility of information from federal agencies. Specific guidelines from NASA (2002) are derived from the above. According to these guidelines, "NASA requires a higher standard of quality for information that is considered influential. Influential scientific, financial, or statistical information is defined as NASA information that, when disseminated, will have or does have clear and substantial impact on important public policies or important private sector decisions." For information to be compliant, "the information must be transparent and reproducible to the greatest possible extent." We present how the principles of transparency and reproducibility have been applied to NASA data supporting the Third National Climate Assessment (NCA3). The depth of trace needed of provenance of data used to derive conclusions in NCA3 depends on how the data were used (e.g., qualitatively or quantitatively). Given that the information is diligently maintained in the agency archives, it is possible to trace from a figure in the publication through the datasets, specific files, algorithm versions, instruments used for data collection, and satellites, as well as the individuals and organizations involved in each step. Such trace back permits transparency and reproducibility.
Statistical Exploration of Electronic Structure of Molecules from Quantum Monte-Carlo Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Prabhat, Mr; Zubarev, Dmitry; Lester, Jr., William A.

seem to indicate that one needs 4 or 5 components to account for most of the variance in the data, hence this 5D dataset does not necessarily lie on a well-defined, low dimensional manifold. In terms of specific clustering techniques, K-means was generally useful in exploring the dataset. The partition around medoids (pam) technique produced the most definitive results for our data showing distinctive patterns for both a sample of the complete data and time-series. The gap statistic with tibshirani criteria did not provide any distinction across the 2 dataset. The gap statistic w/DandF criteria, Model based clustering and hierarchical modeling simply failed to run on our datasets. Thankfully, the vanilla PCA technique was successful in handling our entire dataset. PCA revealed some interesting patterns for the scalar value distribution. Kernel PCA techniques (vanilladot, RBF, Polynomial) and MDS failed to run on the entire dataset, or even a significant fraction of the dataset, and we resorted to creating an explicit feature map followed by conventional PCA. Clustering using K-means and PAM in the new basis set seems to produce promising results. Understanding the new basis set in the scientific context of the problem is challenging, and we are currently working to further examine and interpret the results.« less
Technology Needs to Support Future Mars Exploration

NASA Technical Reports Server (NTRS)

Nilsen, Erik N.; Baker, John; Lillard, Randolph P.

2013-01-01

The Mars Program Planning Group (MPPG) under the direction of Dr. Orlando Figueroa, was chartered to develop options for a program-level architecture for robotic exploration of Mars consistent with the objective to send humans to Mars in the 2030's. Scientific pathways were defined for future exploration, and multiple architectural options were developed that meet current science goals and support the future human exploration objectives. Integral to the process was the identification of critical technologies which enable the future scientific and human exploration goals. This paper describes the process for technology capabilities identification and examines the critical capability needs identified in the MPPG process. Several critical enabling technologies that have been identified to support the robotic exploration goals and with potential feedforward application to human exploration goals. Potential roadmaps for the development and validation of these technologies are discussed, including options for subscale technology demonstrations of future human exploration technologies on robotic missions.
Wind Integration National Dataset Toolkit | Grid Modernization | NREL

Science.gov Websites

information, share tips The WIND Toolkit includes meteorological conditions and turbine power for more than Integration National Dataset Toolkit Wind Integration National Dataset Toolkit The Wind Integration National Dataset (WIND) Toolkit is an update and expansion of the Eastern Wind Integration Data Set and
Control Measure Dataset

EPA Pesticide Factsheets

The EPA Control Measure Dataset is a collection of documents describing air pollution control available to regulated facilities for the control and abatement of air pollution emissions from a range of regulated source types, whether directly through the use of technical measures, or indirectly through economic or other measures.
Comparison of Shallow Survey 2012 Multibeam Datasets

NASA Astrophysics Data System (ADS)

Ramirez, T. M.

2012-12-01

The purpose of the Shallow Survey common dataset is a comparison of the different technologies utilized for data acquisition in the shallow survey marine environment. The common dataset consists of a series of surveys conducted over a common area of seabed using a variety of systems. It provides equipment manufacturers the opportunity to showcase their latest systems while giving hydrographic researchers and scientists a chance to test their latest algorithms on the dataset so that rigorous comparisons can be made. Five companies collected data for the Common Dataset in the Wellington Harbor area in New Zealand between May 2010 and May 2011; including Kongsberg, Reson, R2Sonic, GeoAcoustics, and Applied Acoustics. The Wellington harbor and surrounding coastal area was selected since it has a number of well-defined features, including the HMNZS South Seas and HMNZS Wellington wrecks, an armored seawall constructed of Tetrapods and Akmons, aquifers, wharves and marinas. The seabed inside the harbor basin is largely fine-grained sediment, with gravel and reefs around the coast. The area outside the harbor on the southern coast is an active environment, with moving sand and exposed reefs. A marine reserve is also in this area. For consistency between datasets, the coastal research vessel R/V Ikatere and crew were used for all surveys conducted for the common dataset. Using Triton's Perspective processing software multibeam datasets collected for the Shallow Survey were processed for detail analysis. Datasets from each sonar manufacturer were processed using the CUBE algorithm developed by the Center for Coastal and Ocean Mapping/Joint Hydrographic Center (CCOM/JHC). Each dataset was gridded at 0.5 and 1.0 meter resolutions for cross comparison and compliance with International Hydrographic Organization (IHO) requirements. Detailed comparisons were made of equipment specifications (transmit frequency, number of beams, beam width), data density, total uncertainty, and
Computer Series, 52: Scientific Exploration with a Microcomputer: Simulations for Nonscientists.

ERIC Educational Resources Information Center

Whisnant, David M.

1984-01-01

Describes two simulations, written for Apple II microcomputers, focusing on scientific methodology. The first is based on the tendency of colloidal iron in high concentrations to stick to fish gills and cause breathing difficulties. The second, modeled after the dioxin controversy, examines a hypothetical chemical thought to cause cancer. (JN)
Two ultraviolet radiation datasets that cover China

NASA Astrophysics Data System (ADS)

Liu, Hui; Hu, Bo; Wang, Yuesi; Liu, Guangren; Tang, Liqin; Ji, Dongsheng; Bai, Yongfei; Bao, Weikai; Chen, Xin; Chen, Yunming; Ding, Weixin; Han, Xiaozeng; He, Fei; Huang, Hui; Huang, Zhenying; Li, Xinrong; Li, Yan; Liu, Wenzhao; Lin, Luxiang; Ouyang, Zhu; Qin, Boqiang; Shen, Weijun; Shen, Yanjun; Su, Hongxin; Song, Changchun; Sun, Bo; Sun, Song; Wang, Anzhi; Wang, Genxu; Wang, Huimin; Wang, Silong; Wang, Youshao; Wei, Wenxue; Xie, Ping; Xie, Zongqiang; Yan, Xiaoyuan; Zeng, Fanjiang; Zhang, Fawei; Zhang, Yangjian; Zhang, Yiping; Zhao, Chengyi; Zhao, Wenzhi; Zhao, Xueyong; Zhou, Guoyi; Zhu, Bo

2017-07-01

Ultraviolet (UV) radiation has significant effects on ecosystems, environments, and human health, as well as atmospheric processes and climate change. Two ultraviolet radiation datasets are described in this paper. One contains hourly observations of UV radiation measured at 40 Chinese Ecosystem Research Network stations from 2005 to 2015. CUV3 broadband radiometers were used to observe the UV radiation, with an accuracy of 5%, which meets the World Meteorology Organization's measurement standards. The extremum method was used to control the quality of the measured datasets. The other dataset contains daily cumulative UV radiation estimates that were calculated using an all-sky estimation model combined with a hybrid model. The reconstructed daily UV radiation data span from 1961 to 2014. The mean absolute bias error and root-mean-square error are smaller than 30% at most stations, and most of the mean bias error values are negative, which indicates underestimation of the UV radiation intensity. These datasets can improve our basic knowledge of the spatial and temporal variations in UV radiation. Additionally, these datasets can be used in studies of potential ozone formation and atmospheric oxidation, as well as simulations of ecological processes.
Atmosphere Explorer set for launch

NASA Technical Reports Server (NTRS)

1975-01-01

The Atmosphere Explorer-D (Explorer-54) is described which will explore in detail an area of the earth's outer atmosphere where important energy transfer, atomic and molecular processes, and chemical reactions occur that are critical to the heat balance of the atmosphere. Data are presented on the mission facts, launch vehicle operations, AE-D/Delta flight events, spacecraft description, scientific instruments, tracking, and data acquisition.
Partition dataset according to amino acid type improves the prediction of deleterious non-synonymous SNPs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yang, Jing; Li, Yuan-Yuan; Shanghai Center for Bioinformation Technology, Shanghai 200235

2012-03-02

Highlights: Black-Right-Pointing-Pointer Proper dataset partition can improve the prediction of deleterious nsSNPs. Black-Right-Pointing-Pointer Partition according to original residue type at nsSNP is a good criterion. Black-Right-Pointing-Pointer Similar strategy is supposed promising in other machine learning problems. -- Abstract: Many non-synonymous SNPs (nsSNPs) are associated with diseases, and numerous machine learning methods have been applied to train classifiers for sorting disease-associated nsSNPs from neutral ones. The continuously accumulated nsSNP data allows us to further explore better prediction approaches. In this work, we partitioned the training data into 20 subsets according to either original or substituted amino acid type at the nsSNPmore » site. Using support vector machine (SVM), training classification models on each subset resulted in an overall accuracy of 76.3% or 74.9% depending on the two different partition criteria, while training on the whole dataset obtained an accuracy of only 72.6%. Moreover, the dataset was also randomly divided into 20 subsets, but the corresponding accuracy was only 73.2%. Our results demonstrated that partitioning the whole training dataset into subsets properly, i.e., according to the residue type at the nsSNP site, will improve the performance of the trained classifiers significantly, which should be valuable in developing better tools for predicting the disease-association of nsSNPs.« less
Explaining Variation in How Classroom Communities Adapt the Practice of Scientific Argumentation

ERIC Educational Resources Information Center

Berland, Leema K.

2011-01-01

Research and practice has placed an increasing emphasis on aligning classroom practices with scientific practices such as scientific argumentation. In this paper, I explore 1 challenge associated with this goal by examining how existing classroom practices influence students' engagement in the practice of scientific argumentation. To do so, I…
Spreading Chaos: The Role of Popularizations in the Diffusion of Scientific Ideas

ERIC Educational Resources Information Center

Paul, Danette

2004-01-01

Scientific popularizations are generally considered translations (often dubious ones) of scientific research for a lay audience. This study explores the role popularizations play within scientific discourse, specifically in the development of chaos theory. The methods included a review of the popular and the semipopular books on chaos theory from…
The Harvard organic photovoltaic dataset

DOE PAGES

Lopez, Steven A.; Pyzer-Knapp, Edward O.; Simm, Gregor N.; ...

2016-09-27

Presented in this work is the Harvard Organic Photovoltaic Dataset (HOPV15), a collation of experimental photovoltaic data from the literature, and corresponding quantum-chemical calculations performed over a range of conformers, each with quantum chemical results using a variety of density functionals and basis sets. It is anticipated that this dataset will be of use in both relating electronic structure calculations to experimental observations through the generation of calibration schemes, as well as for the creation of new semi-empirical methods and the benchmarking of current and future model chemistries for organic electronic applications.
The Harvard organic photovoltaic dataset.

PubMed

Lopez, Steven A; Pyzer-Knapp, Edward O; Simm, Gregor N; Lutzow, Trevor; Li, Kewei; Seress, Laszlo R; Hachmann, Johannes; Aspuru-Guzik, Alán

2016-09-27

The Harvard Organic Photovoltaic Dataset (HOPV15) presented in this work is a collation of experimental photovoltaic data from the literature, and corresponding quantum-chemical calculations performed over a range of conformers, each with quantum chemical results using a variety of density functionals and basis sets. It is anticipated that this dataset will be of use in both relating electronic structure calculations to experimental observations through the generation of calibration schemes, as well as for the creation of new semi-empirical methods and the benchmarking of current and future model chemistries for organic electronic applications.
Scientific Visualization Tools for Enhancement of Undergraduate Research

NASA Astrophysics Data System (ADS)

Rodriguez, W. J.; Chaudhury, S. R.

2001-05-01

Undergraduate research projects that utilize remote sensing satellite instrument data to investigate atmospheric phenomena pose many challenges. A significant challenge is processing large amounts of multi-dimensional data. Remote sensing data initially requires mining; filtering of undesirable spectral, instrumental, or environmental features; and subsequently sorting and reformatting to files for easy and quick access. The data must then be transformed according to the needs of the investigation(s) and displayed for interpretation. These multidimensional datasets require views that can range from two-dimensional plots to multivariable-multidimensional scientific visualizations with animations. Science undergraduate students generally find these data processing tasks daunting. Generally, researchers are required to fully understand the intricacies of the dataset and write computer programs or rely on commercially available software, which may not be trivial to use. In the time that undergraduate researchers have available for their research projects, learning the data formats, programming languages, and/or visualization packages is impractical. When dealing with large multi-dimensional data sets appropriate Scientific Visualization tools are imperative in allowing students to have a meaningful and pleasant research experience, while producing valuable scientific research results. The BEST Lab at Norfolk State University has been creating tools for multivariable-multidimensional analysis of Earth Science data. EzSAGE and SAGE4D have been developed to sort, analyze and visualize SAGE II (Stratospheric Aerosol and Gas Experiment) data with ease. Three- and four-dimensional visualizations in interactive environments can be produced. EzSAGE provides atmospheric slices in three-dimensions where the researcher can change the scales in the three-dimensions, color tables and degree of smoothing interactively to focus on particular phenomena. SAGE4D provides a navigable
NASA Scientific Balloon in Antarctica

NASA Image and Video Library

2017-12-08

NASA image captured December 25, 2011 A NASA scientific balloon awaits launch in McMurdo, Antarctica. The balloon, carrying Indiana University's Cosmic Ray Electron Synchrotron Telescope (CREST), was launched on December 25. After a circum-navigational flight around the South Pole, the payload landed on January 5. The CREST payload is one of two scheduled as part of this seasons' annual NASA Antarctic balloon Campaign which is conducted in cooperation with the National Science Foundation's Office of Polar Programs. The campaign's second payload is the University of Arizona's Stratospheric Terahertz Observatory (STO). You can follow the flights at the Columbia Scientific Balloon Facility's web site at www.csbf.nasa.gov/antarctica/ice.htm Credit: NASA NASA image use policy. NASA Goddard Space Flight Center enables NASA’s mission through four scientific endeavors: Earth Science, Heliophysics, Solar System Exploration, and Astrophysics. Goddard plays a leading role in NASA’s accomplishments by contributing compelling scientific knowledge to advance the Agency’s mission. Follow us on Twitter Like us on Facebook Find us on Instagram

Scientific Assistant Virtual Laboratory (SAVL)

NASA Astrophysics Data System (ADS)

Alaghband, Gita; Fardi, Hamid; Gnabasik, David

2007-03-01

The Scientific Assistant Virtual Laboratory (SAVL) is a scientific discovery environment, an interactive simulated virtual laboratory, for learning physics and mathematics. The purpose of this computer-assisted intervention is to improve middle and high school student interest, insight and scores in physics and mathematics. SAVL develops scientific and mathematical imagination in a visual, symbolic, and experimental simulation environment. It directly addresses the issues of scientific and technological competency by providing critical thinking training through integrated modules. This on-going research provides a virtual laboratory environment in which the student directs the building of the experiment rather than observing a packaged simulation. SAVL: * Engages the persistent interest of young minds in physics and math by visually linking simulation objects and events with mathematical relations. * Teaches integrated concepts by the hands-on exploration and focused visualization of classic physics experiments within software. * Systematically and uniformly assesses and scores students by their ability to answer their own questions within the context of a Master Question Network. We will demonstrate how the Master Question Network uses polymorphic interfaces and C# lambda expressions to manage simulation objects.
Graphical Visualization of Human Exploration Capabilities

NASA Technical Reports Server (NTRS)

Rodgers, Erica M.; Williams-Byrd, Julie; Arney, Dale C.; Simon, Matthew A.; Williams, Phillip A.; Barsoum, Christopher; Cowan, Tyler; Larman, Kevin T.; Hay, Jason; Burg, Alex

2016-01-01

NASA's pioneering space strategy will require advanced capabilities to expand the boundaries of human exploration on the Journey to Mars (J2M). The Evolvable Mars Campaign (EMC) architecture serves as a framework to identify critical capabilities that need to be developed and tested in order to enable a range of human exploration destinations and missions. Agency-wide System Maturation Teams (SMT) are responsible for the maturation of these critical exploration capabilities and help formulate, guide and resolve performance gaps associated with the EMC-identified capabilities. Systems Capability Organization Reporting Engine boards (SCOREboards) were developed to integrate the SMT data sets into cohesive human exploration capability stories that can be used to promote dialog and communicate NASA's exploration investments. Each SCOREboard provides a graphical visualization of SMT capability development needs that enable exploration missions, and presents a comprehensive overview of data that outlines a roadmap of system maturation needs critical for the J2M. SCOREboards are generated by a computer program that extracts data from a main repository, sorts the data based on a tiered data reduction structure, and then plots the data according to specified user inputs. The ability to sort and plot varying data categories provides the flexibility to present specific SCOREboard capability roadmaps based on customer requests. This paper presents the development of the SCOREboard computer program and shows multiple complementary, yet different datasets through a unified format designed to facilitate comparison between datasets. Example SCOREboard capability roadmaps are presented followed by a discussion of how the roadmaps are used to: 1) communicate capability developments and readiness of systems for future missions, and 2) influence the definition of NASA's human exploration investment portfolio through capability-driven processes. The paper concludes with a description
A dataset of images and morphological profiles of 30 000 small-molecule treatments using the Cell Painting assay

PubMed Central

Bray, Mark-Anthony; Gustafsdottir, Sigrun M; Rohban, Mohammad H; Singh, Shantanu; Ljosa, Vebjorn; Sokolnicki, Katherine L; Bittker, Joshua A; Bodycombe, Nicole E; Dančík, Vlado; Hasaka, Thomas P; Hon, Cindy S; Kemp, Melissa M; Li, Kejie; Walpita, Deepika; Wawer, Mathias J; Golub, Todd R; Schreiber, Stuart L; Clemons, Paul A; Shamji, Alykhan F

2017-01-01

Abstract Background Large-scale image sets acquired by automated microscopy of perturbed samples enable a detailed comparison of cell states induced by each perturbation, such as a small molecule from a diverse library. Highly multiplexed measurements of cellular morphology can be extracted from each image and subsequently mined for a number of applications. Findings This microscopy dataset includes 919 265 five-channel fields of view, representing 30 616 tested compounds, available at “The Cell Image Library” (CIL) repository. It also includes data files containing morphological features derived from each cell in each image, both at the single-cell level and population-averaged (i.e., per-well) level; the image analysis workflows that generated the morphological features are also provided. Quality-control metrics are provided as metadata, indicating fields of view that are out-of-focus or containing highly fluorescent material or debris. Lastly, chemical annotations are supplied for the compound treatments applied. Conclusions Because computational algorithms and methods for handling single-cell morphological measurements are not yet routine, the dataset serves as a useful resource for the wider scientific community applying morphological (image-based) profiling. The dataset can be mined for many purposes, including small-molecule library enrichment and chemical mechanism-of-action studies, such as target identification. Integration with genetically perturbed datasets could enable identification of small-molecule mimetics of particular disease- or gene-related phenotypes that could be useful as probes or potential starting points for development of future therapeutics. PMID:28327978
Teaching Scientific Reasoning to Liberal Arts Students

NASA Astrophysics Data System (ADS)

Rubbo, Louis

2014-03-01

University courses in conceptual physics and astronomy typically serve as the terminal science experience for the liberal arts student. Within this population significant content knowledge gains can be achieved by utilizing research verified pedagogical methods. However, from the standpoint of the Univeristy, students are expected to complete these courses not necessarily for the content knowledge but instead for the development of scientific reasoning skills. Results from physics education studies indicate that unless scientific reasoning instruction is made explicit students do not progress in their reasoning abilities. How do we complement the successful content based pedagogical methods with instruction that explicitly focuses on the development of scientific reasoning skills? This talk will explore methodologies that actively engages the non-science students with the explicit intent of fostering their scientific reasoning abilities.
Workshop on Science and the Human Exploration of Mars

NASA Technical Reports Server (NTRS)

Duke, M. B. (Editor)

2001-01-01

The exploration of Mars will be a multi-decadal activity. Currently, a scientific program is underway, sponsored by NASA's Office of Space Science in the United States, in collaboration with international partners France, Italy, and the European Space Agency. Plans exist for the continuation of this robotic program through the first automated return of Martian samples in 2014. Mars is also a prime long-term objective for human exploration, and within NASA, efforts are being made to provide the best integration of the robotic program and future human exploration missions. From the perspective of human exploration missions, it is important to understand the scientific objectives of human missions, in order to design the appropriate systems, tools, and operational capabilities to maximize science on those missions. In addition, data from the robotic missions can provide critical environmental data - surface morphology, materials composition, evaluations of potential toxicity of surface materials, radiation, electrical and other physical properties of the Martian environment, and assessments of the probability that humans would encounter Martian life forms. Understanding of the data needs can lead to the definition of experiments that can be done in the near-term that will make the design of human missions more effective. This workshop was convened to begin a dialog between the scientific community that is central to the robotic exploration mission program and a set of experts in systems and technologies that are critical to human exploration missions. The charge to the workshop was to develop an understanding of the types of scientific exploration that would be best suited to the human exploration missions and the capabilities and limitations of human explorers in undertaking science on those missions.
Exploring Scientific, Artistic, Moral and Technical Reflection in Teacher Action Research

ERIC Educational Resources Information Center

Luttenberg, Johan; Oolbekkink-Marchand, Helma; Meijer, Paulien

2018-01-01

Reflection in action research is a complicated matter because of the many domains of reflection and most significantly, the lack of understanding of these domains of reflection in action research and how these are supported. In this paper, we propose a framework based on four domains of reflection, namely, scientific, artistic, moral and technical…
Publishing descriptions of non-public clinical datasets: proposed guidance for researchers, repositories, editors and funding organisations.

PubMed

Hrynaszkiewicz, Iain; Khodiyar, Varsha; Hufton, Andrew L; Sansone, Susanna-Assunta

2016-01-01

Sharing of experimental clinical research data usually happens between individuals or research groups rather than via public repositories, in part due to the need to protect research participant privacy. This approach to data sharing makes it difficult to connect journal articles with their underlying datasets and is often insufficient for ensuring access to data in the long term. Voluntary data sharing services such as the Yale Open Data Access (YODA) and Clinical Study Data Request (CSDR) projects have increased accessibility to clinical datasets for secondary uses while protecting patient privacy and the legitimacy of secondary analyses but these resources are generally disconnected from journal articles-where researchers typically search for reliable information to inform future research. New scholarly journal and article types dedicated to increasing accessibility of research data have emerged in recent years and, in general, journals are developing stronger links with data repositories. There is a need for increased collaboration between journals, data repositories, researchers, funders, and voluntary data sharing services to increase the visibility and reliability of clinical research. Using the journal Scientific Data as a case study, we propose and show examples of changes to the format and peer-review process for journal articles to more robustly link them to data that are only available on request. We also propose additional features for data repositories to better accommodate non-public clinical datasets, including Data Use Agreements (DUAs).
Internationally coordinated glacier monitoring: strategy and datasets

NASA Astrophysics Data System (ADS)

Hoelzle, Martin; Armstrong, Richard; Fetterer, Florence; Gärtner-Roer, Isabelle; Haeberli, Wilfried; Kääb, Andreas; Kargel, Jeff; Nussbaumer, Samuel; Paul, Frank; Raup, Bruce; Zemp, Michael

2014-05-01

Internationally coordinated monitoring of long-term glacier changes provide key indicator data about global climate change and began in the year 1894 as an internationally coordinated effort to establish standardized observations. Today, world-wide monitoring of glaciers and ice caps is embedded within the Global Climate Observing System (GCOS) in support of the United Nations Framework Convention on Climate Change (UNFCCC) as an important Essential Climate Variable (ECV). The Global Terrestrial Network for Glaciers (GTN-G) was established in 1999 with the task of coordinating measurements and to ensure the continuous development and adaptation of the international strategies to the long-term needs of users in science and policy. The basic monitoring principles must be relevant, feasible, comprehensive and understandable to a wider scientific community as well as to policy makers and the general public. Data access has to be free and unrestricted, the quality of the standardized and calibrated data must be high and a combination of detailed process studies at selected field sites with global coverage by satellite remote sensing is envisaged. Recently a GTN-G Steering Committee was established to guide and advise the operational bodies responsible for the international glacier monitoring, which are the World Glacier Monitoring Service (WGMS), the US National Snow and Ice Data Center (NSIDC), and the Global Land Ice Measurements from Space (GLIMS) initiative. Several online databases containing a wealth of diverse data types having different levels of detail and global coverage provide fast access to continuously updated information on glacier fluctuation and inventory data. For world-wide inventories, data are now available through (a) the World Glacier Inventory containing tabular information of about 130,000 glaciers covering an area of around 240,000 km2, (b) the GLIMS-database containing digital outlines of around 118,000 glaciers with different time stamps and
Does using different modern climate datasets impact pollen-based paleoclimate reconstructions in North America during the past 2,000 years

NASA Astrophysics Data System (ADS)

Ladd, Matthew; Viau, Andre

2013-04-01

Paleoclimate reconstructions rely on the accuracy of modern climate datasets for calibration of fossil records under the assumption of climate normality through time, which means that the modern climate operates in a similar manner as over the past 2,000 years. In this study, we show how using different modern climate datasets have an impact on a pollen-based reconstruction of mean temperature of the warmest month (MTWA) during the past 2,000 years for North America. The modern climate datasets used to explore this research question include the: Whitmore et al., (2005) modern climate dataset; North American Regional Reanalysis (NARR); National Center For Environmental Prediction (NCEP); European Center for Medium Range Weather Forecasting (ECMWF) ERA-40 reanalysis; WorldClim, Global Historical Climate Network (GHCN) and New et al., which is derived from the CRU dataset. Results show that some caution is advised in using the reanalysis data on large-scale reconstructions. Station data appears to dampen out the variability of the reconstruction produced using station based datasets. The reanalysis or model-based datasets are not recommended for paleoclimate large-scale North American reconstructions as they appear to lack some of the dynamics observed in station datasets (CRU) which resulted in warm-biased reconstructions as compared to the station-based reconstructions. The Whitmore et al. (2005) modern climate dataset appears to be a compromise between CRU-based datasets and model-based datasets except for the ERA-40. In addition, an ultra-high resolution gridded climate dataset such as WorldClim may only be useful if the pollen calibration sites in North America have at least the same spatial precision. We reconstruct the MTWA to within +/-0.01°C by using an average of all curves derived from the different modern climate datasets, demonstrating the robustness of the procedure used. It may be that the use of an average of different modern datasets may reduce the
The next scientific revolution.

PubMed

Hey, Tony

2010-11-01

For decades, computer scientists have tried to teach computers to think like human experts. Until recently, most of those efforts have failed to come close to generating the creative insights and solutions that seem to come naturally to the best researchers, doctors, and engineers. But now, Tony Hey, a VP of Microsoft Research, says we're witnessing the dawn of a new generation of powerful computer tools that can "mash up" vast quantities of data from many sources, analyze them, and help produce revolutionary scientific discoveries. Hey and his colleagues call this new method of scientific exploration "machine learning." At Microsoft, a team has already used it to innovate a method of predicting with impressive accuracy whether a patient with congestive heart failure who is released from the hospital will be readmitted within 30 days. It was developed by directing a computer program to pore through hundreds of thousands of data points on 300,000 patients and "learn" the profiles of patients most likely to be rehospitalized. The economic impact of this prediction tool could be huge: If a hospital understands the likelihood that a patient will "bounce back," it can design programs to keep him stable and save thousands of dollars in health care costs. Similar efforts to uncover important correlations that could lead to scientific breakthroughs are under way in oceanography, conservation, and AIDS research. And in business, deep data exploration has the potential to unearth critical insights about customers, supply chains, advertising effectiveness, and more.
TopoLens: Building a cyberGIS community data service for enhancing the usability of high-resolution National Topographic datasets

USGS Publications Warehouse

Hu, Hao; Hong, Xingchen; Terstriep, Jeff; Liu, Yan; Finn, Michael P.; Rush, Johnathan; Wendel, Jeffrey; Wang, Shaowen

2016-01-01

Geospatial data, often embedded with geographic references, are important to many application and science domains, and represent a major type of big data. The increased volume and diversity of geospatial data have caused serious usability issues for researchers in various scientific domains, which call for innovative cyberGIS solutions. To address these issues, this paper describes a cyberGIS community data service framework to facilitate geospatial big data access, processing, and sharing based on a hybrid supercomputer architecture. Through the collaboration between the CyberGIS Center at the University of Illinois at Urbana-Champaign (UIUC) and the U.S. Geological Survey (USGS), a community data service for accessing, customizing, and sharing digital elevation model (DEM) and its derived datasets from the 10-meter national elevation dataset, namely TopoLens, is created to demonstrate the workflow integration of geospatial big data sources, computation, analysis needed for customizing the original dataset for end user needs, and a friendly online user environment. TopoLens provides online access to precomputed and on-demand computed high-resolution elevation data by exploiting the ROGER supercomputer. The usability of this prototype service has been acknowledged in community evaluation.
FoodMicrobionet: A database for the visualisation and exploration of food bacterial communities based on network analysis.

PubMed

Parente, Eugenio; Cocolin, Luca; De Filippis, Francesca; Zotta, Teresa; Ferrocino, Ilario; O'Sullivan, Orla; Neviani, Erasmo; De Angelis, Maria; Cotter, Paul D; Ercolini, Danilo

2016-02-16

Amplicon targeted high-throughput sequencing has become a popular tool for the culture-independent analysis of microbial communities. Although the data obtained with this approach are portable and the number of sequences available in public databases is increasing, no tool has been developed yet for the analysis and presentation of data obtained in different studies. This work describes an approach for the development of a database for the rapid exploration and analysis of data on food microbial communities. Data from seventeen studies investigating the structure of bacterial communities in dairy, meat, sourdough and fermented vegetable products, obtained by 16S rRNA gene targeted high-throughput sequencing, were collated and analysed using Gephi, a network analysis software. The resulting database, which we named FoodMicrobionet, was used to analyse nodes and network properties and to build an interactive web-based visualisation. The latter allows the visual exploration of the relationships between Operational Taxonomic Units (OTUs) and samples and the identification of core- and sample-specific bacterial communities. It also provides additional search tools and hyperlinks for the rapid selection of food groups and OTUs and for rapid access to external resources (NCBI taxonomy, digital versions of the original articles). Microbial interaction network analysis was carried out using CoNet on datasets extracted from FoodMicrobionet: the complexity of interaction networks was much lower than that found for other bacterial communities (human microbiome, soil and other environments). This may reflect both a bias in the dataset (which was dominated by fermented foods and starter cultures) and the lower complexity of food bacterial communities. Although some technical challenges exist, and are discussed here, the net result is a valuable tool for the exploration of food bacterial communities by the scientific community and food industry. Copyright © 2015. Published by
Evolution and convergence of the patterns of international scientific collaboration.

PubMed

Coccia, Mario; Wang, Lili

2016-02-23

International research collaboration plays an important role in the social construction and evolution of science. Studies of science increasingly analyze international collaboration across multiple organizations for its impetus in improving research quality, advancing efficiency of the scientific production, and fostering breakthroughs in a shorter time. However, long-run patterns of international research collaboration across scientific fields and their structural changes over time are hardly known. Here we show the convergence of international scientific collaboration across research fields over time. Our study uses a dataset by the National Science Foundation and computes the fraction of papers that have international institutional coauthorships for various fields of science. We compare our results with pioneering studies carried out in the 1970s and 1990s by applying a standardization method that transforms all fractions of internationally coauthored papers into a comparable framework. We find, over 1973-2012, that the evolution of collaboration patterns across scientific disciplines seems to generate a convergence between applied and basic sciences. We also show that the general architecture of international scientific collaboration, based on the ranking of fractions of international coauthorships for different scientific fields per year, has tended to be unchanged over time, at least until now. Overall, this study shows, to our knowledge for the first time, the evolution of the patterns of international scientific collaboration starting from initial results described by literature in the 1970s and 1990s. We find a convergence of these long-run collaboration patterns between the applied and basic sciences. This convergence might be one of contributing factors that supports the evolution of modern scientific fields.
Performative Intra-Action of a Paper Plane and a Child: Exploring Scientific Concepts as Agentic Playmates

NASA Astrophysics Data System (ADS)

Haus, Jana Maria

2018-05-01

This work uses new materialist perspectives (Barad 2007; Lenz Taguchi 2014; Rautio in Children's Geographies, 11(4), 394-408, 2013) to examine an exploration of concepts as agents and the question how intra-action of human and non-human bodies lead to the investigation of scientific concepts, relying on an article by de Freitas and Palmer (Cultural Studies of Science Education, 11(4), 1201-1222, 2016). Through an analysis of video stills of a case study, a focus on classroom assemblages shows how the intra-actions of human and non-human bodies (one 5-year-old boy, a piece of paper that becomes a paper plane and the concepts of force and flight) lead to an intertwining and intersecting of play, learning, and becoming. Video recordings were used to qualitatively analyze three questions, which emerged through and resulted from the intra-action of researcher and data. This paper aims at addressing a prevalent gap in the research literature on science learning from a materialist view. Findings of the analysis show that human and non-human bodies together become through and for another to jointly and agentically intra-act in exploring and learning about science. Implications for learning and teaching science are that teachers could attempt to focus on setting up the learning environment differently, so that children have time and access to materials that matter to them and that, as "Hultman (2011) claims […] `whisper, answer, demand and offer'" (Areljung forthcoming, p. 77) themselves to children in the learning and teaching environment.
Application of Huang-Hilbert Transforms to Geophysical Datasets

NASA Technical Reports Server (NTRS)

Duffy, Dean G.

2003-01-01

The Huang-Hilbert transform is a promising new method for analyzing nonstationary and nonlinear datasets. In this talk I will apply this technique to several important geophysical datasets. To understand the strengths and weaknesses of this method, multi- year, hourly datasets of the sea level heights and solar radiation will be analyzed. Then we will apply this transform to the analysis of gravity waves observed in a mesoscale observational net.
Exploring Crossing Differential Item Functioning by Gender in Mathematics Assessment

ERIC Educational Resources Information Center

Ong, Yoke Mooi; Williams, Julian; Lamprianou, Iasonas

2015-01-01

The purpose of this article is to explore crossing differential item functioning (DIF) in a test drawn from a national examination of mathematics for 11-year-old pupils in England. An empirical dataset was analyzed to explore DIF by gender in a mathematics assessment. A two-step process involving the logistic regression (LR) procedure for…
The Harvard organic photovoltaic dataset

PubMed Central

Lopez, Steven A.; Pyzer-Knapp, Edward O.; Simm, Gregor N.; Lutzow, Trevor; Li, Kewei; Seress, Laszlo R.; Hachmann, Johannes; Aspuru-Guzik, Alán

2016-01-01

The Harvard Organic Photovoltaic Dataset (HOPV15) presented in this work is a collation of experimental photovoltaic data from the literature, and corresponding quantum-chemical calculations performed over a range of conformers, each with quantum chemical results using a variety of density functionals and basis sets. It is anticipated that this dataset will be of use in both relating electronic structure calculations to experimental observations through the generation of calibration schemes, as well as for the creation of new semi-empirical methods and the benchmarking of current and future model chemistries for organic electronic applications. PMID:27676312
International Planning for Subglacial Lake Exploration

NASA Astrophysics Data System (ADS)

Kennicutt, M.; Priscu, J.

2003-04-01

As one of the last unexplored frontiers on our planet, subglacial lakes offer a unique and exciting venue for exploration and research. Over the past several years, subglacial lakes have captured the imagination of the scientific community and public, evoking images of potential exotic life forms surviving under some of the most extreme conditions on earth. Various planning activities have recognized that due to the remote and harsh conditions, that a successful subglacial lake exploration program will entail a concerted effort for a number of years. It will also require an international commitment of major financial and human resources. To begin a detailed planning process, the Scientific Committee on Antarctic Research (SCAR) convened the Subglacial Antarctic Lake Exploration Group of Specialists (SALEGOS) in Tokyo in 2000. The group was asked to build on previous workshops and meetings to develop a plan to explore subglacial lake environments. Its mandate adopted the guiding principles as agreed in Cambridge in 1999 that the program would be interdisciplinary in scope, be designed for minimum contamination and disturbance of the subglacial lake environment, have as a goal lake entry and sample retrieval, and that the ultimate target of the program should be Lake Vostok exploration. Since its formation SALEGOS has met three times and addressed some of the more intractable issues related to subglacial lake exploration. Topics under discussion include current state-of-the-knowledge of subglacial environments, technological needs, international management and organizational strategies, a portfolio of scientific projects, "clean" requirements, and logistical considerations. In this presentation the actvities of SALEGOS will be summarized and recommendations for an international subglacial lake exploration program discussed.
Evaluating science return in space exploration initiative architectures

NASA Technical Reports Server (NTRS)

Budden, Nancy Ann; Spudis, Paul D.

1993-01-01

Science is an important aspect of the Space Exploration Initiative, a program to explore the Moon and Mars with people and machines. Different SEI mission architectures are evaluated on the basis of three variables: access (to the planet's surface), capability (including number of crew, equipment, and supporting infrastructure), and time (being the total number of man-hours available for scientific activities). This technique allows us to estimate the scientific return to be expected from different architectures and from different implementations of the same architecture. Our methodology allows us to maximize the scientific return from the initiative by illuminating the different emphases and returns that result from the alternative architectural decisions.
Interpolation of diffusion weighted imaging datasets.

PubMed

Dyrby, Tim B; Lundell, Henrik; Burke, Mark W; Reislev, Nina L; Paulson, Olaf B; Ptito, Maurice; Siebner, Hartwig R

2014-12-01

Diffusion weighted imaging (DWI) is used to study white-matter fibre organisation, orientation and structural connectivity by means of fibre reconstruction algorithms and tractography. For clinical settings, limited scan time compromises the possibilities to achieve high image resolution for finer anatomical details and signal-to-noise-ratio for reliable fibre reconstruction. We assessed the potential benefits of interpolating DWI datasets to a higher image resolution before fibre reconstruction using a diffusion tensor model. Simulations of straight and curved crossing tracts smaller than or equal to the voxel size showed that conventional higher-order interpolation methods improved the geometrical representation of white-matter tracts with reduced partial-volume-effect (PVE), except at tract boundaries. Simulations and interpolation of ex-vivo monkey brain DWI datasets revealed that conventional interpolation methods fail to disentangle fine anatomical details if PVE is too pronounced in the original data. As for validation we used ex-vivo DWI datasets acquired at various image resolutions as well as Nissl-stained sections. Increasing the image resolution by a factor of eight yielded finer geometrical resolution and more anatomical details in complex regions such as tract boundaries and cortical layers, which are normally only visualized at higher image resolutions. Similar results were found with typical clinical human DWI dataset. However, a possible bias in quantitative values imposed by the interpolation method used should be considered. The results indicate that conventional interpolation methods can be successfully applied to DWI datasets for mining anatomical details that are normally seen only at higher resolutions, which will aid in tractography and microstructural mapping of tissue compartments. Copyright © 2014. Published by Elsevier Inc.

Trace: a high-throughput tomographic reconstruction engine for large-scale datasets.

PubMed

Bicer, Tekin; Gürsoy, Doğa; Andrade, Vincent De; Kettimuthu, Rajkumar; Scullin, William; Carlo, Francesco De; Foster, Ian T

2017-01-01

Modern synchrotron light sources and detectors produce data at such scale and complexity that large-scale computation is required to unleash their full power. One of the widely used imaging techniques that generates data at tens of gigabytes per second is computed tomography (CT). Although CT experiments result in rapid data generation, the analysis and reconstruction of the collected data may require hours or even days of computation time with a medium-sized workstation, which hinders the scientific progress that relies on the results of analysis. We present Trace, a data-intensive computing engine that we have developed to enable high-performance implementation of iterative tomographic reconstruction algorithms for parallel computers. Trace provides fine-grained reconstruction of tomography datasets using both (thread-level) shared memory and (process-level) distributed memory parallelization. Trace utilizes a special data structure called replicated reconstruction object to maximize application performance. We also present the optimizations that we apply to the replicated reconstruction objects and evaluate them using tomography datasets collected at the Advanced Photon Source. Our experimental evaluations show that our optimizations and parallelization techniques can provide 158× speedup using 32 compute nodes (384 cores) over a single-core configuration and decrease the end-to-end processing time of a large sinogram (with 4501 × 1 × 22,400 dimensions) from 12.5 h to <5 min per iteration. The proposed tomographic reconstruction engine can efficiently process large-scale tomographic data using many compute nodes and minimize reconstruction times.
GeoVision Exploration Task Force Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Doughty, Christine; Dobson, Patrick F.; Wall, Anna

The GeoVision study effort included ground-breaking, detailed research on current and future market conditions and geothermal technologies in order to forecast and quantify the electric and non-electric deployment potentials under a range of scenarios, in addition to their impacts on the Nation’s jobs, economy and environment. Coordinated by the U.S. Department of Energy’s (DOE’s) Geothermal Technologies Office (GTO), the GeoVision study development relied on the collection, modeling, and analysis of robust datasets through seven national laboratory partners, which were organized into eight technical Task Force groups. The purpose of this report is to provide a central repository for the researchmore » conducted by the Exploration Task Force. The Exploration Task Force consists of four individuals representing three national laboratories: Patrick Dobson (task lead) and Christine Doughty of Lawrence Berkeley National Laboratory, Anna Wall of National Renewable Energy Laboratory, Travis McLing of Idaho National Laboratory, and Chester Weiss of Sandia National Laboratories. As part of the GeoVision analysis, our team conducted extensive scientific and financial analyses on a number of topics related to current and future geothermal exploration methods. The GeoVision Exploration Task Force complements the drilling and resource technology investigations conducted as part of the Reservoir Maintenance and Development Task Force. The Exploration Task Force however has focused primarily on early stage R&D technologies in exploration and confirmation drilling, along with an evaluation of geothermal financing challenges and assumptions, and innovative “blue-sky” technologies. This research was used to develop geothermal resource supply curves (through the use of GETEM) for use in the ReEDS capacity expansion modeling that determines geothermal technology deployment potential. It also catalogues and explores the large array of early-stage R&D technologies with the
EEG datasets for motor imagery brain-computer interface.

PubMed

Cho, Hohyun; Ahn, Minkyu; Ahn, Sangtae; Kwon, Moonyoung; Jun, Sung Chan

2017-07-01

Most investigators of brain-computer interface (BCI) research believe that BCI can be achieved through induced neuronal activity from the cortex, but not by evoked neuronal activity. Motor imagery (MI)-based BCI is one of the standard concepts of BCI, in that the user can generate induced activity by imagining motor movements. However, variations in performance over sessions and subjects are too severe to overcome easily; therefore, a basic understanding and investigation of BCI performance variation is necessary to find critical evidence of performance variation. Here we present not only EEG datasets for MI BCI from 52 subjects, but also the results of a psychological and physiological questionnaire, EMG datasets, the locations of 3D EEG electrodes, and EEGs for non-task-related states. We validated our EEG datasets by using the percentage of bad trials, event-related desynchronization/synchronization (ERD/ERS) analysis, and classification analysis. After conventional rejection of bad trials, we showed contralateral ERD and ipsilateral ERS in the somatosensory area, which are well-known patterns of MI. Finally, we showed that 73.08% of datasets (38 subjects) included reasonably discriminative information. Our EEG datasets included the information necessary to determine statistical significance; they consisted of well-discriminated datasets (38 subjects) and less-discriminative datasets. These may provide researchers with opportunities to investigate human factors related to MI BCI performance variation, and may also achieve subject-to-subject transfer by using metadata, including a questionnaire, EEG coordinates, and EEGs for non-task-related states. © The Authors 2017. Published by Oxford University Press.
Cross-Dataset Analysis and Visualization Driven by Expressive Web Services

NASA Astrophysics Data System (ADS)

Alexandru Dumitru, Mircea; Catalin Merticariu, Vlad

2015-04-01

The deluge of data that is hitting us every day from satellite and airborne sensors is changing the workflow of environmental data analysts and modelers. Web geo-services play now a fundamental role, and are no longer needed to preliminary download and store the data, but rather they interact in real-time with GIS applications. Due to the very large amount of data that is curated and made available by web services, it is crucial to deploy smart solutions for optimizing network bandwidth, reducing duplication of data and moving the processing closer to the data. In this context we have created a visualization application for analysis and cross-comparison of aerosol optical thickness datasets. The application aims to help researchers identify and visualize discrepancies between datasets coming from various sources, having different spatial and time resolutions. It also acts as a proof of concept for integration of OGC Web Services under a user-friendly interface that provides beautiful visualizations of the explored data. The tool was built on top of the World Wind engine, a Java based virtual globe built by NASA and the open source community. For data retrieval and processing we exploited the OGC Web Coverage Service potential: the most exciting aspect being its processing extension, a.k.a. the OGC Web Coverage Processing Service (WCPS) standard. A WCPS-compliant service allows a client to execute a processing query on any coverage offered by the server. By exploiting a full grammar, several different kinds of information can be retrieved from one or more datasets together: scalar condensers, cross-sectional profiles, comparison maps and plots, etc. This combination of technology made the application versatile and portable. As the processing is done on the server-side, we ensured that the minimal amount of data is transferred and that the processing is done on a fully-capable server, leaving the client hardware resources to be used for rendering the visualization
Preparing Precipitation Data Access, Value-added Services and Scientific Exploration Tools for the Integrated Multi-satellitE Retrievals for GPM (IMERG)

NASA Astrophysics Data System (ADS)

Ostrenga, D.; Liu, Z.; Kempler, S. J.; Vollmer, B.; Teng, W. L.

2013-12-01

The Precipitation Data and Information Services Center (PDISC) (http://disc.gsfc.nasa.gov/precipitation or google: NASA PDISC), located at the NASA Goddard Space Flight Center (GSFC) Earth Sciences (GES) Data and Information Services Center (DISC), is home of the Tropical Rainfall Measuring Mission (TRMM) data archive. For over 15 years, the GES DISC has served not only TRMM, but also other space-based, airborne-based, field campaign and ground-based precipitation data products to the precipitation community and other disciplinary communities as well. The TRMM Multi-Satellite Precipitation Analysis (TMPA) products are the most popular products in the TRMM product family in terms of data download and access through Mirador, the GES-DISC Interactive Online Visualization ANd aNalysis Infrastructure (Giovanni) and other services. The next generation of TMPA, the Integrated Multi-satellitE Retrievals for GPM (IMERG) to be released in 2014 after the launch of GPM, will be significantly improved in terms of spatial and temporal resolutions. To better serve the user community, we are preparing data services and samples are listed below. To enable scientific exploration of Earth science data products without going through complicated and often time consuming processes, such as data downloading, data processing, etc., the GES DISC has developed Giovanni in consultation with members of the user community, requesting quick search, subset, analysis and display capabilities for their specific data of interest. For example, the TRMM Online Visualization and Analysis System (TOVAS, http://disc2.nascom.nasa.gov/Giovanni/tovas/) has proven extremely popular, especially as additional datasets have been added upon request. Giovanni will continue to evolve to accommodate GPM data and the multi-sensor data inter-comparisons that will be sure to follow. Additional PDISC tool and service capabilities being adapted for GPM data include: An on-line PDISC Portal (includes user guide, etc
Lighting the Way through Scientific Discourse

ERIC Educational Resources Information Center

Yang, Li-hsuan

2008-01-01

This article describes a thought-provoking lesson that compares various arrangements of lamp-battery circuits to help students develop the motivation and competence to participate in scientific discourse for knowledge construction. Through experimentation and discourse, students explore concepts about voltage, current, resistance, and Ohm's law.…
Answering the right question - integration of InSAR with other datasets

NASA Astrophysics Data System (ADS)

Holley, Rachel; McCormack, Harry; Burren, Richard

2014-05-01

The capabilities of satellite Interferometric Synthetic Aperture Radar (InSAR) are well known, and utilized across a wide range of academic and commercial applications. However there is a tendency, particularly in commercial applications, for users to ask 'What can we study with InSAR?'. When establishing a new technique this approach is important, but InSAR has been possible for 20 years now and, even accounting for new and innovative algorithms, this ground has been thoroughly explored. Too many studies conclude 'We show the ground is moving here, by this much', and mention the wider context as an afterthought. The focus needs to shift towards first asking the right questions - in fields as diverse as hazard awareness, resource optimization, financial considerations and pure scientific enquiry - and then working out how to achieve the best possible answers. Depending on the question, InSAR (and ground deformation more generally) may provide a large or small contribution to the overall solution, and there are usually benefits to integrating a number of techniques to capitalize on the complementary capabilities and provide the most useful measurements. However, there is still a gap between measurements and answers, and unlocking the value of the data relies heavily on appropriate visualization, integrated analysis, communication between technique and application experts, and appropriate use of modelling. We present a number of application examples, and demonstrate how their usefulness can be transformed by moving from a focus on data to answers - integrating complementary geodetic, geophysical and geological datasets and geophysical modeling with appropriate visualization, to enable comprehensive solution-focused interpretation. It will also discuss how forthcoming developments are likely to further advance realisation of the full potential satellite InSAR holds.
A high-resolution European dataset for hydrologic modeling

NASA Astrophysics Data System (ADS)

Ntegeka, Victor; Salamon, Peter; Gomes, Goncalo; Sint, Hadewij; Lorini, Valerio; Thielen, Jutta

2013-04-01

There is an increasing demand for large scale hydrological models not only in the field of modeling the impact of climate change on water resources but also for disaster risk assessments and flood or drought early warning systems. These large scale models need to be calibrated and verified against large amounts of observations in order to judge their capabilities to predict the future. However, the creation of large scale datasets is challenging for it requires collection, harmonization, and quality checking of large amounts of observations. For this reason, only a limited number of such datasets exist. In this work, we present a pan European, high-resolution gridded dataset of meteorological observations (EFAS-Meteo) which was designed with the aim to drive a large scale hydrological model. Similar European and global gridded datasets already exist, such as the HadGHCND (Caesar et al., 2006), the JRC MARS-STAT database (van der Goot and Orlandi, 2003) and the E-OBS gridded dataset (Haylock et al., 2008). However, none of those provide similarly high spatial resolution and/or a complete set of variables to force a hydrologic model. EFAS-Meteo contains daily maps of precipitation, surface temperature (mean, minimum and maximum), wind speed and vapour pressure at a spatial grid resolution of 5 x 5 km for the time period 1 January 1990 - 31 December 2011. It furthermore contains calculated radiation, which is calculated by using a staggered approach depending on the availability of sunshine duration, cloud cover and minimum and maximum temperature, and evapotranspiration (potential evapotranspiration, bare soil and open water evapotranspiration). The potential evapotranspiration was calculated using the Penman-Monteith equation with the above-mentioned meteorological variables. The dataset was created as part of the development of the European Flood Awareness System (EFAS) and has been continuously updated throughout the last years. The dataset variables are used as
Geographic information system datasets of regolith-thickness data, regolith-thickness contours, raster-based regolith thickness, and aquifer-test and specific-capacity data for the Lost Creek Designated Ground Water Basin, Weld, Adams, and Arapahoe Counties, Colorado

USGS Publications Warehouse

Arnold, L. Rick

2010-01-01

These datasets were compiled in support of U.S. Geological Survey Scientific-Investigations Report 2010-5082-Hydrogeology and Steady-State Numerical Simulation of Groundwater Flow in the Lost Creek Designated Ground Water Basin, Weld, Adams, and Arapahoe Counties, Colorado. The datasets were developed by the U.S. Geological Survey in cooperation with the Lost Creek Ground Water Management District and the Colorado Geological Survey. The four datasets are described as follows and methods used to develop the datasets are further described in Scientific-Investigations Report 2010-5082: (1) ds507_regolith_data: This point dataset contains geologic information concerning regolith (unconsolidated sediment) thickness and top-of-bedrock altitude at selected well and test-hole locations in and near the Lost Creek Designated Ground Water Basin, Weld, Adams, and Arapahoe Counties, Colorado. Data were compiled from published reports, consultant reports, and from lithologic logs of wells and test holes on file with the U.S. Geological Survey Colorado Water Science Center and the Colorado Division of Water Resources. (2) ds507_regthick_contours: This dataset consists of contours showing generalized lines of equal regolith thickness overlying bedrock in the Lost Creek Designated Ground Water Basin, Weld, Adams, and Arapahoe Counties, Colorado. Regolith thickness was contoured manually on the basis of information provided in the dataset ds507_regolith_data. (3) ds507_regthick_grid: This dataset consists of raster-based generalized thickness of regolith overlying bedrock in the Lost Creek Designated Ground Water Basin, Weld, Adams, and Arapahoe Counties, Colorado. Regolith thickness in this dataset was derived from contours presented in the dataset ds507_regthick_contours. (4) ds507_welltest_data: This point dataset contains estimates of aquifer transmissivity and hydraulic conductivity at selected well locations in the Lost Creek Designated Ground Water Basin, Weld, Adams, and
Exploring Evolving Media Discourse Through Event Cueing.

PubMed

Lu, Yafeng; Steptoe, Michael; Burke, Sarah; Wang, Hong; Tsai, Jiun-Yi; Davulcu, Hasan; Montgomery, Douglas; Corman, Steven R; Maciejewski, Ross

2016-01-01

Online news, microblogs and other media documents all contain valuable insight regarding events and responses to events. Underlying these documents is the concept of framing, a process in which communicators act (consciously or unconsciously) to construct a point of view that encourages facts to be interpreted by others in a particular manner. As media discourse evolves, how topics and documents are framed can undergo change, shifting the discussion to different viewpoints or rhetoric. What causes these shifts can be difficult to determine directly; however, by linking secondary datasets and enabling visual exploration, we can enhance the hypothesis generation process. In this paper, we present a visual analytics framework for event cueing using media data. As discourse develops over time, our framework applies a time series intervention model which tests to see if the level of framing is different before or after a given date. If the model indicates that the times before and after are statistically significantly different, this cues an analyst to explore related datasets to help enhance their understanding of what (if any) events may have triggered these changes in discourse. Our framework consists of entity extraction and sentiment analysis as lenses for data exploration and uses two different models for intervention analysis. To demonstrate the usage of our framework, we present a case study on exploring potential relationships between climate change framing and conflicts in Africa.
[The scientific entertainer in primary health care].

PubMed

Ortega-Calvo, Manuel; Santos, José Manuel; Lapetra, José

2012-09-01

The scientific method is capable of being applied in primary care. In this article we defend the role of the "scientific entertainer "as strategic and necessary in achieving this goal. The task has to include playful and light-hearted content. We explore some words in English that may help us to understand the concept of "scientific entertainer" from a semantic point of view (showman, master of ceremonies, entrepreneur, go-between) also in Spanish language (counsellor, mediator, methodologist) and finally in Latin and Greek (tripalium, negotium, chronos, kairos). We define the clinical, manager or research health-worker who is skilled in primary care as a "primarylogist". Copyright © 2011 Elsevier España, S.L. All rights reserved.
ASSISTments Dataset from Multiple Randomized Controlled Experiments

ERIC Educational Resources Information Center

Selent, Douglas; Patikorn, Thanaporn; Heffernan, Neil

2016-01-01

In this paper, we present a dataset consisting of data generated from 22 previously and currently running randomized controlled experiments inside the ASSISTments online learning platform. This dataset provides data mining opportunities for researchers to analyze ASSISTments data in a convenient format across multiple experiments at the same time.…
The NOAA Ship Okeanos Explorer Education Materials Collection: Bringing Ocean Exploration Alive for Teachers and Students

NASA Astrophysics Data System (ADS)

Haynes, S.

2012-12-01

The NOAA Ship Okeanos Explorer, America's first Federal ship dedicated to ocean exploration, is envisioned as the ship upon which learners of all ages embark together on scientific voyages of exploration to poorly-known or unexplored areas of the global ocean. Through a combination of lessons, web pages, a ship tracker and dynamic imagery and video, learners participate as ocean explorers in breakthrough discoveries leading to increased scientific understanding and enhanced literacy about our ocean world. The Okeanos Explorer Education Materials Collection was developed to encourage educators and students to become personally involved with the ship's voyages and discoveries. This collection is presented in two volumes: Volume 1: Why Do We Explore? (modern reasons for ocean exploration - specifically, climate change, energy, human health and ocean health) and Volume 2: How Do We Explore? (21st Century strategies and tools for ocean exploration, including telepresence, sonar mapping, water column exploration and remotely operated vehicles). These volumes have been developed into full-day professional development opportunities provided at NOAA OER Alliance Partner sites nationwide and include lessons for grades 5-12 designed to support the evolving science education needs currently articulated in the K-12 Framework for Science Education. Together, the lessons, web pages, ship tracker and videos provide a dynamic education package for teachers to share modern ocean exploration in the classroom and inspire the next generation of explorers. This presentation will share these two Volumes, highlights from current explorations of the Okeanos Explorer and how they are used in ocean explorer lessons, and methods for accessing ocean explorer resources and following along with expeditions.;
The Electrophysiological Correlates of Scientific Innovation Induced by Heuristic Information

ERIC Educational Resources Information Center

Luo, Junlong; Du, Xiumin; Tang, Xiaochen; Zhang, Entao; Li, Haijiang; Zhang, Qinglin

2013-01-01

In this study, novel and old scientific innovations (NSI and OSI) were selected as materials to explore the electrophysiological correlates of scientific innovation induced by heuristic information. Using event-related brain potentials (ERPs) to do so, college students solved NSI problems (for which they did not know the answers) and OSI problems…
Estimating parameters for probabilistic linkage of privacy-preserved datasets.

PubMed

Brown, Adrian P; Randall, Sean M; Ferrante, Anna M; Semmens, James B; Boyd, James H

2017-07-10

Probabilistic record linkage is a process used to bring together person-based records from within the same dataset (de-duplication) or from disparate datasets using pairwise comparisons and matching probabilities. The linkage strategy and associated match probabilities are often estimated through investigations into data quality and manual inspection. However, as privacy-preserved datasets comprise encrypted data, such methods are not possible. In this paper, we present a method for estimating the probabilities and threshold values for probabilistic privacy-preserved record linkage using Bloom filters. Our method was tested through a simulation study using synthetic data, followed by an application using real-world administrative data. Synthetic datasets were generated with error rates from zero to 20% error. Our method was used to estimate parameters (probabilities and thresholds) for de-duplication linkages. Linkage quality was determined by F-measure. Each dataset was privacy-preserved using separate Bloom filters for each field. Match probabilities were estimated using the expectation-maximisation (EM) algorithm on the privacy-preserved data. Threshold cut-off values were determined by an extension to the EM algorithm allowing linkage quality to be estimated for each possible threshold. De-duplication linkages of each privacy-preserved dataset were performed using both estimated and calculated probabilities. Linkage quality using the F-measure at the estimated threshold values was also compared to the highest F-measure. Three large administrative datasets were used to demonstrate the applicability of the probability and threshold estimation technique on real-world data. Linkage of the synthetic datasets using the estimated probabilities produced an F-measure that was comparable to the F-measure using calculated probabilities, even with up to 20% error. Linkage of the administrative datasets using estimated probabilities produced an F-measure that was higher
Viking Seismometer PDS Archive Dataset

NASA Astrophysics Data System (ADS)

Lorenz, R. D.

2016-12-01

The Viking Lander 2 seismometer operated successfully for over 500 Sols on the Martian surface, recording at least one likely candidate Marsquake. The Viking mission, in an era when data handling hardware (both on board and on the ground) was limited in capability, predated modern planetary data archiving, and ad-hoc repositories of the data, and the very low-level record at NSSDC, were neither convenient to process nor well-known. In an effort supported by the NASA Mars Data Analysis Program, we have converted the bulk of the Viking dataset (namely the 49,000 and 270,000 records made in High- and Event- modes at 20 and 1 Hz respectively) into a simple ASCII table format. Additionally, since wind-generated lander motion is a major component of the signal, contemporaneous meteorological data are included in summary records to facilitate correlation. These datasets are being archived at the PDS Geosciences Node. In addition to brief instrument and dataset descriptions, the archive includes code snippets in the freely-available language 'R' to demonstrate plotting and analysis. Further, we present examples of lander-generated noise, associated with the sampler arm, instrument dumps and other mechanical operations.
Cloud-Based Mobile Application Development Tools and NASA Science Datasets

NASA Astrophysics Data System (ADS)

Oostra, D.; Lewis, P. M.; Chambers, L. H.; Moore, S. W.

2011-12-01

A number of cloud-based visual development tools have emerged that provide methods for developing mobile applications quickly and without previous programming experience. This paper will explore how our new and current data users can best combine these cloud-based mobile application tools and available NASA climate science datasets. Our vision is that users will create their own mobile applications for visualizing our data and will develop tools for their own needs. The approach we are documenting is based on two main ideas. The first is to provide training and information. Through examples, sharing experiences, and providing workshops, users can be shown how to use free online tools to easily create mobile applications that interact with NASA datasets. The second approach is to provide application programming interfaces (APIs), databases, and web applications to access data in a way that educators, students and scientists can quickly integrate it into their own mobile application development. This framework allows us to foster development activities and boost interaction with NASA's data while saving resources that would be required for a large internal application development staff. The findings of this work will include data gathered through meetings with local data providers, educators, libraries and individuals. From the very first queries into this topic, a high level of interest has been identified from our groups of users. This overt interest, combined with the marked popularity of mobile applications, has created a new channel for outreach and communications between the science and education communities. As a result, we would like to offer educators and other stakeholders some insight into the mobile application development arena, and provide some next steps and new approaches. Our hope is that, through our efforts, we will broaden the scope and usage of NASA's climate science data by providing new ways to access environmentally relevant datasets.
Privacy-preserving GWAS analysis on federated genomic datasets.

PubMed

Constable, Scott D; Tang, Yuzhe; Wang, Shuang; Jiang, Xiaoqian; Chapin, Steve

2015-01-01

The biomedical community benefits from the increasing availability of genomic data to support meaningful scientific research, e.g., Genome-Wide Association Studies (GWAS). However, high quality GWAS usually requires a large amount of samples, which can grow beyond the capability of a single institution. Federated genomic data analysis holds the promise of enabling cross-institution collaboration for effective GWAS, but it raises concerns about patient privacy and medical information confidentiality (as data are being exchanged across institutional boundaries), which becomes an inhibiting factor for the practical use. We present a privacy-preserving GWAS framework on federated genomic datasets. Our method is to layer the GWAS computations on top of secure multi-party computation (MPC) systems. This approach allows two parties in a distributed system to mutually perform secure GWAS computations, but without exposing their private data outside. We demonstrate our technique by implementing a framework for minor allele frequency counting and χ2 statistics calculation, one of typical computations used in GWAS. For efficient prototyping, we use a state-of-the-art MPC framework, i.e., Portable Circuit Format (PCF) 1. Our experimental results show promise in realizing both efficient and secure cross-institution GWAS computations.
National Elevation Dataset

USGS Publications Warehouse

,

1999-01-01

The National Elevation Dataset (NED) is a new raster product assembled by the U.S. Geological Survey (USGS). The NED is designed to provide national elevation data in a seamless form with a consistent datum, elevation unit, and projection. Data corrections were made in the NED assembly process to minimize artifacts, permit edge matching, and fill sliver areas of missing data.
A Benchmark Dataset for SSVEP-Based Brain-Computer Interfaces.

PubMed

Wang, Yijun; Chen, Xiaogang; Gao, Xiaorong; Gao, Shangkai

2017-10-01

This paper presents a benchmark steady-state visual evoked potential (SSVEP) dataset acquired with a 40-target brain- computer interface (BCI) speller. The dataset consists of 64-channel Electroencephalogram (EEG) data from 35 healthy subjects (8 experienced and 27 naïve) while they performed a cue-guided target selecting task. The virtual keyboard of the speller was composed of 40 visual flickers, which were coded using a joint frequency and phase modulation (JFPM) approach. The stimulation frequencies ranged from 8 Hz to 15.8 Hz with an interval of 0.2 Hz. The phase difference between two adjacent frequencies was . For each subject, the data included six blocks of 40 trials corresponding to all 40 flickers indicated by a visual cue in a random order. The stimulation duration in each trial was five seconds. The dataset can be used as a benchmark dataset to compare the methods for stimulus coding and target identification in SSVEP-based BCIs. Through offline simulation, the dataset can be used to design new system diagrams and evaluate their BCI performance without collecting any new data. The dataset also provides high-quality data for computational modeling of SSVEPs. The dataset is freely available fromhttp://bci.med.tsinghua.edu.cn/download.html.

Dataset-Driven Research to Support Learning and Knowledge Analytics

ERIC Educational Resources Information Center

Verbert, Katrien; Manouselis, Nikos; Drachsler, Hendrik; Duval, Erik

2012-01-01

In various research areas, the availability of open datasets is considered as key for research and application purposes. These datasets are used as benchmarks to develop new algorithms and to compare them to other algorithms in given settings. Finding such available datasets for experimentation can be a challenging task in technology enhanced…
Why We Explore: The Value of Space Exploration for Future Generations

NASA Technical Reports Server (NTRS)

Cook, Stephen A.; Armstrong, Robert C., Jr.

2007-01-01

The National Aeronautics and Space Administration (NASA) and its industry partners are making measurable progress toward delivering new human space transportation capabilities to serve as the catalyst for a new era of discovery, as directed by the U.S. Vision for Space Exploration. In the interest of ensuring prolonged support, the Agency encourages space advocates of all stripes to accurately portray both the tangible and intangible benefits of space exploration, especially its value for future generations. This may be done not only by emphasizing the nation's return on its aerospace investment, but also by highlighting enabling security features and by promoting the scientific and technological benefits that accrue from the human exploration of space. As America embarks on a new era of leadership and international partnership on the next frontier, we are poised to master space by living off-planet on the Moon to prepare astronauts for longer journeys to Mars. These and other relevant facts should be clearly in the view of influential decision-makers and the American taxpayers, and we must increasingly involve those on whom the long-term sustainability of space exploration ultimately depends: America's youth. This paper will examine three areas of concrete benefits for future generations: fundamental security, economic enterprise, and high-technology advancements spurred by the innovation that scientific discovery demands.
Classroom Critters and the Scientific Method.

ERIC Educational Resources Information Center

Kneidel, Sally

This resource book presents 37 behavioral experiments that can be performed with commonly-found classroom animals including hamsters, gerbils, mice, goldfish, guppies, anolis lizards, kittens, and puppies. Each experiment explores the five steps of the scientific method: (1) Question; (2) Hypothesis; (3) Methods; (4) Result; and (5) Conclusion.…
The Scientific Field during Argentina's Latest Military Dictatorship (1976-1983): Contraction of Public Universities and Expansion of the National Council for Scientific and Technological Research (CONICET)

ERIC Educational Resources Information Center

Bekerman, Fabiana

2013-01-01

This study looks at some of the traits that characterized Argentina's scientific and university policies under the military regime that spanned from 1976 through 1983. To this end, it delves into a rarely explored empirical observation: financial resource transfers from national universities to the National Scientific and Technological Research…
Method of generating features optimal to a dataset and classifier

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bruillard, Paul J.; Gosink, Luke J.; Jarman, Kenneth D.

A method of generating features optimal to a particular dataset and classifier is disclosed. A dataset of messages is inputted and a classifier is selected. An algebra of features is encoded. Computable features that are capable of describing the dataset from the algebra of features are selected. Irredundant features that are optimal for the classifier and the dataset are selected.
Querying Patterns in High-Dimensional Heterogenous Datasets

ERIC Educational Resources Information Center

Singh, Vishwakarma

2012-01-01

The recent technological advancements have led to the availability of a plethora of heterogenous datasets, e.g., images tagged with geo-location and descriptive keywords. An object in these datasets is described by a set of high-dimensional feature vectors. For example, a keyword-tagged image is represented by a color-histogram and a…
Explorer 1 60th Anniversary

NASA Image and Video Library

2018-01-31

Attendees watch a short video on Explorer 1 during an event celebrating the 60th Anniversary of the Explorer 1 mission and the discovery of Earth's radiation belts, Wednesday, Jan. 31, 2018, at the National Academy of Sciences in Washington. The first U.S. satellite, Explorer 1, was launched from Cape Canaveral on January 31, 1958. The 30-pound satellite would yield a major scientific discovery, the Van Allen radiation belts circling our planet, and begin six decades of groundbreaking space science and human exploration. (NASA/Joel Kowsky)
Explorer 1 60th Anniversary

NASA Image and Video Library

2018-01-31

A replica of the Explorer 1 satellite is seen on display during an event celebrating the 60th Anniversary of the Explorer 1 mission and the discovery of Earth's radiation belts, Wednesday, Jan. 31, 2018, at the National Academy of Sciences in Washington. The first U.S. satellite, Explorer 1, was launched from Cape Canaveral on January 31, 1958. The 30-pound satellite would yield a major scientific discovery, the Van Allen radiation belts circling our planet, and begin six decades of groundbreaking space science and human exploration. (NASA/Joel Kowsky)
Initial Efforts toward Mission-Representative Imaging Surveys from Aerial Explorers

NASA Technical Reports Server (NTRS)

Pisanich, Greg; Plice, Laura; Ippolito, Corey; Young, Larry A.; Lau, Benton; Lee, Pascal

2004-01-01

Numerous researchers have proposed the use of robotic aerial explorers to perform scientific investigation of planetary bodies in our solar system. One of the essential tasks for any aerial explorer is to be able to perform scientifically valuable imaging surveys. The focus of this paper is to discuss the challenges implicit in, and recent observations related to, acquiring mission-representative imaging data from a small fixed-wing UAV, acting as a surrogate planetary aerial explorer. This question of successfully performing aerial explorer surveys is also tied to other topics of technical investigation, including the development of unique bio-inspired technologies.
Transgenes and transgressions: scientific dissent as heterogeneous practice.

PubMed

Delborne, Jason A

2008-08-01

Although scholars in science and technology studies have explored many dynamics and consequences of scientific controversy, no coherent theory of scientific dissent has emerged. This paper proposes the elements of such a framework, based on understanding scientific dissent as a set of heterogeneous practices. I use the controversy over the presence of transgenic DNA in Mexican maize in the early 2000s to point to a processual model of scientific dissent. 'Contrarian science' includes knowledge claims that challenge the dominant scientific trajectory, but need not necessarily lead to dissent. 'Impedance' represents efforts to undermine the credibility of contrarian science (or contrarian scientists) and may originate within or outside of the scientific community. In the face of impedance, contrarian scientists may become dissenters. The actions of the scientist at the center of the case study, Professor Ignacio Chapela of the University of California, Berkeley, demonstrate particular practices of scientific dissent, ranging from 'agonistic engagement' to 'dissident science'. These practices speak not only to functional strategies of winning scientific debate, but also to attempts to reconfigure relations among scientists, publics, institutions, and politics that order knowledge production.
Developing, sharing and using large community datasets to evaluate regional hydrologic change in Northern Brazil

NASA Astrophysics Data System (ADS)

Thompson, S. E.; Levy, M. C.

2016-12-01

Quantifying regional water cycle changes resulting from the physical transformation of the earth's surface is essential for water security. Although hydrology has a rich legacy of "paired basin" experiments that identify water cycle responses to imposed land use or land cover change (i) there is a deficit of such studies across many representative biomes worldwide, including the tropics, and (ii) the paired basins generally do not provide a representative sample of regional river systems in a way that can inform policy. Larger sample, empirical analyses are needed for such policy-relevant understanding - and these analyses must be supported by regional data. Northern Brazil is a global agricultural and biodiversity center, where regional climate and hydrology are projected (through modeling) to have strong sensitivities to land cover change. Dramatic land cover change has and continues to occur in this region. We used a causal statistical anlaysis framework to explore the effects of deforestation and land cover conversion on regional hydrology. Firstly, we used a comparative approach to address the `data selection uncertainty' problem associated with rainfall datasets covering this sparsely monitored region. We compared 9 remotely-sensed (RS) and in-situ (IS) rainfall datasets, demonstrating that rainfall characterization and trends were sensitive to the selected data sources and identifying which of these datasets had the strongest fidelity to independently measured streamflow occurrence. Next, we employed a "differences-in-differences" regression technique to evaluate the effects of land use change on the quantiles of the flow duration curve between populations of basins experiencing different levels of land conversion. Regionally, controlling for climate and other variables, deforestation significantly increased flow in the lowest third of the flow duration curve. Addressing this problem required harmonizing 9 separate spatial datasets (in addition to the 9
SSERVI: Merging Science and Human Exploration

NASA Technical Reports Server (NTRS)

Schmidt, Gregory; Gibbs, Kristina

2017-01-01

The NASA Solar System Exploration Research Virtual Institute (SSERVI) is a virtual institute focused on research and the intersection of science and exploration, training the next generation of lunar scientists, and community development. As part of the SSERVI mission, we act as a hub for the opportunities that engage the larger scientific and exploration communities in order to form a new interdisciplinary, research-focused collaborations.
Harnessing Connectivity in a Large-Scale Small-Molecule Sensitivity Dataset | Office of Cancer Genomics

Cancer.gov

Identifying genetic alterations that prime a cancer cell to respond to a particular therapeutic agent can facilitate the development of precision cancer medicines. Cancer cell-line (CCL) profiling of small-molecule sensitivity has emerged as an unbiased method to assess the relationships between genetic or cellular features of CCLs and small-molecule response. Here, we developed annotated cluster multidimensional enrichment analysis to explore the associations between groups of small molecules and groups of CCLs in a new, quantitative sensitivity dataset.
A Novel Feature-Map Based ICA Model for Identifying the Individual, Intra/Inter-Group Brain Networks across Multiple fMRI Datasets.

PubMed

Wang, Nizhuan; Chang, Chunqi; Zeng, Weiming; Shi, Yuhu; Yan, Hongjie

2017-01-01

Independent component analysis (ICA) has been widely used in functional magnetic resonance imaging (fMRI) data analysis to evaluate functional connectivity of the brain; however, there are still some limitations on ICA simultaneously handling neuroimaging datasets with diverse acquisition parameters, e.g., different repetition time, different scanner, etc. Therefore, it is difficult for the traditional ICA framework to effectively handle ever-increasingly big neuroimaging datasets. In this research, a novel feature-map based ICA framework (FMICA) was proposed to address the aforementioned deficiencies, which aimed at exploring brain functional networks (BFNs) at different scales, e.g., the first level (individual subject level), second level (intragroup level of subjects within a certain dataset) and third level (intergroup level of subjects across different datasets), based only on the feature maps extracted from the fMRI datasets. The FMICA was presented as a hierarchical framework, which effectively made ICA and constrained ICA as a whole to identify the BFNs from the feature maps. The simulated and real experimental results demonstrated that FMICA had the excellent ability to identify the intergroup BFNs and to characterize subject-specific and group-specific difference of BFNs from the independent component feature maps, which sharply reduced the size of fMRI datasets. Compared with traditional ICAs, FMICA as a more generalized framework could efficiently and simultaneously identify the variant BFNs at the subject-specific, intragroup, intragroup-specific and intergroup levels, implying that FMICA was able to handle big neuroimaging datasets in neuroscience research.
Using the OOI Cabled Array HD Camera to Explore Geophysical and Oceanographic Problems at Axial Seamount

NASA Astrophysics Data System (ADS)

Crone, T. J.; Knuth, F.; Marburg, A.

2016-12-01

A broad array of Earth science problems can be investigated using high-definition video imagery from the seafloor, ranging from those that are geological and geophysical in nature, to those that are biological and water-column related. A high-definition video camera was installed as part of the Ocean Observatory Initiative's core instrument suite on the Cabled Array, a real-time fiber optic data and power system that stretches from the Oregon Coast to Axial Seamount on the Juan de Fuca Ridge. This camera runs a 14-minute pan-tilt-zoom routine 8 times per day, focusing on locations of scientific interest on and near the Mushroom vent in the ASHES hydrothermal field inside the Axial caldera. The system produces 13 GB of lossless HD video every 3 hours, and at the time of this writing it has generated 2100 recordings totaling 28.5 TB since it began streaming data into the OOI archive in August of 2015. Because of the large size of this dataset, downloading the entirety of the video for long timescale investigations is not practical. We are developing a set of user-side tools for downloading single frames and frame ranges from the OOI HD camera raw data archive to aid users interested in using these data for their research. We use these tools to download about one year's worth of partial frame sets to investigate several questions regarding the hydrothermal system at ASHES, including the variability of bacterial "floc" in the water-column, and changes in high temperature fluid fluxes using optical flow techniques. We show that while these user-side tools can facilitate rudimentary scientific investigations using the HD camera data, a server-side computing environment that allows users to explore this dataset without downloading any raw video will be required for more advanced investigations to flourish.
A geologic and mineral exploration spatial database for the Stillwater Complex, Montana

USGS Publications Warehouse

Zientek, Michael L.; Parks, Heather L.

2014-01-01

This report provides essential spatially referenced datasets based on geologic mapping and mineral exploration activities conducted from the 1920s to the 1990s. This information will facilitate research on the complex and provide background material needed to explore for mineral resources and to develop sound land-management policy.
NASA's Solar System Exploration Research Virtual Institute: Combining Science and Exploration

NASA Astrophysics Data System (ADS)

Bailey, B.; Schmidt, G.; Daou, D.; Pendleton, Y.

2015-10-01

The NASA Solar System Exploration Research Virtual Institute (SSERVI) is a virtual institute focused on research at the intersection of science andexploration, training the next generation of lunar scientists, and community development. As part of the SSERVI mission, we act as a hub for opportunities that engage the larger scientific and exploration communities in order to form new interdisciplinary, research-focused collaborations. This talk will describe the research efforts of the nine domestic teams that constitute the U.S. complement of the Institute and how we will engage the international science and exploration communities through workshops, conferences, online seminars and classes, student exchange programs and internships.
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning

PubMed Central

Hoo-Chang, Shin; Roth, Holger R.; Gao, Mingchen; Lu, Le; Xu, Ziyue; Nogues, Isabella; Yao, Jianhua; Mollura, Daniel

2016-01-01

Remarkable progress has been made in image recognition, primarily due to the availability of large-scale annotated datasets (i.e. ImageNet) and the revival of deep convolutional neural networks (CNN). CNNs enable learning data-driven, highly representative, layered hierarchical image features from sufficient training data. However, obtaining datasets as comprehensively annotated as ImageNet in the medical imaging domain remains a challenge. There are currently three major techniques that successfully employ CNNs to medical image classification: training the CNN from scratch, using off-the-shelf pre-trained CNN features, and conducting unsupervised CNN pre-training with supervised fine-tuning. Another effective method is transfer learning, i.e., fine-tuning CNN models (supervised) pre-trained from natural image dataset to medical image tasks (although domain transfer between two medical image datasets is also possible). In this paper, we exploit three important, but previously understudied factors of employing deep convolutional neural networks to computer-aided detection problems. We first explore and evaluate different CNN architectures. The studied models contain 5 thousand to 160 million parameters, and vary in numbers of layers. We then evaluate the influence of dataset scale and spatial image context on performance. Finally, we examine when and why transfer learning from pre-trained ImageNet (via fine-tuning) can be useful. We study two specific computeraided detection (CADe) problems, namely thoraco-abdominal lymph node (LN) detection and interstitial lung disease (ILD) classification. We achieve the state-of-the-art performance on the mediastinal LN detection, with 85% sensitivity at 3 false positive per patient, and report the first five-fold cross-validation classification results on predicting axial CT slices with ILD categories. Our extensive empirical evaluation, CNN model analysis and valuable insights can be extended to the design of high performance
Accuracy assessment of seven global land cover datasets over China

NASA Astrophysics Data System (ADS)

Yang, Yongke; Xiao, Pengfeng; Feng, Xuezhi; Li, Haixing

2017-03-01

Land cover (LC) is the vital foundation to Earth science. Up to now, several global LC datasets have arisen with efforts of many scientific communities. To provide guidelines for data usage over China, nine LC maps from seven global LC datasets (IGBP DISCover, UMD, GLC, MCD12Q1, GLCNMO, CCI-LC, and GlobeLand30) were evaluated in this study. First, we compared their similarities and discrepancies in both area and spatial patterns, and analysed their inherent relations to data sources and classification schemes and methods. Next, five sets of validation sample units (VSUs) were collected to calculate their accuracy quantitatively. Further, we built a spatial analysis model and depicted their spatial variation in accuracy based on the five sets of VSUs. The results show that, there are evident discrepancies among these LC maps in both area and spatial patterns. For LC maps produced by different institutes, GLC 2000 and CCI-LC 2000 have the highest overall spatial agreement (53.8%). For LC maps produced by same institutes, overall spatial agreement of CCI-LC 2000 and 2010, and MCD12Q1 2001 and 2010 reach up to 99.8% and 73.2%, respectively; while more efforts are still needed if we hope to use these LC maps as time series data for model inputting, since both CCI-LC and MCD12Q1 fail to represent the rapid changing trend of several key LC classes in the early 21st century, in particular urban and built-up, snow and ice, water bodies, and permanent wetlands. With the highest spatial resolution, the overall accuracy of GlobeLand30 2010 is 82.39%. For the other six LC datasets with coarse resolution, CCI-LC 2010/2000 has the highest overall accuracy, and following are MCD12Q1 2010/2001, GLC 2000, GLCNMO 2008, IGBP DISCover, and UMD in turn. Beside that all maps exhibit high accuracy in homogeneous regions; local accuracies in other regions are quite different, particularly in Farming-Pastoral Zone of North China, mountains in Northeast China, and Southeast Hills. Special
Variation in the Interpretation of Scientific Integrity in Community-based Participatory Health Research

PubMed Central

Kraemer Diaz, Anne E.; Spears Johnson, Chaya R.; Arcury, Thomas A.

2013-01-01

Community-based participatory research (CBPR) has become essential in health disparities and environmental justice research; however, the scientific integrity of CBPR projects has become a concern. Some concerns, such as appropriate research training, lack of access to resources and finances, have been discussed as possibly limiting the scientific integrity of a project. Prior to understanding what threatens scientific integrity in CBPR, it is vital to understand what scientific integrity means for the professional and community investigators who are involved in CBPR. This analysis explores the interpretation of scientific integrity in CBPR among 74 professional and community research team members from of 25 CBPR projects in nine states in the southeastern United States in 2012. It describes the basic definition for scientific integrity and then explores variations in the interpretation of scientific integrity in CBPR. Variations in the interpretations were associated with team member identity as professional or community investigators. Professional investigators understood scientific integrity in CBPR as either conceptually or logistically flexible, as challenging to balance with community needs, or no different than traditional scientific integrity. Community investigators interpret other factors as important in scientific integrity, such as trust, accountability, and overall benefit to the community. This research demonstrates that the variations in the interpretation of scientific integrity in CBPR call for a new definition of scientific integrity in CBPR that takes into account the understanding and needs of all investigators. PMID:24161098

Scientific and non-scientific challenges for Operational Earthquake Forecasting

NASA Astrophysics Data System (ADS)

Marzocchi, W.

2015-12-01

Tracking the time evolution of seismic hazard in time windows shorter than the usual 50-years of long-term hazard models may offer additional opportunities to reduce the seismic risk. This is the target of operational earthquake forecasting (OEF). During the OEF development in Italy we identify several challenges that range from pure science to the more practical interface of science with society. From a scientific point of view, although earthquake clustering is the clearest empirical evidence about earthquake occurrence, and OEF clustering models are the most (successfully) tested hazard models in seismology, we note that some seismologists are still reluctant to accept their scientific reliability. After exploring the motivations of these scientific doubts, we also look into an issue that is often overlooked in this discussion, i.e., in any kind of hazard analysis, we do not use a model because it is the true one, but because it is the better than anything else we can think of. The non-scientific aspects are mostly related to the fact that OEF usually provides weekly probabilities of large eartquakes smaller than 1%. These probabilities are considered by some seismologists too small to be of interest or useful. However, in a recent collaboration with engineers we show that such earthquake probabilities may lead to intolerable individual risk of death. Interestingly, this debate calls for a better definition of the still fuzzy boundaries among the different expertise required for the whole risk mitigation process. The last and probably more pressing challenge is related to the communication to the public. In fact, a wrong message could be useless or even counterproductive. Here we show some progresses that we have made in this field working with communication experts in Italy.
Slash Writers and Guinea Pigs as Models for a Scientific Multiliteracy

ERIC Educational Resources Information Center

Weinstein, Matthew

2006-01-01

This paper explores alternative approaches to the conception of scientific literacy, drawing on cultural studies and emerging practices in language arts as its framework. The paper reviews historic tensions in the understanding of scientific literacy and then draws on the multiliteracies movement in language arts to suggest a scientific…
Optimal Compressed Sensing and Reconstruction of Unstructured Mesh Datasets

DOE PAGES

Salloum, Maher; Fabian, Nathan D.; Hensinger, David M.; ...

2017-08-09

Exascale computing promises quantities of data too large to efficiently store and transfer across networks in order to be able to analyze and visualize the results. We investigate compressed sensing (CS) as an in situ method to reduce the size of the data as it is being generated during a large-scale simulation. CS works by sampling the data on the computational cluster within an alternative function space such as wavelet bases and then reconstructing back to the original space on visualization platforms. While much work has gone into exploring CS on structured datasets, such as image data, we investigate itsmore » usefulness for point clouds such as unstructured mesh datasets often found in finite element simulations. We sample using a technique that exhibits low coherence with tree wavelets found to be suitable for point clouds. We reconstruct using the stagewise orthogonal matching pursuit algorithm that we improved to facilitate automated use in batch jobs. We analyze the achievable compression ratios and the quality and accuracy of reconstructed results at each compression ratio. In the considered case studies, we are able to achieve compression ratios up to two orders of magnitude with reasonable reconstruction accuracy and minimal visual deterioration in the data. Finally, our results suggest that, compared to other compression techniques, CS is attractive in cases where the compression overhead has to be minimized and where the reconstruction cost is not a significant concern.« less
Biological insight, high-throughput datasets and the nature of neuro-degenerative disorders.

PubMed

Valente, André X C N; Oliveira, Paulo J; Khaiboullina, Svetlana F; Palotás, András; Rizvanov, Albert A

2013-09-01

Life sciences are experiencing a historical shift towards a quantitative, data-rich regime. This transition has been associated with the advent of bio-informatics: mathematicians, physicists, computer scientists and statisticians are now commonplace in the field, working on the analysis of ever larger data-sets. An open question regarding what should drive scientific progress in this new era remains: will biological insight become increasingly irrelevant in a world of hypothesis-free, unbiased data analysis? This piece offers a different perspective, pin-pointing that biological thought is more-than-ever relevant in a data-rich setting. Some of the novel highthroughput information being acquired in the field of neuro-degenerative disorders is highlighted here. As but one example of how theory and experiment can interact in this new reality, our efforts in developing an idiopathic neuro-degenerative disease hematopoietic stemcell ageing theory are described.
The LANDFIRE Refresh strategy: updating the national dataset

USGS Publications Warehouse

Nelson, Kurtis J.; Connot, Joel A.; Peterson, Birgit E.; Martin, Charley

2013-01-01

The LANDFIRE Program provides comprehensive vegetation and fuel datasets for the entire United States. As with many large-scale ecological datasets, vegetation and landscape conditions must be updated periodically to account for disturbances, growth, and natural succession. The LANDFIRE Refresh effort was the first attempt to consistently update these products nationwide. It incorporated a combination of specific systematic improvements to the original LANDFIRE National data, remote sensing based disturbance detection methods, field collected disturbance information, vegetation growth and succession modeling, and vegetation transition processes. This resulted in the creation of two complete datasets for all 50 states: LANDFIRE Refresh 2001, which includes the systematic improvements, and LANDFIRE Refresh 2008, which includes the disturbance and succession updates to the vegetation and fuel data. The new datasets are comparable for studying landscape changes in vegetation type and structure over a decadal period, and provide the most recent characterization of fuel conditions across the country. The applicability of the new layers is discussed and the effects of using the new fuel datasets are demonstrated through a fire behavior modeling exercise using the 2011 Wallow Fire in eastern Arizona as an example.
Exploring the first scientific observations of lunar eclipses made in Siam

NASA Astrophysics Data System (ADS)

Orchiston, Wayne; Orchiston, Darunee Lingling; George, Martin; Soonthornthum, Boonrucksar

2016-04-01

The first great ruler to encourage the adoption of Western culture and technology throughout Siam (present-day Thailand) was King Narai, who also had a passion for astronomy. He showed this by encouraging French and other Jesuit missionaries, some with astronomical interests and training, to settle in Siam from the early 1660s. One of these was Father Antoine Thomas, and he was the first European known to have carried out scientific astronomical observations from Siam when he determined the latitude of Ayutthaya in 1681 and the following year observed the total lunar eclipse of 22 February. A later lunar eclipse also has an important place in the history of Thai astronomy. In 1685 a delegation of French missionary-astronomers settled in Ayutthaya, and on 10-11 December 1685 they joined King Narai and his court astrologers and observed a lunar eclipse from the King's 'country retreat' near Lop Buri. This event so impressed the King that he approved the erection of a large modern well-equipped astronomical observatory at Lop Buri. Construction of Wat San Paulo Observatory - as it was known - began in 1686 and was completed in 1687. In this paper we examine these two lunar eclipses and their association with the development of scientific astronomy in Siam.
Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery.

PubMed

Zhao, Yongan; Wang, Xiaofeng; Jiang, Xiaoqian; Ohno-Machado, Lucila; Tang, Haixu

2015-01-01

To propose a new approach to privacy preserving data selection, which helps the data users access human genomic datasets efficiently without undermining patients' privacy. Our idea is to let each data owner publish a set of differentially-private pilot data, on which a data user can test-run arbitrary association-test algorithms, including those not known to the data owner a priori. We developed a suite of new techniques, including a pilot-data generation approach that leverages the linkage disequilibrium in the human genome to preserve both the utility of the data and the privacy of the patients, and a utility evaluation method that helps the user assess the value of the real data from its pilot version with high confidence. We evaluated our approach on real human genomic data using four popular association tests. Our study shows that the proposed approach can help data users make the right choices in most cases. Even though the pilot data cannot be directly used for scientific discovery, it provides a useful indication of which datasets are more likely to be useful to data users, who can therefore approach the appropriate data owners to gain access to the data. © The Author 2014. Published by Oxford University Press on behalf of the American Medical Informatics Association.
Outward bound: women translators and scientific travel writing, 1780-1800.

PubMed

Martin, Alison E

2016-04-01

As the Enlightenment drew to a close, translation had gradually acquired an increasingly important role in the international circulation and transmission of scientific knowledge. Yet comparatively little attention has been paid to the translators responsible for making such accounts accessible in other languages, some of whom were women. In this article I explore how European women cast themselves as intellectually enquiring, knowledgeable and authoritative figures in their translations. Focusing specifically on the genre of scientific travel writing, I investigate the narrative strategies deployed by women translators to mark their involvement in the process of scientific knowledge-making. These strategies ranged from rhetorical near-invisibility, driven by women's modest marginalization of their own public engagement in science, to the active advertisement of themselves as intellectually curious consumers of scientific knowledge. A detailed study of Elizabeth Helme's translation of the French ornithologist François le Vaillant's Voyage dans l'intérieur de l'Afrique [Voyage into the Interior of Africa] (1790) allows me to explore how her reworking of the original text for an Anglophone reading public enabled her to engage cautiously - or sometimes more openly - with questions regarding how scientific knowledge was constructed, for whom and with which aims in mind.
Exploring the limits of classical physics: Planck, Einstein, and the structure of a scientific revolution

NASA Astrophysics Data System (ADS)

Büttner, Jochen; Renn, Jürgen; Schemmel, Matthias

The emergence of the quantum theory in the beginning of the last century is generally seen as a scientific revolution par excellence. Although numerous studies have been dedicated to its historical analysis, there is so far only one major work available with an explicit historical theory of scientific revolutions in the background, Thomas Kuhn's Black-Body Theory and the Quantum Discontinuity of 1978.
Multiple-Agent Air/Ground Autonomous Exploration Systems

NASA Technical Reports Server (NTRS)

Fink, Wolfgang; Chao, Tien-Hsin; Tarbell, Mark; Dohm, James M.

2007-01-01

Autonomous systems of multiple-agent air/ground robotic units for exploration of the surfaces of remote planets are undergoing development. Modified versions of these systems could be used on Earth to perform tasks in environments dangerous or inaccessible to humans: examples of tasks could include scientific exploration of remote regions of Antarctica, removal of land mines, cleanup of hazardous chemicals, and military reconnaissance. A basic system according to this concept (see figure) would include a unit, suspended by a balloon or a blimp, that would be in radio communication with multiple robotic ground vehicles (rovers) equipped with video cameras and possibly other sensors for scientific exploration. The airborne unit would be free-floating, controlled by thrusters, or tethered either to one of the rovers or to a stationary object in or on the ground. Each rover would contain a semi-autonomous control system for maneuvering and would function under the supervision of a control system in the airborne unit. The rover maneuvering control system would utilize imagery from the onboard camera to navigate around obstacles. Avoidance of obstacles would also be aided by readout from an onboard (e.g., ultrasonic) sensor. Together, the rover and airborne control systems would constitute an overarching closed-loop control system to coordinate scientific exploration by the rovers.
Communication and the Social Representation of Scientific Knowledge.

ERIC Educational Resources Information Center

Lievrouw, Leah A.

1990-01-01

Examines the process of disseminating scientific information to the public. Explores the particular steps and strategies that scientists use in taking research findings to a popular audience. Examines the popularization of cold-fusion research. (RS)
NetCDF-U - Uncertainty conventions for netCDF datasets

NASA Astrophysics Data System (ADS)

Bigagli, Lorenzo; Nativi, Stefano; Domenico, Ben

2013-04-01

knowledge, no general convention on the encoding of uncertainty has been proposed, to date. In particular, the netCDF Climate and Forecast Conventions (NetCDF-CF), a de-facto standard for a large amount of data in Fluid Earth Sciences, mention the issue and provide limited support for uncertainty representation. NetCDF-U is designed to be fully compatible with NetCDF-CF, where possible adopting the same mechanisms (e.g. using the same attributes name with compatible semantics). The rationale for this is that a probabilistic description of scientific quantities is a crosscutting aspect, which may be modularized (note that a netCDF dataset may be compliant with more than one convention). The scope of NetCDF-U is to extend and qualify the netCDF classic data model (also known as netCDF3), to capture the uncertainty related to geospatial information encoded in that format. In the future, a netCDF4 approach for uncertainty encoding will be investigated. The NetCDF-U Conventions have the following rationale: • Compatibility with netCDF-CF Conventions 1.5. • Human-readability of conforming datasets structure. • Minimal difference between certain/agnostic and uncertain representations of data (e.g. with respect to dataset structure). NetCDF-U is based on a generic mechanism for annotating netCDF data variables with probability theory semantics. The Uncertainty Markup Language (UncertML) 2.0 is used as a controlled conceptual model and vocabulary for NetCDF-U annotations. The proposed mechanism anticipates a generalized support for semantic annotations in netCDF. NetCDF-U defines syntactical conventions for encoding samples, summary statistics, and distributions, along with mechanisms for expressing dependency relationships among variables. The conventions were accepted as an Open Geospatial Consortium (OGC) Discussion Paper (OGC 11-163); related discussions are conducted on a public forum hosted by the OGC. NetCDF-U may have implications for future work directed at
Opportunities for multivariate analysis of open spatial datasets to characterize urban flooding risks

NASA Astrophysics Data System (ADS)

Gaitan, S.; ten Veldhuis, J. A. E.

2015-06-01

Cities worldwide are challenged by increasing urban flood risks. Precise and realistic measures are required to reduce flooding impacts. However, currently implemented sewer and topographic models do not provide realistic predictions of local flooding occurrence during heavy rain events. Assessing other factors such as spatially distributed rainfall, socioeconomic characteristics, and social sensing, may help to explain probability and impacts of urban flooding. Several spatial datasets have been recently made available in the Netherlands, including rainfall-related incident reports made by citizens, spatially distributed rain depths, semidistributed socioeconomic information, and buildings age. Inspecting the potential of this data to explain the occurrence of rainfall related incidents has not been done yet. Multivariate analysis tools for describing communities and environmental patterns have been previously developed and used in the field of study of ecology. The objective of this paper is to outline opportunities for these tools to explore urban flooding risks patterns in the mentioned datasets. To that end, a cluster analysis is performed. Results indicate that incidence of rainfall-related impacts is higher in areas characterized by older infrastructure and higher population density.
Usefulness of DARPA dataset for intrusion detection system evaluation

NASA Astrophysics Data System (ADS)

Thomas, Ciza; Sharma, Vishwas; Balakrishnan, N.

2008-03-01

The MIT Lincoln Laboratory IDS evaluation methodology is a practical solution in terms of evaluating the performance of Intrusion Detection Systems, which has contributed tremendously to the research progress in that field. The DARPA IDS evaluation dataset has been criticized and considered by many as a very outdated dataset, unable to accommodate the latest trend in attacks. Then naturally the question arises as to whether the detection systems have improved beyond detecting these old level of attacks. If not, is it worth thinking of this dataset as obsolete? The paper presented here tries to provide supporting facts for the use of the DARPA IDS evaluation dataset. The two commonly used signature-based IDSs, Snort and Cisco IDS, and two anomaly detectors, the PHAD and the ALAD, are made use of for this evaluation purpose and the results support the usefulness of DARPA dataset for IDS evaluation.
Understanding and Affecting Science Teacher Candidates' Scientific Reasoning in Introductory Astrophysics

ERIC Educational Resources Information Center

Steinberg, Richard; Cormier, Sebastien

2013-01-01

This study reports on a content course for science immersion teacher candidates that emphasized authentic practice of science and thinking scientifically in the context of introductory astrophysics. We explore how 122 science teacher candidates spanning three cohorts did and did not reason scientifically and how this evolved in our program. Our…
Topological Landscapes: A Terrain Metaphor for ScientificData

DOE Office of Scientific and Technical Information (OSTI.GOV)

Weber, Gunther H.; Bremer, Peer-Timo; Pascucci, Valerio

2007-08-01

Scientific visualization and illustration tools are designed to help people understand the structure and complexity of scientific data with images that are as informative and intuitive as possible. In this context, the use of metaphors plays an important role, since they make complex information easily accessible by using commonly known concepts. In this paper we propose a new metaphor, called 'Topological Landscapes', which facilitates understanding the topological structure of scalar functions. The basic idea is to construct a terrain with the same topology as a given dataset and to display the terrain as an easily understood representation of the actualmore » input data. In this projection from an n-dimensional scalar function to a two-dimensional (2D) model we preserve function values of critical points, the persistence (function span) of topological features, and one possible additional metric property (in our examples volume). By displaying this topologically equivalent landscape together with the original data we harness the natural human proficiency in understanding terrain topography and make complex topological information easily accessible.« less
Poster Development and Presentation to Improve Scientific Inquiry and Broaden Effective Scientific Communication Skills.

PubMed

Rauschenbach, Ines; Keddis, Ramaydalis; Davis, Diane

2018-01-01

We have redesigned a tried-and-true laboratory exercise into an inquiry-based team activity exploring microbial growth control, and implemented this activity as the basis for preparing a scientific poster in a large, multi-section laboratory course. Spanning most of the semester, this project culminates in a poster presentation of data generated from a student-designed experiment. Students use and apply the scientific method and improve written and verbal communication skills. The guided inquiry format of this exercise provides the opportunity for student collaboration through cooperative learning. For each learning objective, a percentage score was tabulated (learning objective score = points awarded/total possible points). A score of 80% was our benchmark for achieving each objective. At least 76% of the student groups participating in this project over two semesters achieved each learning goal. Student perceptions of the project were evaluated using a survey. Nearly 90% of participating students felt they had learned a great deal in the areas of formulating a hypothesis, experimental design, and collecting and analyzing data; 72% of students felt this project had improved their scientific writing skills. In a separate survey, 84% of students who responded felt that peer review was valuable in improving their final poster submission. We designed this inquiry-based poster project to improve student scientific communication skills. This exercise is appropriate for any microbiology laboratory course whose learning outcomes include the development of scientific inquiry and literacy.
Large-scale Labeled Datasets to Fuel Earth Science Deep Learning Applications

NASA Astrophysics Data System (ADS)

Maskey, M.; Ramachandran, R.; Miller, J.

2017-12-01

Deep learning has revolutionized computer vision and natural language processing with various algorithms scaled using high-performance computing. However, generic large-scale labeled datasets such as the ImageNet are the fuel that drives the impressive accuracy of deep learning results. Large-scale labeled datasets already exist in domains such as medical science, but creating them in the Earth science domain is a challenge. While there are ways to apply deep learning using limited labeled datasets, there is a need in the Earth sciences for creating large-scale labeled datasets for benchmarking and scaling deep learning applications. At the NASA Marshall Space Flight Center, we are using deep learning for a variety of Earth science applications where we have encountered the need for large-scale labeled datasets. We will discuss our approaches for creating such datasets and why these datasets are just as valuable as deep learning algorithms. We will also describe successful usage of these large-scale labeled datasets with our deep learning based applications.
Exploring access to scientific literature using content-based image retrieval

NASA Astrophysics Data System (ADS)

Deserno, Thomas M.; Antani, Sameer; Long, Rodney

2007-03-01

The number of articles published in the scientific medical literature is continuously increasing, and Web access to the journals is becoming common. Databases such as SPIE Digital Library, IEEE Xplore, indices such as PubMed, and search engines such as Google provide the user with sophisticated full-text search capabilities. However, information in images and graphs within these articles is entirely disregarded. In this paper, we quantify the potential impact of using content-based image retrieval (CBIR) to access this non-text data. Based on the Journal Citations Report (JCR), the journal Radiology was selected for this study. In 2005, 734 articles were published electronically in this journal. This included 2,587 figures, which yields a rate of 3.52 figures per article. Furthermore, 56.4% of these figures are composed of several individual panels, i.e. the figure combines different images and/or graphs. According to the Image Cross-Language Evaluation Forum (ImageCLEF), the error rate of automatic identification of medical images is about 15%. Therefore, it is expected that, by applying ImageCLEF-like techniques, already 95.5% of articles could be retrieved by means of CBIR. The challenge for CBIR in scientific literature, however, is the use of local texture properties to analyze individual image panels in composite illustrations. Using local features for content-based image representation, 8.81 images per article are available, and the predicted correctness rate may increase to 98.3%. From this study, we conclude that CBIR may have a high impact in medical literature research and suggest that additional research in this area is warranted.
Assembling Large, Multi-Sensor Climate Datasets Using the SciFlo Grid Workflow System

NASA Astrophysics Data System (ADS)

Wilson, B. D.; Manipon, G.; Xing, Z.; Fetzer, E.

2008-12-01

NASA's Earth Observing System (EOS) is the world's most ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the A-Train platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over periods of years to decades. However, moving from predominantly single-instrument studies to a multi-sensor, measurement-based model for long-duration analysis of important climate variables presents serious challenges for large-scale data mining and data fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another instrument (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the cloud scenes from CloudSat, and repeat the entire analysis over years of AIRS data. To perform such an analysis, one must discover & access multiple datasets from remote sites, find the space/time matchups between instruments swaths and model grids, understand the quality flags and uncertainties for retrieved physical variables, and assemble merged datasets for further scientific and statistical analysis. To meet these large-scale challenges, we are utilizing a Grid computing and dataflow framework, named SciFlo, in which we are deploying a set of versatile and reusable operators for data query, access, subsetting, co-registration, mining, fusion, and advanced statistical analysis. SciFlo is a semantically-enabled ("smart") Grid Workflow system that ties together a peer-to-peer network of computers into an efficient engine for distributed computation. The SciFlo workflow engine enables scientists to do multi-instrument Earth Science by assembling remotely-invokable Web Services (SOAP or http GET URLs), native executables, command-line scripts, and Python codes into a distributed computing flow. A scientist visually authors the graph of operation in the VizFlow GUI, or uses a

Some links on this page may take you to non-federal websites. Their policies may differ from this site.