NASA Technical Reports Server (NTRS)
Todd, Nancy S.
2016-01-01
The rock and soil samples returned from the Apollo missions from 1969-72 have supported 46 years of research leading to advances in our understanding of the formation and evolution of the inner Solar System. NASA has been engaged in several initiatives that aim to restore, digitize, and make available to the public existing published and unpublished research data for the Apollo samples. One of these initiatives is a collaboration with IEDA (Interdisciplinary Earth Data Alliance) to develop MoonDB, a lunar geochemical database modeled after PetDB (Petrological Database of the Ocean Floor). In support of this initiative, NASA has adopted the use of IGSN (International Geo Sample Number) to generate persistent, unique identifiers for lunar samples that scientists can use when publishing research data. To facilitate the IGSN registration of the original 2,200 samples and over 120,000 subdivided samples, NASA has developed an application that retrieves sample metadata from the Lunar Curation Database and uses the SESAR API to automate the generation of IGSNs and registration of samples into SESAR (System for Earth Sample Registration). This presentation will describe the work done by NASA to map existing sample metadata to the IGSN metadata and integrate the IGSN registration process into the sample curation workflow, the lessons learned from this effort, and how this work can be extended in the future to help deal with the registration of large numbers of samples.
NASA Astrophysics Data System (ADS)
Todd, N. S.
2016-12-01
The rock and soil samples returned from the Apollo missions from 1969-72 have supported 46 years of research leading to advances in our understanding of the formation and evolution of the inner Solar System. NASA has been engaged in several initiatives that aim to restore, digitize, and make available to the public existing published and unpublished research data for the Apollo samples. One of these initiatives is a collaboration with IEDA (Interdisciplinary Earth Data Alliance) to develop MoonDB, a lunar geochemical database modeled after PetDB. In support of this initiative, NASA has adopted the use of IGSN (International Geo Sample Number) to generate persistent, unique identifiers for lunar samples that scientists can use when publishing research data. To facilitate the IGSN registration of the original 2,200 samples and over 120,000 subdivided samples, NASA has developed an application that retrieves sample metadata from the Lunar Curation Database and uses the SESAR API to automate the generation of IGSNs and registration of samples into SESAR (System for Earth Sample Registration). This presentation will describe the work done by NASA to map existing sample metadata to the IGSN metadata and integrate the IGSN registration process into the sample curation workflow, the lessons learned from this effort, and how this work can be extended in the future to help deal with the registration of large numbers of samples.
Utilizing the International GeoSample Number Concept during ICDP Expedition COSC
NASA Astrophysics Data System (ADS)
Conze, Ronald; Lorenz, Henning; Ulbricht, Damian; Gorgas, Thomas; Elger, Kirsten
2016-04-01
The concept of the International GeoSample Number (IGSN) was introduced to uniquely identify and register geo-related sample material, and make it retrievable via electronic media (e.g., SESAR - http://www.geosamples.org/igsnabout). The general aim of the IGSN concept is to improve accessing stored sample material worldwide, enable the exact identification, its origin and provenance, and also the exact and complete citation of acquired samples throughout the literature. The ICDP expedition COSC (Collisional Orogeny in the Scandinavian Caledonides, http://cosc.icdp-online.org) prompted for the first time in ICDP's history to assign and register IGSNs during an ongoing drilling campaign. ICDP drilling expeditions are using commonly the Drilling Information System DIS (http://doi.org/10.2204/iodp.sd.4.07.2007) for the inventory of recovered sample material. During COSC IGSNs were assigned to every drill hole, core run, core section, and sample taken from core material. The original IGSN specification has been extended to achieve the required uniqueness of IGSNs with our offline-procedure. The ICDP name space indicator and the Expedition ID (5054) are forming an extended prefix (ICDP5054). For every type of sample material, an encoded sequence of characters follows. This sequence is derived from the DIS naming convention which is unique from the beginning. Thereby every ICDP expedition has an unlimited name space for IGSN assignments. This direct derivation of IGSNs from the DIS database context ensures the distinct parent-child hierarchy of the IGSNs among each other. In the case of COSC this method of inventory-keeping of all drill cores was done routinely using the ExpeditionDIS during field work and subsequent sampling party. After completing the field campaign, all sample material was transferred to the "Nationales Bohrkernlager" in Berlin-Spandau, Germany. Corresponding data was subsequently imported into the CurationDIS used at the aforementioned core storage facility. This CurationDIS assigns IGSNs on samples newly taken in the repository in the identical fashion as done in the field. Thereby, the parent-child linkage of the IGSNs is ensured consistently throughout the entire sampling process. The only difference between ExpeditionDIS and CurationDIS sample curation is using the name space ICDP and BGRB respectively as part of the corresponding ID string. To prepare the IGSN registry, a set of metadata is generated for every assigned IGSN using the DIS, which is then exported from the DIS into one common xml-file. The xml-file is based on the SESAR schema and a proposal of IGSN e.V. (http://schema.igsn.org). This systematics has been recently extended for drilling data to achieve additional information for future retrieval options. The two allocation agents GFZ Potsdam und PANGAEA are currently involved in the registry of IGSNs in the case of COSC drill campaigns. An example for the IGSN registration of the COSC-1 drill hole A (5054_1_A) is "ICDP5054EEW1001" and can be resolved using the URL http://hdl.handle.net/10273/ICDP5054EEW1001. Opening the landing page for the complete COSC core material for this particular hole showcases graphically a hierarchical tree entitled "Sample Family". An example of an IGSN citation associated with a COSC sample set is featured on an EGU-2016 poster presentation by Ulrich Harms, Johannes Hierold et al. (EGU2016-8646).
IGSN e.V.: Registration and Identification Services for Physical Samples in the Digital Universe
NASA Astrophysics Data System (ADS)
Lehnert, K. A.; Klump, J.; Arko, R. A.; Bristol, S.; Buczkowski, B.; Chan, C.; Chan, S.; Conze, R.; Cox, S. J.; Habermann, T.; Hangsterfer, A.; Hsu, L.; Milan, A.; Miller, S. P.; Noren, A. J.; Richard, S. M.; Valentine, D. W.; Whitenack, T.; Wyborn, L. A.; Zaslavsky, I.
2011-12-01
The International Geo Sample Number (IGSN) is a unique identifier for samples and specimens collected from our natural environment. It was developed by the System for Earth Sample Registration SESAR to overcome the problem of ambiguous naming of samples that has limited the ability to share, link, and integrate data for samples across Geoscience data systems. Over the past 5 years, SESAR has made substantial progress in implementing the IGSN for sample and data management, working with Geoscience researchers, Geoinformatics specialists, and sample curators to establish metadata requirements, registration procedures, and best practices for the use of the IGSN. The IGSN is now recognized as the primary solution for sample identification and registration, and supported by a growing user community of investigators, repositories, science programs, and data systems. In order to advance broad disciplinary and international implementation of the IGSN, SESAR organized a meeting of international leaders in Geoscience informatics in 2011 to develop a consensus strategy for the long-term operations of the registry with approaches for sustainable operation, organizational structure, governance, and funding. The group endorsed an internationally unified approach for registration and discovery of physical specimens in the Geosciences, and refined the existing SESAR architecture to become a modular and scalable approach, separating the IGSN Registry from a central Sample Metadata Clearinghouse (SESAR), and introducing 'Local Registration Agents' that provide registration services to specific disciplinary or organizational communities, with tools for metadata submission and management, and metadata archiving. Development and implementation of the new IGSN architecture is underway with funding provided by the US NSF Office of International Science and Engineering. A formal governance structure is being established for the IGSN model, consisting of (a) an international not-for-profit organization, the IGSN e.V. (e.V. = 'Eingetragener Verein', legal status for a registered voluntary association in Germany), that defines the IGSN scope and syntax and maintains the IGSN Handle system, and (b) a Science Advisory Board that guides policies, technology, and best practices of the SESAR Sample Metadata Clearinghouse and Local Registration Agents. The IGSN e.V. is being incorporated in Germany at the GFZ Potsdam, a founding event is planned for the AGU Fall Meeting.
Sample Identification at Scale - Implementing IGSN in a Research Agency
NASA Astrophysics Data System (ADS)
Klump, J. F.; Golodoniuc, P.; Wyborn, L. A.; Devaraju, A.; Fraser, R.
2015-12-01
Earth sciences are largely observational and rely on natural samples, types of which vary significantly between science disciplines. Sharing and referencing of samples in scientific literature and across the Web requires the use of globally unique identifiers essential for disambiguation. This practice is very common in other fields, e.g. ISBN in publishing, doi in scientific literature, etc. In Earth sciences however, this is still often done in an ad-hoc manner without the use of unique identifiers. The International Geo Sample Number (IGSN) system provides a persistent, globally unique label for identifying environmental samples. As an IGSN allocating agency, CSIRO implements the IGSN registration service at the organisational scale with contributions from multiple research groups. Capricorn Distal Footprints project is one of the first pioneers and early adopters of the technology in Australia. For this project, IGSN provides a mechanism for identification of new and legacy samples, as well as derived sub-samples. It will ensure transparency and reproducibility in various geochemical sampling campaigns that will involve a diversity of sampling methods. Hence, diverse geochemical and isotopic results can be linked back to the parent sample, particularly where multiple children of that sample have also been analysed. The IGSN integration for this project is still in early stages and requires further consultations on the governance mechanisms that we need to put in place to allow efficient collaboration within CSIRO and collaborating partners on the project including naming conventions, service interfaces, etc. In this work, we present the results of the initial implementation of IGSN in the context of the Capricorn Distal Footprints project. This study has so far demonstrated the effectiveness of the proposed approach, while maintaining the flexibility to adapt to various media types, which is critical in the context of a multi-disciplinary project.
The IGSN Experience: Successes and Challenges of Implementing Persistent Identifiers for Samples
NASA Astrophysics Data System (ADS)
Lehnert, Kerstin; Arko, Robert
2016-04-01
Physical samples collected and studied in the Earth sciences represent both a research resource and a research product in the Earth Sciences. As such they need to be properly managed, curated, documented, and cited to ensure re-usability and utility for future science, reproducibility of the data generated by their study, and credit for funding agencies and researchers who invested substantial resources and intellectual effort into their collection and curation. Use of persistent and unique identifiers and deposition of metadata in a persistent registry are therefore as important for physical samples as they are for digital data. The International Geo Sample Number (IGSN) is a persistent, globally unique identifier. Its adoption by individual investigators, repository curators, publishers, and data managers is rapidly growing world-wide. This presentation will provide an analysis of the development and implementation path of the IGSN and relevant insights and experiences gained along its way. Development of the IGSN started in 2004 as part of a US NSF-funded project to establish a registry for sample metadata, the System for Earth Sample Registration (SESAR). The initial system provided a centralized solution for users to submit information about their samples and obtain IGSNs and bar codes. Challenges encountered during this initial phase related to defining the scope of the registry, granularity of registered objects, responsibilities of relevant actors, and workflows, and designing the registry's metadata schema, its user interfaces, and the identifier itself, including its syntax. The most challenging task though was to make the IGSN an integral part of personal and institutional sample management, digital management of sample-based data, and data publication on a global scale. Besides convincing individual researchers, curators, editors and publishers, as well as data managers in US and non-US academia, state and federal agencies, the PIs of the SESAR project needed to identify ways to organize, operate, and govern the global registry in the short and in the long-term. A major breakthrough was achieved at an international workshop in February 2011, at which participants designed a new distributed and scalable architecture for the IGSN with international governance by a membership organization modeled after the DataCite consortium. The founding of the international governing body and implementation organization for the IGSN, the IGSN e.V., took place at the AGU Fall Meeting 2011. Recent progress came at a workshop in September 2015, where stakeholders from both geoscience and life science disciplines drafted a standard IGSN metadata schema for describing samples, to complement the existing schema for registering samples. Consensus was achieved on an essential set of properties to describe a sample's origin and classification, creating a "birth certificate" for the sample. Further consensus was achieved in clarifying that an IGSN may represent exactly one physical sample, sampling feature, or collection of samples; and in aligning the IGSN schema with the existing Observations Data Model (ODM-2). The resulting schema was published online at schema.igsn.org and presented at the AGU Fall Meeting 2015.
Building an Internet of Samples: The Australian Contribution
NASA Astrophysics Data System (ADS)
Wyborn, Lesley; Klump, Jens; Bastrakova, Irina; Devaraju, Anusuriya; McInnes, Brent; Cox, Simon; Karssies, Linda; Martin, Julia; Ross, Shawn; Morrissey, John; Fraser, Ryan
2017-04-01
Physical samples are often the ground truth to research reported in the scientific literature across multiple domains. They are collected by many different entities (individual researchers, laboratories, government agencies, mining companies, citizens, museums, etc.). Samples must be curated over the long-term to ensure both that their existence is known, and to allow any data derived from them through laboratory and field tests to be linked to the physical samples. For example, having unique identifiers that link back ground truth data on the original sample helps calibrate large volumes of remotely sensed data. Access to catalogues of reliably identified samples from several collections promotes collaboration across all Earth Science disciplines. It also increases the cost effectiveness of research by reducing the need to re-collect samples in the field. The assignment of web identifiers to the digital representations of these physical objects allows us to link to data, literature, investigators and institutions, thus creating an "Internet of Samples". An Australian implementation of the "Internet of Samples" is using the IGSN (International Geo Sample Number, http://igsn.github.io) to identify samples in a globally unique and persistent way. IGSN was developed in the solid earth science community and is recommended for sample identification by the Coalition for Publishing Data in the Earth and Space Sciences (COPDESS). IGSN is interoperable with other persistent identifier systems such as DataCite. Furthermore, the basic IGSN description metadata schema is compatible with existing schemas such as OGC Observations and Measurements (O&M) and DataCite Metadata Schema which makes crosswalks to other metadata schemas easy. IGSN metadata is disseminated through the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) allowing it to be aggregated in other applications such as portals (e.g. the Australian IGSN catalogue http://igsn2.csiro.au). The metadata is available in more than one format. The software for IGSN web services is based on components developed for DataCite and adapted to the specific requirements of IGSN. This cooperation in open source development ensures sustainable implementation and faster turnaround times for updates. IGSN, in particular in its Australian implementation, is characterised by a federated approach to system architecture and organisational governance giving it the necessary flexibility to adapt to particular local practices within multiple domains, whilst maintaining an overarching international standard. The three current IGSN allocation agents in Australia: Geoscience Australia, CSIRO and Curtin University, represent different sectors. Through funding from the Australian Research Data Services Program they have combined to develop a common web portal that allows discovery of physical samples and sample collections at a national level.International governance then ensures we can link to an international community but at the same time act locally to ensure the services offered are relevant to the needs of Australian researchers. This flexibility aids the integration of new disciplines into a global community of a physical samples information network.
NASA Astrophysics Data System (ADS)
Bastrakova, I.; Klump, J. F.; McInnes, B.; Wyborn, L. A.; Brown, A.
2015-12-01
The International Geo-Sample Number (IGSN) provides a globally unique identifier for physical samples used to generate analytical data. This unique identifier provides the ability to link each physical sample to any analytical data undertaken on that sample, as well as to any publications derived from any data derived on the sample. IGSN is particularly important for geochemical and geochronological data, where numerous analytical techniques can be undertaken at multiple analytical facilities not only on the parent rock sample itself, but also on derived sample splits and mineral separates. Australia now has three agencies implementing IGSN: Geoscience Australia, CSIRO and Curtin University. All three have now combined into a single project, funded by the Australian Research Data Services program, to better coordinate the implementation of IGSN in Australia, in particular how these agencies allocate IGSN identifiers. The project will register samples from pilot applications in each agency including the CSIRO National Collection of Mineral Spectra database, the Geoscience Australia sample collection, and the Digital Mineral Library of the John De Laeter Centre for Isotope Research at Curtin University. These local agency catalogues will then be aggregated into an Australian portal, which will ultimately be expanded for all geoscience specimens. The development of this portal will also involve developing a common core metadata schema for the description of Australian geoscience specimens, as well as formulating agreed governance models for registering Australian samples. These developments aim to enable a common approach across Australian academic, research organisations and government agencies for the unique identification of geoscience specimens and any analytical data and/or publications derived from them. The emerging pattern of governance and technical collaboration established in Australia may also serve as a blueprint for similar collaborations internationally.
NASA Astrophysics Data System (ADS)
Devaraju, Anusuriya; Klump, Jens; Tey, Victor; Fraser, Ryan
2016-04-01
Physical samples such as minerals, soil, rocks, water, air and plants are important observational units for understanding the complexity of our environment and its resources. They are usually collected and curated by different entities, e.g., individual researchers, laboratories, state agencies, or museums. Persistent identifiers may facilitate access to physical samples that are scattered across various repositories. They are essential to locate samples unambiguously and to share their associated metadata and data systematically across the Web. The International Geo Sample Number (IGSN) is a persistent, globally unique label for identifying physical samples. The IGSNs of physical samples are registered by end-users (e.g., individual researchers, data centers and projects) through allocating agents. Allocating agents are the institutions acting on behalf of the implementing organization (IGSN e.V.). The Commonwealth Scientific and Industrial Research Organisation CSIRO) is one of the allocating agents in Australia. To implement IGSN in our organisation, we developed a RESTful service and a metadata model. The web service enables a client to register sub-namespaces and multiple samples, and retrieve samples' metadata programmatically. The metadata model provides a framework in which different types of samples may be represented. It is generic and extensible, therefore it may be applied in the context of multi-disciplinary projects. The metadata model has been implemented as an XML schema and a PostgreSQL database. The schema is used to handle sample registrations requests and to disseminate their metadata, whereas the relational database is used to preserve the metadata records. The metadata schema leverages existing controlled vocabularies to minimize the scope for error and incorporates some simplifications to reduce complexity of the schema implementation. The solutions developed have been applied and tested in the context of two sample repositories in CSIRO, the Capricorn Distal Footprints project and the Rock Store.
Beyond 10 Years of Evolving the IGSN Architecture: What's Next?
NASA Astrophysics Data System (ADS)
Lehnert, K.; Arko, R. A.
2016-12-01
The IGSN was developed as part of a US NSF-funded project, which started in 2004 to establish a registry for sample metadata, the System for Earth Sample Registration (SESAR). The initial version of the system provided a centralized solution for users to submit information about their samples and obtain IGSNs and bar codes. A new distributed architecture for the IGSN was designed at a workshop in 2011 that aimed to advance the global implementation of the IGSN. The workshop led to the founding of an international non-profit organization, the IGSN e.V., that adopted the governance model of the DataCite consortium as a non-profit membership organization and its architecture with a central registry and a network of distributed Allocating Agents that provide registration services to the users. Recent progress came at a workshop in 2015, where stakeholders from both geoscience and life science disciplines drafted a standard IGSN metadata schema for describing samples with an essential set of properties about the sample's origin and classification, creating a "birth certificate" for the sample. Consensus was reached that the IGSN should also be used to identify sampling features and collection of samples. The IGSN e.V. global network has steadily grown, with now members in 4 continents and 5 Allocating Agents operational in the US, Australia, and Europe. A Central Catalog has been established at the IGSN Management Office that harvests "birth certificate" metadata records from Allocating Agents via the Open Archives Initiative Protocol for Metadata Harvest (OAI-PMH), and publishes them as a Linked Open Data graph using the Resource Description Framework (RDF) and RDF Query Language (SPARQL) for reuse by Semantic Web clients. Next developments will include a web-based validation service that allows journal editors to check the validity of IGSNs and compliance with metadata requirements, and use of community-recommended vocabularies for specific disciplines.
The Geoscience Internet of Things
NASA Astrophysics Data System (ADS)
Lehnert, K.; Klump, J.
2012-04-01
Internet of Things is a term that refers to "uniquely identifiable objects (things) and their virtual representations in an Internet-like structure" (Wikipedia). We here use the term to describe new and innovative ways to integrate physical samples in the Earth Sciences into the emerging digital infrastructures that are developed to support research and education in the Geosciences. Many Earth Science data are acquired on solid earth samples through observations and experiments conducted in the field or in the lab. The application and long-term utility of sample-based data for science is critically dependent on (a) the availability of information (metadata) about the samples such as geographical location where the sample was collected, time of sampling, sampling method, etc. (b) links between the different data types available for individual samples that are dispersed in the literature and in digital data repositories, and (c) access to the samples themselves. Neither of these requirements could be achieved in the past due to incomplete documentation of samples in publications, use of ambiguous sample names, and the lack of a central catalog that allows researchers to find a sample's archiving location. New internet-based capabilities have been developed over the past few years for the registration and unique identification of samples that make it possible to overcome these problems. Services for the registration and unique identification of samples are provided by the System for Earth Sample Registration SESAR (www.geosamples.org). SESAR developed the International Geo Sample Number, or IGSN, as a unique identifier for samples and specimens collected from our natural environment. Since December 2011, the IGSN is governed by an international organization, the IGSN eV (www.igsn.org), which endorses and promotes an internationally unified approach for registration and discovery of physical specimens in the Geoscience community and is establishing a new modular and scalable architecture for the IGSN to advance global implementation. Use of the IGSN will, for the first time, allow to establish links between samples (or the digital representation of them), data acquired on these samples, and the publications that report these data. Samples can be linked to a dataset by including IGSNs in the metadata record of a dataset's DOI® when the dataset is registered with the DOI® system for unique identification. Links between datasets and publications already have been implemented based on dataset DOIs® between some Geoscience journals and data centers that are Publication Agents in the DataCite consortium (www.datacite.org). Links between IGSNs, dataset DOIs, and publication DOIs will in the future allow researchers to find and access with a single query and without ambiguity all data acquired on a specific sample across the entire literature.
The Digital Sample: Metadata, Unique Identification, and Links to Data and Publications
NASA Astrophysics Data System (ADS)
Lehnert, K. A.; Vinayagamoorthy, S.; Djapic, B.; Klump, J.
2006-12-01
A significant part of digital data in the Geosciences refers to physical samples of Earth materials, from igneous rocks to sediment cores to water or gas samples. The application and long-term utility of these sample-based data in research is critically dependent on (a) the availability of information (metadata) about the samples such as geographical location and time of sampling, or sampling method, (b) links between the different data types available for individual samples that are dispersed in the literature and in digital data repositories, and (c) access to the samples themselves. Major problems for achieving this include incomplete documentation of samples in publications, use of ambiguous sample names, and the lack of a central catalog that allows to find a sample's archiving location. The International Geo Sample Number IGSN, managed by the System for Earth Sample Registration SESAR, provides solutions for these problems. The IGSN is a unique persistent identifier for samples and other GeoObjects that can be obtained by submitting sample metadata to SESAR (www.geosamples.org). If data in a publication is referenced to an IGSN (rather than an ambiguous sample name), sample metadata can readily be extracted from the SESAR database, which evolves into a Global Sample Catalog that also allows to locate the owner or curator of the sample. Use of the IGSN in digital data systems allows building linkages between distributed data. SESAR is contributing to the development of sample metadata standards. SESAR will integrate the IGSN in persistent, resolvable identifiers based on the handle.net service to advance direct linkages between the digital representation of samples in SESAR (sample profiles) and their related data in the literature and in web-accessible digital data repositories. Technologies outlined by Klump et al. (this session) such as the automatic creation of ontologies by text mining applications will be explored for harvesting identifiers of publications and datasets that contain information about a specific sample in order to establish comprehensive data profiles for samples.
Metadata, Identifiers, and Physical Samples
NASA Astrophysics Data System (ADS)
Arctur, D. K.; Lenhardt, W. C.; Hills, D. J.; Jenkyns, R.; Stroker, K. J.; Todd, N. S.; Dassie, E. P.; Bowring, J. F.
2016-12-01
Physical samples are integral to much of the research conducted by geoscientists. The samples used in this research are often obtained at significant cost and represent an important investment for future research. However, making information about samples - whether considered data or metadata - available for researchers to enable discovery is difficult: a number of key elements related to samples are difficult to characterize in common ways, such as classification, location, sample type, sampling method, repository information, subsample distribution, and instrumentation, because these differ from one domain to the next. Unifying these elements or developing metadata crosswalks is needed. The iSamples (Internet of Samples) NSF-funded Research Coordination Network (RCN) is investigating ways to develop these types of interoperability and crosswalks. Within the iSamples RCN, one of its working groups, WG1, has focused on the metadata related to physical samples. This includes identifying existing metadata standards and systems, and how they might interoperate with the International Geo Sample Number (IGSN) schema (schema.igsn.org) in order to help inform leading practices for metadata. For example, we are examining lifecycle metadata beyond the IGSN `birth certificate.' As a first step, this working group is developing a list of relevant standards and comparing their various attributes. In addition, the working group is looking toward technical solutions to facilitate developing a linked set of registries to build the web of samples. Finally, the group is also developing a comparison of sample identifiers and locators. This paper will provide an overview and comparison of the standards identified thus far, as well as an update on the technical solutions examined for integration. We will discuss how various sample identifiers might work in complementary fashion with the IGSN to more completely describe samples, facilitate retrieval of contextual information, and access research work on related samples. Finally, we welcome suggestions and community input to move physical sample unique identifiers forward.
Opportunities and Challenges of Linking Scientific Core Samples to the Geoscience Data Ecosystem
NASA Astrophysics Data System (ADS)
Noren, A. J.
2016-12-01
Core samples generated in scientific drilling and coring are critical for the advancement of the Earth Sciences. The scientific themes enabled by analysis of these samples are diverse, and include plate tectonics, ocean circulation, Earth-life system interactions (paleoclimate, paleobiology, paleoanthropology), Critical Zone processes, geothermal systems, deep biosphere, and many others, and substantial resources are invested in their collection and analysis. Linking core samples to researchers, datasets, publications, and funding agencies through registration of globally unique identifiers such as International Geo Sample Numbers (IGSNs) offers great potential for advancing several frontiers. These include maximizing sample discoverability, access, reuse, and return on investment; a means for credit to researchers; and documentation of project outputs to funding agencies. Thousands of kilometers of core samples and billions of derivative subsamples have been generated through thousands of investigators' projects, yet the vast majority of these samples are curated at only a small number of facilities. These numbers, combined with the substantial similarity in sample types, make core samples a compelling target for IGSN implementation. However, differences between core sample communities and other geoscience disciplines continue to create barriers to implementation. Core samples involve parent-child relationships spanning 8 or more generations, an exponential increase in sample numbers between levels in the hierarchy, concepts related to depth/position in the sample, requirements for associating data derived from core scanning and lithologic description with data derived from subsample analysis, and publications based on tens of thousands of co-registered scan data points and thousands of analyses of subsamples. These characteristics require specialized resources for accurate and consistent assignment of IGSNs, and a community of practice to establish norms, workflows, and infrastructure to support implementation.
What is Next? Linking all Samples of Planet Earth.
NASA Astrophysics Data System (ADS)
Wyborn, L. A.; Lehnert, K.; Klump, J. F.; Arko, R. A.; Cox, S. J. D.; Devaraju, A.; Elger, K.; Murphy, F.; Fleischer, D.
2016-12-01
The process of sampling, observing and analyzing physical samples is not unique to the geosciences. Physical sampling (taking specimens) is a fundamental strategy in many natural sciences, typically to support ex-situ observations in laboratories with the goal of characterizing real-world entities or populations. Observations and measurements are made on individual specimens and their derived samples in various ways, with results reported in research publications. Research on an individual sample is often published in numerous articles, based on multiple, potentially unrelated research programs conducted over many years. Even high-volume Earth observation datasets are proxies of real world phenomena and require calibration by measurements made on position located, well described physical samples. Unique, persistent web-compatible identifiers for physical objects and related sampling features are required to ensure their unambiguous citation and connection to related datasets through web identifiers. Identifier systems have been established within specific domains (e.g., bio, geo, hydro) or different sectors (e.g., museums, government agencies, universities), including the International Geo Sample Number (IGSN) in the geosciences, which has been used for rock, fossil, mineral, soil, regolith, fluid, plant and synthetic materials. IGSNs are issued through a governance system that ensures they are globally unique. Each IGSN directs to a digital representation of the physical object via the Handle.net global resolver system, the same system used for resolving DOI. To enable the unique identification of all samples on Planet Earth and of data derived from them, the next step is to ensure IGSNs can either be integrated with comparable identifier systems in other domains/sectors, or introduced into domains that do not have a viable system. A registry of persistent identifier systems for physical samples would allow users to choose which system best suits their needs. Such a registry may also facilitate unifying best practice in these multiple systems to enable consistent referencing of physical samples and of methods used to link digital data to its sources. IGSNs could be extended into other domains, but additional methodologies of sample collection, curation and processing may need to be considered.
NASA Astrophysics Data System (ADS)
Stall, S.
2016-12-01
The story of a sample starts with a proposal, a data management plan, and funded research. The sample is created, given a unique identifier (IGSN) and properly cared for during its journey to an appropriate storage location. Through its metadata, and publication information, the sample can become well known and shared with other researchers. Ultimately, a valuable sample can tell its entire story through its IGSN, associated ORCIDs, associated publication DOIs, and DOIs of data generated from sample analysis. This journey, or workflow, is in many ways still manual. Tools exist to generate IGSNs for the sample and subsamples. Publishers are committed to making IGSNs machine readable in their journals, but the connection back to the IGSN management system, specifically the System for Earth Sample Registration (SESAR) is not fully complete. Through encouragement of publishers, like AGU, and improved data management practices, such as those promoted by AGU's Data Management Assessment program, the complete lifecycle of a sample can and will be told through the journey it takes from creation, documentation (metadata), analysis, subsamples, publication, and sharing. Publishers and data facilities are using efforts like the Coalition for Publishing Data in the Earth and Space Sciences (COPDESS) to "implement and promote common policies and procedures for the publication and citation of data across Earth Science journals", including IGSNs. As our community improves its data management practices and publishers adopt and enforce machine readable use of unique sample identifiers, the ability to tell the entire story of a sample is close at hand. Better Data Management results in Better Science.
Forensic Tools to Track and Connect Physical Samples to Related Data
NASA Astrophysics Data System (ADS)
Molineux, A.; Thompson, A. C.; Baumgardner, R. W.
2016-12-01
Identifiers, such as local sample numbers, are critical to successfully connecting physical samples and related data. However, identifiers must be globally unique. The International Geo Sample Number (IGSN) generated when registering the sample in the System for Earth Sample Registration (SESAR) provides a globally unique alphanumeric code associated with basic metadata, related samples and their current physical storage location. When registered samples are published, users can link the figured samples to the basic metadata held at SESAR. The use cases we discuss include plant specimens from a Permian core, Holocene corals and derived powders, and thin sections with SEM stubs. Much of this material is now published. The plant taxonomic study from the core is a digital pdf and samples can be directly linked from the captions to the SESAR record. The study of stable isotopes from the corals is not yet digitally available, but individual samples are accessible. Full data and media records for both studies are located in our database where higher quality images, field notes, and section diagrams may exist. Georeferences permit mapping in current and deep time plate configurations. Several aspects emerged during this study. The first, ensure adequate and consistent details are registered with SESAR. Second, educate and encourage the researcher to obtain IGSNs. Third, publish the archive numbers, assigned prior to publication, alongside the IGSN. This provides access to further data through an Integrated Publishing Toolkit (IPT)/aggregators/or online repository databases, thus placing the initial sample in a much richer context for future studies. Fourth, encourage software developers to customize community software to extract data from a database and use it to register samples in bulk. This would improve workflow and provide a path for registration of large legacy collections.
NASA Astrophysics Data System (ADS)
Hsu, L.; Lehnert, K. A.; Carbotte, S. M.; Arko, R. A.; Ferrini, V.; O'hara, S. H.; Walker, J. D.
2012-12-01
The Integrated Earth Data Applications (IEDA) facility maintains multiple data systems with a wide range of solid earth data types from the marine, terrestrial, and polar environments. Examples of the different data types include syntheses of ultra-high resolution seafloor bathymetry collected on large collaborative cruises and analytical geochemistry measurements collected by single investigators in small, unique projects. These different data types have historically been channeled into separate, discipline-specific databases with search and retrieval tailored for the specific data type. However, a current major goal is to integrate data from different systems to allow interdisciplinary data discovery and scientific analysis. To increase discovery and access across these heterogeneous systems, IEDA employs several unique IDs, including sample IDs (International Geo Sample Number, IGSN), person IDs (GeoPass ID), funding award IDs (NSF Award Number), cruise IDs (from the Marine Geoscience Data System Expedition Metadata Catalog), dataset IDs (DOIs), and publication IDs (DOIs). These IDs allow linking of a sample registry (System for Earth SAmple Registration), data libraries and repositories (e.g. Geochemical Research Library, Marine Geoscience Data System), integrated synthesis databases (e.g. EarthChem Portal, PetDB), and investigator services (IEDA Data Compliance Tool). The linked systems allow efficient discovery of related data across different levels of granularity. In addition, IEDA data systems maintain links with several external data systems, including digital journal publishers. Links have been established between the EarthChem Portal and ScienceDirect through publication DOIs, returning sample-level objects and geochemical analyses for a particular publication. Linking IEDA-hosted data to digital publications with IGSNs at the sample level and with IEDA-allocated dataset DOIs are under development. As an example, an individual investigator could sign up for a GeoPass account ID, write a proposal to NSF and create a data plan using the IEDA Data Management Plan Tool. Having received the grant, the investigator then collects rock samples on a scientific cruise from dredges and registers the samples with IGSNs. The investigator then performs analytical geochemistry on the samples, and submits the full dataset to the Geochemical Resource Library for a dataset DOI. Finally, the investigator writes an article that is published in Science Direct. Knowing any of the following IDs: Investigator GeoPass ID, NSF Award Number, Cruise ID, Sample IGSNs, dataset DOI, or publication DOI, a user would be able to navigate to all samples, datasets, and publications in IEDA and external systems. Use of persistent identifiers to link heterogeneous data systems in IEDA thus increases access, discovery, and proper citation of hard-earned investigator datasets.
Rock and Core Repository Coming Digital
NASA Astrophysics Data System (ADS)
Maicher, Doris; Fleischer, Dirk; Czerniak, Andreas
2016-04-01
In times of whole city centres being available by a mouse click in 3D to virtually walk through, reality sometimes becomes neglected. The reality of scientific sample collections not being digitised to the essence of molecules, isotopes and electrons becomes unbelievable to the upgrowing generation of scientists. Just like any other geological institute the Helmholtz Centre for Ocean Research GEOMAR accumulated thousands of specimen. The samples, collected mainly during marine expeditions, date back as far as 1964. Today GEOMAR houses a central geological sample collection of at least 17 000 m of sediment core and more than 4 500 boxes with hard rock samples and refined sample specimen. This repository, having been dormant, missed the onset of the interconnected digital age. Physical samples without barcodes, QR codes or RFID tags need to be migrated and reconnected, urgently. In our use case, GEOMAR opted for the International Geo Sample Number IGSN as the persistent identifier. Consequentially, the software CurationDIS by smartcube GmbH as the central component of this project was selected. The software is designed to handle acquisition and administration of sample material and sample archiving in storage places. In addition, the software allows direct embedding of IGSN. We plan to adopt IGSN as a future asset, while for the initial inventory taking of our sample material, simple but unique QR codes act as "bridging identifiers" during the process. Currently we compile an overview of the broad variety of sample types and their associated data. QR-coding of the boxes of rock samples and sediment cores is near completion, delineating their location in the repository and linking a particular sample to any information available about the object. Planning is in progress to streamline the flow from receiving new samples to their curation to sharing samples and information publically. Additionally, interface planning for linkage to GEOMAR databases OceanRep (publications) and OSIS (expeditions) as well as for external data retrieval are in the pipeline. Looking ahead to implement IGSN, taking on board lessons learned from earlier generations, it will enable to comply with our institute's open science policy. Also it will allow to register newly collected samples already during ship expeditions. They thus receive their "birth certificate" contemporarily in this ever faster revolving scientific world.
NASA Astrophysics Data System (ADS)
Lehnert, K. A.; Goldstein, S. L.; Vinayagamoorthy, S.; Lenhardt, W. C.
2005-12-01
Data on samples represent a primary foundation of Geoscience research across disciplines, ranging from the study of climate change, to biogeochemical cycles, to mantle and continental dynamics and are key to our knowledge of the Earth's dynamical systems and evolution. Different data types are generated for individual samples by different research groups, published in different papers, and stored in different databases on a global scale. The utility of these data is critically dependent on their integration. Such integration can be achieved within a Geoscience Cyberinfrastructure, but requires unambiguous identification of samples. Currently, naming of samples is arbitrary and inconsistent and therefore severely limits our ability to share, link, and integrate sample-based data. Major problems include name duplication, and changing of names as a sample is passed along over many years to different investigators. SESAR, the System for Earth Sample Registration (http://www.geosamples.org), addresses this problem by building a registry that generates and administers globally unique identifiers for Geoscience samples: the International Geo Sample Number (IGSN). Implementation of the IGSN in data publication and digital data management will dramatically advance interoperability among information systems for sample-based data, opening an extensive range of new opportunities for discovery and for interdisciplinary approaches in research. The IGSN will also facilitate the ability of investigators to build on previously collected data on samples as new measurements are made or new techniques are developed. With potentially broad application to all types of Geoscience samples, SESAR is global in scope. It is a web-based system that can be easily accessed by individual users through an interactive web interface and by distributed client systems via standard web services. Samples can be registered individually or in batches and at various levels of granularity from entire cores or dredges or sample suites to individual samples to sub-samples such as splits and separates. Relationships between `parent' and `child' samples are tracked. The system generates bar codes that users can download as images for labeling purposes. SESAR released a beta version of the registry in April 2005 that allows users to register a limited range of sample types. Identifiers generated by the beta version will remain valid when SESAR moves into its operational stage. Since then more than 3700 samples have been registered in SESAR. Registration of samples at a central clearinghouse will automatically build a global catalog of Geoscience samples, which will become a hugely valuable resource for the Geoscience community that allows more efficient planning of field and laboratory projects and facilitates sharing of samples, which will help build more comprehensive data sets for individual samples. The SESAR catalog will provide links to sample profiles on external systems that hold data about samples, thereby enabling users to easily obtain complete information about samples.
NASA Astrophysics Data System (ADS)
Averett, A.; DeJarnett, B. B.
2016-12-01
The University Of Texas Bureau Of Economic Geology (BEG) serves as the geological survey for Texas and operates three geological sample repositories that house well over 2 million boxes of geological samples (cores and cuttings) and an abundant amount of geoscience data (geophysical logs, thin sections, geochemical analyses, etc.). Material is accessible and searchable online, and it is publically available to the geological community for research and education. Patrons access information about our collection by using our online core and log database (SQL format). BEG is currently undertaking a large project to: 1) improve the internal accuracy of metadata associated with the collection; 2) enhance the capabilities of the database for both BEG curators and researchers as well as our external patrons; and 3) ensure easy and efficient navigation for patrons through our online portal. As BEG undertakes this project, BEG is in the early stages of planning to export the metadata for its collection into SESAR (System for Earth Sample Registration) and have IGSN's (International GeoSample Numbers) assigned to its samples. Education regarding the value of IGSN's and an external registry (SESAR) has been crucial to receiving management support for the project because the concept and potential benefits of registering samples in a registry outside of the institution were not well-known prior to this project. Potential benefits such as increases in discoverability, repository recognition in publications, and interoperability were presented. The project was well-received by management, and BEG fully supports the effort to register our physical samples with SESAR. Since BEG is only in the initial phase of this project, any stumbling blocks, workflow issues, successes/failures, etc. can only be predicted at this point, but by mid-December, BEG expects to have several concrete issues to present in the session. Currently, our most pressing issue involves establishing the most efficient workflow for exporting of large amounts of metadata in a format that SESAR can easily ingest, and how this can be best accomplished with very few BEG staff assigned to the project.
Data Management and Rescue at a State Geological Survey
NASA Astrophysics Data System (ADS)
Hills, D. J.; McIntyre-Redden, M. R.
2015-12-01
As new technologies are developed to utilize data more fully, and as shrinking budgets mean more needs to be done with less, well-documented and discoverable legacy data is vital for continued research and economic growth. Many governmental agencies are mandated to maintain scientific data, and the Geological Survey of Alabama (GSA) is no different. As part of the mandate to explore for, characterize, and report Alabama's mineral, energy, water, and biological resources for the betterment of Alabama's citizens, communities, and businesses, the GSA has increasingly been called upon to make our data (including samples) more accessible to stakeholders. The GSA has been involved in several data management, preservation, and rescue projects, including the National Geothermal Data System and the National Geological and Geophysical Data Preservation Program. GSA staff utilizes accepted standards for metadata, such as those found at the US Geoscience Information Network (USGIN). Through the use of semi-automated workflows, these standards can be applied to legacy data records. As demand for more detailed information on samples increases, especially so that a researcher can do a preliminary assessment prior to a site visit, it has become critical for the efficiency of the GSA to have better systems in place for sample tracking and data management. Thus, GSA is in the process of registering cores and related samples for International Geo Sample Numbers (IGSNs) through the System for Earth Sample Registration. IGSNs allow the GSA to use asset management software to better curate the physical samples and provide more accurate information to stakeholders. Working with other initiatives, such as EarthCube's iSamples project, will ensure that GSA continues to use best practices and standards for sample identification, documentation, citation, curation, and sharing.
NASA Astrophysics Data System (ADS)
Chan, S.; Lehnert, K. A.; Coleman, R. J.
2011-12-01
SESAR, the System for Earth Sample Registration, is an online registry for physical samples collected for Earth and environmental studies. SESAR generates and administers the International Geo Sample Number IGSN, a unique identifier for samples that is dramatically advancing interoperability amongst information systems for sample-based data. SESAR was developed to provide the complete range of registry services, including definition of IGSN syntax and metadata profiles, registration and validation of name spaces requested by users, tools for users to submit and manage sample metadata, validation of submitted metadata, generation and validation of the unique identifiers, archiving of sample metadata, and public or private access to the sample metadata catalog. With the development of SESAR v3, we placed particular emphasis on creating enhanced tools that make metadata submission easier and more efficient for users, and that provide superior functionality for users to manage metadata of their samples in their private workspace MySESAR. For example, SESAR v3 includes a module where users can generate custom spreadsheet templates to enter metadata for their samples, then upload these templates online for sample registration. Once the content of the template is uploaded, it is displayed online in an editable grid format. Validation rules are executed in real-time on the grid data to ensure data integrity. Other new features of SESAR v3 include the capability to transfer ownership of samples to other SESAR users, the ability to upload and store images and other files in a sample metadata profile, and the tracking of changes to sample metadata profiles. In the next version of SESAR (v3.5), we will further improve the discovery, sharing, registration of samples. For example, we are developing a more comprehensive suite of web services that will allow discovery and registration access to SESAR from external systems. Both batch and individual registrations will be possible through web services. Based on valuable feedback from the user community, we will introduce enhancements that add greater flexibility to the system to accommodate the vast diversity of metadata that users want to store. Users will be able to create custom metadata fields and use these for the samples they register. Users will also be able to group samples into 'collections' to make retrieval for research projects or publications easier. An improved interface design will allow for better workflow transition and navigation throughout the application. In keeping up with the demands of a growing community, SESAR has also made process changes to ensure efficiency in system development. For example, we have implemented a release cycle to better track enhancements and fixes to the system, and an API library that facilitates reusability of code. Usage tracking, metrics and surveys capture information to guide the direction of future developments. A new set of administrative tools allows greater control of system management.
Content Model Use and Development to Redeem Thin Section Records
NASA Astrophysics Data System (ADS)
Hills, D. J.
2014-12-01
The National Geothermal Data System (NGDS) is a catalog of documents and datasets that provide information about geothermal resources located primarily within the United States. The goal of NGDS is to make large quantities of geothermal-relevant geoscience data available to the public by creating a national, sustainable, distributed, and interoperable network of data providers. The Geological Survey of Alabama (GSA) has been a data provider in the initial phase of NGDS. One method by which NGDS facilitates interoperability is through the use of content models. Content models provide a schema (structure) for submitted data. Schemas dictate where and how data should be entered. Content models use templates that simplify data formatting to expedite use by data providers. These methodologies implemented by NGDS can extend beyond geothermal data to all geoscience data. The GSA, using the NGDS physical samples content model, has tested and refined a content model for thin sections and thin section photos. Countless thin sections have been taken from oil and gas well cores housed at the GSA, and many of those thin sections have related photomicrographs. Record keeping for these thin sections has been scattered at best, and it is critical to capture their metadata while the content creators are still available. A next step will be to register the GSA's thin sections with SESAR (System for Earth Sample Registration) and assign an IGSN (International Geo Sample Number) to each thin section. Additionally, the thin section records will be linked to the GSA's online record database. When complete, the GSA's thin sections will be more readily discoverable and have greater interoperability. Moving forward, the GSA is implementing use of NGDS-like content models and registration with SESAR and IGSN to improve collection maintenance and management of additional physical samples.
EarthChem and SESAR: Data Resources and Interoperability for EarthScope Cyberinfrastructure
NASA Astrophysics Data System (ADS)
Lehnert, K. A.; Walker, D.; Block, K.; Vinay, S.; Ash, J.
2008-12-01
Data management within the EarthScope Cyberinfrastructure needs to pursue two goals in order to advance and maximize the broad scientific application and impact of the large volumes of observational data acquired by EarthScope facilities: (a) to provide access to all data acquired by EarthScope facilities, and to promote their use by broad audiences, and (b) to facilitate discovery of, access to, and integration of multi-disciplinary data sets that complement EarthScope data in support of EarthScope science. EarthChem and SESAR, the System for Earth Sample Registration, are two projects within the Geoinformatics for Geochemistry program that offer resources for EarthScope CI. EarthChem operates a data portal that currently provides access to >13 million analytical values for >600,000 samples, more than half of which are from North America, including data from the USGS and all data from the NAVDAT database, a web-accessible repository for age, chemical and isotopic data from Mesozoic and younger igneous rocks in western North America. The new EarthChem GEOCHRON database will house data collected in association with GeoEarthScope, storing and serving geochronological data submitted by participating facilities. The EarthChem Deep Lithosphere Dataset is a compilation of petrological data for mantle xenoliths, initiated in collaboration with GeoFrame to complement geophysical endeavors within EarthScope science. The EarthChem Geochemical Resource Library provides a home for geochemical and petrological data products and data sets. Parts of the digital data in EarthScope CI refer to physical samples such as drill cores, igneous rocks, or water and gas samples, collected, for example, by SAFOD or by EarthScope science projects and acquired through lab-based analysis. Management of sample-based data requires the use of global unique identifiers for samples, so that distributed data for individual samples generated in different labs and published in different papers can be unambiguously linked and integrated. SESAR operates a registry for Earth samples that assigns and administers the International GeoSample Numbers (IGSN) as a global unique identifier for samples. Registration of EarthScope samples with SESAR and use of the IGSN will ensure their unique identification in publications and data systems, thus facilitating interoperability among sample-based data relevant to EarthScope CI and globally. It will also make these samples visible to global audiences via the SESAR Global Sample Catalog.
NASA Astrophysics Data System (ADS)
Hsu, L.; Bristol, S.; Lehnert, K. A.; Arko, R. A.; Peters, S. E.; Uhen, M. D.; Song, L.
2014-12-01
The U.S. Geological Survey (USGS) is an exemplar of the need for improved cyberinfrastructure for its vast holdings of invaluable physical geoscience data. Millions of discrete paleobiological and geological specimens lie in USGS warehouses and at the Smithsonian Institution. These specimens serve as the basis for many geologic maps and geochemical databases, and are a potential treasure trove of new scientific knowledge. The extent of this treasure is virtually unknown and inaccessible outside a small group of paleogeoscientists and geochemists. A team from the USGS, the Integrated Earth Data Applications (IEDA) facility, and the Paleobiology Database (PBDB) are working to expose information on paleontological and geochemical specimens for discovery by scientists and citizens. This project uses existing infrastructure of the System for Earth Sample Registration (SESAR) and PBDB, which already contains much of the fundamental data schemas that are necessary to accommodate USGS records. The project is also developing a new Linked Data interface for the USGS National Geochemical Database (NGDB). The International Geo Sample Number (IGSN) is the identifier that links samples between all systems. For paleontological specimens, SESAR and PBDB will be the primary repositories for USGS records, with a data syncing process to archive records within the USGS ScienceBase system. The process began with mapping the metadata fields necessary for USGS collections to the existing SESAR and PBDB data structures, while aligning them with the Observations & Measurements and Darwin Core standards. New functionality needed in SESAR included links to a USGS locality registry, fossil classifications, a spatial qualifier attribution for samples with sensitive locations, and acknowledgement of data and metadata licensing. The team is developing a harvesting mechanism to periodically transfer USGS records from within PBDB and SESAR to ScienceBase. For the NGDB, the samples are being registered with IGSNs in SESAR and the geochemical data are being published as Linked Data. This system allows the USGS collections to benefit from disciplinary and institutional strengths of the participating resources, while simultaneously increasing the discovery, accessibility, and citation of USGS physical collection holdings.
NASA Astrophysics Data System (ADS)
Nettles, J. J.; Bowring, J. F.
2014-12-01
NSF requires data management plans as part of funding proposals and geochronologists, among other scientists, are archiving their data and results to the public cloud archives managed by the NSF-funded Integrated Earth Data Applications, or IEDA. GeoChron is a database for geochronology housed within IEDA. The software application U-Pb_Redux developed at the Cyber Infrastructure Research and Development Lab for the Earth Sciences (CIRDLES.org) at the College of Charleston provides seamless connectivity to GeoChron for uranium-lead (U-Pb) geochronologists to automatically upload and retrieve their data and results. U-Pb_Redux also manages publication-quality documents including report tables and graphs. CHRONI is a lightweight mobile application for Android devices that provides easy access to these archived data and results. With CHRONI, U-Pb geochronologists can view archived data and analyses downloaded from the Geochron database, or any other location, in a customizable format. CHRONI uses the same extensible markup language (XML) schema and documents used by U-Pb_Redux and GeoChron. Report Settings are special XML files that can be customized in U-Pb_Redux, stored in the cloud, and then accessed and used in CHRONI to create the same customized data display on the mobile device. In addition to providing geologists effortless and mobile access to archived data and analyses, CHRONI allows users to manage their GeoChron credentials, quickly download private and public files via a specified IEDA International Geo Sample Number (IGSN) or URL, and view specialized graphics associated with particular IGSNs. Future versions of CHRONI will be developed to support iOS compatible devices. CHRONI is an open source project under the Apache 2 license and is hosted at https://github.com/CIRDLES/CHRONI. We encourage community participation in its continued development.
NASA Astrophysics Data System (ADS)
Gorgas, Thomas; Conze, Ronald; Lorenz, Henning; Elger, Kirsten; Ulbricht, Damian; Wilkens, Roy; Lyle, Mitchell; Westerhold, Thomas; Drury, Anna Joy; Tian, Jun; Hahn, Annette
2017-04-01
Scientific ocean drilling over the past >40 years and corresponding efforts on land (by now for more than >20 years) has led to the accumulation of an enormous amount of valuable petrophysical, geochemical, biological and geophysical data obtained through laboratory and field experiments across a multitude of scale-and time dimensions. Such data can be utilized comprehensively in a holistic fashion, and thereby provide base toward an enhanced "Core-Log-Integration", modeling small-scale basin processes to large-scale Earth phenomena, while also storing and managing all relevant information in an "Open Access" fashion. Since the early 1990's members of our team have acquired and measured a large dataset of physical and geochemical properties representing both terrestrial and marine geological environments. This dataset cover a variety of both macro-to-microscale dimensions, and thereby allowing this type of interdisciplinary data examination. Over time, data management and processing tools have been developed and were recently merged with modern data publishing methods, which allow identifying and tracking data and associated publications in a trackable and concise manner. Our current presentation summarizes an important part of the value chain in geosciences, comprising: 1) The state-of-the-art in data management for continental and lake drilling projects performed with and through ICDP's Drilling Information System (DIS). 2) The CODD (Code for Ocean Drilling Data) as numerical-based, programmable data processing toolbox and applicable for both continental and marine drilling projects. 3) The implementation of Persistent Identifiers, such as the International Geo Sample Number (IGSN) to identify and track sample material as part of Digital-Object-Identifier (DOI)-tagged operation reports and research publications. 4) A list of contacts provided for scientists with an interest in learning and applying methods and techniques we offer in form of basic and advanced training courses at our respective research institutions and facilities around the world.
Applications for unique identifiers in the geological sciences
NASA Astrophysics Data System (ADS)
Klump, J.; Lehnert, K. A.
2012-12-01
Even though geology has always been a generalist discipline in many parts, approaches towards questions about Earth's past have become increasingly interdisciplinary. At the same time, a wealth of samples has been collected, the resulting data have been stored in in disciplinary databases, the interpretations published in scientific literature. In the past these resources have existed alongside each other, semantically linked only by the knowledge of the researcher and his peers. One of the main drivers towards the inception of the world wide web was the ability to link scientific sources over the internet. The Uniform Resource Locator (URL) used to locate resources on the web soon turned out to be ephemeral in nature. A more reliable way of addressing objects was needed, a way of persistent identification to make digital objects, or digital representations of objects, part of the record of science. With their high degree of centralisation the scientific publishing houses were quick to implement and adopt a system for unique and persistent identification, the Digital Object Identifier (DOI) ®. At the same time other identifier systems exist alongside DOI, e.g. URN, ARK, handle ®, and others. There many uses for persistent identification in science, other than the identification of journal articles. DOI are already used for the identification of data, thus making data citable. There are several initiatives to assign identifiers to authors and institutions to allow unique identification. A recent development is the application of persistent identifiers for geological samples. As most data in the geosciences are derived from samples, it is crucial to be able to uniquely identify the samples from which a set of data were derived. Incomplete documentation of samples in publications, use of ambiguous sample names are major obstacles for synthesis studies and re-use of data. Access to samples for re-analysis and re-appraisal is limited due to the lack of a central catalogue that allows finding a sample's archiving location. The International Geo Sample Number (IGSN) provides solutions to the questions of unique sample identification and discovery. Use of the IGSN in digital data systems allows building linkages between the digital representation of samples in sample registries, e.g. SESAR, and their related data in the literature and in web accessible digital data repositories. Persistent identifiers are now available for literature, data, samples, and authors. More applications, e.g. identification of methods or instruments, will follow. In conjunction with semantic web technology the application of unique and persistent identifiers in the geosciences will aid discovery both through systematic data mining, exploratory data analysis, and serendipity effects. This talk will discuss existing and emerging applications for persistent identifiers in the geological sciences.
Persistent Identifiers, Discoverability and Open Science (Communication)
NASA Astrophysics Data System (ADS)
Murphy, Fiona; Lehnert, Kerstin; Hanson, Brooks
2016-04-01
Early in 2016, the American Geophysical Union announced it was incorporating ORCIDs into its submission workflows. This was accompanied by a strong statement supporting the use of other persistent identifiers - such as IGSNs, and the CrossRef open registry 'funding data'. This was partly in response to funders' desire to track and manage their outputs. However the more compelling argument, and the reason why the AGU has also signed up to the Center for Open Science's Transparency and Openness Promotion (TOP) Guidelines (http://cos.io/top), is that ultimately science and scientists will be the richer for these initiatives due to increased opportunities for interoperability, reproduceability and accreditation. The AGU has appealed to the wider community to engage with these initiatives, recognising that - unlike the introduction of Digital Object Identifiers (DOIs) for articles by CrossRef - full, enriched use of persistent identifiers throughout the scientific process requires buy-in from a range of scholarly communications stakeholders. At the same time, across the general research landscape, initiatives such as Project CRediT (contributor roles taxonomy), Publons (reviewer acknowledgements) and the forthcoming CrossRef DOI Event Tracker are contributing to our understanding and accreditation of contributions and impact. More specifically for earth science and scientists, the cross-functional Coalition for Publishing Data in the Earth and Space Sciences (COPDESS) was formed in October 2014 and is working to 'provide an organizational framework for Earth and space science publishers and data facilities to jointly implement and promote common policies and procedures for the publication and citation of data across Earth Science journals'. Clearly, the judicious integration of standards, registries and persistent identifiers such as ORCIDs and International Geo Sample Numbers (IGSNs) to the research and research output processes is key to the success of this venture. However these also give rise to a number of logistical, technological and cultural challenges. This poster seeks to identify and progress our understanding of these. The authors are keen to build knowledge from the gathering of case studies (successful or otherwise) and hear from potential collaborators in order to develop a robust structure that will empower both earth science and earth scientists and enable more nuanced, trustworthy, interoperable research in the near future.
Discovering Physical Samples Through Identifiers, Metadata, and Brokering
NASA Astrophysics Data System (ADS)
Arctur, D. K.; Hills, D. J.; Jenkyns, R.
2015-12-01
Physical samples, particularly in the geosciences, are key to understanding the Earth system, its history, and its evolution. Our record of the Earth as captured by physical samples is difficult to explain and mine for understanding, due to incomplete, disconnected, and evolving metadata content. This is further complicated by differing ways of classifying, cataloguing, publishing, and searching the metadata, especially when specimens do not fit neatly into a single domain—for example, fossils cross disciplinary boundaries (mineral and biological). Sometimes even the fundamental classification systems evolve, such as the geological time scale, triggering daunting processes to update existing specimen databases. Increasingly, we need to consider ways of leveraging permanent, unique identifiers, as well as advancements in metadata publishing that link digital records with physical samples in a robust, adaptive way. An NSF EarthCube Research Coordination Network (RCN) called the Internet of Samples (iSamples) is now working to bridge the metadata schemas for biological and geological domains. We are leveraging the International Geo Sample Number (IGSN) that provides a versatile system of registering physical samples, and working to harmonize this with the DataCite schema for Digital Object Identifiers (DOI). A brokering approach for linking disparate catalogues and classification systems could help scale discovery and access to the many large collections now being managed (sometimes millions of specimens per collection). This presentation is about our community building efforts, research directions, and insights to date.
NASA Astrophysics Data System (ADS)
Hsu, L.; Lehnert, K. A.; Walker, J. D.; Chan, C.; Ash, J.; Johansson, A. K.; Rivera, T. A.
2011-12-01
Sample-based measurements in geochemistry are highly diverse, due to the large variety of sample types, measured properties, and idiosyncratic analytical procedures. In order to ensure the utility of sample-based data for re-use in research or education they must be associated with a high quality and quantity of descriptive, discipline-specific metadata. Without an adequate level of documentation, it is not possible to reproduce scientific results or have confidence in using the data for new research inquiries. The required detail in data documentation makes it challenging to aggregate large sets of data from different investigators and disciplines. One solution to this challenge is to build data systems with several tiers of intricacy, where the less detailed tiers are geared toward discovery and interoperability, and the more detailed tiers have higher value for data analysis. The Geoinformatics for Geochemistry (GfG) group, which is part of the Integrated Earth Data Applications facility (http://www.iedadata.org), has taken this approach to provide services for the discovery, access, and analysis of sample-based geochemical data for a diverse user community, ranging from the highly informed geochemist to non-domain scientists and undergraduate students. GfG builds and maintains three tiers in the sample based data systems, from a simple data catalog (Geochemical Resource Library), to a substantially richer data model for the EarthChem Portal (EarthChem XML), and finally to detailed discipline-specific data models for petrologic (PetDB), sedimentary (SedDB), hydrothermal spring (VentDB), and geochronological (GeoChron) samples. The data catalog, the lowest level in the hierarchy, contains the sample data values plus metadata only about the dataset itself (Dublin Core metadata such as dataset title and author), and therefore can accommodate the widest diversity of data holdings. The second level includes measured data values from the sample, basic information about the analytical method, and metadata about the samples such as geospatial information and sample type. The third and highest level includes detailed data quality documentation and more specific information about the scientific context of the sample. The three tiers are linked to allow users to quickly navigate to their desired level of metadata detail. Links are based on the use of unique identifiers: (a) DOI at the granularity of datasets, and (b) the International Geo Sample Number IGSN at the granularity of samples. Current developments in the GfG sample-based systems include new registry architecture for the IGSN to advance international implementation, growth and modification of EarthChemXML to include geochemical data for new sample types such as soils and liquids, and the construction of a hydrothermal vent data system. This flexible, tiered, model provides a solution for offering varying levels of detail in order to aggregate a large quantity of data and serve the largest user group of both disciplinary novices and experts.
NASA Astrophysics Data System (ADS)
Hallett, B. W.; Dere, A. L. D.; Lehnert, K.; Carter, M.
2016-12-01
Vast numbers of physical samples are routinely collected by geoscientists to probe key scientific questions related to global climate change, biogeochemical cycles, magmatic processes, mantle dynamics, etc. Despite their value as irreplaceable records of nature the majority of these samples remain undiscoverable by the broader scientific community because they lack a digital presence or are not well-documented enough to facilitate their discovery and reuse for future scientific and educational use. The NSF EarthCube iSamples Research Coordination Network seeks to develop a unified approach across all Earth Science disciplines for the registration, description, identification, and citation of physical specimens in order to take advantage of the new opportunities that cyberinfrastructure offers. Even as consensus around best practices begins to emerge, such as the use of the International Geo Sample Number (IGSN), more work is needed to communicate these practices to investigators to encourage widespread adoption. Recognizing the importance of students and early career scientists in particular to transforming data and sample management practices, the iSamples Education and Training Working Group is developing training modules for sample collection, documentation, and management workflows. These training materials are made available to educators/research supervisors online at http://earthcube.org/group/isamples and can be modularized for supervisors to create a customized research workflow. This study details the design and development of several sample management tutorials, created by early career scientists and documented in collaboration with undergraduate research students in field and lab settings. Modules under development focus on rock outcrops, rock cores, soil cores, and coral samples, with an emphasis on sample management throughout the collection, analysis and archiving process. We invite others to share their sample management/registration workflows and to develop training modules. This educational approach, with evolving digital materials, can help prepare future scientists to perform research in a way that will contribute to EarthCube data integration and discovery.
From Field to the Web: Management and Publication of Geoscience Samples in CSIRO Mineral Resources
NASA Astrophysics Data System (ADS)
Devaraju, A.; Klump, J. F.; Tey, V.; Fraser, R.; Reid, N.; Brown, A.; Golodoniuc, P.
2016-12-01
Inaccessible samples are an obstacle to the reproducibility of research and may cause waste of time and resources through duplication of sample collection and management. Within the Commonwealth Scientific and Industrial Research Organisation (CSIRO) Mineral Resources there are various research communities who collect or generate physical samples as part of their field studies and analytical processes. Materials can be varied and could be rock, soil, plant materials, water, and even synthetic materials. Given the wide range of applications in CSIRO, each researcher or project may follow their own method of collecting, curating and documenting samples. In many cases samples and their documentation are often only available to the sample collector. For example, the Australian Resources Research Centre stores rock samples and research collections dating as far back as the 1970s. Collecting these samples again would be prohibitively expensive and in some cases impossible because the site has been mined out. These samples would not be easily discoverable by others without an online sample catalog. We identify some of the organizational and technical challenges to provide unambiguous and systematic access to geoscience samples, and present their solutions (e.g., workflow, persistent identifier and tools). We present the workflow starting from field sampling to sample publication on the Web, and describe how the International Geo Sample Number (IGSN) can be applied to identify samples along the process. In our test case geoscientific samples are collected as part of the Capricorn Distal Footprints project, a collaboration project between the CSIRO, the Geological Survey of Western Australia, academic institutions and industry partners. We conclude by summarizing the values of our solutions in terms of sample management and publication.
NASA Astrophysics Data System (ADS)
Block, K. A.; Randel, C.; Ismail, A.; Palumbo, R. V.; Cai, Y.; Carter, M.; Lehnert, K.
2016-12-01
Most geologic samples of New York City (NYC) have been collected during city construction projects. Studies of these samples are essential for our understanding of the local geology as well as the tectonic processes that shaped the entire Appalachian region. Among these is a suite of rare high-grade granulite samples collected during the construction of the Brooklyn-Queens section of NYC Water Tunnel #3 have been resting dormant in the basement of the City College of New York (CCNY), studied by a small group of investigators with institutional knowledge, but largely undiscoverable and inaccessible to the broader scientific community. Data derived from these samples remain in disparate places, at best in analog format in publications or theses or, at worst, in spreadsheets stored on local machines or on old media, such as CDs and even floppy disks. As part of the Interdisciplinary Earth Data Alliance - CCNY joint internship program, 3 undergraduate students inventoried hundreds of samples and archived sample metadata in the System for Earth Sample Registration (SESAR), a sample metadata registry. Upon registration, each sample was assigned an International GeoSample Number (IGSN) ‒ a globally-unique and persistent identifier that allows unambiguous citation of samples and linking of disparate analytical data across the literature. The students also compiled geochemical analyses, thin-section images, and associated analytical metadata for publication in the EarthChem Library, where the dataset will be openly and persistently accessible and citable via a DOI (Digital Object Identifier). Not only did the internship result in the illumination of countless dark samples and data values, but it also provided the students with valuable lessons in responsible sample and data management, training that should serve them well in their future scientific endeavors.
Constructing an integrated gene similarity network for the identification of disease genes.
Tian, Zhen; Guo, Maozu; Wang, Chunyu; Xing, LinLin; Wang, Lei; Zhang, Yin
2017-09-20
Discovering novel genes that are involved human diseases is a challenging task in biomedical research. In recent years, several computational approaches have been proposed to prioritize candidate disease genes. Most of these methods are mainly based on protein-protein interaction (PPI) networks. However, since these PPI networks contain false positives and only cover less half of known human genes, their reliability and coverage are very low. Therefore, it is highly necessary to fuse multiple genomic data to construct a credible gene similarity network and then infer disease genes on the whole genomic scale. We proposed a novel method, named RWRB, to infer causal genes of interested diseases. First, we construct five individual gene (protein) similarity networks based on multiple genomic data of human genes. Then, an integrated gene similarity network (IGSN) is reconstructed based on similarity network fusion (SNF) method. Finally, we employee the random walk with restart algorithm on the phenotype-gene bilayer network, which combines phenotype similarity network, IGSN as well as phenotype-gene association network, to prioritize candidate disease genes. We investigate the effectiveness of RWRB through leave-one-out cross-validation methods in inferring phenotype-gene relationships. Results show that RWRB is more accurate than state-of-the-art methods on most evaluation metrics. Further analysis shows that the success of RWRB is benefited from IGSN which has a wider coverage and higher reliability comparing with current PPI networks. Moreover, we conduct a comprehensive case study for Alzheimer's disease and predict some novel disease genes that supported by literature. RWRB is an effective and reliable algorithm in prioritizing candidate disease genes on the genomic scale. Software and supplementary information are available at http://nclab.hit.edu.cn/~tianzhen/RWRB/ .
NASA Astrophysics Data System (ADS)
Pignol, C.; Arnaud, F.; Godinho, E.; Galabertier, B.; Caillo, A.; Billy, I.; Augustin, L.; Calzas, M.; Rousseau, D. D.; Crosta, X.
2016-12-01
Managing scientific data is probably one the most challenging issues in modern science. In plaeosciences the question is made even more sensitive with the need of preserving and managing high value fragile geological samples: cores. Large international scientific programs, such as IODP or ICDP led intense effort to solve this problem and proposed detailed high standard work- and dataflows thorough core handling and curating. However many paleoscience results derived from small-scale research programs in which data and sample management is too often managed only locally - when it is… In this paper we present a national effort leads in France to develop an integrated system to curate ice and sediment cores. Under the umbrella of the national excellence equipment program CLIMCOR, we launched a reflexion about core curating and the management of associated fieldwork data. Our aim was then to conserve all data from fieldwork in an integrated cyber-environment which will evolve toward laboratory-acquired data storage in a near future. To do so, our demarche was conducted through an intimate relationship with field operators as well laboratory core curators in order to propose user-oriented solutions. The national core curating initiative proposes a single web portal in which all teams can store their fieldwork data. This portal is used as a national hub to attribute IGSNs. For legacy samples, this requires the establishment of a dedicated core list with associated metadata. However, for forthcoming core data, we developed a mobile application to capture technical and scientific data directly on the field. This application is linked with a unique coring-tools library and is adapted to most coring devices (gravity, drilling, percussion etc.) including multiple sections and holes coring operations. Those field data can be uploaded automatically to the national portal, but also referenced through international standards (IGSN and INSPIRE) and displayed in international portals (currently, NOAA's IMLGS). In this paper, we present the architecture of the integrated system, future perspectives and the approach we adopted to reach our goals. We will also present our mobile application through didactic examples.
Towards Making Data Bases Practical for use in the Field
NASA Astrophysics Data System (ADS)
Fischer, T. P.; Lehnert, K. A.; Chiodini, G.; McCormick, B.; Cardellini, C.; Clor, L. E.; Cottrell, E.
2014-12-01
Geological, geochemical, and geophysical research is often field based with travel to remote areas and collection of samples and data under challenging environmental conditions. Cross-disciplinary investigations would greatly benefit from near real-time data access and visualisation within the existing framework of databases and GIS tools. An example of complex, interdisciplinary field-based and data intensive investigations is that of volcanologists and gas geochemists, who sample gases from fumaroles, hot springs, dry gas vents, hydrothermal vents and wells. Compositions of volcanic gas plumes are measured directly or by remote sensing. Soil gas fluxes from volcanic areas are measured by accumulation chamber and involve hundreds of measurements to calculate the total emission of a region. Many investigators also collect rock samples from recent or ancient volcanic eruptions. Structural, geochronological, and geophysical data collected during the same or related field campaigns complement these emissions data. All samples and data collected in the field require a set of metadata including date, time, location, sample or measurement id, and descriptive comments. Currently, most of these metadata are written in field notebooks and later transferred into a digital format. Final results such as laboratory analyses of samples and calculated flux data are tabulated for plotting, correlation with other types of data, modeling and finally publication and presentation. Data handling, organization and interpretation could be greatly streamlined by using digital tools available in the field to record metadata, assign an International Geo Sample Number (IGSN), upload measurements directly from field instruments, and arrange sample curation. Available data display tools such as GeoMapApp and existing data sets (PetDB, IRIS, UNAVCO) could be integrated to direct locations for additional measurements during a field campaign. Nearly live display of sampling locations, pictures, and comments could be used as an educational and outreach tool during sampling expeditions. Achieving these goals requires the integration of existing online data resources, with common access through a dedicated web portal.
Publishing Linked Open Data for Physical Samples - Lessons Learned
NASA Astrophysics Data System (ADS)
Ji, P.; Arko, R. A.; Lehnert, K.; Bristol, S.
2016-12-01
Most data and information about physical samples and associated sampling features currently reside in relational databases. Integrating common concepts from various databases has motivated us to publish Linked Open Data for collections of physical samples, using Semantic Web technologies including the Resource Description Framework (RDF), RDF Query Language (SPARQL), and Web Ontology Language (OWL). The goal of our work is threefold: To evaluate and select ontologies in different granularities for common concepts; to establish best practices and develop a generic methodology for publishing physical sample data stored in relational database as Linked Open Data; and to reuse standard community vocabularies from the International Commission on Stratigraphy (ICS), Global Volcanism Program (GVP), General Bathymetric Chart of the Oceans (GEBCO), and others. Our work leverages developments in the EarthCube GeoLink project and the Interdisciplinary Earth Data Alliance (IEDA) facility for modeling and extracting physical sample data stored in relational databases. Reusing ontologies developed by GeoLink and IEDA has facilitated discovery and integration of data and information across multiple collections including the USGS National Geochemical Database (NGDB), System for Earth Sample Registration (SESAR), and Index to Marine & Lacustrine Geological Samples (IMLGS). We have evaluated, tested, and deployed Linked Open Data tools including Morph, Virtuoso Server, LodView, LodLive, and YASGUI for converting, storing, representing, and querying data in a knowledge base (RDF triplestore). Using persistent identifiers such as Open Researcher & Contributor IDs (ORCIDs) and International Geo Sample Numbers (IGSNs) at the record level makes it possible for other repositories to link related resources such as persons, datasets, documents, expeditions, awards, etc. to samples, features, and collections. This work is supported by the EarthCube "GeoLink" project (NSF# ICER14-40221 and others) and the "USGS-IEDA Partnership to Support a Data Lifecycle Framework and Tools" project (USGS# G13AC00381).
A Mobile App for Geochemical Field Data Acquisition
NASA Astrophysics Data System (ADS)
Klump, J. F.; Reid, N.; Ballsun-Stanton, B.; White, A.; Sobotkova, A.
2015-12-01
We have developed a geochemical sampling application for use on Android tablets. This app was developed together with the Federated Archaeological Information Management Systems (FAIMS) at Macquarie University and is based on the open source FAIMS mobile platform, which was originally designed for archaeological field data collection. The FAIMS mobile platform has proved valuable for hydrogeochemical, biogeochemical, soil and rock sample collection due to the ability to customise data collection methodologies for any field research. The module we commissioned allows for using inbuilt or external GPS to locate sample points, it incorporates standard and incremental sampling names which can be easily fed into the International Geo-Sample Number (IGSN). Sampling can be documented not only in metadata, but also accompanied by photographic documentation and sketches. The module is augmented by dropdown menus for fields specific for each sample type and user defined tags. The module also provides users with an overview of all records from a field campaign in a records viewer. We also use basic mapping functionality, showing the current location, sampled points overlaid on preloaded rasters, and allows for drawing of points and simple polygons to be later exported as shape files. A particular challenge is the remoteness of the sampling locations, hundreds of kilometres away from network access. The first trial raised the issue of backup without access to the internet, so in collaboration with the FAIMS team and Solutions First, we commissioned a vehicle mounted portable server. This server box is constantly syncing with the tablets in the field via Wi-Fi, it has an uninterruptible power supply that can run for up to 45 minutes when the vehicle is turned off, and a 1TB hard drive for storage of all data and photographs. The server can be logged into via any of the field tablets or laptop to download all the data collected to date or to just view it on the server.
Physical Samples Linked Data in Action
NASA Astrophysics Data System (ADS)
Ji, P.; Arko, R. A.; Lehnert, K.; Bristol, S.
2017-12-01
Most data and metadata related to physical samples currently reside in isolated relational databases driven by diverse data models. How to approach the challenge for sharing, interchanging and integrating data from these difference relational databases motivated us to publish Linked Open Data for collections of physical samples, using Semantic Web technologies including the Resource Description Framework (RDF), RDF Query Language (SPARQL), and Web Ontology Language (OWL). In last few years, we have released four knowledge graphs concentrated on physical samples, including System for Earth Sample Registration (SESAR), USGS National Geochemical Database (NGDC), Ocean Biogeographic Information System (OBIS), and Earthchem Database. Currently the four knowledge graphs contain over 12 million facets (triples) about objects of interest to the geoscience domain. Choosing appropriate domain ontologies for representing context of data is the core of the whole work. Geolink ontology developed by Earthcube Geolink project was used as top level to represent common concepts like person, organization, cruise, etc. Physical sample ontology developed by Interdisciplinary Earth Data Alliance (IEDA) and Darwin Core vocabulary were used as second level to describe details about geological samples and biological diversity. We also focused on finding and building best tool chains to support the whole life cycle of publishing linked data we have, including information retrieval, linked data browsing and data visualization. Currently, Morph, Virtuoso Server, LodView, LodLive, and YASGUI were employed for converting, storing, representing, and querying data in a knowledge base (RDF triplestore). Persistent digital identifier is another main point we concentrated on. Open Researcher & Contributor IDs (ORCIDs), International Geo Sample Numbers (IGSNs), Global Research Identifier Database (GRID) and other persistent identifiers were used to link different resources from various graphs with person, sample, organization, cruise, etc. This work is supported by the EarthCube "GeoLink" project (NSF# ICER14-40221 and others) and the "USGS-IEDA Partnership to Support a Data Lifecycle Framework and Tools" project (USGS# G13AC00381).
Persistent Identifiers for Field Expeditions: A Next Step for the US Oceanographic Research Fleet
NASA Astrophysics Data System (ADS)
Arko, Robert; Carbotte, Suzanne; Chandler, Cynthia; Smith, Shawn; Stocks, Karen
2016-04-01
Oceanographic research cruises are complex affairs, typically requiring an extensive effort to secure the funding, plan the experiment, and mobilize the field party. Yet cruises are not typically published online as first-class digital objects with persistent, citable identifiers linked to the scientific literature. The Rolling Deck to Repository (R2R; info@rvdata.us) program maintains a master catalog of oceanographic cruises for the United States research fleet, currently documenting over 6,000 expeditions on 37 active and retired vessels. In 2015, R2R started routinely publishing a Digital Object Identifier (DOI) for each completed cruise. Cruise DOIs, in turn, are linked to related persistent identifiers where available including the Open Researcher and Contributor ID (ORCID) for members of the science party, the International Geo Sample Number (IGSN) for physical specimens collected during the cruise, the Open Funder Registry (FundRef) codes that supported the experiment, and additional DOIs for datasets, journal articles, and other products resulting from the cruise. Publishing a persistent identifier for each field expedition will facilitate interoperability between the many different repositories that hold research products from cruises; will provide credit to the investigators who secured the funding and carried out the experiment; and will facilitate the gathering of fleet-wide altmetrics that demonstrate the broad impact of oceanographic research.
NASA Astrophysics Data System (ADS)
Bowring, J. F.; McLean, N. M.; Walker, J. D.; Gehrels, G. E.; Rubin, K. H.; Dutton, A.; Bowring, S. A.; Rioux, M. E.
2015-12-01
The Cyber Infrastructure Research and Development Lab for the Earth Sciences (CIRDLES.org) has worked collaboratively for the last decade with geochronologists from EARTHTIME and EarthChem to build cyberinfrastructure geared to ensuring transparency and reproducibility in geoscience workflows and is engaged in refining and extending that work to serve additional geochronology domains during the next decade. ET_Redux (formerly U-Pb_Redux) is a free open-source software system that provides end-to-end support for the analysis of U-Pb geochronological data. The system reduces raw mass spectrometer (TIMS and LA-ICPMS) data to U-Pb dates, allows users to interpret ages from these data, and then facilitates the seamless federation of the results from one or more labs into a community web-accessible database using standard and open techniques. This EarthChem database - GeoChron.org - depends on keyed references to the System for Earth Sample Registration (SESAR) database that stores metadata about registered samples. These keys are each a unique International Geo Sample Number (IGSN) assigned to a sample and to its derivatives. ET_Redux provides for interaction with this archive, allowing analysts to store, maintain, retrieve, and share their data and analytical results electronically with whomever they choose. This initiative has created an open standard for the data elements of a complete reduction and analysis of U-Pb data, and is currently working to complete the same for U-series geochronology. We have demonstrated the utility of interdisciplinary collaboration between computer scientists and geoscientists in achieving a working and useful system that provides transparency and supports reproducibility, allowing geochemists to focus on their specialties. The software engineering community also benefits by acquiring research opportunities to improve development process methodologies used in the design, implementation, and sustainability of domain-specific software.
Reviving legacy clay mineralogy data and metadata through the IEDA-CCNY Data Internship Program
NASA Astrophysics Data System (ADS)
Palumbo, R. V.; Randel, C.; Ismail, A.; Block, K. A.; Cai, Y.; Carter, M.; Hemming, S. R.; Lehnert, K.
2016-12-01
Reconstruction of past climate and ocean circulation using ocean sediment cores relies on the use of multiple climate proxies measured on well-studied cores. Preserving all the information collected on a sediment core is crucial for the success of future studies using these unique and important samples. Clay mineralogy is a powerful tool to study weathering processes and sedimentary provenance. In his pioneering dissertation, Pierre Biscaye (1964, Yale University) established the X-Ray Diffraction (XRD) method for quantitative clay mineralogy analyses in ocean sediments and presented data for 500 core-top samples throughout the Atlantic Ocean and its neighboring seas. Unfortunately, the data only exists in analog format, which has discouraged scientists from reusing the data, apart from replication of the published maps. Archiving and preserving this dataset and making it publicly available in a digital format, linked with the metadata from the core repository will allow the scientific community to use these data to generate new findings. Under the supervision of Sidney Hemming and members of the Interdisciplinary Earth Data Alliance (IEDA) team, IEDA-CCNY interns digitized the data and metadata from Biscaye's dissertation and linked them with additional sample metadata using IGSN (International Geo-Sample Number). After compilation and proper documentation of the dataset, it was published in the EarthChem Library where the dataset will be openly accessible, and citable with a persistent DOI (Digital Object Identifier). During this internship, the students read peer-reviewed articles, interacted with active scientists in the field and acquired knowledge about XRD methods and the data generated, as well as its applications. They also learned about existing and emerging best practices in data publication and preservation. Data rescue projects are a fun and interactive way for students to become engaged in the field.
Digital Curation of Earth Science Samples Starts in the Field
NASA Astrophysics Data System (ADS)
Lehnert, K. A.; Hsu, L.; Song, L.; Carter, M. R.
2014-12-01
Collection of physical samples in the field is an essential part of research in the Earth Sciences. Samples provide a basis for progress across many disciplines, from the study of global climate change now and over the Earth's history, to present and past biogeochemical cycles, to magmatic processes and mantle dynamics. The types of samples, methods of collection, and scope and scale of sampling campaigns are highly diverse, ranging from large-scale programs to drill rock and sediment cores on land, in lakes, and in the ocean, to environmental observation networks with continuous sampling, to single investigator or small team expeditions to remote areas around the globe or trips to local outcrops. Cyberinfrastructure for sample-related fieldwork needs to cater to the different needs of these diverse sampling activities, aligning with specific workflows, regional constraints such as connectivity or climate, and processing of samples. In general, digital tools should assist with capture and management of metadata about the sampling process (location, time, method) and the sample itself (type, dimension, context, images, etc.), management of the physical objects (e.g., sample labels with QR codes), and the seamless transfer of sample metadata to data systems and software relevant to the post-sampling data acquisition, data processing, and sample curation. In order to optimize CI capabilities for samples, tools and workflows need to adopt community-based standards and best practices for sample metadata, classification, identification and registration. This presentation will provide an overview and updates of several ongoing efforts that are relevant to the development of standards for digital sample management: the ODM2 project that has generated an information model for spatially-discrete, feature-based earth observations resulting from in-situ sensors and environmental samples, aligned with OGC's Observation & Measurements model (Horsburgh et al, AGU FM 2014); implementation of the IGSN (International Geo Sample Number) as a globally unique sample identifier via a distributed system of allocating agents and a central registry; and the EarthCube Research Coordination Network iSamplES (Internet of Samples in the Earth Sciences) that aims to improve sharing and curation of samples through the use of CI.
1980-12-01
graviineters by the Air Force 1281st Geodetic Snuadron (’ halen , unpublished) at most of the same sites in 1965 are shown relative to the IGS!1 71 values...cont.) United States (cont.) Woollard and gnma IGSN 71 Diff GW 77A Woods Hole BM 980.3271 .312 49* -14.61 MI CI! I( VAN Detroit WA 116 "K" Willow Run
NASA Astrophysics Data System (ADS)
Stroker, K. J.; Jencks, J. H.; Eakins, B.
2016-12-01
The Index to Marine and Lacustrine Geological Samples (IMLGS) is a community designed and maintained resource enabling researchers to locate and request seafloor and lakebed geologic samples curated by partner institutions. The Index was conceived in the dawn of the digital age by representatives from U.S. academic and government marine core repositories and the NOAA National Geophysical Data Center, now the National Centers for Environmental Information (NCEI), at a 1977 meeting convened by the National Science Foundation (NSF). The Index is based on core concepts of community oversight, common vocabularies, consistent metadata and a shared interface. The Curators Consortium, international in scope, meets biennially to share ideas and discuss best practices. NCEI serves the group by providing database access and maintenance, a list server, digitizing support and long-term archival of sample metadata, data and imagery. Over three decades, participating curators have performed the laborious task of creating and contributing metadata for over 205,000 sea floor and lake-bed cores, grabs, and dredges archived in their collections. Some partners use the Index for primary web access to their collections while others use it to increase exposure of more in-depth institutional systems. The IMLGS has a persistent URL/Digital Object Identifier (DOI), as well as DOIs assigned to partner collections for citation and to provide a persistent link to curator collections. The Index is currently a geospatially-enabled relational database, publicly accessible via Web Feature and Web Map Services, and text- and ArcGIS map-based web interfaces. To provide as much knowledge as possible about each sample, the Index includes curatorial contact information and links to related data, information and images : 1) at participating institutions, 2) in the NCEI archive, and 3) through a Linked Data interface maintained by the Rolling Deck to Repository R2R. Over 43,000 International GeoSample Numbers (IGSNs) linking to the System for Earth Sample Registration (SESAR) are included in anticipation of opportunities for interconnectivity with Integrated Earth Data Applications (IEDA) systems. The paper will discuss the database with a goal to increase the connections and links to related data at partner institutions.
Scalable persistent identifier systems for dynamic datasets
NASA Astrophysics Data System (ADS)
Golodoniuc, P.; Cox, S. J. D.; Klump, J. F.
2016-12-01
Reliable and persistent identification of objects, whether tangible or not, is essential in information management. Many Internet-based systems have been developed to identify digital data objects, e.g., PURL, LSID, Handle, ARK. These were largely designed for identification of static digital objects. The amount of data made available online has grown exponentially over the last two decades and fine-grained identification of dynamically generated data objects within large datasets using conventional systems (e.g., PURL) has become impractical. We have compared capabilities of various technological solutions to enable resolvability of data objects in dynamic datasets, and developed a dataset-centric approach to resolution of identifiers. This is particularly important in Semantic Linked Data environments where dynamic frequently changing data is delivered live via web services, so registration of individual data objects to obtain identifiers is impractical. We use identifier patterns and pattern hierarchies for identification of data objects, which allows relationships between identifiers to be expressed, and also provides means for resolving a single identifier into multiple forms (i.e. views or representations of an object). The latter can be implemented through (a) HTTP content negotiation, or (b) use of URI querystring parameters. The pattern and hierarchy approach has been implemented in the Linked Data API supporting the United Nations Spatial Data Infrastructure (UNSDI) initiative and later in the implementation of geoscientific data delivery for the Capricorn Distal Footprints project using International Geo Sample Numbers (IGSN). This enables flexible resolution of multi-view persistent identifiers and provides a scalable solution for large heterogeneous datasets.
Geoscience Australia Publishes Sample Descriptions using W3C standards
NASA Astrophysics Data System (ADS)
Car, N. J.; Cox, S. J. D.; Bastrakova, I.; Wyborn, L. A.
2017-12-01
The recent revision of the W3C Semantic Sensor Network Ontology (SSN) has focused on three key concerns: Extending the scope of the ontology to include sampling and actuation as well as observation and sensing Modularizing the ontology into a simple core with few classes and properties and little formal axiomatization, supplemented by additional modules that formalize the semantics and extend the scope Alignments with several existing applications and upper ontologies These enhancements mean that SSN can now be used as the basis for publishing descriptions of geologic samples as Linked Data. Geoscience Australia maintains a database of about three million samples, collected over 50 years through projects from ocean core, terrestrial rock and hydrochemistry borehole projects, almost all of which are held in in the special-purpose GA samples repository. Access to descriptions of these samples as Linked Data has recently been enabled. The sample descriptions can be viewed in various machine-readable formalizations, including IGSN (XML & RDF), Dublin Core (XML & RDF) and SSN (RDF), as well as web landing-pages for people. Of particular importance is the support for encoding relationships between samples, and between samples and surveys, boreholes, and traverses which they are related to, as well as between samples processed for analytical purposes and their parents, siblings, and back to the original field samples. The SSN extension for Sample Relationships provides an extensible, semantically rich mechanism to capture any relationship necessary to explain the provenance of observation results obtained from samples. Sample citation is facilitated through the use of URI-based persistent identifiers which resolve to samples' landing pages. The sample system also allows PROV pingbacks to be received for samples when users of them record provenance for their actions.
NASA Astrophysics Data System (ADS)
Howe, Michael
2014-05-01
Much of the digital geological information on the composition, properties and dynamics of the subsurface is based ultimately on physical samples, many of which are archived to provide a basis for the information. Online metadata catalogues of these collections have now been available for many years. Many of these are institutional and tightly focussed, with UK examples including the British Geological Survey's (BGS) palaeontological samples database, PalaeoSaurus (http://www.bgs.ac.uk/palaeosaurus/), and mineralogical and petrological sample database, Britrocks (http://www.bgs.ac.uk/data/britrocks.html) . There are now a growing number of international sample metadata databases, including The Palaeobiology Database (http://paleobiodb.org/) and SESAR, the IGSN (International Geo Sample Number) database (http://www.geosamples.org/catalogsearch/ ). More recently the emphasis has moved beyond metadata (locality, identification, age, citations, etc) to digital imagery, with the intention of providing the user with at least enough information to determine whether viewing the sample would be worthwhile. Recent BGS examples include high resolution (e.g. 7216 x 5412 pixel) hydrocarbon well core images (http://www.bgs.ac.uk/data/offshoreWells/wells.cfc?method=searchWells) , high resolution rock thin section images (e.g. http://www.largeimages.bgs.ac.uk/iip/britrocks.html?id=290000/291739 ) and building stone images (http://geoscenic.bgs.ac.uk/asset-bank/action/browseItems?categoryId=1547&categoryTypeId=1) . This has been developed further with high resolution stereo images. The Jisc funded GB3D type fossils online project delivers these as red-cyan anaglyphs (http://www.3d-fossils.ac.uk/). More innovatively, the GB3D type fossils project has laser scanned several thousand type fossils and the resulting 3d-digital models are now being delivered through the online portal. Importantly, this project also represents collaboration between the BGS, Oxford and Cambridge Universities, the National Museums of Wales, and numerous other national, local and regional museums. The lack of currently accepted international standards and infrastructures for the delivery of high resolution images and 3d-digital models has necessitated the BGS in developing or selecting its own. Most high resolution images have been delivered using the JPEG 2000 format because of its quality and speed. Digital models have been made available in both .PLY and .OBJ format because of their respective efficient file size, and flexibility. Consideration must now be given to European and international standards and infrastructures for the delivery of high resolution images and 3d-digital models.
The International Gravity Standardization Net 1971 (I.G.S.N.71)
1972-05-31
Companion of Individual Adjustments Discussions at the May 1971 meeting of the Working Sub-Group in Ottawa were concerned mainly with the differences...34"*^™«»*-’ — —— mm n„, I,,J, i..„ TIU -i i M WII «pi iL i.i.wiwiink’.WA,.iMP> "^W^ 3.2.1. Companion of Scale Factor Dttarmination Otter »a Ail...pendolari sulla base europea di taratura dei gravimetri. Pubbl. CGI, Mem. n. 12. C. MORELLI, 1946a : Per un sistema di riferimento
Geosamples.org: Shared Cyberinfrastructure for Geoscience Samples
NASA Astrophysics Data System (ADS)
Lehnert, Kerstin; Allison, Lee; Arctur, David; Klump, Jens; Lenhardt, Christopher
2014-05-01
Many scientific domains, specifically in the geosciences, rely on physical samples as basic elements for study and experimentation. Samples are collected to analyze properties of natural materials and features that are key to our knowledge of Earth's dynamical systems and evolution, and to preserve a record of our environment over time. Huge volumes of samples have been acquired over decades or even centuries and stored in a large number and variety of institutions including museums, universities and colleges, state geological surveys, federal agencies, and industry. All of these collections represent highly valuable, often irreplaceable records of nature that need to be accessible so that they can be re-used in future research and for educational purposes. Many sample repositories are keen to use cyberinfrastructure capabilities to enhance access to their collections on the internet and to support and streamline collection management (accessioning of new samples, labeling, handling sample requests, etc.), but encounter substantial challenges and barriers to integrate digital sample management into their daily routine. They lack the resources (staff, funding) and infrastructure (hardware, software, IT support) to develop and operate web-enabled databases, to migrate analog sample records into digital data management systems, and to transfer paper- or spreadsheet-based workflows to electronic systems. Use of commercial software is often not an option as it incurs high costs for licenses, requires IT expertise for installation and maintenance, and often does not match the needs of the smaller repositories, being designed for large museums or different types of collections (art, archeological, biological). Geosamples.org is an alliance of sample repositories (academic, US federal and state surveys, industry) and data facilities that aims to develop a cyberinfrastructure that will dramatically advance access to physical samples for the research community, government agencies, students, educators, and the general public, while supporting, simplifying, and standardizing the work of curators in repositories, museums, and universities, and even for individual investigators who manage personal or project-based sample collections in their lab. Geosamples.org builds upon best practices and cyberinfrastructure for sample identification, registration, and documentation developed by the IGSN e.V., an international organization that governs the International Geosample Number, a persistent unique identifier for physical samples. Geosamples.org will develop a Digital Environment for Sample Curation (DESC) that will facilitate the creation, identification, and registration of 'virtual samples' and network them into an 'Internet of Samples' that will allow to discover, access, and track online physical samples, the data derived by their study, and the publications that contain these data. DESC will provide easy-to-use software tools for curators to maintain digital catalogs of their collections, to provide online access to the catalog to search for and request samples, manage sample requests and users, track collection usage and impact. Geosamples.org will also work toward joint practices for the recognition of intellectual property, build mechanisms to create sustainable business models for continuing maintenance and evolution of managing sample resources, and integrate the sample management life-cycle into professional and cultural practice of science.
Persistent Identifiers for Field Deployments: A Missing Link in the Provenance Chain
NASA Astrophysics Data System (ADS)
Arko, R. A.; Ji, P.; Fils, D.; Shepherd, A.; Chandler, C. L.; Lehnert, K.
2016-12-01
Research in the geosciences is characterized by a wide range of complex and costly field deployments including oceanographic cruises, submersible dives, drilling expeditions, seismic networks, geodetic campaigns, moored arrays, aircraft flights, and satellite missions. Each deployment typically produces a mix of sensor and sample data, spanning a period from hours to decades, that ultimately yields a long tail of post-field products and publications. Publishing persistent, citable identifiers for field deployments will facilitate 1) preservation and reuse of the original field data, 2) reproducibility of the resulting publications, and 3) recognition for both the facilities that operate the platforms and the investigators who secure funding for the experiments. In the ocean domain, sharing unique identifiers for field deployments is a familiar practice. For example, the Biological and Chemical Oceanography Data Management Office (BCO-DMO) routinely links datasets to cruise identifiers published by the Rolling Deck to Repository (R2R) program. In recent years, facilities have started to publish formal/persistent identifiers, typically Digital Object Identifiers (DOIs), for field deployments including seismic networks, oceanographic cruises, and moored arrays. For example, the EarthChem Library (ECL) publishes a DOI for each dataset which, if it derived from an oceanographic research cruise on a US vessel, is linked to a DOI for the cruise published by R2R. Work is underway to create similar links for the IODP JOIDES Resolution Science Operator (JRSO) and the Continental Scientific Drilling Coordination Office (CSDCO). We present results and lessons learned including a draft schema for publishing field deployments as DataCite DOI records; current practice for linking these DOIs with related identifiers such as Open Researcher and Contributor IDs (ORCIDs), Open Funder Registry (OFR) codes, and International Geo Sample Numbers (IGSNs); and consideration of other identifier types for field deployments such as UUIDs and Handles.
NASA Astrophysics Data System (ADS)
McInnes, B.; Brown, A.; Liffers, M.
2015-12-01
Publically funded laboratories have a responsibility to generate, archive and disseminate analytical data to the research community. Laboratory managers know however, that a long tail of analytical effort never escapes researchers' thumb drives once they leave the lab. This work reports on a research data management project (Digital Mineralogy Library) where integrated hardware and software systems automatically archive and deliver analytical data and metadata to institutional and community data portals. The scientific objective of the DML project was to quantify the modal abundance of heavy minerals extracted from key lithological units in Western Australia. The selected analytical platform was a TESCAN Integrated Mineral Analyser (TIMA) that uses EDS-based mineral classification software to image and quantify mineral abundance and grain size at micron scale resolution. The analytical workflow used a bespoke laboratory information management system (LIMS) to orchestrate: (1) the preparation of grain mounts with embedded QR codes that serve as enduring links between physical samples and analytical data, (2) the assignment of an International Geo Sample Number (IGSN) and Digital Object Identifier (DOI) to each grain mount via the System for Earth Sample Registry (SESAR), (3) the assignment of a DOI to instrument metadata via Research Data Australia, (4) the delivery of TIMA analytical outputs, including spatially registered mineralogy images and mineral abundance data, to an institutionally-based data management server, and (5) the downstream delivery of a final data product via a Google Maps interface such as the AuScope Discovery Portal. The modular design of the system permits the networking of multiple instruments within a single site or multiple collaborating research institutions. Although sharing analytical data does provide new opportunities for the geochemistry community, the creation of an open data network requires: (1) adopting open data reporting standards and conventions, (2) requiring instrument manufacturers and software developers to deliver and process data in formats compatible with open standards, and (3) public funding agencies to incentivise researchers, laboratories and institutions to make their data open and accessible to consumers.
Bouguer gravity map of Indonesia
NASA Astrophysics Data System (ADS)
Green, R.; Adkins, J. S.; Harrington, H. J.; Untung, M.
1981-01-01
A Bouguer gravity map of Indonesia on Mercator projection at a scale of 1: 5,000,000 and with a contour interval 20 mGal has been prepared over the past few years as part of a joint research program of the Geological Survey of Indonesia and the University of New England, Armidale. A new base station network was set up throughout Indonesia and tied to the IGSN stations at Sydney and Singapore. A discussion of the gravity features and the tectonic implications are given. The map is obtainable, in folded form only, from the Publications Department, University of New England, Armidale, N.S.W., Australia 2351 for $ A 5.- plus postage.
Publication of sensor data in the long-term environmental monitoring infrastructure TERENO
NASA Astrophysics Data System (ADS)
Stender, V.; Schroeder, M.; Klump, J. F.
2014-12-01
Terrestrial Environmental Observatories (TERENO) is an interdisciplinary and long-term research project spanning an Earth observation network across Germany. It includes four test sites within Germany from the North German lowlands to the Bavarian Alps and is operated by six research centers of the Helmholtz Association. TERENO Northeast is one of the sub-observatories of TERENO and is operated by the German Research Centre for Geosciences GFZ in Potsdam. This observatory investigates geoecological processes in the northeastern lowland of Germany by collecting large amounts of environmentally relevant data. The success of long-term projects like TERENO depends on well-organized data management, data exchange between the partners involved and on the availability of the captured data. Data discovery and dissemination are facilitated not only through data portals of the regional TERENO observatories but also through a common spatial data infrastructure TEODOOR (TEreno Online Data repOsitORry). TEODOOR bundles the data, provided by the different web services of the single observatories, and provides tools for data discovery, visualization and data access. The TERENO Northeast data infrastructure integrates data from more than 200 instruments and makes data available through standard web services. TEODOOR accesses the OGC Sensor Web Enablement (SWE) interfaces offered by the regional observatories. In addition to the SWE interface, TERENO Northeast also publishes time series of environmental sensor data through the online research data publication platform DataCite. The metadata required by DataCite are created in an automated process by extracting information from the SWE SensorML to create ISO 19115 compliant metadata. The GFZ data management tool kit panMetaDocs is used to register Digital Object Identifiers (DOI) and preserve file based datasets. In addition to DOI, the International Geo Sample Numbers (IGSN) is used to uniquely identify research specimens.
Absolute Gravity Datum in the Age of Cold Atom Gravimeters
NASA Astrophysics Data System (ADS)
Childers, V. A.; Eckl, M. C.
2014-12-01
The international gravity datum is defined today by the International Gravity Standardization Net of 1971 (IGSN-71). The data supporting this network was measured in the 1950s and 60s using pendulum and spring-based gravimeter ties (plus some new ballistic absolute meters) to replace the prior protocol of referencing all gravity values to the earlier Potsdam value. Since this time, gravimeter technology has advanced significantly with the development and refinement of the FG-5 (the current standard of the industry) and again with the soon-to-be-available cold atom interferometric absolute gravimeters. This latest development is anticipated to provide improvement in the range of two orders of magnitude as compared to the measurement accuracy of technology utilized to develop ISGN-71. In this presentation, we will explore how the IGSN-71 might best be "modernized" given today's requirements and available instruments and resources. The National Geodetic Survey (NGS), along with other relevant US Government agencies, is concerned about establishing gravity control to establish and maintain high order geodetic networks as part of the nation's essential infrastructure. The need to modernize the nation's geodetic infrastructure was highlighted in "Precise Geodetic Infrastructure, National Requirements for a Shared Resource" National Academy of Science, 2010. The NGS mission, as dictated by Congress, is to establish and maintain the National Spatial Reference System, which includes gravity measurements. Absolute gravimeters measure the total gravity field directly and do not involve ties to other measurements. Periodic "intercomparisons" of multiple absolute gravimeters at reference gravity sites are used to constrain the behavior of the instruments to ensure that each would yield reasonably similar measurements of the same location (i.e. yield a sufficiently consistent datum when measured in disparate locales). New atomic interferometric gravimeters promise a significant increase in accuracy. Our presentation will also explore the impact of such an instrument on our theory of how to constrain the gravity datum and on how to ensure stability, repeatability, and reproducibility across different absolute gravimeter systems.
NASA Astrophysics Data System (ADS)
Galkin, A.; Klump, J.; Wiedenbeck, M.
2012-04-01
Secondary Ion Mass Spectrometers (SIMS) is an highly sensitive technique for analyzing the surfaces of solids and thin film samples, but has the major drawback that such instruments are both rare and expensive. The Virtual SIMS project aims to design, develop and operate the IT infrastructure around the CAMECA IMS 1280-HR SIMS at GFZ Potsdam. The system will cover the whole spectrum of the procedures in the lab - from the online application for measurement time, to the remote access to the instrument and finally the maintenance of the data for publishing and future re-use. A virtual lab infrastructure around the IMS 1280 will enable remote access to the instrument and make measurement time available to the broadest possible user community. Envisioned is that the IT infrastructure would consist of the following: web portal, data repository, sample repository, project management software, communication arrangements between the lab staff and distant researcher and remote access to the instruments. The web portal will handle online applications for the measurement time. The data from the experiments, the monitoring sensor logs and the lab logbook entries are to be stored and archived. Researchers will be able to access their data remotely in real time, thus imposing a user rights management strucuture. Also planned is that all samples and the standards will be assigned a unique International GeoSample Number (IGSN) and that the images of the samples will be stored and made accessible in addition to any additional documents which might be uploaded by the researcher. The project management application will schedule the application process, the measurements times, notifications and alerts. A video conference capability is forseen for communication between the Potsdam staff and the remote researcher. The remote access to the instruments requires a sophisticated client-server solution. This highly sensitive instrument has to be controlled in real-time with latencies diminished to a minimum. Also, failures and shortages of the internet connection, as well as possible outages on the client side, have to be considered and safe fallbacks for such events must be provided. The level of skills of the researcher remotely operating the instrument will define the scope of control given during an operating session. An important aspect of the project is the design of the virtual lab system in collaboration with the laboratory operators and the researchers who will use the instrument and its peripherals. Different approaches for the IT solutions will be tested and evaluated, so imporved guidelines can evolve from obsperved operating performance.
NASA Astrophysics Data System (ADS)
Moore, C.
2011-12-01
The Index to Marine and Lacustrine Geological Samples is a community designed and maintained resource enabling researchers to locate and request sea floor and lakebed geologic samples archived by partner institutions. Conceived in the dawn of the digital age by representatives from U.S. academic and government marine core repositories and the NOAA National Geophysical Data Center (NGDC) at a 1977 meeting convened by the National Science Foundation (NSF), the Index is based on core concepts of community oversight, common vocabularies, consistent metadata and a shared interface. Form and content of underlying vocabularies and metadata continue to evolve according to the needs of the community, as do supporting technologies and access methodologies. The Curators Consortium, now international in scope, meets at partner institutions biennially to share ideas and discuss best practices. NGDC serves the group by providing database access and maintenance, a list server, digitizing support and long-term archival of sample metadata, data and imagery. Over three decades, participating curators have performed the herculean task of creating and contributing metadata for over 195,000 sea floor and lakebed cores, grabs, and dredges archived in their collections. Some partners use the Index for primary web access to their collections while others use it to increase exposure of more in-depth institutional systems. The Index is currently a geospatially-enabled relational database, publicly accessible via Web Feature and Web Map Services, and text- and ArcGIS map-based web interfaces. To provide as much knowledge as possible about each sample, the Index includes curatorial contact information and links to related data, information and images; 1) at participating institutions, 2) in the NGDC archive, and 3) at sites such as the Rolling Deck to Repository (R2R) and the System for Earth Sample Registration (SESAR). Over 34,000 International GeoSample Numbers (IGSNs) linking to SESAR are included in anticipation of opportunities for interconnectivity with Integrated Earth Data Applications (IEDA) systems. To promote interoperability and broaden exposure via the semantic web, NGDC is publishing lithologic classification schemes and terminology used in the Index as Simple Knowledge Organization System (SKOS) vocabularies, coordinating with R2R and the Consortium for Ocean Leadership for consistency. Availability in SKOS form will also facilitate use of the vocabularies in International Standards Organization (ISO) 19115-2 compliant metadata records. NGDC provides stewardship for the Index on behalf of U.S. repositories as the NSF designated "appropriate National Data Center" for data and metadata pertaining to sea floor samples as specified in the 2011 Division of Ocean Sciences Sample and Data Policy, and on behalf of international partners via a collocated World Data Center. NGDC operates on the Open Archival Information System (OAIS) reference model. Active Partners: Antarctic Marine Geology Research Facility, Florida State University; British Ocean Sediment Core Research Facility; Geological Survey of Canada; Integrated Ocean Drilling Program; Lamont-Doherty Earth Observatory; National Lacustrine Core Repository, University of Minnesota; Oregon State University; Scripps Institution of Oceanography; University of Rhode Island; U.S. Geological Survey; Woods Hole Oceanographic Institution.
The French initiative for scientific cores virtual curating : a user-oriented integrated approach
NASA Astrophysics Data System (ADS)
Pignol, Cécile; Godinho, Elodie; Galabertier, Bruno; Caillo, Arnaud; Bernardet, Karim; Augustin, Laurent; Crouzet, Christian; Billy, Isabelle; Teste, Gregory; Moreno, Eva; Tosello, Vanessa; Crosta, Xavier; Chappellaz, Jérome; Calzas, Michel; Rousseau, Denis-Didier; Arnaud, Fabien
2016-04-01
Managing scientific data is probably one the most challenging issue in modern science. The question is made even more sensitive with the need of preserving and managing high value fragile geological sam-ples: cores. Large international scientific programs, such as IODP or ICDP are leading an intense effort to solve this problem and propose detailed high standard work- and dataflows thorough core handling and curating. However most results derived from rather small-scale research programs in which data and sample management is generally managed only locally - when it is … The national excellence equipment program (Equipex) CLIMCOR aims at developing French facilities for coring and drilling investigations. It concerns indiscriminately ice, marine and continental samples. As part of this initiative, we initiated a reflexion about core curating and associated coring-data management. The aim of the project is to conserve all metadata from fieldwork in an integrated cyber-environment which will evolve toward laboratory-acquired data storage in a near future. In that aim, our demarche was conducted through an close relationship with field operators as well laboratory core curators in order to propose user-oriented solutions. The national core curating initiative currently proposes a single web portal in which all scientifics teams can store their field data. For legacy samples, this will requires the establishment of a dedicated core lists with associated metadata. For forthcoming samples, we propose a mobile application, under Android environment to capture technical and scientific metadata on the field. This application is linked with a unique coring tools library and is adapted to most coring devices (gravity, drilling, percussion, etc...) including multiple sections and holes coring operations. Those field data can be uploaded automatically to the national portal, but also referenced through international standards or persistent identifiers (IGSN, ORCID and INSPIRE) and displayed in international portals (currently, NOAA's IMLGS). In this paper, we present the architecture of the integrated system, future perspectives and the approach we adopted to reach our goals. We will also present in front of our poster, one of the three mobile applications, dedicated more particularly to the operations of continental drillings.
High-precision gravimetric survey in support of lunar laser ranging at Haleakala, Maui, 1976 - 1978
NASA Technical Reports Server (NTRS)
Schenck, B. E.; Laurila, S. H.
1978-01-01
The planning, observations and adjustment of high-precision gravity survey networks established on the islands of Maui and Oahu as part of the geodetic-geophysical program in support of lunar laser ranging at Haleakala, Maui, Hawaii are described. The gravity survey networks include 43 independently measured gravity differences along the gravity calibration line from Kahului Airport to the summit of Mt. Haleakala, together with some key points close to tidal gauges on Maui, and 40 gravity differences within metropolitan Honolulu. The results of the 1976-1978 survey are compared with surveys made in 1961 and in 1964-1965. All final gravity values are given in the system of the international gravity standardization net 1971 (IGSN 71); values are obtained by subtracting 14.57 mgal from the Potsdam value at the gravity base station at the Hickam Air Force Base, Honolulu.
Working with Specify in a Paleo-Geological Context
NASA Astrophysics Data System (ADS)
Molineux, A.; Thompson, A. C.; Appleton, L.
2014-12-01
For geological collections with limited funding an open source relational database provides an opportunity to digitize specimens and related data. At the Non-vertebrate Paleontology Lab, a large mixed paleo and geological repository on a restricted budget, we opted for one such database, Specify. Initially created at Kansas University for neontological collections and based on a single computer, Specify has moved into the networked scene and will soon be web-based as Specify 7. We currently use the server version of Specify 6, networked to all computers in the lab each running a desktop client, often with six users at any one time. Along with improved access there have been great efforts to broaden the applicability of this database to other disciplines. Current developments are of great importance to us because they focus on the geological aspects of lithostratigraphy and chronostratigaphy and their relationship to other variables. Adoption of this software has required constant change as we move to take advantage of the great improvements. We enjoy the interaction with the developers and their willingness to listen and consider our issues. Here we discuss some of the ways in which we have fashioned Specify into a database that provides us with the flexibility that we need without removing the ability to share our data with other aggregators through accepted protocols. We discuss the customization of forms, the attachment of media and tracking of original media files, our efforts to incorporate geological specimens, and our plans to link the individual specimen record GUIDs to an IGSN numbers and thence to future connections to data derived from our specimens.
Isostatic gravity map of the Monterey 30 x 60 minute quadrangle and adjacent areas, California
Langenheim, V.E.; Stiles, S.R.; Jachens, R.C.
2002-01-01
The digital dataset consists of one file (monterey_100k.iso) containing 2,385 gravity stations. The file, monterey_100k.iso, contains the principal facts of the gravity stations, with one point coded per line. The format of the data is described below. Each gravity station has a station name, location (latitude and longitude, NAD27 projection), elevation, and an observed gravity reading. The data are on the IGSN71 datum and the reference ellipsoid is the Geodetic Reference System 1967 (GRS67). The free-air gravity anomalies were calculated using standard formulas (Telford and others, 1976). The Bouguer, curvature, and terrain corrections were applied to the free-air anomaly at each station to determine the complete Bouguer gravity anomalies at a reduction density of 2.67 g/cc. An isostatic correction was then applied to remove the long-wavelength effect of deep crustal and/or upper mantle masses that isostatically support regional topography.
20 Years of persistent identifiers - Which systems are here to stay?
NASA Astrophysics Data System (ADS)
Klump, Jens; Huber, Robert; Lehnert, Kerstin
2016-04-01
Web-based persistent identifiers have been around for more than 20 years, a period long enough to start observing patterns of success and failure. Persistent identifiers were invented to address challenges arising from the distributed and disorganised nature of the internet, which not only allowed new technologies to emerge, it also made it difficult to maintain a persistent record of science. Persistent identifiers now allow unambiguous identification of resources on the net. The expectations were that persistent identifiers would lead to greater accessibility, transparency and reproducibility of research results. Over the past two decades a number of persistent identifier systems have been built, one of them being Digital Object Identifiers (DOI). While DOI were originally invented by the publishing industry, they quickly became an established way for the identification of research resources. At first, these resources referred to scholarly literature and related resources. Other identifier systems, some of them using DOI as an example, were developed as grass-roots efforts by the scientific community. The concept of using persistent identifiers has since been expanded to other, non-textual resources, like datasets (DOI, EPIC) and geological specimens (IGSN), and more recently to authors and contributors of scholarly works (ORCID), and to software and instruments. A common witticism states that "a great thing about standards is that there are so many to choose from." Setting up identifier systems is technically trivial. The real challenge lies in creating a governance system for the respective identifiers. Which systems will stand the test of time? Drawing on data from the Registry of Research Data Repositories (re3data.org) and our own experience in the field, this presentation looks at the history and adoption of existing identifier systems and how this gives us some indications towards factors influencing sustainability of these systems.
ODM2 Admin Pilot Project- a Data Management Application for Observations of the Critical Zone.
NASA Astrophysics Data System (ADS)
Leon, M.; McDowell, W. H.; Mayorga, E.; Setiawan, L.; Hooper, R. P.
2017-12-01
ODM2 Admin is a tool to manage data stored in a relational database using the Observation Data Model 2 (ODM2) information model. Originally developed by the Luquillo Critical Zone Observatory (CZO) to manage a wide range of Earth observations, it has now been deployed at 6 projects: the Catalina Jemez CZO, the Dry Creek Experimental Forest, Au Sable and Manistee River sites managed by Michigan State, Tropical Response to Altered Climate Experiment (TRACE) and the Critical Zone Integrative Microbial Ecology Activity (CZIMEA) EarthCube project; most of these deployments are hosted on a Microsoft Azure cloud server managed by CUAHSI. ODM2 Admin is a web application built on the Python open-source Django framework and available for download from GitHub and DockerHub. It provides tools for data ingestion, editing, QA/QC, data visualization, browsing, mapping and documentation of equipment deployment, methods, and citations. Additional features include the ability to generate derived data values, automatically or manually create data annotations and create datasets from arbitrary groupings of results. Over 22 million time series values for more than 600 time series are being managed with ODM2 Admin across the 6 projects as well as more than 12,000 soil profiles and other measurements. ODM2 Admin links with external identifier systems through DOIs, ORCiDs and IGSNs, so cited works, details about researchers and earth sample meta-data can be accessed directly from ODM2 Admin. This application is part of a growing open source ODM2 application ecosystem under active development. ODM2 Admin can be deployed alongside other tools from the ODM2 ecosystem, including ODM2API and WOFpy, which provide access to the underlying ODM2 data through a Python API and Water One Flow web services.
Open Core Data: Connecting scientific drilling data to scientists and community data resources
NASA Astrophysics Data System (ADS)
Fils, D.; Noren, A. J.; Lehnert, K.; Diver, P.
2016-12-01
Open Core Data (OCD) is an innovative, efficient, and scalable infrastructure for data generated by scientific drilling and coring to improve discoverability, accessibility, citability, and preservation of data from the oceans and continents. OCD is building on existing community data resources that manage, store, publish, and preserve scientific drilling data, filling a critical void that currently prevents linkages between these and other data systems and tools to realize the full potential of data generated through drilling and coring. We are developing this functionality through Linked Open Data (LOD) and semantic patterns that enable data access through the use of community ontologies such as GeoLink (geolink.org, an EarthCube Building Block), a collection of protocols, formats and vocabularies from a set of participating geoscience repositories. Common shared concepts of classes such as cruise, dataset, person and others allow easier resolution of common references through shared resource IDs. These graphs are then made available via SPARQL as well as incorporated into web pages following schema.org approaches. Additionally the W3C PROV vocabulary is under evaluation for use for documentation of provenance. Further, the application of persistent identifiers for samples (IGSNs); datasets, expeditions, and projects (DOIs); and people (ORCIDs), combined with LOD approaches, provides methods to resolve and incorporate metadata and datasets. Application Program Interfaces (APIs) complement these semantic approaches to the OCD data holdings. APIs are exposed following the Swagger guidelines (swagger.io) and will be evolved into the OpenAPI (openapis.org) approach. Currently APIs are in development for the NSF funded Flyover Country mobile geoscience app (fc.umn.edu), the Neotoma Paleoecology Database (neotomadb.org), Magnetics Information Consortium (MagIC; earthref.org/MagIC), and other community tools and data systems, as well as for internal OCD use.
Gravimetric investigations on the North American Datum (1972 - 1973)
NASA Technical Reports Server (NTRS)
Mather, R. S.
1975-01-01
All the available unclassified gravity data on the North American Datum (NAD) and in the surrounding oceans was assembled late in 1972 for the investigation of the gravity field in North America and its relation to North American Datum 1927 (NAD 27). The gravity data in Canada and the United States was compiled on a common datum compatible with the International Gravity Standardization Network 1971 (IGSN 71). The variation in the error of representation in the region is studied along with the correlation characteristics of gravity anomalies with elevation. A free air geoid (FAG 73) was computed from a combination of surface gravity data and Goddard Earth Model (GEM) 4 and this was used as the basis for the computation of the non-Stokesian contributions to the height anomaly. The geocentric orientation parameters obtained by this astrogravimetric method are compared with those obtained by satellite techniques. The differences are found to be no greater than those between individual satellite solutions. The differences between the astrogravimetric solution and satellite solutions GSFC 73 and GEM 6 are studied in detail with a view to obtaining a better understanding of these discrepancies.
Architecture for the Interdisciplinary Earth Data Alliance
NASA Astrophysics Data System (ADS)
Richard, S. M.
2016-12-01
The Interdisciplinary Earth Data Alliance (IEDA) is leading an EarthCube (EC) Integrative Activity to develop a governance structure and technology framework that enables partner data systems to share technology, infrastructure, and practice for documenting, curating, and accessing heterogeneous geoscience data. The IEDA data facility provides capabilities in an extensible framework that enables domain-specific requirements for each partner system in the Alliance to be integrated into standardized cross-domain workflows. The shared technology infrastructure includes a data submission hub, a domain-agnostic file-based repository, an integrated Alliance catalog and a Data Browser for data discovery across all partner holdings, as well as services for registering identifiers for datasets (DOI) and samples (IGSN). The submission hub will be a platform that facilitates acquisition of cross-domain resource documentation and channels users into domain and resource-specific workflows tailored for each partner community. We are exploring an event-based message bus architecture with a standardized plug-in interface for adding capabilities. This architecture builds on the EC CINERGI metadata pipeline as well as the message-based architecture of the SEAD project. Plug-in components for file introspection to match entities to a data type registry (extending EC Digital Crust and Research Data Alliance work), extract standardized keywords (using CINERGI components), location, cruise, personnel and other metadata linkage information (building on GeoLink and existing IEDA partner components). The submission hub will feed submissions to appropriate partner repositories and service endpoints targeted by domain and resource type for distribution. The Alliance governance will adopt patterns (vocabularies, operations, resource types) for self-describing data services using standard HTTP protocol for simplified data access (building on EC GeoWS and other `RESTful' approaches). Exposure of resource descriptions (datasets and service distributions) for harvesting by commercial search engines as well as geoscience-data focused crawlers (like EC B-Cube crawler) will increase discoverability of IEDA resources with minimal effort by curators.
DOIDB: Reusing DataCite's search software as metadata portal for GFZ Data Services
NASA Astrophysics Data System (ADS)
Elger, K.; Ulbricht, D.; Bertelmann, R.
2016-12-01
GFZ Data Services is the central service point for the publication of research data at the Helmholtz Centre Potsdam GFZ German Research Centre for Geosciences (GFZ). It provides data publishing services to scientists of GFZ, associated projects, and associated institutions. The publishing services aim to make research data and physical samples visible and citable, by assigning persistent identifiers (DOI, IGSN) and by complementing existing IT infrastructure. To integrate several research domains a modular software stack that is made of free software components has been created to manage data and metadata as well as register persistent identifiers [1]. Pivotal component for the registration of DOIs is the DOIDB. It has been derived from three software components provided by DataCite [2] that moderate the registration of DOIs and the deposition of metadata, allow the dissemination of metadata, and provide a user interface to navigate and discover datasets. The DOIDB acts as a proxy to the DataCite infrastructure and in addition to the DataCite metadata schema, it allows to deposit and disseminate metadata following the schemas ISO19139 and NASA GCMD DIF. The search component has been modified to meet the requirements of a geosciences metadata portal. In particular, the search component has been altered to make use of Apache SOLRs capability to index and query spatial coordinates. Furthermore, the user interface has been adjusted to provide a first impression of the data by showing a map, summary information and subjects. DOIDB and its components are available on GitHub [3].We present a software solution for registration of DOIs that allows to integrate existing data systems, keeps track of registered DOIs, and provides a metadata portal to discover datasets [4]. [1] Ulbricht, D.; Elger, K.; Bertelmann, R.; Klump, J. panMetaDocs, eSciDoc, and DOIDB—An Infrastructure for the Curation and Publication of File-Based Datasets for GFZ Data Services. ISPRS Int. J. Geo-Inf. 2016, 5, 25. http://doi.org/10.3390/ijgi5030025[2] https://github.com/datacite[3] https://github.com/ulbricht/search/tree/doidb , https://github.com/ulbricht/mds/tree/doidb , https://github.com/ulbricht/oaip/tree/doidb[4] http://doidb.wdc-terra.org
NASA Astrophysics Data System (ADS)
Walker, J. D.; Ash, J. M.; Bowring, J.; Bowring, S. A.; Deino, A. L.; Kislitsyn, R.; Koppers, A. A.
2009-12-01
One of the most onerous tasks in rigorous development of data reporting and databases for geochronological and thermochronological studies is to fully capture all of the metadata needed to completely document both the analytical work as well as the interpretation effort. This information is available in the data reduction programs used by researchers, but has proven difficult to harvest into either publications or databases. For this reason, the EarthChem and EARTHTIME efforts are collaborating to foster the next generation of data management and discovery for age information by integrating data reporting with data reduction. EarthChem is a community-driven effort to facilitate the discovery, access, and preservation of geochemical data of all types and to support research and enable new and better science. EARTHTIME is also a community-initiated project whose aim is to foster the next generation of high-precision geochronology and thermochoronology. In addition, collaboration with the CRONUS effort for cosmogenic radionuclides is in progress. EarthChem workers have met with groups working on the Ar-Ar, U-Pb, and (U-Th)/He systems to establish data reporting requirements as well as XML schemas to be used for transferring data from reduction programs to database. At present, we have prototype systems working for the U-Pb_Redux, ArArCalc, MassSpec, and Helios programs. In each program, the user can select to upload data and metadata to the GEOCHRON system hosted at EarthChem. There are two additional requirements for upload. The first is having a unique identifier (IGSN) obtained either manually or via web services contained within the reduction program from the SESAR system. The second is that the user selects whether the sample is to be available for discovery (public) or remain hidden (private). Search for data at the GEOCHRON portal can be done using age, method, mineral, or location parameters. Data can be downloaded in the full XML format for ingestion back into the reduction program or as abbreviated tables.
NASA Astrophysics Data System (ADS)
Klump, J. F.; Ulbricht, D.; Conze, R.
2014-12-01
The Continental Deep Drilling Programme (KTB) was a scientific drilling project from 1987 to 1995 near Windischeschenbach, Bavaria. The main super-deep borehole reached a depth of 9,101 meters into the Earth's continental crust. The project used the most current equipment for data capture and processing. After the end of the project key data were disseminated through the web portal of the International Continental Scientific Drilling Program (ICDP). The scientific reports were published as printed volumes. As similar projects have also experienced, it becomes increasingly difficult to maintain a data portal over a long time. Changes in software and underlying hardware make a migration of the entire system inevitable. Around 2009 the data presented on the ICDP web portal were migrated to the Scientific Drilling Database (SDDB) and published through DataCite using Digital Object Identifiers (DOI) as persistent identifiers. The SDDB portal used a relational database with a complex data model to store data and metadata. A PHP-based Content Management System with custom modifications made it possible to navigate and browse datasets using the metadata and then download datasets. The data repository software eSciDoc allows storing self-contained packages consistent with the OAIS reference model. Each package consists of binary data files and XML-metadata. Using a REST-API the packages can be stored in the eSciDoc repository and can be searched using the XML-metadata. During the last maintenance cycle of the SDDB the data and metadata were migrated into the eSciDoc repository. Discovery metadata was generated following the GCMD-DIF, ISO19115 and DataCite schemas. The eSciDoc repository allows to store an arbitrary number of XML-metadata records with each data object. In addition to descriptive metadata each data object may contain pointers to related materials, such as IGSN-metadata to link datasets to physical specimens, or identifiers of literature interpreting the data. Datasets are presented by XSLT-stylesheet transformation using the stored metadata. The presentation shows several migration cycles of data and metadata, which were driven by aging software systems. Currently the datasets reside as self-contained entities in a repository system that is ready for digital preservation.
Gravity and isostatic anomaly maps of Greece produced
NASA Astrophysics Data System (ADS)
Lagios, E.; Chailas, S.; Hipkin, R. G.
A gravity anomaly map of Greece was first compiled in the early 1970s [Makris and Stavrou, 1984] from all available gravity data collected by different Hellenic institutions. However, to compose this map the data had to be smoothed to the point that many of the smaller-wavelength gravity anomalies were lost. New work begun in 1987 has resulted in the publication of an updated map [Lagios et al., 1994] and an isostatic anomaly map derived from it.The gravity data cover the area between east longitudes 19° and 27° and north latitudes 32° and 42°, organized in files of 100-km squares and grouped in 10-km squares using UTM zone 34 coordinates. Most of the data on land come from the gravity observations of Makris and Stavrou [1984] with additional data from the Institute of Geology and Mining Exploration, the Public Oil Corporation of Greece, and Athens University. These data were checked using techniques similar to those used in compiling the gravity anomaly map of the United States, but the horizontal gradient was used as a check rather than the gravity difference. Marine data were digitized from the maps of Morelli et al. [1975a, 1975b]. All gravity anomaly values are referred to the IGSN-71 system, reduced with the standard Bouger density of 2.67 Mg/m3. We estimate the errors of the anomalies in the continental part of Greece to be ±0.9 mGal; this is expected to be smaller over fairly flat regions. For stations whose height has been determined by leveling, the error is only ±0.3 mGal. For the marine areas, the errors are about ±5 mGal [Morelli, 1990].
7 CFR 58.244 - Number of samples.
Code of Federal Regulations, 2013 CFR
2013-01-01
... 7 Agriculture 3 2013-01-01 2013-01-01 false Number of samples. 58.244 Section 58.244 Agriculture... Procedures § 58.244 Number of samples. As many samples shall be taken from each dryer production lot as is necessary to assure proper composition and quality control. A sufficient number of representative samples...
7 CFR 58.244 - Number of samples.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 7 Agriculture 3 2010-01-01 2010-01-01 false Number of samples. 58.244 Section 58.244 Agriculture... Procedures § 58.244 Number of samples. As many samples shall be taken from each dryer production lot as is necessary to assure proper composition and quality control. A sufficient number of representative samples...
Effect of finite particle number sampling on baryon number fluctuations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Steinheimer, Jan; Koch, Volker
The effects of finite particle number sampling on the net baryon number cumulants, extracted from fluid dynamical simulations, are studied. The commonly used finite particle number sampling procedure introduces an additional Poissonian (or multinomial if global baryon number conservation is enforced) contribution which increases the extracted moments of the baryon number distribution. If this procedure is applied to a fluctuating fluid dynamics framework, one severely overestimates the actual cumulants. We show that the sampling of so-called test particles suppresses the additional contribution to the moments by at least one power of the number of test particles. We demonstrate this methodmore » in a numerical fluid dynamics simulation that includes the effects of spinodal decomposition due to a first-order phase transition. Furthermore, in the limit where antibaryons can be ignored, we derive analytic formulas which capture exactly the effect of particle sampling on the baryon number cumulants. These formulas may be used to test the various numerical particle sampling algorithms.« less
Effect of finite particle number sampling on baryon number fluctuations
Steinheimer, Jan; Koch, Volker
2017-09-28
The effects of finite particle number sampling on the net baryon number cumulants, extracted from fluid dynamical simulations, are studied. The commonly used finite particle number sampling procedure introduces an additional Poissonian (or multinomial if global baryon number conservation is enforced) contribution which increases the extracted moments of the baryon number distribution. If this procedure is applied to a fluctuating fluid dynamics framework, one severely overestimates the actual cumulants. We show that the sampling of so-called test particles suppresses the additional contribution to the moments by at least one power of the number of test particles. We demonstrate this methodmore » in a numerical fluid dynamics simulation that includes the effects of spinodal decomposition due to a first-order phase transition. Furthermore, in the limit where antibaryons can be ignored, we derive analytic formulas which capture exactly the effect of particle sampling on the baryon number cumulants. These formulas may be used to test the various numerical particle sampling algorithms.« less
Subrandom methods for multidimensional nonuniform sampling.
Worley, Bradley
2016-08-01
Methods of nonuniform sampling that utilize pseudorandom number sequences to select points from a weighted Nyquist grid are commonplace in biomolecular NMR studies, due to the beneficial incoherence introduced by pseudorandom sampling. However, these methods require the specification of a non-arbitrary seed number in order to initialize a pseudorandom number generator. Because the performance of pseudorandom sampling schedules can substantially vary based on seed number, this can complicate the task of routine data collection. Approaches such as jittered sampling and stochastic gap sampling are effective at reducing random seed dependence of nonuniform sampling schedules, but still require the specification of a seed number. This work formalizes the use of subrandom number sequences in nonuniform sampling as a means of seed-independent sampling, and compares the performance of three subrandom methods to their pseudorandom counterparts using commonly applied schedule performance metrics. Reconstruction results using experimental datasets are also provided to validate claims made using these performance metrics. Copyright © 2016 Elsevier Inc. All rights reserved.
Condition Survey and Paver Implementation Davis-Monthan Air Force Base, Arizona
1991-02-01
questiotns that the program asks, and then analysis results are produced based on those responses. The analysis reports can only be generated using the...09/01/89 PCI- 36 RATING- POOR CONDITION- RIDING- SAFETY - DRAINAGE- SHOULDERS- OVERALL- TOTAL NUMBER OF SAMPLES IN SECTION- 4 NUMBER OF SAMPLES...CONDITION- RIDING- SAFETY - DRAINAGE- SHOULDERS- OVERALL- TOTAL NUMBER OF SAMPLES IN SECTION- 17 NUMBER OF SAMPLES SURVEYED- 5 RECOMMENDED SAMPLES TO BE
NASA Astrophysics Data System (ADS)
KIM, H.; Suk, M. K.; Jung, S. A.; Park, J. S.; Ko, J. S.
2016-12-01
The data quality of dual-polarimetric weather radar is subject to radar scanning strategies such as pulse length, pulse repetition frequency (PRF), antenna scan speed, and sampling number. In terms of sampling number, the quality of radar moment data increases with the increasing of sampling number at the given PRF and pulse length while the feasible number of elevation angles decreases for the given time or the time required for radar volume scan increases with the relatively high sampling number. For operational weather radar, the sampling number is subjectively determined by the proficient radar operator. The determination of suitable sampling number is still challengeable for operational dual-polarimetric weather radar.In this study, we analyzed the sensitivity of polarimetric measurements to sampling number based on special radar experiment for rainfall and snowfall events using S-band dual-polarimetric radar (YIT) at Yong-In test bed. For this experiment, YIT radar transmitted a simultaneously polarized beam in horizontal and vertical with pulse length of 1.0 μs and single PRF of 600Hz. The beam width and gate size were 1.0° and 250m, respectively. The volume scan was composed of three PPI scans with three sampling numbers (antenna scan speed) of 40 (15°s-1), 60(10°s-1), and 85(7°s-1) at same elevation angle (=0.2°). We first investigated the spatial fluctuation of the polarimetric measurements according to three sampling numbers using radial texture. As the sampling number increases, the radial fluctuations of polarimetric measurements decrease. Second, we also examined the sensitivity to fuzzy logic based quality control algorithm for dual-polarimetric radar (Ye et al. 2015). The probability density functions (PDFs) of fuzzy logic feature parameters between ground clutter and meteorological echo area were compared. For overlapping area in both PDFs between ground clutter and meteorological echo increases with decreasing the sampling number. As the overlapping area increases, the classification of ground clutter (or meteorological echo) in fuzzy logic classifier is more difficult due to similar characteristics between ground clutter and meteorological echoes.
Ooi, Delicia Shu Qin; Tan, Verena Ming Hui; Ong, Siong Gim; Chan, Yiong Huak; Heng, Chew Kiat; Lee, Yung Seng
2017-01-01
The human salivary (AMY1) gene, encoding salivary α-amylase, has variable copy number variants (CNVs) in the human genome. We aimed to determine if real-time quantitative polymerase chain reaction (qPCR) and the more recently available Droplet Digital PCR (ddPCR) can provide a precise quantification of the AMY1 gene copy number in blood, buccal cells and saliva samples derived from the same individual. Seven participants were recruited and DNA was extracted from the blood, buccal cells and saliva samples provided by each participant. Taqman assay real-time qPCR and ddPCR were conducted to quantify AMY1 gene copy numbers. Statistical analysis was carried out to determine the difference in AMY1 gene copy number between the different biological specimens and different assay methods. We found significant within-individual difference (p<0.01) in AMY1 gene copy number between different biological samples as determined by qPCR. However, there was no significant within-individual difference in AMY1 gene copy number between different biological samples as determined by ddPCR. We also found that AMY1 gene copy number of blood samples were comparable between qPCR and ddPCR, while there is a significant difference (p<0.01) between AMY1 gene copy numbers measured by qPCR and ddPCR for both buccal swab and saliva samples. Despite buccal cells and saliva samples being possible sources of DNA, it is pertinent that ddPCR or a single biological sample, preferably blood sample, be used for determining highly polymorphic gene copy numbers like AMY1, due to the large within-individual variability between different biological samples if real time qPCR is employed.
40 CFR 257.23 - Ground-water sampling and analysis requirements.
Code of Federal Regulations, 2014 CFR
2014-07-01
... parameters shall be determined after considering the number of samples in the background data base, the data... considering the number of samples in the background data base, the data distribution, and the range of the... of § 257.22(a)(1). (f) The number of samples collected to establish ground-water quality data must be...
40 CFR 257.23 - Ground-water sampling and analysis requirements.
Code of Federal Regulations, 2012 CFR
2012-07-01
... parameters shall be determined after considering the number of samples in the background data base, the data... considering the number of samples in the background data base, the data distribution, and the range of the... of § 257.22(a)(1). (f) The number of samples collected to establish ground-water quality data must be...
Federal Register 2010, 2011, 2012, 2013, 2014
2013-02-04
... personnel and law enforcement officers. For Confidential Informants, fingerprints, handwriting samples... Informant; Fingerprints; Handwriting sample; Identifying numbers, such as Social Security Number, Alien...; Photograph of the individual; Fingerprints; Handwriting sample; Identifying numbers, such as Social Security...
NASA Technical Reports Server (NTRS)
Rao, R. G. S.; Ulaby, F. T.
1977-01-01
The paper examines optimal sampling techniques for obtaining accurate spatial averages of soil moisture, at various depths and for cell sizes in the range 2.5-40 acres, with a minimum number of samples. Both simple random sampling and stratified sampling procedures are used to reach a set of recommended sample sizes for each depth and for each cell size. Major conclusions from statistical sampling test results are that (1) the number of samples required decreases with increasing depth; (2) when the total number of samples cannot be prespecified or the moisture in only one single layer is of interest, then a simple random sample procedure should be used which is based on the observed mean and SD for data from a single field; (3) when the total number of samples can be prespecified and the objective is to measure the soil moisture profile with depth, then stratified random sampling based on optimal allocation should be used; and (4) decreasing the sensor resolution cell size leads to fairly large decreases in samples sizes with stratified sampling procedures, whereas only a moderate decrease is obtained in simple random sampling procedures.
Recursive algorithms for phylogenetic tree counting.
Gavryushkina, Alexandra; Welch, David; Drummond, Alexei J
2013-10-28
In Bayesian phylogenetic inference we are interested in distributions over a space of trees. The number of trees in a tree space is an important characteristic of the space and is useful for specifying prior distributions. When all samples come from the same time point and no prior information available on divergence times, the tree counting problem is easy. However, when fossil evidence is used in the inference to constrain the tree or data are sampled serially, new tree spaces arise and counting the number of trees is more difficult. We describe an algorithm that is polynomial in the number of sampled individuals for counting of resolutions of a constraint tree assuming that the number of constraints is fixed. We generalise this algorithm to counting resolutions of a fully ranked constraint tree. We describe a quadratic algorithm for counting the number of possible fully ranked trees on n sampled individuals. We introduce a new type of tree, called a fully ranked tree with sampled ancestors, and describe a cubic time algorithm for counting the number of such trees on n sampled individuals. These algorithms should be employed for Bayesian Markov chain Monte Carlo inference when fossil data are included or data are serially sampled.
21 CFR 203.38 - Sample lot or control numbers; labeling of sample units.
Code of Federal Regulations, 2010 CFR
2010-04-01
... 21 Food and Drugs 4 2010-04-01 2010-04-01 false Sample lot or control numbers; labeling of sample units. 203.38 Section 203.38 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) DRUGS: GENERAL PRESCRIPTION DRUG MARKETING Samples § 203.38 Sample lot or control...
Preparation of Chemical Samples On Relevant Surfaces Using Inkjet Technology
2013-04-01
PREPARATION OF CHEMICAL SAMPLES ON RELEVANT SURFACES USING INKJET TECHNOLOGY...2012 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Preparation of Chemical Samples on Relevant Surfaces Using Inkjet Technology 5b. GRANT NUMBER...SUBJECT TERMS Surface detection Inkjet Simulant deposition 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT 18. NUMBER OF
[Effects of sampling plot number on tree species distribution prediction under climate change].
Liang, Yu; He, Hong-Shi; Wu, Zhi-Wei; Li, Xiao-Na; Luo, Xu
2013-05-01
Based on the neutral landscapes under different degrees of landscape fragmentation, this paper studied the effects of sampling plot number on the prediction of tree species distribution at landscape scale under climate change. The tree species distribution was predicted by the coupled modeling approach which linked an ecosystem process model with a forest landscape model, and three contingent scenarios and one reference scenario of sampling plot numbers were assumed. The differences between the three scenarios and the reference scenario under different degrees of landscape fragmentation were tested. The results indicated that the effects of sampling plot number on the prediction of tree species distribution depended on the tree species life history attributes. For the generalist species, the prediction of their distribution at landscape scale needed more plots. Except for the extreme specialist, landscape fragmentation degree also affected the effects of sampling plot number on the prediction. With the increase of simulation period, the effects of sampling plot number on the prediction of tree species distribution at landscape scale could be changed. For generalist species, more plots are needed for the long-term simulation.
Monitoring of Cryptosporidium and Giardia in Czech drinking water sources.
Dolejs, P; Ditrich, O; Machula, T; Kalousková, N; Puzová, G
2000-01-01
In Czech raw water sources for drinking water supply, Cryptosporidium was found in numbers from 0 to 7400 per 100 liters and Giardia from 0 to 485 per 100 liters. The summer floods of 1997 probably brought the highest numbers of Cryptosporidium oocysts into one of the reservoirs sampled; since then these numbers decreased steadily. A relatively high number of Cryptosporidium oocysts was found in one sample of treated water. Repeated sampling demonstrated that this was a sporadic event. The reason for the presence of Cryptosporidium in a sample of treated drinking-water is unclear and requires further study.
Considerations for throughfall chemistry sample-size determination
Pamela J. Edwards; Paul Mohai; Howard G. Halverson; David R. DeWalle
1989-01-01
Both the number of trees sampled per species and the number of sampling points under each tree are important throughfall sampling considerations. Chemical loadings obtained from an urban throughfall study were used to evaluate the relative importance of both of these sampling factors in tests for determining species' differences. Power curves for detecting...
Technical note: Alternatives to reduce adipose tissue sampling bias.
Cruz, G D; Wang, Y; Fadel, J G
2014-10-01
Understanding the mechanisms by which nutritional and pharmaceutical factors can manipulate adipose tissue growth and development in production animals has direct and indirect effects in the profitability of an enterprise. Adipocyte cellularity (number and size) is a key biological response that is commonly measured in animal science research. The variability and sampling of adipocyte cellularity within a muscle has been addressed in previous studies, but no attempt to critically investigate these issues has been proposed in the literature. The present study evaluated 2 sampling techniques (random and systematic) in an attempt to minimize sampling bias and to determine the minimum number of samples from 1 to 15 needed to represent the overall adipose tissue in the muscle. Both sampling procedures were applied on adipose tissue samples dissected from 30 longissimus muscles from cattle finished either on grass or grain. Briefly, adipose tissue samples were fixed with osmium tetroxide, and size and number of adipocytes were determined by a Coulter Counter. These results were then fit in a finite mixture model to obtain distribution parameters of each sample. To evaluate the benefits of increasing number of samples and the advantage of the new sampling technique, the concept of acceptance ratio was used; simply stated, the higher the acceptance ratio, the better the representation of the overall population. As expected, a great improvement on the estimation of the overall adipocyte cellularity parameters was observed using both sampling techniques when sample size number increased from 1 to 15 samples, considering both techniques' acceptance ratio increased from approximately 3 to 25%. When comparing sampling techniques, the systematic procedure slightly improved parameters estimation. The results suggest that more detailed research using other sampling techniques may provide better estimates for minimum sampling.
Optimal sampling and quantization of synthetic aperture radar signals
NASA Technical Reports Server (NTRS)
Wu, C.
1978-01-01
Some theoretical and experimental results on optimal sampling and quantization of synthetic aperture radar (SAR) signals are presented. It includes a description of a derived theoretical relationship between the pixel signal to noise ratio of processed SAR images and the number of quantization bits per sampled signal, assuming homogeneous extended targets. With this relationship known, a solution may be realized for the problem of optimal allocation of a fixed data bit-volume (for specified surface area and resolution criterion) between the number of samples and the number of bits per sample. The results indicate that to achieve the best possible image quality for a fixed bit rate and a given resolution criterion, one should quantize individual samples coarsely and thereby maximize the number of multiple looks. The theoretical results are then compared with simulation results obtained by processing aircraft SAR data.
Phylogenetic Copy-Number Factorization of Multiple Tumor Samples.
Zaccaria, Simone; El-Kebir, Mohammed; Klau, Gunnar W; Raphael, Benjamin J
2018-04-16
Cancer is an evolutionary process driven by somatic mutations. This process can be represented as a phylogenetic tree. Constructing such a phylogenetic tree from genome sequencing data is a challenging task due to the many types of mutations in cancer and the fact that nearly all cancer sequencing is of a bulk tumor, measuring a superposition of somatic mutations present in different cells. We study the problem of reconstructing tumor phylogenies from copy-number aberrations (CNAs) measured in bulk-sequencing data. We introduce the Copy-Number Tree Mixture Deconvolution (CNTMD) problem, which aims to find the phylogenetic tree with the fewest number of CNAs that explain the copy-number data from multiple samples of a tumor. We design an algorithm for solving the CNTMD problem and apply the algorithm to both simulated and real data. On simulated data, we find that our algorithm outperforms existing approaches that either perform deconvolution/factorization of mixed tumor samples or build phylogenetic trees assuming homogeneous tumor samples. On real data, we analyze multiple samples from a prostate cancer patient, identifying clones within these samples and a phylogenetic tree that relates these clones and their differing proportions across samples. This phylogenetic tree provides a higher resolution view of copy-number evolution of this cancer than published analyses.
2015-03-01
ALGORITHM—EIGENVALUE ESTIMATION OF HYPERSPECTRAL WISHART COVARIANCE MATRICES FROM A LIMITED NUMBER OF SAMPLES ECBC-TN-067 Avishai Ben- David ...NUMBER 6. AUTHOR(S) Ben- David , Avishai (ECBC) and Davidson, Charles E. (STC) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7...and published by Avishai Ben- David and Charles E. Davidson (Eigenvalue Estimation of Hyperspectral WishartCovariance Matrices from Limited Number of
Effect of different sampling schemes on the spatial placement of conservation reserves in Utah, USA
Bassett, S.D.; Edwards, T.C.
2003-01-01
We evaluated the effect of three different sampling schemes used to organize spatially explicit biological information had on the spatial placement of conservation reserves in Utah, USA. The three sampling schemes consisted of a hexagon representation developed by the EPA/EMAP program (statistical basis), watershed boundaries (ecological), and the current county boundaries of Utah (socio-political). Four decision criteria were used to estimate effects, including amount of area, length of edge, lowest number of contiguous reserves, and greatest number of terrestrial vertebrate species covered. A fifth evaluation criterion was the effect each sampling scheme had on the ability of the modeled conservation reserves to cover the six major ecoregions found in Utah. Of the three sampling schemes, county boundaries covered the greatest number of species, but also created the longest length of edge and greatest number of reserves. Watersheds maximized species coverage using the least amount of area. Hexagons and watersheds provide the least amount of edge and fewest number of reserves. Although there were differences in area, edge and number of reserves among the sampling schemes, all three schemes covered all the major ecoregions in Utah and their inclusive biodiversity. ?? 2003 Elsevier Science Ltd. All rights reserved.
2003-01-01
PHASE MICROEXTRACTION COUPLED WITH GAS CHROMATOGRAPHY/MASS SPECTROMETRY AS A RAPID METHOD FOR FIELD SAMPLING AND ANALYSIS OF CHEMICAL WARFARE AGENTS...SAMPLING AND ANALYSIS OF CHEMICAL WARFARE AGENTS AND TOXIC INDUSTRIAL CHEMICALS 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6...GAS CHROMATOGRAPHY/MASS SPECTROMETRY AS A RAPID METHOD FOR FIELD SAMPLING AND ANALYSIS OF CHEMICAL WARFARE AGENTS AND TOXIC INDUSTRIAL CHEMICALS
NASA Astrophysics Data System (ADS)
Istiningrum, Reni Banowati; Saepuloh, Azis; Jannah, Wirdatul; Aji, Didit Waskito
2017-03-01
Yogyakarta is one of patchouli oil distillation center in Indonesia. The quality of patchouli oil greatly affect its market price. Therefore, testing quality of patchouli oil parameters is an important concern, one through determination of the measurement uncertainty. This study will determine the measurement uncertainty of ester number, acid number and content of patchouli alcohol through a bottom up approach. Source contributor to measurement uncertainty of ester number is a mass of the sample, a blank and sample titration volume, the molar mass of KOH, HCl normality, and replication. While the source contributor of the measurement uncertainty of acid number is the mass of the sample, the sample titration volume, the relative mass and normality of KOH, and repetition. Determination of patchouli alcohol by Gas Chromatography considers the sources of measurement uncertainty only from repeatability because reference materials are not available.
Effects of low sampling rate in the digital data-transition tracking loop
NASA Technical Reports Server (NTRS)
Mileant, A.; Million, S.; Hinedi, S.
1994-01-01
This article describes the performance of the all-digital data-transition tracking loop (DTTL) with coherent and noncoherent sampling using nonlinear theory. The effects of few samples per symbol and of noncommensurate sampling and symbol rates are addressed and analyzed. Their impact on the probability density and variance of the phase error are quantified through computer simulations. It is shown that the performance of the all-digital DTTL approaches its analog counterpart when the sampling and symbol rates are noncommensurate (i.e., the number of samples per symbol is an irrational number). The loop signal-to-noise ratio (SNR) (inverse of phase error variance) degrades when the number of samples per symbol is an odd integer but degrades even further for even integers.
Yang, Ning-Yan; Zhang, Quan; Li, Jin-Lu; Yang, Sheng-Hui; Shi, Qing
2014-05-01
The study aims to evaluate the change of related subgingival periodontopathogens among different stage of gingivitis in adolescent and assess the relationship between periodontopathogens and the progression of periodontal inflammation. A total of 77 subgingival plaque samples from 35 adolescent individuals were divided into three groups including gingivitis group (mild, 15 samples; moderate, 16 samples; severe, 15 samples), chronic periodontitis group (15 samples) and healthy group (15 samples). Real-time PCR was used to quantitate Porphyromonas gingivalis, Prevotella intermedia, Tannerella forsythensis, and Fusobacterium nucleatum in subgingival plaque samples. All species, except for F. nucleatum, were detected in samples from gingivitis and periodontitis groups in significantly greater number than in those from healthy group (P < 0.05). In gingivitis groups, the number of P. gingivalis, T. forsythensis, and F. nucleatum in moderate and severe gingivitis groups was significantly higher than in mild gingivitis group (P < 0.05). After merging moderate gingivitis and severe gingivitis groups into moderate-to-severe gingivitis group, the four periodontopathogens were detected in samples from periodontitis group in significantly greater number than in those from moderate-to-severe gingivitis group (P < 0.05). The number of P. gingivalis, P. intermedia, T. forsythensis, and F. nucleatum in subgingival plaque increases with progression of periodontal inflammation in adolescents. Examination of periodontopathogens number in adolescents may be of some value for monitoring of periodontal disease development. © 2013 BSPD, IAPD and John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
The purpose of this SOP is to indicate the proper method for assigning unique Identification Numbers for all samples taken and forms used in the collection of NHEXAS Pilot Studies. All data tracking procedures were built upon these ID numbers. Inspection of these ID numbers pro...
Sadílek, David; Šťáhlavský, František; Vilímová, Jitka; Zima, Jan
2013-01-01
Abstract Variation in the number of chromosomes was revealed in 61 samples of Cimex lectularius Linnaeus, 1758 from the Czech Republic and other European countries, hosted on Myotis Kaup, 1829 (4) and Homo sapiens Linnaeus, 1758 (57). The karyotype of all the specimens of Cimex lectularius analysed contained 26 autosomes and a varying number of the sex chromosomes. The number of sex chromosomes showed extensive variation, and up to 20 fragments were recorded. Altogether, 12 distinct karyotypes were distinguished. The male karyotypes consisted of 29, 30, 31, 32, 33, 34, 35, 36, 37, 40, 42 and 47 chromosomes. The females usually exhibited the number of chromosomes which was complementary to the number established in the males from the same sample. However, 11 polymorphic samples were revealed in which the karyotypes of females and males were not complementary each other. The complement with 2n = 26+X1X2Y was found in 44% of the specimens and 57,4% samples of bed bugs studied. The karyotypes with higher chromosome numbers as well as individuals with chromosomal mosaics were usually found within the samples exhibiting particularly extensive variation between individuals, and such complements were not found within samples contaning a few or single specimen. The occurrence of chromosomal mosaics with the karyotype constitution varying between cells of single individual was observed in five specimens (4.3%) from five samples. We assume that polymorphism caused by fragmentation of the X chromosome may result in meiotic problems and non-disjunction can produce unbalanced gametes and result in lowered fitness of individuals carrying higher numbers of the X chromosome fragments. This effect should be apparently enhanced with the increasing number of the fragments and this may be the reason for the observed distribution pattern of individual karyotypes in the studied samples and the rarity of individuals with extremely high chromosome numbers. The assumed lowering of the fitness of individuals carrying higher numbers of the X chromosome fragments could affect population dynamics of variable populations. PMID:24455100
Improving the accuracy of livestock distribution estimates through spatial interpolation.
Bryssinckx, Ward; Ducheyne, Els; Muhwezi, Bernard; Godfrey, Sunday; Mintiens, Koen; Leirs, Herwig; Hendrickx, Guy
2012-11-01
Animal distribution maps serve many purposes such as estimating transmission risk of zoonotic pathogens to both animals and humans. The reliability and usability of such maps is highly dependent on the quality of the input data. However, decisions on how to perform livestock surveys are often based on previous work without considering possible consequences. A better understanding of the impact of using different sample designs and processing steps on the accuracy of livestock distribution estimates was acquired through iterative experiments using detailed survey. The importance of sample size, sample design and aggregation is demonstrated and spatial interpolation is presented as a potential way to improve cattle number estimates. As expected, results show that an increasing sample size increased the precision of cattle number estimates but these improvements were mainly seen when the initial sample size was relatively low (e.g. a median relative error decrease of 0.04% per sampled parish for sample sizes below 500 parishes). For higher sample sizes, the added value of further increasing the number of samples declined rapidly (e.g. a median relative error decrease of 0.01% per sampled parish for sample sizes above 500 parishes. When a two-stage stratified sample design was applied to yield more evenly distributed samples, accuracy levels were higher for low sample densities and stabilised at lower sample sizes compared to one-stage stratified sampling. Aggregating the resulting cattle number estimates yielded significantly more accurate results because of averaging under- and over-estimates (e.g. when aggregating cattle number estimates from subcounty to district level, P <0.009 based on a sample of 2,077 parishes using one-stage stratified samples). During aggregation, area-weighted mean values were assigned to higher administrative unit levels. However, when this step is preceded by a spatial interpolation to fill in missing values in non-sampled areas, accuracy is improved remarkably. This counts especially for low sample sizes and spatially even distributed samples (e.g. P <0.001 for a sample of 170 parishes using one-stage stratified sampling and aggregation on district level). Whether the same observations apply on a lower spatial scale should be further investigated.
De Jong, G D; Hoback, W W
2006-06-01
Carrion insect succession studies have historically used repeated sampling of single or a few carcasses to produce data, either weighing the carcasses, removing a qualitative subsample of the fauna present, or both, on every visit over the course of decomposition and succession. This study, conducted in a set of related experimental hypotheses with two trials in a single season, investigated the effect that repeated sampling has on insect succession, determined by the number of taxa collected on each visit and by community composition. Each trial lasted at least 21 days, with daily visits on the first 14 days. Rat carcasses used in this study were all placed in the field on the same day, but then either sampled qualitatively on every visit (similar to most succession studies) or ignored until a given day of succession, when they were sampled qualitatively (a subsample) and then destructively sampled in their entirety. Carcasses sampled on every visit were in two groups: those from which only a sample of the fauna was taken and those from which a sample of fauna was taken and the carcass was weighed for biomass determination. Of the carcasses visited only once, the number of taxa in subsamples was compared to the actual number of taxa present when the carcass was destructively sampled to determine if the subsamples adequately represented the total carcass fauna. Data from the qualitative subsamples of those carcasses visited only once were also compared to data collected from carcasses that were sampled on every visit to investigate the effect of the repeated sampling. A total of 39 taxa were collected from carcasses during the study and the component taxa are discussed individually in relation to their role in succession. Number of taxa differed on only one visit between the qualitative subsamples and the actual number of taxa present, primarily because the organisms missed by the qualitative sampling were cryptic (hidden deep within body cavities) or rare (only represented by very few specimens). There were no differences discovered between number of taxa in qualitative subsamples from carcasses sampled repeatedly (with or without biomass determinations) and those sampled only a single time. Community composition differed considerably in later stages of decomposition, with disparate communities due primarily to small numbers of rare taxa. These results indicate that the methods used historically for community composition determination in experimental forensic entomology are generally adequate.
Sampling for area estimation: A comparison of full-frame sampling with the sample segment approach
NASA Technical Reports Server (NTRS)
Hixson, M.; Bauer, M. E.; Davis, B. J. (Principal Investigator)
1979-01-01
The author has identified the following significant results. Full-frame classifications of wheat and non-wheat for eighty counties in Kansas were repetitively sampled to simulate alternative sampling plans. Evaluation of four sampling schemes involving different numbers of samples and different size sampling units shows that the precision of the wheat estimates increased as the segment size decreased and the number of segments was increased. Although the average bias associated with the various sampling schemes was not significantly different, the maximum absolute bias was directly related to sampling size unit.
NASA Technical Reports Server (NTRS)
Kalayeh, H. M.; Landgrebe, D. A.
1983-01-01
A criterion which measures the quality of the estimate of the covariance matrix of a multivariate normal distribution is developed. Based on this criterion, the necessary number of training samples is predicted. Experimental results which are used as a guide for determining the number of training samples are included. Previously announced in STAR as N82-28109
Estimating numbers of females with cubs-of-the-year in the Yellowstone grizzly bear population
Keating, K.A.; Schwartz, C.C.; Haroldson, M.A.; Moody, D.
2001-01-01
For grizzly bears (Ursus arctos horribilis) in the Greater Yellowstone Ecosystem (GYE), minimum population size and allowable numbers of human-caused mortalities have been calculated as a function of the number of unique females with cubs-of-the-year (FCUB) seen during a 3- year period. This approach underestimates the total number of FCUB, thereby biasing estimates of population size and sustainable mortality. Also, it does not permit calculation of valid confidence bounds. Many statistical methods can resolve or mitigate these problems, but there is no universal best method. Instead, relative performances of different methods can vary with population size, sample size, and degree of heterogeneity among sighting probabilities for individual animals. We compared 7 nonparametric estimators, using Monte Carlo techniques to assess performances over the range of sampling conditions deemed plausible for the Yellowstone population. Our goal was to estimate the number of FCUB present in the population each year. Our evaluation differed from previous comparisons of such estimators by including sample coverage methods and by treating individual sightings, rather than sample periods, as the sample unit. Consequently, our conclusions also differ from earlier studies. Recommendations regarding estimators and necessary sample sizes are presented, together with estimates of annual numbers of FCUB in the Yellowstone population with bootstrap confidence bounds.
Shen, You-xin; Liu, Wei-li; Li, Yu-hui; Guan, Hui-lin
2014-01-01
A large number of small-sized samples invariably shows that woody species are absent from forest soil seed banks, leading to a large discrepancy with the seedling bank on the forest floor. We ask: 1) Does this conventional sampling strategy limit the detection of seeds of woody species? 2) Are large sample areas and sample sizes needed for higher recovery of seeds of woody species? We collected 100 samples that were 10 cm (length) × 10 cm (width) × 10 cm (depth), referred to as larger number of small-sized samples (LNSS) in a 1 ha forest plot, and placed them to germinate in a greenhouse, and collected 30 samples that were 1 m × 1 m × 10 cm, referred to as small number of large-sized samples (SNLS) and placed them (10 each) in a nearby secondary forest, shrub land and grass land. Only 15.7% of woody plant species of the forest stand were detected by the 100 LNSS, contrasting with 22.9%, 37.3% and 20.5% woody plant species being detected by SNLS in the secondary forest, shrub land and grassland, respectively. The increased number of species vs. sampled areas confirmed power-law relationships for forest stand, the LNSS and SNLS at all three recipient sites. Our results, although based on one forest, indicate that conventional LNSS did not yield a high percentage of detection for woody species, but SNLS strategy yielded a higher percentage of detection for woody species in the seed bank if samples were exposed to a better field germination environment. A 4 m2 minimum sample area derived from power equations is larger than the sampled area in most studies in the literature. Increased sample size also is needed to obtain an increased sample area if the number of samples is to remain relatively low.
ERIC Educational Resources Information Center
Guo, Ling-Yu; Eisenberg, Sarita
2015-01-01
Purpose: The goal of this study was to investigate the extent to which sample length affected the reliability of total number of words (TNW), number of different words (NDW), and mean length of C-units in morphemes (MLCUm) in parent-elicited conversational samples for 3-year-olds. Method: Participants were sixty 3-year-olds. A 22-min language…
Kanık, Emine Arzu; Temel, Gülhan Orekici; Erdoğan, Semra; Kaya, İrem Ersöz
2013-01-01
Objective: The aim of study is to introduce method of Soft Independent Modeling of Class Analogy (SIMCA), and to express whether the method is affected from the number of independent variables, the relationship between variables and sample size. Study Design: Simulation study. Material and Methods: SIMCA model is performed in two stages. In order to determine whether the method is influenced by the number of independent variables, the relationship between variables and sample size, simulations were done. Conditions in which sample sizes in both groups are equal, and where there are 30, 100 and 1000 samples; where the number of variables is 2, 3, 5, 10, 50 and 100; moreover where the relationship between variables are quite high, in medium level and quite low were mentioned. Results: Average classification accuracy of simulation results which were carried out 1000 times for each possible condition of trial plan were given as tables. Conclusion: It is seen that diagnostic accuracy results increase as the number of independent variables increase. SIMCA method is a method in which the relationship between variables are quite high, the number of independent variables are many in number and where there are outlier values in the data that can be used in conditions having outlier values. PMID:25207065
Kanık, Emine Arzu; Temel, Gülhan Orekici; Erdoğan, Semra; Kaya, Irem Ersöz
2013-03-01
The aim of study is to introduce method of Soft Independent Modeling of Class Analogy (SIMCA), and to express whether the method is affected from the number of independent variables, the relationship between variables and sample size. Simulation study. SIMCA model is performed in two stages. In order to determine whether the method is influenced by the number of independent variables, the relationship between variables and sample size, simulations were done. Conditions in which sample sizes in both groups are equal, and where there are 30, 100 and 1000 samples; where the number of variables is 2, 3, 5, 10, 50 and 100; moreover where the relationship between variables are quite high, in medium level and quite low were mentioned. Average classification accuracy of simulation results which were carried out 1000 times for each possible condition of trial plan were given as tables. It is seen that diagnostic accuracy results increase as the number of independent variables increase. SIMCA method is a method in which the relationship between variables are quite high, the number of independent variables are many in number and where there are outlier values in the data that can be used in conditions having outlier values.
Optimal number of features as a function of sample size for various classification rules.
Hua, Jianping; Xiong, Zixiang; Lowey, James; Suh, Edward; Dougherty, Edward R
2005-04-15
Given the joint feature-label distribution, increasing the number of features always results in decreased classification error; however, this is not the case when a classifier is designed via a classification rule from sample data. Typically (but not always), for fixed sample size, the error of a designed classifier decreases and then increases as the number of features grows. The potential downside of using too many features is most critical for small samples, which are commonplace for gene-expression-based classifiers for phenotype discrimination. For fixed sample size and feature-label distribution, the issue is to find an optimal number of features. Since only in rare cases is there a known distribution of the error as a function of the number of features and sample size, this study employs simulation for various feature-label distributions and classification rules, and across a wide range of sample and feature-set sizes. To achieve the desired end, finding the optimal number of features as a function of sample size, it employs massively parallel computation. Seven classifiers are treated: 3-nearest-neighbor, Gaussian kernel, linear support vector machine, polynomial support vector machine, perceptron, regular histogram and linear discriminant analysis. Three Gaussian-based models are considered: linear, nonlinear and bimodal. In addition, real patient data from a large breast-cancer study is considered. To mitigate the combinatorial search for finding optimal feature sets, and to model the situation in which subsets of genes are co-regulated and correlation is internal to these subsets, we assume that the covariance matrix of the features is blocked, with each block corresponding to a group of correlated features. Altogether there are a large number of error surfaces for the many cases. These are provided in full on a companion website, which is meant to serve as resource for those working with small-sample classification. For the companion website, please visit http://public.tgen.org/tamu/ofs/ e-dougherty@ee.tamu.edu.
Cryopreservation of Circulating Tumor Cells for Enumeration and Characterization.
Nejlund, Sarah; Smith, Julie; Kraan, Jaco; Stender, Henrik; Van, Mai N; Langkjer, Sven T; Nielsen, Mikkel T; Sölétormos, György; Hillig, Thore
2016-08-01
A blood sample containing circulating tumor cells (CTCs) may serve as a surrogate for metastasis in invasive cancer. Cryopreservation will provide new opportunities in management of clinical samples in the laboratory and allow collection of samples over time for future analysis of existing and upcoming cancer biomarkers. Blood samples from healthy volunteers were spiked with high (∼500) and low (∼50) number of tumor cells from culture. The samples were stored at -80C with cryopreservative dimethyl sulfoxide mixed with Roswell Park Memorial Institute 1640 medium. Flow cytometry tested if cryopreservation affected specific biomarkers regularly used to detect CTCs, i.e. cytokeratin (CK) and epithelial cell adhesion molecule (EpCAM) and white blood cell specific lymphocyte common antigen (CD45). After various time intervals (up to 6 months), samples were thawed and tumor cell recovery (enumeration) was examined. Clinical samples may differ from cell line studies, so the cryopreservation protocol was tested on 17 patients with invasive breast cancer and tumor cell recovery was examined. Two blood samples were drawn from each patient. Biomarkers, CK, CD45, and EpCAM, were not affected by the freezing and thawing procedures. Cryopreserved samples (n = 2) spiked with a high number of tumor cells (∼500) had a ∼90% recovery compared with the spiked fresh samples. In samples spiked with lower numbers of tumor cells (median = 43 in n = 5 samples), the recovery was 63% after cryopreservation (median 27 tumor cells), p = 0.03. With an even lower number of spiked tumor cells (median = 3 in n = 8 samples), the recovery rate of tumor cells after cryopreservation did not seem to be affected (median = 8), p = 0.09. Time of cryopreservation did not affect recovery. When testing the effect of cryopreservation on enumeration in clinical samples, no difference was observed in the number of CTCs between the fresh and the cryopreserved samples based on n = 17 pairs, p = 0.83; however, the variation was large. This large variation was confirmed by clinically paired fresh samples (n = 64 pairs), where 95% of the samples (<30 CTCs) vary in number up to ±15 CTCs, p = 0.18. A small loss of CTCs after cryopreservation may be expected; however, cryopreservation of CTCs for biomarker characterization for clinical applications seems promising.
Mandal, Abhishek; Boatz, Jennifer C.; Wheeler, Travis; van der Wel, Patrick C. A.
2017-01-01
A number of recent advances in the field of magic-angle-spinning (MAS) solid-state NMR have enabled its application to a range of biological systems of ever increasing complexity. To retain biological relevance, these samples are increasingly studied in a hydrated state. At the same time, experimental feasibility requires the sample preparation process to attain a high sample concentration within the final MAS rotor. We discuss these considerations, and how they have led to a number of different approaches to MAS NMR sample preparation. We describe our experience of how custom-made (or commercially available) ultracentrifugal devices can facilitate a simple, fast and reliable sample preparation process. A number of groups have since adopted such tools, in some cases to prepare samples for sedimentation-style MAS NMR experiments. Here we argue for a more widespread adoption of their use for routine MAS NMR sample preparation. PMID:28229262
Wood, Henry M; Belvedere, Ornella; Conway, Caroline; Daly, Catherine; Chalkley, Rebecca; Bickerdike, Melissa; McKinley, Claire; Egan, Phil; Ross, Lisa; Hayward, Bruce; Morgan, Joanne; Davidson, Leslie; MacLennan, Ken; Ong, Thian K; Papagiannopoulos, Kostas; Cook, Ian; Adams, David J; Taylor, Graham R; Rabbitts, Pamela
2010-08-01
The use of next-generation sequencing technologies to produce genomic copy number data has recently been described. Most approaches, however, reply on optimal starting DNA, and are therefore unsuitable for the analysis of formalin-fixed paraffin-embedded (FFPE) samples, which largely precludes the analysis of many tumour series. We have sought to challenge the limits of this technique with regards to quality and quantity of starting material and the depth of sequencing required. We confirm that the technique can be used to interrogate DNA from cell lines, fresh frozen material and FFPE samples to assess copy number variation. We show that as little as 5 ng of DNA is needed to generate a copy number karyogram, and follow this up with data from a series of FFPE biopsies and surgical samples. We have used various levels of sample multiplexing to demonstrate the adjustable resolution of the methodology, depending on the number of samples and available resources. We also demonstrate reproducibility by use of replicate samples and comparison with microarray-based comparative genomic hybridization (aCGH) and digital PCR. This technique can be valuable in both the analysis of routine diagnostic samples and in examining large repositories of fixed archival material.
Albasan, Hasan; Lulich, Jody P; Osborne, Carl A; Lekcharoensuk, Chalermpol; Ulrich, Lisa K; Carpenter, Kathleen A
2003-01-15
To determine effects of storage temperature and time on pH and specific gravity of and number and size of crystals in urine samples from dogs and cats. Randomized complete block design. 31 dogs and 8 cats. Aliquots of each urine sample were analyzed within 60 minutes of collection or after storage at room or refrigeration temperatures (20 vs 6 degrees C [68 vs 43 degrees F]) for 6 or 24 hours. Crystals formed in samples from 11 of 39 (28%) animals. Calcium oxalate (CaOx) crystals formed in vitro in samples from 1 cat and 8 dogs. Magnesium ammonium phosphate (MAP) crystals formed in vitro in samples from 2 dogs. Compared with aliquots stored at room temperature, refrigeration increased the number and size of crystals that formed in vitro; however, the increase in number and size of MAP crystals in stored urine samples was not significant. Increased storage time and decreased storage temperature were associated with a significant increase in number of CaOx crystals formed. Greater numbers of crystals formed in urine aliquots stored for 24 hours than in aliquots stored for 6 hours. Storage time and temperature did not have a significant effect on pH or specific gravity. Urine samples should be analyzed within 60 minutes of collection to minimize temperature- and time-dependent effects on in vitro crystal formation. Presence of crystals observed in stored samples should be validated by reevaluation of fresh urine.
Generating Random Samples of a Given Size Using Social Security Numbers.
ERIC Educational Resources Information Center
Erickson, Richard C.; Brauchle, Paul E.
1984-01-01
The purposes of this article are (1) to present a method by which social security numbers may be used to draw cluster samples of a predetermined size and (2) to describe procedures used to validate this method of drawing random samples. (JOW)
Occurrence of oral deformities in larval anurans
Drake, D.L.; Altig, R.; Grace, J.B.; Walls, S.C.
2007-01-01
We quantified deformities in the marginal papillae, tooth rows, and jaw sheaths of tadpoles from 13 population samples representing three families and 11 sites in the southeastern United States. Oral deformities were observed in all samples and in 13.5-98% of the specimens per sample. Batrachochytrium dendrobatidis (chytrid) infections were detected in three samples. There was high variability among samples in the pattern and number of discovered deformities. Pairwise associations between oral structures containing deformities were nonrandom for several populations, especially those with B. dendrobatidis infections or high total numbers of deformities. Comparisons of deformities among samples using multivariate analyses revealed that tadpole samples grouped together by family. Analyses of ordination indicated that three variables, the number of deformities, the number of significant associations among deformity types within populations, and whether populations were infected with B. dendrobatidis, were significantly correlated with the pattern of deformities. Our data indicate that the incidence of oral deformities can be high in natural populations and that phylogeny and B. dendrobatidis infection exert a strong influence on the occurrence and type of oral deformities in tadpoles. ?? by the American Society of Ichthyologists and Herperologists.
Burkness, Eric C; Hutchison, W D
2009-10-01
Populations of cabbage looper, Trichoplusiani (Lepidoptera: Noctuidae), were sampled in experimental plots and commercial fields of cabbage (Brasicca spp.) in Minnesota during 1998-1999 as part of a larger effort to implement an integrated pest management program. Using a resampling approach and the Wald's sequential probability ratio test, sampling plans with different sampling parameters were evaluated using independent presence/absence and enumerative data. Evaluations and comparisons of the different sampling plans were made based on the operating characteristic and average sample number functions generated for each plan and through the use of a decision probability matrix. Values for upper and lower decision boundaries, sequential error rates (alpha, beta), and tally threshold were modified to determine parameter influence on the operating characteristic and average sample number functions. The following parameters resulted in the most desirable operating characteristic and average sample number functions; action threshold of 0.1 proportion of plants infested, tally threshold of 1, alpha = beta = 0.1, upper boundary of 0.15, lower boundary of 0.05, and resampling with replacement. We found that sampling parameters can be modified and evaluated using resampling software to achieve desirable operating characteristic and average sample number functions. Moreover, management of T. ni by using binomial sequential sampling should provide a good balance between cost and reliability by minimizing sample size and maintaining a high level of correct decisions (>95%) to treat or not treat.
7 CFR 42.143 - Operating Characteristic (OC) curves for on-line sampling and inspection.
Code of Federal Regulations, 2010 CFR
2010-01-01
...=Number of sample units in a subgroup. T=Subgroup tolerance.L=Acceptance limit.S=Starting value. EC02SE91... ng=Number of sample units in a subgroup. T=Subgroup tolerance. L=Acceptance limit. S=Starting value... of sample units in a subgroup. T=Subgroup tolerance. L=Acceptance limit. S=Starting value. EC02SE91...
Investigating the Randomness of Numbers
ERIC Educational Resources Information Center
Pendleton, Kenn L.
2009-01-01
The use of random numbers is pervasive in today's world. Random numbers have practical applications in such far-flung arenas as computer simulations, cryptography, gambling, the legal system, statistical sampling, and even the war on terrorism. Evaluating the randomness of extremely large samples is a complex, intricate process. However, the…
78 FR 43002 - Proposed Collection; Comment Request for Revenue Procedure 2004-29
Federal Register 2010, 2011, 2012, 2013, 2014
2013-07-18
... comments concerning statistical sampling in Sec. 274 Context. DATES: Written comments should be received on... INFORMATION: Title: Statistical Sampling in Sec. 274 Contest. OMB Number: 1545-1847. Revenue Procedure Number: Revenue Procedure 2004-29. Abstract: Revenue Procedure 2004-29 prescribes the statistical sampling...
33. DETAILS OF SAMPLE SUPPORT FRAME ASSEMLBY, LIFTING LUG, AND ...
33. DETAILS OF SAMPLE SUPPORT FRAME ASSEMLBY, LIFTING LUG, AND SAMPLE CARRIER ROD. F.C. TORKELSON DRAWING NUMBER 842-ARVFS-701-S-5. INEL INDEX CODE NUMBER: 075 0701 60 851 151979. - Idaho National Engineering Laboratory, Advanced Reentry Vehicle Fusing System, Scoville, Butte County, ID
Performance of Copan WASP for Routine Urine Microbiology
Quiblier, Chantal; Jetter, Marion; Rominski, Mark; Mouttet, Forouhar; Böttger, Erik C.; Keller, Peter M.
2015-01-01
This study compared a manual workup of urine clinical samples with fully automated WASPLab processing. As a first step, two different inocula (1 and 10 μl) and different streaking patterns were compared using WASP and InoqulA BT instrumentation. Significantly more single colonies were produced with the10-μl inoculum than with the 1-μl inoculum, and automated streaking yielded significantly more single colonies than manual streaking on whole plates (P < 0.001). In a second step, 379 clinical urine samples were evaluated using WASP and the manual workup. Average numbers of detected morphologies, recovered species, and CFUs per milliliter of all 379 urine samples showed excellent agreement between WASPLab and the manual workup. The percentage of urine samples clinically categorized as positive or negative did not differ between the automated and manual workflow, but within the positive samples, automated processing by WASPLab resulted in the detection of more potential pathogens. In summary, the present study demonstrates that (i) the streaking pattern, i.e., primarily the number of zigzags/length of streaking lines, is critical for optimizing the number of single colonies yielded from primary cultures of urine samples; (ii) automated streaking by the WASP instrument is superior to manual streaking regarding the number of single colonies yielded (for 32.2% of the samples); and (iii) automated streaking leads to higher numbers of detected morphologies (for 47.5% of the samples), species (for 17.4% of the samples), and pathogens (for 3.4% of the samples). The results of this study point to an improved quality of microbiological analyses and laboratory reports when using automated sample processing by WASP and WASPLab. PMID:26677255
Identifying Immune Drivers of Gulf War Illness Using a Novel Daily Sampling Approach
2017-10-01
AWARD NUMBER: W81XWH-12-1-0557 TITLE: Identifying Immune Drivers of Gulf War Illness Using a Novel Daily Sampling Approach PRINCIPAL...TITLE AND SUBTITLE Identifying Immune Drivers of Gulf War Illness Using A Novel 5a. CONTRACT NUMBER Daily Sampling Approach 5b. GRANT NUMBER...INTRODUCTION: The major aim of this research project is to identify aspects of the immune system that are dysregulated in veterans with Gulf War Illness
NASA Astrophysics Data System (ADS)
Jamaluddin; Darwis, A.; Massinai, M. A.
2018-02-01
Asbuton as natural rock asphalt consists of a granular material; usually limestone or sandstone. In its natural state, it contains bitumen intimately dispersed throughout its mass, while the remainder of the material is a solid mineral matter. This research was conducted in Sorowalio, Buton Regency, Southeast Sulawesi province, Indonesia. This study aims to determine the content and the percentage of minerals contained in the rocks by using X-Ray Fluorescence (XRF). The method of research is a preliminary survey, sampling and laboratory analysis. XRF reports chemical composition, including Si (quartz) and Ca (calcite). The results indicate the content and the percentage of element dominate the rock sample is Fe2O3, MgO, CaO, and SiO2. Research results using XRF show that there are four metal oxide dominant elements. Hematite (Fe2O3) is dominant in all locations of sampling. Magnesium oxide (MgO) has the highest levels found in sample number six and the lowest is in sample number five. Silicates (SiO) has the highest levels at sample number six and the lowest in sample number seven. Calcium oxide (CaO) is dominant in all sampling locations. The sample of asbuton contains 37.90% asphalt, 43.28% carbonate, and18.82% other minerals.
DataSync - sharing data via filesystem
NASA Astrophysics Data System (ADS)
Ulbricht, Damian; Klump, Jens
2014-05-01
Usually research work is a cycle of to hypothesize, to collect data, to corroborate the hypothesis, and finally to publish the results. In this sequence there are possibilities to base the own work on the work of others. Maybe there are candidates of physical samples listed in the IGSN-Registry and there is no need to go on excursion to acquire physical samples. Hopefully the DataCite catalogue lists already metadata of datasets that meet the constraints of the hypothesis and that are now open for reappraisal. After all, working with the measured data to corroborate the hypothesis involves new methods, and proven methods as well as different software tools. A cohort of intermediate data is created that can be shared with colleagues to discuss the research progress and receive a first evaluation. In consequence, the intermediate data should be versioned to easily get back to valid intermediate data, when you notice you get on the wrong track. Things are different for project managers. They want to know what is currently done, what has been done, and what is the last valid data, if somebody has to continue the work. To make life of members of small science projects easier we developed Datasync [1] as a software for sharing and versioning data. Datasync is designed to synchronize directory trees between different computers of a research team over the internet. The software is developed as JAVA application and watches a local directory tree for changes that are replicated as eSciDoc-objects into an eSciDoc-infrastructure [2] using the eSciDoc REST API. Modifications to the local filesystem automatically create a new version of an eSciDoc-object inside the eSciDoc-infrastructure. This way individual folders can be shared between team members while project managers can get a general idea of current status by synchronizing whole project inventories. Additionally XML metadata from separate files can be managed together with data files inside the eSciDoc-objects. While Datasync's major task is to distribute directory trees, we complement its functionality with the PHP-based application panMetaDocs [3]. panMetaDocs is the successor to panMetaWorks [4] and inherits most of its functionality. Through an internet browser PanMetaDocs provides a web-based overview of the datasets inside the eSciDoc-infrastructure. The software allows to upload further data, to add and edit metadata using the metadata editor, and it disseminates metadata through various channels. In addition, previous versions of a file can be downloaded and access rights can be defined on files and folders to control visibility of files for users of both panMetaDocs and Datasync. panMetaDocs serves as a publication agent for datasets and it serves as a registration agent for dataset DOIs. The application stack presented here allows sharing, versioning, and central storage of data from the very beginning of project activities by using the file synchronization service Datasync. The web-application panMetaDocs complements the functionality of DataSync by providing a dataset publication agent and other tools to handle administrative tasks on the data. [1] http://github.com/ulbricht/datasync [2] http://github.com/escidoc [3] http://panmetadocs.sf.net [4] http://metaworks.pangaea.de
NASA Technical Reports Server (NTRS)
Hixson, M. M.; Bauer, M. E.; Davis, B. J.
1979-01-01
The effect of sampling on the accuracy (precision and bias) of crop area estimates made from classifications of LANDSAT MSS data was investigated. Full-frame classifications of wheat and non-wheat for eighty counties in Kansas were repetitively sampled to simulate alternative sampling plants. Four sampling schemes involving different numbers of samples and different size sampling units were evaluated. The precision of the wheat area estimates increased as the segment size decreased and the number of segments was increased. Although the average bias associated with the various sampling schemes was not significantly different, the maximum absolute bias was directly related to sampling unit size.
Winston Paul Smith; Daniel J. Twedt; David A. Wiedenfeld; Paul B. Hamel; Robert P. Ford; Robert J. Cooper
1993-01-01
To compare efficacy of point count sampling in bottomland hardwood forests, duration of point count, number of point counts, number of visits to each point during a breeding season, and minimum sample size are examined.
Sampling Using a Fixed Number of Trees Per Plot
Hans T. Schreuder
2004-01-01
The fixed number of trees sample design proposed by Jonsson and others (1992) may be dangerous in applications if a probabilistic framework of sampling is desired. The procedure can be seriously biased. Examples are given here.Publication Web Site:http://www.fs.fed.us/rm/pubs/rmrs_rn017.html
Estimates of the Average Number of Times Students Say They Cheated
ERIC Educational Resources Information Center
Liebler, Robert
2017-01-01
Data from published studies is used to recover information about the sample mean self-reported number of times cheated by college students. The sample means were estimated by fitting distributions to the reported data. The few estimated sample means thus recovered were roughly 2 or less.
Bellier, Edwige; Grøtan, Vidar; Engen, Steinar; Schartau, Ann Kristin; Diserud, Ola H; Finstad, Anders G
2012-10-01
Obtaining accurate estimates of diversity indices is difficult because the number of species encountered in a sample increases with sampling intensity. We introduce a novel method that requires that the presence of species in a sample to be assessed while the counts of the number of individuals per species are only required for just a small part of the sample. To account for species included as incidence data in the species abundance distribution, we modify the likelihood function of the classical Poisson log-normal distribution. Using simulated community assemblages, we contrast diversity estimates based on a community sample, a subsample randomly extracted from the community sample, and a mixture sample where incidence data are added to a subsample. We show that the mixture sampling approach provides more accurate estimates than the subsample and at little extra cost. Diversity indices estimated from a freshwater zooplankton community sampled using the mixture approach show the same pattern of results as the simulation study. Our method efficiently increases the accuracy of diversity estimates and comprehension of the left tail of the species abundance distribution. We show how to choose the scale of sample size needed for a compromise between information gained, accuracy of the estimates and cost expended when assessing biological diversity. The sample size estimates are obtained from key community characteristics, such as the expected number of species in the community, the expected number of individuals in a sample and the evenness of the community.
Long-Term Study of Vibrio parahaemolyticus Prevalence and Distribution in New Zealand Shellfish
Hedderley, D.; Fletcher, G. C.
2015-01-01
The food-borne pathogen Vibrio parahaemolyticus has been reported as being present in New Zealand (NZ) seawaters, but there have been no reported outbreaks of food-borne infection from commercially grown NZ seafood. Our study determined the current incidence of V. parahaemolyticus in NZ oysters and Greenshell mussels and the prevalence of V. parahaemolyticus tdh and trh strains. Pacific (235) and dredge (21) oyster samples and mussel samples (55) were obtained from commercial shellfish-growing areas between December 2009 and June 2012. Total V. parahaemolyticus numbers and the presence of pathogenic genes tdh and trh were determined using the FDA most-probable-number (MPN) method and confirmed using PCR analysis. In samples from the North Island of NZ, V. parahaemolyticus was detected in 81% of Pacific oysters and 34% of mussel samples, while the numbers of V. parahaemolyticus tdh and trh strains were low, with just 3/215 Pacific oyster samples carrying the tdh gene. V. parahaemolyticus organisms carrying tdh and trh were not detected in South Island samples, and V. parahaemolyticus was detected in just 1/21 dredge oyster and 2/16 mussel samples. Numbers of V. parahaemolyticus organisms increased when seawater temperatures were high, the season when most commercial shellfish-growing areas are not harvested. The numbers of V. parahaemolyticus organisms in samples exceeded 1,000 MPN/g only when the seawater temperatures exceeded 19°C, so this environmental parameter could be used as a trigger warning of potential hazard. There is some evidence that the total V. parahaemolyticus numbers increased compared with those reported from a previous 1981 to 1984 study, but the analytical methods differed significantly. PMID:25616790
Long-term study of Vibrio parahaemolyticus prevalence and distribution in New Zealand shellfish.
Cruz, C D; Hedderley, D; Fletcher, G C
2015-04-01
The food-borne pathogen Vibrio parahaemolyticus has been reported as being present in New Zealand (NZ) seawaters, but there have been no reported outbreaks of food-borne infection from commercially grown NZ seafood. Our study determined the current incidence of V. parahaemolyticus in NZ oysters and Greenshell mussels and the prevalence of V. parahaemolyticus tdh and trh strains. Pacific (235) and dredge (21) oyster samples and mussel samples (55) were obtained from commercial shellfish-growing areas between December 2009 and June 2012. Total V. parahaemolyticus numbers and the presence of pathogenic genes tdh and trh were determined using the FDA most-probable-number (MPN) method and confirmed using PCR analysis. In samples from the North Island of NZ, V. parahaemolyticus was detected in 81% of Pacific oysters and 34% of mussel samples, while the numbers of V. parahaemolyticus tdh and trh strains were low, with just 3/215 Pacific oyster samples carrying the tdh gene. V. parahaemolyticus organisms carrying tdh and trh were not detected in South Island samples, and V. parahaemolyticus was detected in just 1/21 dredge oyster and 2/16 mussel samples. Numbers of V. parahaemolyticus organisms increased when seawater temperatures were high, the season when most commercial shellfish-growing areas are not harvested. The numbers of V. parahaemolyticus organisms in samples exceeded 1,000 MPN/g only when the seawater temperatures exceeded 19°C, so this environmental parameter could be used as a trigger warning of potential hazard. There is some evidence that the total V. parahaemolyticus numbers increased compared with those reported from a previous 1981 to 1984 study, but the analytical methods differed significantly. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Code of Federal Regulations, 2011 CFR
2011-01-01
...] Factor Grades AL 2 Number of 50-count samples 3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20...) For 21 through 40 Samples [See footnotes at end of Table I] Factor Grades AL 2 Number of 50-count... United States. 2 AL—Absolute limit permitted in individual 33-count sample. 3 Sample size—33-count. 4...
Code of Federal Regulations, 2012 CFR
2012-01-01
...] Factor Grades AL 2 Number of 50-count samples 3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20...) For 21 through 40 Samples [See footnotes at end of Table I] Factor Grades AL 2 Number of 50-count... United States. 2 AL—Absolute limit permitted in individual 33-count sample. 3 Sample size—33-count. 4...
Park, J-H; Sulyok, M; Lemons, A R; Green, B J; Cox-Ganser, J M
2018-05-04
Recent developments in molecular and chemical methods have enabled the analysis of fungal DNA and secondary metabolites, often produced during fungal growth, in environmental samples. We compared 3 fungal analytical methods by analysing floor dust samples collected from an office building for fungi using viable culture, internal transcribed spacer (ITS) sequencing and secondary metabolites using liquid chromatography-tandem mass spectrometry. Of the 32 metabolites identified, 29 had a potential link to fungi with levels ranging from 0.04 (minimum for alternariol monomethylether) to 5700 ng/g (maximum for neoechinulin A). The number of fungal metabolites quantified per sample ranged from 8 to 16 (average = 13/sample). We identified 216 fungal operational taxonomic units (OTUs) with the number per sample ranging from 6 to 29 (average = 18/sample). We identified 37 fungal species using culture, and the number per sample ranged from 2 to 13 (average = 8/sample). Agreement in identification between ITS sequencing and culturing was weak (kappa = -0.12 to 0.27). The number of cultured fungal species poorly correlated with OTUs, which did not correlate with the number of metabolites. These suggest that using multiple measurement methods may provide an improved understanding of fungal exposures in indoor environments and that secondary metabolites may be considered as an additional source of exposure. © 2018 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Analysis of Retention of First-Term Enlisted Personnel in the Selected Reserves.
1988-06-01
composed of questions related to education and training. The Kaiser - Meyer - Olkin measure of sampling adequacy was 0.887 and the number of cases was...status. The Kaiser - Meyer - Olkin measure of sampling adequacy was 0.901 and the number of cases was 1,889. These three factors were also used as...drills. The Kaiser - Meyer - Olkin measure of sampling adequacy was 0.841 and the number of cases was 2,507. These two factors were also used as
The cost of large numbers of hypothesis tests on power, effect size and sample size.
Lazzeroni, L C; Ray, A
2012-01-01
Advances in high-throughput biology and computer science are driving an exponential increase in the number of hypothesis tests in genomics and other scientific disciplines. Studies using current genotyping platforms frequently include a million or more tests. In addition to the monetary cost, this increase imposes a statistical cost owing to the multiple testing corrections needed to avoid large numbers of false-positive results. To safeguard against the resulting loss of power, some have suggested sample sizes on the order of tens of thousands that can be impractical for many diseases or may lower the quality of phenotypic measurements. This study examines the relationship between the number of tests on the one hand and power, detectable effect size or required sample size on the other. We show that once the number of tests is large, power can be maintained at a constant level, with comparatively small increases in the effect size or sample size. For example at the 0.05 significance level, a 13% increase in sample size is needed to maintain 80% power for ten million tests compared with one million tests, whereas a 70% increase in sample size is needed for 10 tests compared with a single test. Relative costs are less when measured by increases in the detectable effect size. We provide an interactive Excel calculator to compute power, effect size or sample size when comparing study designs or genome platforms involving different numbers of hypothesis tests. The results are reassuring in an era of extreme multiple testing.
Klinkenberg, Don; Thomas, Ekelijn; Artavia, Francisco F Calvo; Bouma, Annemarie
2011-08-01
Design of surveillance programs to detect infections could benefit from more insight into sampling schemes. We address the effect of sampling schemes for Salmonella Enteritidis surveillance in laying hens. Based on experimental estimates for the transmission rate in flocks, and the characteristics of an egg immunological test, we have simulated outbreaks with various sampling schemes, and with the current boot swab program with a 15-week sampling interval. Declaring a flock infected based on a single positive egg was not possible because test specificity was too low. Thus, a threshold number of positive eggs was defined to declare a flock infected, and, for small sample sizes, eggs from previous samplings had to be included in a cumulative sample to guarantee a minimum flock level specificity. Effectiveness of surveillance was measured by the proportion of outbreaks detected, and by the number of contaminated table eggs brought on the market. The boot swab program detected 90% of the outbreaks, with 75% fewer contaminated eggs compared to no surveillance, whereas the baseline egg program (30 eggs each 15 weeks) detected 86%, with 73% fewer contaminated eggs. We conclude that a larger sample size results in more detected outbreaks, whereas a smaller sampling interval decreases the number of contaminated eggs. Decreasing sample size and interval simultaneously reduces the number of contaminated eggs, but not indefinitely: the advantage of more frequent sampling is counterbalanced by the cumulative sample including less recently laid eggs. Apparently, optimizing surveillance has its limits when test specificity is taken into account. © 2011 Society for Risk Analysis.
Gounder, Shakti; Tayler-Smith, Katherine; Khogali, Mohammed; Raikabula, Maopa; Harries, Anthony D
2013-07-01
In Fiji, patients with suspected pulmonary tuberculosis (PTB) currently submit three sputum specimens for smear microscopy for acid-fast bacilli, but there is little information about how well this practice is carried out. A cross-sectional retrospective review was carried out in all four TB diagnostic laboratories in Fiji to determine among new patients presenting with suspected PTB in 2011: the quality of submitted sputum; the number of sputum samples submitted; the relationship between quality and number of submitted samples to smear-positivity; and positive yield from first, second and third samples. Of 1940 patients with suspected PTB, 3522 sputum samples were submitted: 997 (51.4%) patients submitted one sample, 304 (15.7%) patients submitted two samples and 639 (32.9%) submitted three samples. Sputum quality was recorded in 2528 (71.8%) of samples, of which 1046 (41.4%) were of poor quality. Poor quality sputum was more frequent in females, inpatients and children (0-14 years). Good quality sputum and a higher number of submitted samples positively correlated with smear-positivity for acid-fast bacilli. There were 122 (6.3%) patients with suspected PTB who were sputum smear positive. Of those, 89 had submitted three sputum samples: 79 (89%) were diagnosed based on the first sputum sample, 6 (7%) on the second sample and 4 (4%) on the third sample. This study shows that there are deficiencies in the practice of sputum smear examination in Fiji with respect to sputum quality and recommended number of submitted samples, although the results support the continued use of three sputum samples for TB diagnosis. Ways to improve sputum quality and adherence to recommended guidelines are needed.
Questioning the utility of pooling samples in microarray experiments with cell lines.
Lusa, L; Cappelletti, V; Gariboldi, M; Ferrario, C; De Cecco, L; Reid, J F; Toffanin, S; Gallus, G; McShane, L M; Daidone, M G; Pierotti, M A
2006-01-01
We describe a microarray experiment using the MCF-7 breast cancer cell line in two different experimental conditions for which the same number of independent pools as the number of individual samples was hybridized on Affymetrix GeneChips. Unexpectedly, when using individual samples, the number of probe sets found to be differentially expressed between treated and untreated cells was about three times greater than that found using pools. These findings indicate that pooling samples in microarray experiments where the biological variability is expected to be small might not be helpful and could even decrease one's ability to identify differentially expressed genes.
Nagy, Bálint; Bán, Zoltán; Papp, Zoltán
2005-10-01
The quality and the quantity of isolated DNA have an effect on PCR amplifications. The authors studied three DNA isolation protocols (resin binding method using fresh and frozen amniotic fluid samples, and silica adsorption method using fresh samples) on the quantity and on the quality of the isolated DNA. Amniotic fluid samples were obtained from 20 pregnant women. The isolated DNA concentrations were determined by real-time fluorimeter using SYBRGreen I method. Each sample was studied for the presence of 8 STR markers. The authors compared the number of the detected alleles, electrophoretograms and peak areas. There was a significant difference between the concentration of the obtained DNA and in the peak areas between the three isolation protocols. The numbers of detected alleles were different, we observed the most allele drop outs in the resin type DNA isolation protocol from the fresh sample (detected allele numbers 182), followed by resin binding protocol from the frozen samples (detected allele number 243) and by the silica adsorption method (detected allele number 264). The authors demonstrated that the DNA isolation method has an effect on the quantity and quality of the isolated DNA, and on further PCR amplifications.
Evaluation of process errors in bed load sampling using a Dune Model
Gomez, Basil; Troutman, Brent M.
1997-01-01
Reliable estimates of the streamwide bed load discharge obtained using sampling devices are dependent upon good at-a-point knowledge across the full width of the channel. Using field data and information derived from a model that describes the geometric features of a dune train in terms of a spatial process observed at a fixed point in time, we show that sampling errors decrease as the number of samples collected increases, and the number of traverses of the channel over which the samples are collected increases. It also is preferable that bed load sampling be conducted at a pace which allows a number of bed forms to pass through the sampling cross section. The situations we analyze and simulate pertain to moderate transport conditions in small rivers. In such circumstances, bed load sampling schemes typically should involve four or five traverses of a river, and the collection of 20–40 samples at a rate of five or six samples per hour. By ensuring that spatial and temporal variability in the transport process is accounted for, such a sampling design reduces both random and systematic errors and hence minimizes the total error involved in the sampling process.
Code of Federal Regulations, 2014 CFR
2014-04-01
... of the PHA's quality control sample is as follows: Universe Minimum number of files or recordsto be... universe is: the number of admissions in the last year for each of the two quality control samples under...
Code of Federal Regulations, 2013 CFR
2013-04-01
... of the PHA's quality control sample is as follows: Universe Minimum number of files or recordsto be... universe is: the number of admissions in the last year for each of the two quality control samples under...
Code of Federal Regulations, 2012 CFR
2012-04-01
... of the PHA's quality control sample is as follows: Universe Minimum number of files or recordsto be... universe is: the number of admissions in the last year for each of the two quality control samples under...
Code of Federal Regulations, 2012 CFR
2012-10-01
... Correction (FPC). The State agency must increase the resulting number by 30 percent to allow for attrition... 30 percent to allow for attrition, but the sample size must not be larger than the number of youth...
Code of Federal Regulations, 2013 CFR
2013-10-01
... Correction (FPC). The State agency must increase the resulting number by 30 percent to allow for attrition... 30 percent to allow for attrition, but the sample size must not be larger than the number of youth...
Code of Federal Regulations, 2014 CFR
2014-10-01
... Correction (FPC). The State agency must increase the resulting number by 30 percent to allow for attrition... 30 percent to allow for attrition, but the sample size must not be larger than the number of youth...
Evans, T M; LeChevallier, M W; Waarvick, C E; Seidler, R J
1981-01-01
The species of total coliform bacteria isolated from drinking water and untreated surface water by the membrane filter (MF), the standard most-probable-number (S-MPN), and modified most-probable-number (M-MPN) techniques were compared. Each coliform detection technique selected for a different profile of coliform species from both types of water samples. The MF technique indicated that Citrobacter freundii was the most common coliform species in water samples. However, the fermentation tube techniques displayed selectivity towards the isolation of Escherichia coli and Klebsiella. The M-MPN technique selected for more C. freundii and Enterobacter spp. from untreated surface water samples and for more Enterobacter and Klebsiella spp. from drinking water samples than did the S-MPN technique. The lack of agreement between the number of coliforms detected in a water sample by the S-MPN, M-MPN, and MF techniques was a result of the selection for different coliform species by the various techniques. PMID:7013706
Sample design effects in landscape genetics
Oyler-McCance, Sara J.; Fedy, Bradley C.; Landguth, Erin L.
2012-01-01
An important research gap in landscape genetics is the impact of different field sampling designs on the ability to detect the effects of landscape pattern on gene flow. We evaluated how five different sampling regimes (random, linear, systematic, cluster, and single study site) affected the probability of correctly identifying the generating landscape process of population structure. Sampling regimes were chosen to represent a suite of designs common in field studies. We used genetic data generated from a spatially-explicit, individual-based program and simulated gene flow in a continuous population across a landscape with gradual spatial changes in resistance to movement. Additionally, we evaluated the sampling regimes using realistic and obtainable number of loci (10 and 20), number of alleles per locus (5 and 10), number of individuals sampled (10-300), and generational time after the landscape was introduced (20 and 400). For a simulated continuously distributed species, we found that random, linear, and systematic sampling regimes performed well with high sample sizes (>200), levels of polymorphism (10 alleles per locus), and number of molecular markers (20). The cluster and single study site sampling regimes were not able to correctly identify the generating process under any conditions and thus, are not advisable strategies for scenarios similar to our simulations. Our research emphasizes the importance of sampling data at ecologically appropriate spatial and temporal scales and suggests careful consideration for sampling near landscape components that are likely to most influence the genetic structure of the species. In addition, simulating sampling designs a priori could help guide filed data collection efforts.
Effects of the number of people on efficient capture and sample collection: a lion case study.
Ferreira, Sam M; Maruping, Nkabeng T; Schoultz, Darius; Smit, Travis R
2013-05-24
Certain carnivore research projects and approaches depend on successful capture of individuals of interest. The number of people present at a capture site may determine success of a capture. In this study 36 lion capture cases in the Kruger National Park were used to evaluate whether the number of people present at a capture site influenced lion response rates and whether the number of people at a sampling site influenced the time it took to process the collected samples. The analyses suggest that when nine or fewer people were present, lions appeared faster at a call-up locality compared with when there were more than nine people. The number of people, however, did not influence the time it took to process the lions. It is proposed that efficient lion capturing should spatially separate capture and processing sites and minimise the number of people at a capture site.
(I Can't Get No) Saturation: A simulation and guidelines for sample sizes in qualitative research.
van Rijnsoever, Frank J
2017-01-01
I explore the sample size in qualitative research that is required to reach theoretical saturation. I conceptualize a population as consisting of sub-populations that contain different types of information sources that hold a number of codes. Theoretical saturation is reached after all the codes in the population have been observed once in the sample. I delineate three different scenarios to sample information sources: "random chance," which is based on probability sampling, "minimal information," which yields at least one new code per sampling step, and "maximum information," which yields the largest number of new codes per sampling step. Next, I use simulations to assess the minimum sample size for each scenario for systematically varying hypothetical populations. I show that theoretical saturation is more dependent on the mean probability of observing codes than on the number of codes in a population. Moreover, the minimal and maximal information scenarios are significantly more efficient than random chance, but yield fewer repetitions per code to validate the findings. I formulate guidelines for purposive sampling and recommend that researchers follow a minimum information scenario.
Development of rotation sample designs for the estimation of crop acreages
NASA Technical Reports Server (NTRS)
Lycthuan-Lee, T. G. (Principal Investigator)
1981-01-01
The idea behind the use of rotation sample designs is that the variation of the crop acreage of a particular sample unit from year to year is usually less than the variation of crop acreage between units within a particular year. The estimation theory is based on an additive mixed analysis of variance model with years as fixed effects, (a sub t), and sample units as a variable factor. The rotation patterns are decided upon according to: (1) the number of sample units in the design each year; (2) the number of units retained in the following years; and (3) the number of years to complete the rotation pattern. Different analytic formulae for the variance of (a sub t) and the variance comparisons in using a complete survey of the rotation patterns.
Schulz, Vincent; Chen, Min; Tuck, David
2010-01-01
Background Genotyping platforms such as single nucleotide polymorphism (SNP) arrays are powerful tools to study genomic aberrations in cancer samples. Allele specific information from SNP arrays provides valuable information for interpreting copy number variation (CNV) and allelic imbalance including loss-of-heterozygosity (LOH) beyond that obtained from the total DNA signal available from array comparative genomic hybridization (aCGH) platforms. Several algorithms based on hidden Markov models (HMMs) have been designed to detect copy number changes and copy-neutral LOH making use of the allele information on SNP arrays. However heterogeneity in clinical samples, due to stromal contamination and somatic alterations, complicates analysis and interpretation of these data. Methods We have developed MixHMM, a novel hidden Markov model using hidden states based on chromosomal structural aberrations. MixHMM allows CNV detection for copy numbers up to 7 and allows more complete and accurate description of other forms of allelic imbalance, such as increased copy number LOH or imbalanced amplifications. MixHMM also incorporates a novel sample mixing model that allows detection of tumor CNV events in heterogeneous tumor samples, where cancer cells are mixed with a proportion of stromal cells. Conclusions We validate MixHMM and demonstrate its advantages with simulated samples, clinical tumor samples and a dilution series of mixed samples. We have shown that the CNVs of cancer cells in a tumor sample contaminated with up to 80% of stromal cells can be detected accurately using Illumina BeadChip and MixHMM. Availability The MixHMM is available as a Python package provided with some other useful tools at http://genecube.med.yale.edu:8080/MixHMM. PMID:20532221
Selim, S A; Cullor, J S
1997-10-15
To assess the number of bacteria and presumptive antibiotic residues in milk fed to calves and to identify those bacteria and the antibiotic susceptibility of selected bacterial strains. Cross-sectional prospective study. 189 samples obtained from 12 local dairies. Samples of waste milk and milk-based fluids (eg, milk replacer, colostrum, bulk-tank milk) were obtained. Cumulative number of viable bacteria was determined. Bacteria were cultured aerobically, and antibiotic susceptibility testing of selected strains was performed. Presumptive antibiotic residues were detected by use of test kits. Geometric mean of the cumulative number of bacteria for waste milk samples was significantly higher than for other types of milk or milk-based products. Streptococcus sp (84/165 samples) and Enterobacteriaceae (83/165 samples) were the predominant bacteria identified, followed by Staphylococcus sp (68/165 samples). Escherichia coli was the gram-negative species most commonly isolated (52/165 samples; 32%); however, none were strain O157. Salmonella sp or Mycoplasma sp were not isolated. Of 189 samples, 119 (63%) were positive when tested for beta-lactams or tetracycline by use of 2 commercially available assays. In vitro, some bacteria were resistant to commonly used antibiotics. Waste milk that has not been effectively treated (eg, pasteurization) to reduce microbial load prior to use as calf feed should be used with caution, because it may contain a high number of bacteria that may be pathogenic to cattle and human beings. Antibiotic residues that would constitute violative amounts and existence of multiple antibiotic resistant bacterial strains are concerns in calf health management and dairy food safety.
Code of Federal Regulations, 2011 CFR
2011-04-01
... of the PHA's quality control sample is as follows: Universe Minimum number of files or records to be... universe is: the number of admissions in the last year for each of the two quality control samples under...
Code of Federal Regulations, 2010 CFR
2010-04-01
... of the PHA's quality control sample is as follows: Universe Minimum number of files or records to be... universe is: the number of admissions in the last year for each of the two quality control samples under...
[Sampling in quality control of medicinal materials-A case of Epimedium].
Wang, Chuanyi; Cao, Jinyi; Liang, Yun; Huang, Wenhua; Guo, Baolin
2009-04-01
To investigate the effect of the different individual number of sampling on the assay results of the medicinal materials. Epimedium pubescens and E. brevicornu were used as samples. The 6 sampling levels were formulated as 1 individual, 5, 10, 20, 30, 50 individuals mix, each level with 3 parallels and 1 individual level5 parallels. The contents of epimedin C and icariin, and the peak areas of epimedin A, epimedin B, rhamnosyl icarisid II and icarisid II in all samples were analyzed by HPLC. The variation degree varied with species and chemical constituents, but the RSD and the deviation from the true value decreased with the increase of individual number on the same chemical constituent. The sampling number should be more than 10 individuals in quality control of Epimedium, and 50 or more individuals would be better for representing the quality of medicinal materials.
Sample size and allocation of effort in point count sampling of birds in bottomland hardwood forests
Smith, W.P.; Twedt, D.J.; Cooper, R.J.; Wiedenfeld, D.A.; Hamel, P.B.; Ford, R.P.; Ralph, C. John; Sauer, John R.; Droege, Sam
1995-01-01
To examine sample size requirements and optimum allocation of effort in point count sampling of bottomland hardwood forests, we computed minimum sample sizes from variation recorded during 82 point counts (May 7-May 16, 1992) from three localities containing three habitat types across three regions of the Mississippi Alluvial Valley (MAV). Also, we estimated the effect of increasing the number of points or visits by comparing results of 150 four-minute point counts obtained from each of four stands on Delta Experimental Forest (DEF) during May 8-May 21, 1991 and May 30-June 12, 1992. For each stand, we obtained bootstrap estimates of mean cumulative number of species each year from all possible combinations of six points and six visits. ANOVA was used to model cumulative species as a function of number of points visited, number of visits to each point, and interaction of points and visits. There was significant variation in numbers of birds and species between regions and localities (nested within region); neither habitat, nor the interaction between region and habitat, was significant. For a = 0.05 and a = 0.10, minimum sample size estimates (per factor level) varied by orders of magnitude depending upon the observed or specified range of desired detectable difference. For observed regional variation, 20 and 40 point counts were required to accommodate variability in total individuals (MSE = 9.28) and species (MSE = 3.79), respectively, whereas ? 25 percent of the mean could be achieved with five counts per factor level. Sample size sufficient to detect actual differences of Wood Thrush (Hylocichla mustelina) was >200, whereas the Prothonotary Warbler (Protonotaria citrea) required <10 counts. Differences in mean cumulative species were detected among number of points visited and among number of visits to a point. In the lower MAV, mean cumulative species increased with each added point through five points and with each additional visit through four visits. Although no interaction was detected between number of points and number of visits, when paired reciprocals were compared, more points invariably yielded a significantly greater cumulative number of species than more visits to a point. Still, 36 point counts per stand during each of two breeding seasons detected only 52 percent of the known available species pool in DEF.
40 CFR 761.355 - Third level of sample selection.
Code of Federal Regulations, 2012 CFR
2012-07-01
... of sample selection further reduces the size of the subsample to 100 grams which is suitable for the... procedures in § 761.353 of this part into 100 gram portions. (b) Use a random number generator or random number table to select one 100 gram size portion as a sample for a procedure used to simulate leachate...
40 CFR 761.355 - Third level of sample selection.
Code of Federal Regulations, 2011 CFR
2011-07-01
... of sample selection further reduces the size of the subsample to 100 grams which is suitable for the... procedures in § 761.353 of this part into 100 gram portions. (b) Use a random number generator or random number table to select one 100 gram size portion as a sample for a procedure used to simulate leachate...
40 CFR 761.355 - Third level of sample selection.
Code of Federal Regulations, 2013 CFR
2013-07-01
... of sample selection further reduces the size of the subsample to 100 grams which is suitable for the... procedures in § 761.353 of this part into 100 gram portions. (b) Use a random number generator or random number table to select one 100 gram size portion as a sample for a procedure used to simulate leachate...
40 CFR 761.355 - Third level of sample selection.
Code of Federal Regulations, 2010 CFR
2010-07-01
... of sample selection further reduces the size of the subsample to 100 grams which is suitable for the... procedures in § 761.353 of this part into 100 gram portions. (b) Use a random number generator or random number table to select one 100 gram size portion as a sample for a procedure used to simulate leachate...
40 CFR 761.355 - Third level of sample selection.
Code of Federal Regulations, 2014 CFR
2014-07-01
... of sample selection further reduces the size of the subsample to 100 grams which is suitable for the... procedures in § 761.353 of this part into 100 gram portions. (b) Use a random number generator or random number table to select one 100 gram size portion as a sample for a procedure used to simulate leachate...
Spreadsheet Simulation of the Law of Large Numbers
ERIC Educational Resources Information Center
Boger, George
2005-01-01
If larger and larger samples are successively drawn from a population and a running average calculated after each sample has been drawn, the sequence of averages will converge to the mean, [mu], of the population. This remarkable fact, known as the law of large numbers, holds true if samples are drawn from a population of discrete or continuous…
Direct Determination of Activities for Microorganisms of Chesapeake Bay Populations
Tabor, Paul S.; Neihof, Rex A.
1984-01-01
We used three methods in determination of the metabolically active individual microorganisms for Chesapeake Bay surface and near-bottom populations over a period of a year. Synthetically active bacteria were recognized as enlarged cells in samples amended with nalidixic acid and yeast extract and incubated for 6 h. Microorganisms with active electron transport systems were identified by the reduction of a tetrazolium salt electron acceptor. Microorganisms active in uptake of amino acids, thymidine, and acetate were determined by microautoradiography. In conjunction with enumeration of active organisms, a total direct count was made for each sample preparation by epifluorescence microscopy. For the majority of samples, numbers of amino acid uptake-active organisms were greater than numbers of organisms determined to be active by other direct measurements. Within a sample, the numbers of uptake-active organisms (amino acids or thymidine) and electron transport system-active organisms were significantly different for 68% of the samples. Numbers of synthetically active bacteria were generally less than numbers determined by the other direct activity measurements. The distribution of total counts in the 11 samplings showed a seasonal pattern, with significant dependence on in situ water temperature, increasing from March to September and then decreasing through February. Synthetically active bacteria and amino acid uptake-active organisms showed a significant dependence on in situ temperature, independent of the function of temperature on total counts. Numbers of active organisms determined by at least one of the methods used exceeded 25% of the total population of all samplings, and from June through September, >85% of the total population was found to be active by at least one direct activity measurement. Thus, active rather than dormant organisms compose a major portion of the microbial population in this region of Chesapeake Bay. PMID:16346659
Direct determination of activities for microorganisms of chesapeake bay populations.
Tabor, P S; Neihof, R A
1984-11-01
We used three methods in determination of the metabolically active individual microorganisms for Chesapeake Bay surface and near-bottom populations over a period of a year. Synthetically active bacteria were recognized as enlarged cells in samples amended with nalidixic acid and yeast extract and incubated for 6 h. Microorganisms with active electron transport systems were identified by the reduction of a tetrazolium salt electron acceptor. Microorganisms active in uptake of amino acids, thymidine, and acetate were determined by microautoradiography. In conjunction with enumeration of active organisms, a total direct count was made for each sample preparation by epifluorescence microscopy. For the majority of samples, numbers of amino acid uptake-active organisms were greater than numbers of organisms determined to be active by other direct measurements. Within a sample, the numbers of uptake-active organisms (amino acids or thymidine) and electron transport system-active organisms were significantly different for 68% of the samples. Numbers of synthetically active bacteria were generally less than numbers determined by the other direct activity measurements. The distribution of total counts in the 11 samplings showed a seasonal pattern, with significant dependence on in situ water temperature, increasing from March to September and then decreasing through February. Synthetically active bacteria and amino acid uptake-active organisms showed a significant dependence on in situ temperature, independent of the function of temperature on total counts. Numbers of active organisms determined by at least one of the methods used exceeded 25% of the total population of all samplings, and from June through September, >85% of the total population was found to be active by at least one direct activity measurement. Thus, active rather than dormant organisms compose a major portion of the microbial population in this region of Chesapeake Bay.
Emery, Sherry; Lee, Jungwha; Curry, Susan J; Johnson, Tim; Sporer, Amy K; Mermelstein, Robin; Flay, Brian; Warnecke, Richard
2010-02-01
Surveys of community-based programs are difficult to conduct when there is virtually no information about the number or locations of the programs of interest. This article describes the methodology used by the Helping Young Smokers Quit (HYSQ) initiative to identify and profile community-based youth smoking cessation programs in the absence of a defined sample frame. We developed a two-stage sampling design, with counties as the first-stage probability sampling units. The second stage used snowball sampling to saturation, to identify individuals who administered youth smoking cessation programs across three economic sectors in each county. Multivariate analyses modeled the relationship between program screening, eligibility, and response rates and economic sector and stratification criteria. Cumulative logit models analyzed the relationship between the number of contacts in a county and the number of programs screened, eligible, or profiled in a county. The snowball process yielded 9,983 unique and traceable contacts. Urban and high-income counties yielded significantly more screened program administrators; urban counties produced significantly more eligible programs, but there was no significant association between the county characteristics and program response rate. There is a positive relationship between the number of informants initially located and the number of programs screened, eligible, and profiled in a county. Our strategy to identify youth tobacco cessation programs could be used to create a sample frame for other nonprofit organizations that are difficult to identify due to a lack of existing directories, lists, or other traditional sample frames.
Hammerstrom, Kamille K; Ranasinghe, J Ananda; Weisberg, Stephen B; Oliver, John S; Fairey, W Russell; Slattery, Peter N; Oakden, James M
2012-10-01
Benthic macrofauna are used extensively for environmental assessment, but the area sampled and sieve sizes used to capture animals often differ among studies. Here, we sampled 80 sites using 3 different sized sampling areas (0.1, 0.05, 0.0071 m(2)) and sieved those sediments through each of 2 screen sizes (0.5, 1 mm) to evaluate their effect on number of individuals, number of species, dominance, nonmetric multidimensional scaling (MDS) ordination, and benthic community condition indices that are used to assess sediment quality in California. Sample area had little effect on abundance but substantially affected numbers of species, which are not easily scaled to a standard area. Sieve size had a substantial effect on both measures, with the 1-mm screen capturing only 74% of the species and 68% of the individuals collected in the 0.5-mm screen. These differences, though, had little effect on the ability to differentiate samples along gradients in ordination space. Benthic indices generally ranked sample condition in the same order regardless of gear, although the absolute scoring of condition was affected by gear type. The largest differences in condition assessment were observed for the 0.0071-m(2) gear. Benthic indices based on numbers of species were more affected than those based on relative abundance, primarily because we were unable to scale species number to a common area as we did for abundance. Copyright © 2010 SETAC.
Keiter, David A.; Cunningham, Fred L.; Rhodes, Olin E.; Irwin, Brian J.; Beasley, James
2016-01-01
Collection of scat samples is common in wildlife research, particularly for genetic capture-mark-recapture applications. Due to high degradation rates of genetic material in scat, large numbers of samples must be collected to generate robust estimates. Optimization of sampling approaches to account for taxa-specific patterns of scat deposition is, therefore, necessary to ensure sufficient sample collection. While scat collection methods have been widely studied in carnivores, research to maximize scat collection and noninvasive sampling efficiency for social ungulates is lacking. Further, environmental factors or scat morphology may influence detection of scat by observers. We contrasted performance of novel radial search protocols with existing adaptive cluster sampling protocols to quantify differences in observed amounts of wild pig (Sus scrofa) scat. We also evaluated the effects of environmental (percentage of vegetative ground cover and occurrence of rain immediately prior to sampling) and scat characteristics (fecal pellet size and number) on the detectability of scat by observers. We found that 15- and 20-m radial search protocols resulted in greater numbers of scats encountered than the previously used adaptive cluster sampling approach across habitat types, and that fecal pellet size, number of fecal pellets, percent vegetative ground cover, and recent rain events were significant predictors of scat detection. Our results suggest that use of a fixed-width radial search protocol may increase the number of scats detected for wild pigs, or other social ungulates, allowing more robust estimation of population metrics using noninvasive genetic sampling methods. Further, as fecal pellet size affected scat detection, juvenile or smaller-sized animals may be less detectable than adult or large animals, which could introduce bias into abundance estimates. Knowledge of relationships between environmental variables and scat detection may allow researchers to optimize sampling protocols to maximize utility of noninvasive sampling for wild pigs and other social ungulates.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keiter, David A.; Cunningham, Fred L.; Rhodes, Jr., Olin E.
Collection of scat samples is common in wildlife research, particularly for genetic capture-mark-recapture applications. Due to high degradation rates of genetic material in scat, large numbers of samples must be collected to generate robust estimates. Optimization of sampling approaches to account for taxa-specific patterns of scat deposition is, therefore, necessary to ensure sufficient sample collection. While scat collection methods have been widely studied in carnivores, research to maximize scat collection and noninvasive sampling efficiency for social ungulates is lacking. Further, environmental factors or scat morphology may influence detection of scat by observers. We contrasted performance of novel radial search protocolsmore » with existing adaptive cluster sampling protocols to quantify differences in observed amounts of wild pig ( Sus scrofa) scat. We also evaluated the effects of environmental (percentage of vegetative ground cover and occurrence of rain immediately prior to sampling) and scat characteristics (fecal pellet size and number) on the detectability of scat by observers. We found that 15- and 20-m radial search protocols resulted in greater numbers of scats encountered than the previously used adaptive cluster sampling approach across habitat types, and that fecal pellet size, number of fecal pellets, percent vegetative ground cover, and recent rain events were significant predictors of scat detection. Our results suggest that use of a fixed-width radial search protocol may increase the number of scats detected for wild pigs, or other social ungulates, allowing more robust estimation of population metrics using noninvasive genetic sampling methods. Further, as fecal pellet size affected scat detection, juvenile or smaller-sized animals may be less detectable than adult or large animals, which could introduce bias into abundance estimates. In conclusion, knowledge of relationships between environmental variables and scat detection may allow researchers to optimize sampling protocols to maximize utility of noninvasive sampling for wild pigs and other social ungulates.« less
Keiter, David A; Cunningham, Fred L; Rhodes, Olin E; Irwin, Brian J; Beasley, James C
2016-01-01
Collection of scat samples is common in wildlife research, particularly for genetic capture-mark-recapture applications. Due to high degradation rates of genetic material in scat, large numbers of samples must be collected to generate robust estimates. Optimization of sampling approaches to account for taxa-specific patterns of scat deposition is, therefore, necessary to ensure sufficient sample collection. While scat collection methods have been widely studied in carnivores, research to maximize scat collection and noninvasive sampling efficiency for social ungulates is lacking. Further, environmental factors or scat morphology may influence detection of scat by observers. We contrasted performance of novel radial search protocols with existing adaptive cluster sampling protocols to quantify differences in observed amounts of wild pig (Sus scrofa) scat. We also evaluated the effects of environmental (percentage of vegetative ground cover and occurrence of rain immediately prior to sampling) and scat characteristics (fecal pellet size and number) on the detectability of scat by observers. We found that 15- and 20-m radial search protocols resulted in greater numbers of scats encountered than the previously used adaptive cluster sampling approach across habitat types, and that fecal pellet size, number of fecal pellets, percent vegetative ground cover, and recent rain events were significant predictors of scat detection. Our results suggest that use of a fixed-width radial search protocol may increase the number of scats detected for wild pigs, or other social ungulates, allowing more robust estimation of population metrics using noninvasive genetic sampling methods. Further, as fecal pellet size affected scat detection, juvenile or smaller-sized animals may be less detectable than adult or large animals, which could introduce bias into abundance estimates. Knowledge of relationships between environmental variables and scat detection may allow researchers to optimize sampling protocols to maximize utility of noninvasive sampling for wild pigs and other social ungulates.
Keiter, David A.; Cunningham, Fred L.; Rhodes, Jr., Olin E.; ...
2016-05-25
Collection of scat samples is common in wildlife research, particularly for genetic capture-mark-recapture applications. Due to high degradation rates of genetic material in scat, large numbers of samples must be collected to generate robust estimates. Optimization of sampling approaches to account for taxa-specific patterns of scat deposition is, therefore, necessary to ensure sufficient sample collection. While scat collection methods have been widely studied in carnivores, research to maximize scat collection and noninvasive sampling efficiency for social ungulates is lacking. Further, environmental factors or scat morphology may influence detection of scat by observers. We contrasted performance of novel radial search protocolsmore » with existing adaptive cluster sampling protocols to quantify differences in observed amounts of wild pig ( Sus scrofa) scat. We also evaluated the effects of environmental (percentage of vegetative ground cover and occurrence of rain immediately prior to sampling) and scat characteristics (fecal pellet size and number) on the detectability of scat by observers. We found that 15- and 20-m radial search protocols resulted in greater numbers of scats encountered than the previously used adaptive cluster sampling approach across habitat types, and that fecal pellet size, number of fecal pellets, percent vegetative ground cover, and recent rain events were significant predictors of scat detection. Our results suggest that use of a fixed-width radial search protocol may increase the number of scats detected for wild pigs, or other social ungulates, allowing more robust estimation of population metrics using noninvasive genetic sampling methods. Further, as fecal pellet size affected scat detection, juvenile or smaller-sized animals may be less detectable than adult or large animals, which could introduce bias into abundance estimates. In conclusion, knowledge of relationships between environmental variables and scat detection may allow researchers to optimize sampling protocols to maximize utility of noninvasive sampling for wild pigs and other social ungulates.« less
Ancestral inference from haplotypes and mutations.
Griffiths, Robert C; Tavaré, Simon
2018-04-25
We consider inference about the history of a sample of DNA sequences, conditional upon the haplotype counts and the number of segregating sites observed at the present time. After deriving some theoretical results in the coalescent setting, we implement rejection sampling and importance sampling schemes to perform the inference. The importance sampling scheme addresses an extension of the Ewens Sampling Formula for a configuration of haplotypes and the number of segregating sites in the sample. The implementations include both constant and variable population size models. The methods are illustrated by two human Y chromosome datasets. Copyright © 2018. Published by Elsevier Inc.
Recommended protocols for sampling macrofungi
Gregory M. Mueller; John Paul Schmit; Sabine M. Hubndorf Leif Ryvarden; Thomas E. O' Dell; D. Jean Lodge; Patrick R. Leacock; Milagro Mata; Loengrin Umania; Qiuxin (Florence) Wu; Daniel L. Czederpiltz
2004-01-01
This chapter discusses several issues regarding reommended protocols for sampling macrofungi: Opportunistic sampling of macrofungi, sampling conspicuous macrofungi using fixed-size, sampling small Ascomycetes using microplots, and sampling a fixed number of downed logs.
Lindblad, M
2007-09-15
Swab sample data from a 13-month microbiological baseline study of swine carcasses at Swedish abattoirs were combined with excision sample data collected routinely at five abattoirs. The aim was to compare the numbers of total aerobic counts, Enterobacteriaceae, and Escherichia coli, recovered by swabbing four carcass sites with gauze (total area 400 cm2) with those obtained by excision at equivalent sites (total area 20 cm2). The results are considered in relation to the process hygiene criteria that are stated in Commission Regulation (EC) No 2073/2005. These criteria apply only to destructive sampling of total aerobic counts and Enterobacteriaceae, but alternative sampling schemes, as well as alternative indicator organisms such as E. coli, are allowed if equivalent guarantees of food safety can be provided. Swab sampling resulted in higher mean log numbers of total aerobic counts at four of the five abattoirs, compared with excision, and lower or equal standard deviations at all abattoirs. The percentage of swab and excision samples positive for Enterobacteriaceae at the different abattoirs ranged from 68 to 100% and 15 to 24%, respectively. Similarly, the percentages of swab samples that were positive for E. coli were higher than the percentages of positive excision samples (range 52 to 84% and 3 to 14%, respectively). Due to the low percentage of positive excision results, the mean log numbers of Enterobacteriaceae and E. coli were only compared at two and one abattoirs, respectively, using log probability regression to substitute censored observations. Higher mean log numbers of Enterobacteriaceae were recovered by swabbing compared with excision at one abattoir, whereas the numbers of Enterobacteriaceae and E. coli did not differ significantly between sampling methods at one abattoir. This study suggests that the same process hygiene criteria as those stipulated for excision can be used for swabbing with gauze without compromising food safety. For monitoring of low numbers of Enterobacteriaceae and E. coli, like those found on swine carcasses at Swedish abattoirs, the results also show that swabbing of a relatively large area is superior to excision of a smaller area.
Gradient-free MCMC methods for dynamic causal modelling
Sengupta, Biswa; Friston, Karl J.; Penny, Will D.
2015-03-14
Here, we compare the performance of four gradient-free MCMC samplers (random walk Metropolis sampling, slice-sampling, adaptive MCMC sampling and population-based MCMC sampling with tempering) in terms of the number of independent samples they can produce per unit computational time. For the Bayesian inversion of a single-node neural mass model, both adaptive and population-based samplers are more efficient compared with random walk Metropolis sampler or slice-sampling; yet adaptive MCMC sampling is more promising in terms of compute time. Slice-sampling yields the highest number of independent samples from the target density -- albeit at almost 1000% increase in computational time, in comparisonmore » to the most efficient algorithm (i.e., the adaptive MCMC sampler).« less
Airborne Bacteria in an Urban Environment
Mancinelli, Rocco L.; Shulls, Wells A.
1978-01-01
Samples were taken at random intervals over a 2-year period from urban air and tested for viable bacteria. The number of bacteria in each sample was determined, and each organism isolated was identified by its morphological and biochemical characteristics. The number of bacteria found ranged from 0.013 to 1.88 organisms per liter of air sampled. Representatives of 19 different genera were found in 21 samples. The most frequently isolated organisms and their percent of occurence were Micrococcus (41%), Staphylococcus (11%), and Aerococcus (8%). The bacteria isolated were correlated with various weather and air pollution parameters using the Pearson product-moment correlation coefficient method. Statistically significant correlations were found between the number of viable bacteria isolated and the concentrations of nitric oxide (−0.45), nitrogen dioxide (+0.43), and suspended particulate pollutants (+0.56). Calculated individually, the total number of Micrococcus, Aerococcus, and Staphylococcus, number of rods, and number of cocci isolated showed negative correlations with nitric oxide and positive correlations with nitrogen dioxide and particulates. Statistically significant positive correlations were found between the total number of rods isolated and the concentration of nitrogen dioxide (+0.54) and the percent relative humidity (+0.43). The other parameters tested, sulfur dioxide, hydrocarbons, and temperature, showed no significant correlations. Images PMID:677875
2007-05-01
Signaling Pathway 5b. GRANT NUMBER W81XWH-05-1-0245 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER Yi Yan 5e. TASK NUMBER...negative control to identify proteins non-specifically precipitated. Tandem mass spectrometric analysis of immunoprecipitated samples identified a...immunoprecipitated sample and negative control. It is important to note that, SRAP was present among the remaining specifically precipitated 87 proteins. Using the
Pierce, C.L.; Sexton, M.D.; Pelham, M.E.; Larscheid, J.G.
2001-01-01
We assessed short-term variability and long-term change in the composition of the littoral fish community in Spirit Lake, Iowa. Fish were sampled in several locations at night with large beach seines during spring, summer and fall of 1995-1998. Long-term changes were inferred from comparison with a similar study conducted over 70 y earlier in Spirit Lake. We found 26 species in the littoral zone. The number of species per sample ranged from 4 to 18, averaging 11.8. The average number of species per sample was higher at stations with greater vegetation density. A distinct seasonal pattern was evident in the number of species collected per sample in most years, increasing steadily from spring to fall. Patterns of variability within our 1995-1998 study period suggest that: (1) numerous samples are necessary to adequately characterize a littoral fish community, (2) sampling should be done when vegetation and young-of-year densities are highest and (3) sampling during a single year is inadequate to reveal the full community. The number of native species has declined by approximately 25% over the last 70 y. A coincident decline in littoral vegetation and associated habitat changes during the same period are likely causes of the long-term community change.
Methods for estimating the amount of vernal pool habitat in the northeastern United States
Van Meter, R.; Bailey, L.L.; Grant, E.H.C.
2008-01-01
The loss of small, seasonal wetlands is a major concern for a variety of state, local, and federal organizations in the northeastern U.S. Identifying and estimating the number of vernal pools within a given region is critical to developing long-term conservation and management strategies for these unique habitats and their faunal communities. We use three probabilistic sampling methods (simple random sampling, adaptive cluster sampling, and the dual frame method) to estimate the number of vernal pools on protected, forested lands. Overall, these methods yielded similar values of vernal pool abundance for each study area, and suggest that photographic interpretation alone may grossly underestimate the number of vernal pools in forested habitats. We compare the relative efficiency of each method and discuss ways of improving precision. Acknowledging that the objectives of a study or monitoring program ultimately determine which sampling designs are most appropriate, we recommend that some type of probabilistic sampling method be applied. We view the dual-frame method as an especially useful way of combining incomplete remote sensing methods, such as aerial photograph interpretation, with a probabilistic sample of the entire area of interest to provide more robust estimates of the number of vernal pools and a more representative sample of existing vernal pool habitats.
Effects of 16S rDNA sampling on estimates of the number of endosymbiont lineages in sucking lice
Burleigh, J. Gordon; Light, Jessica E.; Reed, David L.
2016-01-01
Phylogenetic trees can reveal the origins of endosymbiotic lineages of bacteria and detect patterns of co-evolution with their hosts. Although taxon sampling can greatly affect phylogenetic and co-evolutionary inference, most hypotheses of endosymbiont relationships are based on few available bacterial sequences. Here we examined how different sampling strategies of Gammaproteobacteria sequences affect estimates of the number of endosymbiont lineages in parasitic sucking lice (Insecta: Phthirapatera: Anoplura). We estimated the number of louse endosymbiont lineages using both newly obtained and previously sequenced 16S rDNA bacterial sequences and more than 42,000 16S rDNA sequences from other Gammaproteobacteria. We also performed parametric and nonparametric bootstrapping experiments to examine the effects of phylogenetic error and uncertainty on these estimates. Sampling of 16S rDNA sequences affects the estimates of endosymbiont diversity in sucking lice until we reach a threshold of genetic diversity, the size of which depends on the sampling strategy. Sampling by maximizing the diversity of 16S rDNA sequences is more efficient than randomly sampling available 16S rDNA sequences. Although simulation results validate estimates of multiple endosymbiont lineages in sucking lice, the bootstrap results suggest that the precise number of endosymbiont origins is still uncertain. PMID:27547523
Oscillating-flow regenerator test rig: Woven screen and metal felt results
NASA Technical Reports Server (NTRS)
Gedeon, D.; Wood, J. G.
1992-01-01
We present correlating expressions, in terms of Reynolds or Peclet numbers, for friction factors, Nusselt numbers, enhanced axial conduction ratios, and overall heat flux ratios in four porous regenerator samples representative of stirling cycle regenerators: two woven screen samples and two random wire samples. Error estimates and comparison of data with others suggest our correlations are reliable, but we need to test more samples over a range of porosities before our results will become generally useful.
Relation between urbanization and water quality of streams in the Austin area, Texas
Veenhuis, J.E.; Slade, R.M.
1990-01-01
The ratio of the number of samples with detectable concentrations to the total number of samples analyzed for 18 inorganic trace elements and the concentrations of many of these minor constituents increased with increasing development classifications. Twenty-two of the 42 synthetic organic compounds for which samples were analyzed were detected in one or more samples. The compounds were detected more frequently and in larger concentrations at the sites with more urban classifications.
Zepeda-Mendoza, Marie Lisandra; Bohmann, Kristine; Carmona Baez, Aldo; Gilbert, M Thomas P
2016-05-03
DNA metabarcoding is an approach for identifying multiple taxa in an environmental sample using specific genetic loci and taxa-specific primers. When combined with high-throughput sequencing it enables the taxonomic characterization of large numbers of samples in a relatively time- and cost-efficient manner. One recent laboratory development is the addition of 5'-nucleotide tags to both primers producing double-tagged amplicons and the use of multiple PCR replicates to filter erroneous sequences. However, there is currently no available toolkit for the straightforward analysis of datasets produced in this way. We present DAMe, a toolkit for the processing of datasets generated by double-tagged amplicons from multiple PCR replicates derived from an unlimited number of samples. Specifically, DAMe can be used to (i) sort amplicons by tag combination, (ii) evaluate PCR replicates dissimilarity, and (iii) filter sequences derived from sequencing/PCR errors, chimeras, and contamination. This is attained by calculating the following parameters: (i) sequence content similarity between the PCR replicates from each sample, (ii) reproducibility of each unique sequence across the PCR replicates, and (iii) copy number of the unique sequences in each PCR replicate. We showcase the insights that can be obtained using DAMe prior to taxonomic assignment, by applying it to two real datasets that vary in their complexity regarding number of samples, sequencing libraries, PCR replicates, and used tag combinations. Finally, we use a third mock dataset to demonstrate the impact and importance of filtering the sequences with DAMe. DAMe allows the user-friendly manipulation of amplicons derived from multiple samples with PCR replicates built in a single or multiple sequencing libraries. It allows the user to: (i) collapse amplicons into unique sequences and sort them by tag combination while retaining the sample identifier and copy number information, (ii) identify sequences carrying unused tag combinations, (iii) evaluate the comparability of PCR replicates of the same sample, and (iv) filter tagged amplicons from a number of PCR replicates using parameters of minimum length, copy number, and reproducibility across the PCR replicates. This enables an efficient analysis of complex datasets, and ultimately increases the ease of handling datasets from large-scale studies.
Sample sizes to control error estimates in determining soil bulk density in California forest soils
Youzhi Han; Jianwei Zhang; Kim G. Mattson; Weidong Zhang; Thomas A. Weber
2016-01-01
Characterizing forest soil properties with high variability is challenging, sometimes requiring large numbers of soil samples. Soil bulk density is a standard variable needed along with element concentrations to calculate nutrient pools. This study aimed to determine the optimal sample size, the number of observation (n), for predicting the soil bulk density with a...
The Influence of Mark-Recapture Sampling Effort on Estimates of Rock Lobster Survival
Kordjazi, Ziya; Frusher, Stewart; Buxton, Colin; Gardner, Caleb; Bird, Tomas
2016-01-01
Five annual capture-mark-recapture surveys on Jasus edwardsii were used to evaluate the effect of sample size and fishing effort on the precision of estimated survival probability. Datasets of different numbers of individual lobsters (ranging from 200 to 1,000 lobsters) were created by random subsampling from each annual survey. This process of random subsampling was also used to create 12 datasets of different levels of effort based on three levels of the number of traps (15, 30 and 50 traps per day) and four levels of the number of sampling-days (2, 4, 6 and 7 days). The most parsimonious Cormack-Jolly-Seber (CJS) model for estimating survival probability shifted from a constant model towards sex-dependent models with increasing sample size and effort. A sample of 500 lobsters or 50 traps used on four consecutive sampling-days was required for obtaining precise survival estimations for males and females, separately. Reduced sampling effort of 30 traps over four sampling days was sufficient if a survival estimate for both sexes combined was sufficient for management of the fishery. PMID:26990561
Cheng, Ningtao; Wu, Leihong; Cheng, Yiyu
2013-01-01
The promise of microarray technology in providing prediction classifiers for cancer outcome estimation has been confirmed by a number of demonstrable successes. However, the reliability of prediction results relies heavily on the accuracy of statistical parameters involved in classifiers. It cannot be reliably estimated with only a small number of training samples. Therefore, it is of vital importance to determine the minimum number of training samples and to ensure the clinical value of microarrays in cancer outcome prediction. We evaluated the impact of training sample size on model performance extensively based on 3 large-scale cancer microarray datasets provided by the second phase of MicroArray Quality Control project (MAQC-II). An SSNR-based (scale of signal-to-noise ratio) protocol was proposed in this study for minimum training sample size determination. External validation results based on another 3 cancer datasets confirmed that the SSNR-based approach could not only determine the minimum number of training samples efficiently, but also provide a valuable strategy for estimating the underlying performance of classifiers in advance. Once translated into clinical routine applications, the SSNR-based protocol would provide great convenience in microarray-based cancer outcome prediction in improving classifier reliability. PMID:23861920
Jacobs, V R; Niemeyer, M; Gottschalk, N; Schneider, K T; Kiechle, M
2005-12-01
Private umbilical cord blood (UCB) banking after delivery has increased over the last decade. For adult/somatic stem cell research UCB is an essential source of stem cells and researchers question if the number of UCB samples for research might be reduced by private banking. A survey among seven private blood banks in Germany and analysis and comparison of the number of UCB samples donated for research within the STEMMAT project with private blood banking were performed from 03/2003 to 06/2005 at the Frauenklinik (OB/GYN), Technical University Munich, Germany. Within 27.5 months 1,551 UCB samples were collected for research purposes; the effective recruitment rate was higher than expectations at an effective 66.2 %. Private UCB banking [n = 24] was distributed among three cord blood banks [n = 16, 6 and 4]. The rate of private blood banking was 0.99 % for all deliveries, thus reducing the effective rate for research purpose by only 1.5 %. Under the assumption of active and successful recruitment of scientific UCB samples, private blood banking does not significantly reduce this rate and therefore is a negligible rival in the competition for sufficient numbers of UCB samples for research.
Sofaer, Helen R.; Jarnevich, Catherine S.
2017-01-01
AimThe distributions of exotic species reflect patterns of human-mediated dispersal, species climatic tolerances and a suite of other biotic and abiotic factors. The relative importance of each of these factors will shape how the spread of exotic species is affected by ongoing economic globalization and climate change. However, patterns of trade may be correlated with variation in scientific sampling effort globally, potentially confounding studies that do not account for sampling patterns.LocationGlobal.Time periodMuseum records, generally from the 1800s up to 2015.Major taxa studiedPlant species exotic to the United States.MethodsWe used data from the Global Biodiversity Information Facility (GBIF) to summarize the number of plant species with exotic occurrences in the United States that also occur in each other country world-wide. We assessed the relative importance of trade and climatic similarity for explaining variation in the number of shared species while evaluating several methods to account for variation in sampling effort among countries.ResultsAccounting for variation in sampling effort reversed the relative importance of trade and climate for explaining numbers of shared species. Trade was strongly correlated with numbers of shared U.S. exotic plants between the United States and other countries before, but not after, accounting for sampling variation among countries. Conversely, accounting for sampling effort strengthened the relationship between climatic similarity and species sharing. Using the number of records as a measure of sampling effort provided a straightforward approach for the analysis of occurrence data, whereas species richness estimators and rarefaction were less effective at removing sampling bias.Main conclusionsOur work provides support for broad-scale climatic limitation on the distributions of exotic species, illustrates the need to account for variation in sampling effort in large biodiversity databases, and highlights the difficulty in inferring causal links between the economic drivers of invasion and global patterns of exotic species occurrence.
NASA Astrophysics Data System (ADS)
Marteinsson, V.; Klonowski, A.; Reynisson, E.; Vannier, P.; Sigurdsson, B. D.; Ólafsson, M.
2015-02-01
Colonization of life on Surtsey has been observed systematically since the formation of the island 50 years ago. Although the first colonisers were prokaryotes, such as bacteria and blue-green algae, most studies have been focused on the settlement of plants and animals but less on microbial succession. To explore microbial colonization in diverse soils and the influence of associated vegetation and birds on numbers of environmental bacteria, we collected 45 samples from different soil types on the surface of the island. Total viable bacterial counts were performed with the plate count method at 22, 30 and 37 °C for all soil samples, and the amount of organic matter and nitrogen (N) was measured. Selected samples were also tested for coliforms, faecal coliforms and aerobic and anaerobic bacteria. The subsurface biosphere was investigated by collecting liquid subsurface samples from a 181 m borehole with a special sampler. Diversity analysis of uncultivated biota in samples was performed by 16S rRNA gene sequences analysis and cultivation. Correlation was observed between nutrient deficits and the number of microorganisms in surface soil samples. The lowest number of bacteria (1 × 104-1 × 105 cells g-1) was detected in almost pure pumice but the count was significantly higher (1 × 106-1 × 109 cells g-1) in vegetated soil or pumice with bird droppings. The number of faecal bacteria correlated also to the total number of bacteria and type of soil. Bacteria belonging to Enterobacteriaceae were only detected in vegetated samples and samples containing bird droppings. The human pathogens Salmonella, Campylobacter and Listeria were not in any sample. Both thermophilic bacteria and archaea 16S rDNA sequences were found in the subsurface samples collected at 145 and 172 m depth at 80 and 54 °C, respectively, but no growth was observed in enrichments. The microbiota sequences generally showed low affiliation to any known 16S rRNA gene sequences.
Eblen, Denise R; Barlow, Kristina E; Naugle, Alecia Larew
2006-11-01
The U.S. Food Safety and Inspection Service (FSIS) pathogen reduction-hazard analysis critical control point systems final rule, published in 1996, established Salmonella performance standards for broiler chicken, cow and bull, market hog, and steer and heifer carcasses and for ground beef, chicken, and turkey meat. In 1998, the FSIS began testing to verify that establishments are meeting performance standards. Samples are collected in sets in which the number of samples is defined but varies according to product class. A sample set fails when the number of positive Salmonella samples exceeds the maximum number of positive samples allowed under the performance standard. Salmonella sample sets collected at 1,584 establishments from 1998 through 2003 were examined to identify factors associated with failure of one or more sets. Overall, 1,282 (80.9%) of establishments never had failed sets. In establishments that did experience set failure(s), generally the failed sets were collected early in the establishment testing history, with the exception of broiler establishments where failure(s) occurred both early and late in the course of testing. Small establishments were more likely to have experienced a set failure than were large or very small establishments, and broiler establishments were more likely to have failed than were ground beef, market hog, or steer-heifer establishments. Agency response to failed Salmonella sample sets in the form of in-depth verification reviews and related establishment-initiated corrective actions have likely contributed to declines in the number of establishments that failed sets. A focus on food safety measures in small establishments and broiler processing establishments should further reduce the number of sample sets that fail to meet the Salmonella performance standard.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Plemons, R.E.; Hopwood, W.H. Jr.; Hamilton, J.H.
For a number of years the Oak Ridge Y-12 Plant Laboratory has been analyzing coal predominately for the utilities department of the Y-12 Plant. All laboratory procedures, except a Leco sulfur method which used the Leco Instruction Manual as a reference, were written based on the ASTM coal analyses. Sulfur is analyzed at the present time by two methods, gravimetric and Leco. The laboratory has two major endeavors for monitoring the quality of its coal analyses. (1) A control program by the Plant Statistical Quality Control Department. Quality Control submits one sample for every nine samples submitted by the utilitiesmore » departments and the laboratory analyzes a control sample along with the utilities samples. (2) An exchange program with the DOE Coal Analysis Laboratory in Bruceton, Pennsylvania. The Y-12 Laboratory submits to the DOE Coal Laboratory, on even numbered months, a sample that Y-12 has analyzed. The DOE Coal Laboratory submits, on odd numbered months, one of their analyzed samples to the Y-12 Plant Laboratory to be analyzed. The results of these control and exchange programs are monitored not only by laboratory personnel, but also by Statistical Quality Control personnel who provide statistical evaluations. After analysis and reporting of results, all utilities samples are retained by the laboratory until the coal contracts have been settled. The utilities departments have responsibility for the initiation and preparation of the coal samples. The samples normally received by the laboratory have been ground to 4-mesh, reduced to 0.5-gallon quantities, and sealed in air-tight containers. Sample identification numbers and a Request for Analysis are generated by the utilities departments.« less
Factors associated with number of duodenal samples obtained in suspected celiac disease.
Shamban, Leonid; Sorser, Serge; Naydin, Stan; Lebwohl, Benjamin; Shukr, Mousa; Wiemann, Charlotte; Yevsyukov, Daniel; Piper, Michael H; Warren, Bradley; Green, Peter H R
2017-12-01
Many people with celiac disease are undiagnosed and there is evidence that insufficient duodenal samples may contribute to underdiagnosis. The aims of this study were to investigate whether more samples leads to a greater likelihood of a diagnosis of celiac disease and to elucidate factors that influence the number of samples collected. We identified patients from two community hospitals who were undergoing duodenal biopsy for indications (as identified by International Classification of Diseases code) compatible with possible celiac disease. Three cohorts were evaluated: no celiac disease (NCD, normal villi), celiac disease (villous atrophy, Marsh score 3), and possible celiac disease (PCD, Marsh score < 3). Endoscopic features, indication, setting, trainee presence, and patient demographic details were evaluated for their role in sample collection. 5997 patients met the inclusion criteria. Patients with a final diagnosis of celiac disease had a median of 4 specimens collected. The percentage of patients diagnosed with celiac disease with one sample was 0.3 % compared with 12.8 % of those with six samples ( P = 0.001). Patient factors that positively correlated with the number of samples collected were endoscopic features, demographic details, and indication ( P = 0.001). Endoscopist factors that positively correlated with the number of samples collected were absence of a trainee, pediatric gastroenterologist, and outpatient setting ( P < 0.001). Histological diagnosis of celiac disease significantly increased with six samples. Multiple factors influenced whether adequate biopsies were taken. Adherence to guidelines may increase the diagnosis rate of celiac disease.
(I Can’t Get No) Saturation: A simulation and guidelines for sample sizes in qualitative research
2017-01-01
I explore the sample size in qualitative research that is required to reach theoretical saturation. I conceptualize a population as consisting of sub-populations that contain different types of information sources that hold a number of codes. Theoretical saturation is reached after all the codes in the population have been observed once in the sample. I delineate three different scenarios to sample information sources: “random chance,” which is based on probability sampling, “minimal information,” which yields at least one new code per sampling step, and “maximum information,” which yields the largest number of new codes per sampling step. Next, I use simulations to assess the minimum sample size for each scenario for systematically varying hypothetical populations. I show that theoretical saturation is more dependent on the mean probability of observing codes than on the number of codes in a population. Moreover, the minimal and maximal information scenarios are significantly more efficient than random chance, but yield fewer repetitions per code to validate the findings. I formulate guidelines for purposive sampling and recommend that researchers follow a minimum information scenario. PMID:28746358
Simulation of design-unbiased point-to-particle sampling compared to alternatives on plantation rows
Thomas B. Lynch; David Hamlin; Mark J. Ducey
2016-01-01
Total quantities of tree attributes can be estimated in plantations by sampling on plantation rows using several methods. At random sample points on a row, either fixed row lengths or variable row lengths with a fixed number of sample trees can be assessed. Ratio of means or mean of ratios estimators can be developed for the fixed number of trees option but are not...
Hancock, Bruno C; Ketterhagen, William R
2011-10-14
Discrete element model (DEM) simulations of the discharge of powders from hoppers under gravity were analyzed to provide estimates of dosage form content uniformity during the manufacture of solid dosage forms (tablets and capsules). For a system that exhibits moderate segregation the effects of sample size, number, and location within the batch were determined. The various sampling approaches were compared to current best-practices for sampling described in the Product Quality Research Institute (PQRI) Blend Uniformity Working Group (BUWG) guidelines. Sampling uniformly across the discharge process gave the most accurate results with respect to identifying segregation trends. Sigmoidal sampling (as recommended in the PQRI BUWG guidelines) tended to overestimate potential segregation issues, whereas truncated sampling (common in industrial practice) tended to underestimate them. The size of the sample had a major effect on the absolute potency RSD. The number of sampling locations (10 vs. 20) had very little effect on the trends in the data, and the number of samples analyzed at each location (1 vs. 3 vs. 7) had only a small effect for the sampling conditions examined. The results of this work provide greater understanding of the effect of different sampling approaches on the measured content uniformity of real dosage forms, and can help to guide the choice of appropriate sampling protocols. Copyright © 2011 Elsevier B.V. All rights reserved.
The EIPeptiDi tool: enhancing peptide discovery in ICAT-based LC MS/MS experiments.
Cannataro, Mario; Cuda, Giovanni; Gaspari, Marco; Greco, Sergio; Tradigo, Giuseppe; Veltri, Pierangelo
2007-07-15
Isotope-coded affinity tags (ICAT) is a method for quantitative proteomics based on differential isotopic labeling, sample digestion and mass spectrometry (MS). The method allows the identification and relative quantification of proteins present in two samples and consists of the following phases. First, cysteine residues are either labeled using the ICAT Light or ICAT Heavy reagent (having identical chemical properties but different masses). Then, after whole sample digestion, the labeled peptides are captured selectively using the biotin tag contained in both ICAT reagents. Finally, the simplified peptide mixture is analyzed by nanoscale liquid chromatography-tandem mass spectrometry (LC-MS/MS). Nevertheless, the ICAT LC-MS/MS method still suffers from insufficient sample-to-sample reproducibility on peptide identification. In particular, the number and the type of peptides identified in different experiments can vary considerably and, thus, the statistical (comparative) analysis of sample sets is very challenging. Low information overlap at the peptide and, consequently, at the protein level, is very detrimental in situations where the number of samples to be analyzed is high. We designed a method for improving the data processing and peptide identification in sample sets subjected to ICAT labeling and LC-MS/MS analysis, based on cross validating MS/MS results. Such a method has been implemented in a tool, called EIPeptiDi, which boosts the ICAT data analysis software improving peptide identification throughout the input data set. Heavy/Light (H/L) pairs quantified but not identified by the MS/MS routine, are assigned to peptide sequences identified in other samples, by using similarity criteria based on chromatographic retention time and Heavy/Light mass attributes. EIPeptiDi significantly improves the number of identified peptides per sample, proving that the proposed method has a considerable impact on the protein identification process and, consequently, on the amount of potentially critical information in clinical studies. The EIPeptiDi tool is available at http://bioingegneria.unicz.it/~veltri/projects/eipeptidi/ with a demo data set. EIPeptiDi significantly increases the number of peptides identified and quantified in analyzed samples, thus reducing the number of unassigned H/L pairs and allowing a better comparative analysis of sample data sets.
Multiclass classification of microarray data samples with a reduced number of genes
2011-01-01
Background Multiclass classification of microarray data samples with a reduced number of genes is a rich and challenging problem in Bioinformatics research. The problem gets harder as the number of classes is increased. In addition, the performance of most classifiers is tightly linked to the effectiveness of mandatory gene selection methods. Critical to gene selection is the availability of estimates about the maximum number of genes that can be handled by any classification algorithm. Lack of such estimates may lead to either computationally demanding explorations of a search space with thousands of dimensions or classification models based on gene sets of unrestricted size. In the former case, unbiased but possibly overfitted classification models may arise. In the latter case, biased classification models unable to support statistically significant findings may be obtained. Results A novel bound on the maximum number of genes that can be handled by binary classifiers in binary mediated multiclass classification algorithms of microarray data samples is presented. The bound suggests that high-dimensional binary output domains might favor the existence of accurate and sparse binary mediated multiclass classifiers for microarray data samples. Conclusions A comprehensive experimental work shows that the bound is indeed useful to induce accurate and sparse multiclass classifiers for microarray data samples. PMID:21342522
A Structure-Adaptive Hybrid RBF-BP Classifier with an Optimized Learning Strategy
Wen, Hui; Xie, Weixin; Pei, Jihong
2016-01-01
This paper presents a structure-adaptive hybrid RBF-BP (SAHRBF-BP) classifier with an optimized learning strategy. SAHRBF-BP is composed of a structure-adaptive RBF network and a BP network of cascade, where the number of RBF hidden nodes is adjusted adaptively according to the distribution of sample space, the adaptive RBF network is used for nonlinear kernel mapping and the BP network is used for nonlinear classification. The optimized learning strategy is as follows: firstly, a potential function is introduced into training sample space to adaptively determine the number of initial RBF hidden nodes and node parameters, and a form of heterogeneous samples repulsive force is designed to further optimize each generated RBF hidden node parameters, the optimized structure-adaptive RBF network is used for adaptively nonlinear mapping the sample space; then, according to the number of adaptively generated RBF hidden nodes, the number of subsequent BP input nodes can be determined, and the overall SAHRBF-BP classifier is built up; finally, different training sample sets are used to train the BP network parameters in SAHRBF-BP. Compared with other algorithms applied to different data sets, experiments show the superiority of SAHRBF-BP. Especially on most low dimensional and large number of data sets, the classification performance of SAHRBF-BP outperforms other training SLFNs algorithms. PMID:27792737
Pereira, Elisabete; Figueira, Celso; Aguiar, Nuno; Vasconcelos, Rita; Vasconcelos, Sílvia; Calado, Graça; Brandão, João; Prada, Susana
2013-09-01
Madeira forms a mid-Atlantic volcanic archipelago, whose economy is largely dependent on tourism. There, one can encounter different types of sand beach: natural basaltic, natural calcareous and artificial calcareous. Microbiological and mycological quality of the sand was analyzed in two different years. Bacterial indicators were detected in higher number in 2010 (36.7% of the samples) than in 2011 (9.1%). Mycological indicators were detected in a similar percentage of samples in 2010 (68.3%) and 2011 (75%), even though the total number of colonies detected in 2010 was much higher (827 in 41 samples) than in 2011 (427 in 66 samples). Enterococci and potentially pathogenic and allergenic fungi (particularly Penicillium sp.) were the most common indicators detected in both years. Candida sp. yeast was also commonly detected in the samples. The analysis of the 3rd quartile and maximum numbers of all indicators in samples showed that artificial beaches tend to be more contaminated than the natural ones. However, a significant difference between the variables was lacking. More monitoring data (number of bathers, sea birds, radiation intensity variation, and a greater number of samples) should be collected in order to confirm if these differences are significant. In general, the sand quality in the archipelago's beaches was good. As the sand may be a vector of diseases, an international common set of indicators and values and a compatible methodologies for assessing sand contamination, should be defined, in order to provide the bather's with an indication of beach sand quality, rather than only the water. Copyright © 2013 Elsevier B.V. All rights reserved.
Gradient-free MCMC methods for dynamic causal modelling.
Sengupta, Biswa; Friston, Karl J; Penny, Will D
2015-05-15
In this technical note we compare the performance of four gradient-free MCMC samplers (random walk Metropolis sampling, slice-sampling, adaptive MCMC sampling and population-based MCMC sampling with tempering) in terms of the number of independent samples they can produce per unit computational time. For the Bayesian inversion of a single-node neural mass model, both adaptive and population-based samplers are more efficient compared with random walk Metropolis sampler or slice-sampling; yet adaptive MCMC sampling is more promising in terms of compute time. Slice-sampling yields the highest number of independent samples from the target density - albeit at almost 1000% increase in computational time, in comparison to the most efficient algorithm (i.e., the adaptive MCMC sampler). Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Number of pins in two-stage stratified sampling for estimating herbage yield
William G. O' Regan; C. Eugene Conrad
1975-01-01
In a two-stage stratified procedure for sampling herbage yield, plots are stratified by a pin frame in stage one, and clipped. In stage two, clippings from selected plots are sorted, dried, and weighed. Sample size and distribution of plots between the two stages are determined by equations. A way to compute the effect of number of pins on the variance of estimated...
[Influence of PCR cycle number on microbial diversity analysis through next generation sequencing].
An, Yunhe; Gao, Lijuan; Li, Junbo; Tian, Yanjie; Wang, Jinlong; Zheng, Xuejuan; Wu, Huijuan
2016-08-25
Using of high throughput sequencing technology to study the microbial diversity in complex samples has become one of the hottest issues in the field of microbial diversity research. In this study, the soil and sheep rumen chyme samples were used to extract DNA, respectively. Then the 25 ng total DNA was used to amplify the 16S rRNA V3 region with 20, 25, 30 PCR cycles, and the final sequencing library was constructed by mixing equal amounts of purified PCR products. Finally, the operational taxonomic unit (OUT) amount, rarefaction curve, microbial number and species were compared through data analysis. It was found that at the same amount of DNA template, the proportion of the community composition was not the best with more numbers of PCR cycle, although the species number was much more. In all, when the PCR cycle number is 25, the number of species and proportion of the community composition were the most optimal both in soil or chyme samples.
Reich, Felix; Atanassova, Viktoria; Haunhorst, Eberhard; Klein, Günter
2008-09-30
For the presence and number of Campylobacter, 18 broiler flocks were sampled over a period of 18 months. A total of 70% of the flocks were positive for Campylobacter, with higher prevalence found in summer and autumn, compared to winter and spring. Positive flocks showed contamination rates above 90%, in negative flocks this was lower, mostly below 50%. The enumeration showed a decrease in Campylobacter during processing of positive flocks. The numbers were highest in carcasses after scalding/defeathering (mean 5.9 log10 cfu/carcass) and dropped by 0.7 log10 cfu/carcass after chilling. A positive correlation was observed between the number of Campylobacter present in the caeca and the number of bacteria present on carcasses and cut products. When a negative flock was slaughtered after Campylobacter positive flocks, the number of positive samples was higher compared to the case when a negative flock had been slaughtered previously. C. jejuni was isolated from 73.6% of the poultry samples.
Gao, Yang; Widschwendter, Martin; Teschendorff, Andrew E
2018-05-04
Normal tissue at risk of neoplastic transformation is characterized by somatic mutations, copy-number variation and DNA methylation changes. It is unclear however, which type of alteration may be more informative of cancer risk. We analyzed genome-wide DNA methylation and copy-number calls from the same DNA assay in a cohort of healthy breast samples and age-matched normal samples collected adjacent to breast cancer. Using statistical methods to adjust for cell type heterogeneity, we show that DNA methylation changes can discriminate normal-adjacent from normal samples better than somatic copy-number variants. We validate this important finding in an independent dataset. These results suggest that DNA methylation alterations in the normal cell of origin may offer better cancer risk prediction and early detection markers than copy-number changes. Copyright © 2018. Published by Elsevier B.V.
Allele quantification using molecular inversion probes (MIP)
Wang, Yuker; Moorhead, Martin; Karlin-Neumann, George; Falkowski, Matthew; Chen, Chunnuan; Siddiqui, Farooq; Davis, Ronald W.; Willis, Thomas D.; Faham, Malek
2005-01-01
Detection of genomic copy number changes has been an important research area, especially in cancer. Several high-throughput technologies have been developed to detect these changes. Features that are important for the utility of technologies assessing copy number changes include the ability to interrogate regions of interest at the desired density as well as the ability to differentiate the two homologs. In addition, assessing formaldehyde fixed and paraffin embedded (FFPE) samples allows the utilization of the vast majority of cancer samples. To address these points we demonstrate the use of molecular inversion probe (MIP) technology to the study of copy number. MIP is a high-throughput genotyping technology capable of interrogating >20 000 single nucleotide polymorphisms in the same tube. We have shown the ability of MIP at this multiplex level to provide copy number measurements while obtaining the allele information. In addition we have demonstrated a proof of principle for copy number analysis in FFPE samples. PMID:16314297
Binomial leap methods for simulating stochastic chemical kinetics.
Tian, Tianhai; Burrage, Kevin
2004-12-01
This paper discusses efficient simulation methods for stochastic chemical kinetics. Based on the tau-leap and midpoint tau-leap methods of Gillespie [D. T. Gillespie, J. Chem. Phys. 115, 1716 (2001)], binomial random variables are used in these leap methods rather than Poisson random variables. The motivation for this approach is to improve the efficiency of the Poisson leap methods by using larger stepsizes. Unlike Poisson random variables whose range of sample values is from zero to infinity, binomial random variables have a finite range of sample values. This probabilistic property has been used to restrict possible reaction numbers and to avoid negative molecular numbers in stochastic simulations when larger stepsize is used. In this approach a binomial random variable is defined for a single reaction channel in order to keep the reaction number of this channel below the numbers of molecules that undergo this reaction channel. A sampling technique is also designed for the total reaction number of a reactant species that undergoes two or more reaction channels. Samples for the total reaction number are not greater than the molecular number of this species. In addition, probability properties of the binomial random variables provide stepsize conditions for restricting reaction numbers in a chosen time interval. These stepsize conditions are important properties of robust leap control strategies. Numerical results indicate that the proposed binomial leap methods can be applied to a wide range of chemical reaction systems with very good accuracy and significant improvement on efficiency over existing approaches. (c) 2004 American Institute of Physics.
Atmospheric CO2 Concentrations from Aircraft for 1972-1981, CSIRO Monitoring Program
Beardsmore, David J. [Commonwealth Scientific and Industrial Research Organization (CSIRO), Victoria, Australia; Pearman, Graeme I. [Commonwealth Scientific and Industrial Research Organization (CSIRO), Victoria, Australia
2012-01-01
From 1972 through 1981, air samples were collected in glass flasks from aircraft at a variety of latitudes and altitudes over Australia, New Zealand, and Antarctica. The samples were analyzed for CO2 concentrations with nondispersive infrared gas analysis. The resulting data contain the sampling dates, type of aircraft, flight number, flask identification number, sampling time, geographic sector, distance in kilometers from the listed distance measuring equipment (DME) station, station number of the radio navigation distance measuring equipment, altitude of the aircraft above mean sea level, sample analysis date, flask pressure, tertiary standards used for the analysis, analyzer used, and CO2 concentration. These data represent the first published record of CO2 concentrations in the Southern Hemisphere expressed in the WMO 1981 CO2 Calibration Scale and provide a precise record of atmospheric CO2 concentrations in the troposphere and lower stratosphere over Australia and New Zealand.
Carpenter, Danielle; Walker, Susan; Prescott, Natalie; Schalkwijk, Joost; Armour, John Al
2011-08-18
Copy number variation (CNV) contributes to the variation observed between individuals and can influence human disease progression, but the accurate measurement of individual copy numbers is technically challenging. In the work presented here we describe a modification to a previously described paralogue ratio test (PRT) method for genotyping the CCL3L1/CCL4L1 copy variable region, which we use to ascertain CCL3L1/CCL4L1 copy number in 1581 European samples. As the products of CCL3L1 and CCL4L1 potentially play a role in autoimmunity we performed case control association studies with Crohn's disease, rheumatoid arthritis and psoriasis clinical cohorts. We evaluate the PRT methodology used, paying particular attention to accuracy and precision, and highlight the problems of differential bias in copy number measurements. Our PRT methods for measuring copy number were of sufficient precision to detect very slight but systematic differential bias between results from case and control DNA samples in one study. We find no evidence for an association between CCL3L1 copy number and Crohn's disease, rheumatoid arthritis or psoriasis. Differential bias of this small magnitude, but applied systematically across large numbers of samples, would create a serious risk of false positive associations in copy number, if measured using methods of lower precision, or methods relying on single uncorroborated measurements. In this study the small differential bias detected by PRT in one sample set was resolved by a simple pre-treatment by restriction enzyme digestion.
2011-01-01
Background Copy number variation (CNV) contributes to the variation observed between individuals and can influence human disease progression, but the accurate measurement of individual copy numbers is technically challenging. In the work presented here we describe a modification to a previously described paralogue ratio test (PRT) method for genotyping the CCL3L1/CCL4L1 copy variable region, which we use to ascertain CCL3L1/CCL4L1 copy number in 1581 European samples. As the products of CCL3L1 and CCL4L1 potentially play a role in autoimmunity we performed case control association studies with Crohn's disease, rheumatoid arthritis and psoriasis clinical cohorts. Results We evaluate the PRT methodology used, paying particular attention to accuracy and precision, and highlight the problems of differential bias in copy number measurements. Our PRT methods for measuring copy number were of sufficient precision to detect very slight but systematic differential bias between results from case and control DNA samples in one study. We find no evidence for an association between CCL3L1 copy number and Crohn's disease, rheumatoid arthritis or psoriasis. Conclusions Differential bias of this small magnitude, but applied systematically across large numbers of samples, would create a serious risk of false positive associations in copy number, if measured using methods of lower precision, or methods relying on single uncorroborated measurements. In this study the small differential bias detected by PRT in one sample set was resolved by a simple pre-treatment by restriction enzyme digestion. PMID:21851606
An improved SRC method based on virtual samples for face recognition
NASA Astrophysics Data System (ADS)
Fu, Lijun; Chen, Deyun; Lin, Kezheng; Li, Ao
2018-07-01
The sparse representation classifier (SRC) performs classification by evaluating which class leads to the minimum representation error. However, in real world, the number of available training samples is limited due to noise interference, training samples cannot accurately represent the test sample linearly. Therefore, in this paper, we first produce virtual samples by exploiting original training samples at the aim of increasing the number of training samples. Then, we take the intra-class difference as data representation of partial noise, and utilize the intra-class differences and training samples simultaneously to represent the test sample in a linear way according to the theory of SRC algorithm. Using weighted score level fusion, the respective representation scores of the virtual samples and the original training samples are fused together to obtain the final classification results. The experimental results on multiple face databases show that our proposed method has a very satisfactory classification performance.
Wickremsinhe, Enaksha R; Perkins, Everett J
2015-03-01
Traditional pharmacokinetic analysis in nonclinical studies is based on the concentration of a test compound in plasma and requires approximately 100 to 200 μL blood collected per time point. However, the total blood volume of mice limits the number of samples that can be collected from an individual animal-often to a single collection per mouse-thus necessitating dosing multiple mice to generate a pharmacokinetic profile in a sparse-sampling design. Compared with traditional methods, dried blood spot (DBS) analysis requires smaller volumes of blood (15 to 20 μL), thus supporting serial blood sampling and the generation of a complete pharmacokinetic profile from a single mouse. Here we compare plasma-derived data with DBS-derived data, explain how to adopt DBS sampling to support discovery mouse studies, and describe how to generate pharmacokinetic and pharmacodynamic data from a single mouse. Executing novel study designs that use DBS enhances the ability to identify and streamline better drug candidates during drug discovery. Implementing DBS sampling can reduce the number of mice needed in a drug discovery program. In addition, the simplicity of DBS sampling and the smaller numbers of mice needed translate to decreased study costs. Overall, DBS sampling is consistent with 3Rs principles by achieving reductions in the number of animals used, decreased restraint-associated stress, improved data quality, direct comparison of interanimal variability, and the generation of multiple endpoints from a single study.
Wickremsinhe, Enaksha R; Perkins, Everett J
2015-01-01
Traditional pharmacokinetic analysis in nonclinical studies is based on the concentration of a test compound in plasma and requires approximately 100 to 200 µL blood collected per time point. However, the total blood volume of mice limits the number of samples that can be collected from an individual animal—often to a single collection per mouse—thus necessitating dosing multiple mice to generate a pharmacokinetic profile in a sparse-sampling design. Compared with traditional methods, dried blood spot (DBS) analysis requires smaller volumes of blood (15 to 20 µL), thus supporting serial blood sampling and the generation of a complete pharmacokinetic profile from a single mouse. Here we compare plasma-derived data with DBS-derived data, explain how to adopt DBS sampling to support discovery mouse studies, and describe how to generate pharmacokinetic and pharmacodynamic data from a single mouse. Executing novel study designs that use DBS enhances the ability to identify and streamline better drug candidates during drug discovery. Implementing DBS sampling can reduce the number of mice needed in a drug discovery program. In addition, the simplicity of DBS sampling and the smaller numbers of mice needed translate to decreased study costs. Overall, DBS sampling is consistent with 3Rs principles by achieving reductions in the number of animals used, decreased restraint-associated stress, improved data quality, direct comparison of interanimal variability, and the generation of multiple endpoints from a single study. PMID:25836959
Pinto-Leite, C M; Rocha, P L B
2012-12-01
Empirical studies using visual search methods to investigate spider communities were conducted with different sampling protocols, including a variety of plot sizes, sampling efforts, and diurnal periods for sampling. We sampled 11 plots ranging in size from 5 by 10 m to 5 by 60 m. In each plot, we computed the total number of species detected every 10 min during 1 hr during the daytime and during the nighttime (0630 hours to 1100 hours, both a.m. and p.m.). We measured the influence of time effort on the measurement of species richness by comparing the curves produced by sample-based rarefaction and species richness estimation (first-order jackknife). We used a general linear model with repeated measures to assess whether the phase of the day during which sampling occurred and the differences in the plot lengths influenced the number of species observed and the number of species estimated. To measure the differences in species composition between the phases of the day, we used a multiresponse permutation procedure and a graphical representation based on nonmetric multidimensional scaling. After 50 min of sampling, we noted a decreased rate of species accumulation and a tendency of the estimated richness curves to reach an asymptote. We did not detect an effect of plot size on the number of species sampled. However, differences in observed species richness and species composition were found between phases of the day. Based on these results, we propose guidelines for visual search for tropical web spiders.
NASA Astrophysics Data System (ADS)
Marteinsson, V.; Klonowski, A.; Reynisson, E.; Vannier, P.; Sigurdsson, B. D.; Ólafsson, M.
2014-09-01
Colonisation of life on Surtsey has been observed systematically since the formation of the island 50 years ago. Although the first colonisers were prokaryotes, such as bacteria and blue-green algae, most studies have been focusing on settlement of plants and animals but less on microbial succession. To explore microbial colonization in diverse soils and the influence of associate vegetation and birds on numbers of environmental bacteria, we collected 45 samples from different soils types on the surface of the island. Total viable bacterial counts were performed with plate count at 22, 30 and 37 °C for all soils samples and the amount of organic matter and nitrogen (N) was measured. Selected samples were also tested for coliforms, faecal coliforms aerobic and anaerobic bacteria. The deep subsurface biosphere was investigated by collecting liquid subsurface samples from a 182 m borehole with a special sampler. Diversity analysis of uncultivated biota in samples was performed by 16S rRNA gene sequences analysis and cultivation. Correlation was observed between N deficits and the number of microorganisms in surface soils samples. The lowest number of bacteria (1 × 104-1 × 105 g-1) was detected in almost pure pumice but the count was significant higher (1 × 106-1 × 109 g-1) in vegetated soil or pumice with bird droppings. The number of faecal bacteria correlated also to the total number of bacteria and type of soil. Bacteria belonging to Enterobacteriaceae were only detected in vegetated and samples containing bird droppings. The human pathogens Salmonella, Campylobacter and Listeria were not in any sample. Both thermophilic bacteria and archaea 16S rDNA sequences were found in the subsurface samples collected at 145 m and 172 m depth at 80 °C and 54 °C, respectively, but no growth was observed in enrichments. The microbiota sequences generally showed low affiliation to any known 16S rRNA gene sequences.
Phylogenetic effective sample size.
Bartoszek, Krzysztof
2016-10-21
In this paper I address the question-how large is a phylogenetic sample? I propose a definition of a phylogenetic effective sample size for Brownian motion and Ornstein-Uhlenbeck processes-the regression effective sample size. I discuss how mutual information can be used to define an effective sample size in the non-normal process case and compare these two definitions to an already present concept of effective sample size (the mean effective sample size). Through a simulation study I find that the AICc is robust if one corrects for the number of species or effective number of species. Lastly I discuss how the concept of the phylogenetic effective sample size can be useful for biodiversity quantification, identification of interesting clades and deciding on the importance of phylogenetic correlations. Copyright © 2016 Elsevier Ltd. All rights reserved.
Exploring high dimensional free energy landscapes: Temperature accelerated sliced sampling
NASA Astrophysics Data System (ADS)
Awasthi, Shalini; Nair, Nisanth N.
2017-03-01
Biased sampling of collective variables is widely used to accelerate rare events in molecular simulations and to explore free energy surfaces. However, computational efficiency of these methods decreases with increasing number of collective variables, which severely limits the predictive power of the enhanced sampling approaches. Here we propose a method called Temperature Accelerated Sliced Sampling (TASS) that combines temperature accelerated molecular dynamics with umbrella sampling and metadynamics to sample the collective variable space in an efficient manner. The presented method can sample a large number of collective variables and is advantageous for controlled exploration of broad and unbound free energy basins. TASS is also shown to achieve quick free energy convergence and is practically usable with ab initio molecular dynamics techniques.
Contains the Air Sampling Logbook between 1-24-2011 thru 1-28-2011 from the Region 4 Yellow Bluff Air Study Wilcox County, Alabama SESD Project Identification Number:11-0068 November 2010-December 2010
Ahmed, W; Hodgers, L; Sidhu, J P S; Toze, S
2012-01-01
In this study, the microbiological quality of household tap water samples fed from rainwater tanks was assessed by monitoring the numbers of Escherichia coli bacteria and enterococci from 24 households in Southeast Queensland (SEQ), Australia. Quantitative PCR (qPCR) was also used for the quantitative detection of zoonotic pathogens in water samples from rainwater tanks and connected household taps. The numbers of zoonotic pathogens were also estimated in fecal samples from possums and various species of birds by using qPCR, as possums and birds are considered to be the potential sources of fecal contamination in roof-harvested rainwater (RHRW). Among the 24 households, 63% of rainwater tank and 58% of connected household tap water (CHTW) samples contained E. coli and exceeded Australian drinking water guidelines of <1 CFU E. coli per 100 ml water. Similarly, 92% of rainwater tanks and 83% of CHTW samples also contained enterococci. In all, 21%, 4%, and 13% of rainwater tank samples contained Campylobacter spp., Salmonella spp., and Giardia lamblia, respectively. Similarly, 21% of rainwater tank and 13% of CHTW samples contained Campylobacter spp. and G. lamblia, respectively. The number of E. coli (P = 0.78), Enterococcus (P = 0.64), Campylobacter (P = 0.44), and G. lamblia (P = 0.50) cells in rainwater tanks did not differ significantly from the numbers observed in the CHTW samples. Among the 40 possum fecal samples tested, Campylobacter spp., Cryptosporidium parvum, and G. lamblia were detected in 60%, 13%, and 30% of samples, respectively. Among the 38 bird fecal samples tested, Campylobacter spp., Salmonella spp., C. parvum, and G. lamblia were detected in 24%, 11%, 5%, and 13% of the samples, respectively. Household tap water samples fed from rainwater tanks tested in the study appeared to be highly variable. Regular cleaning of roofs and gutters, along with pruning of overhanging tree branches, might also prove effective in reducing animal fecal contamination of rainwater tanks.
Fortes, Esther D; David, John; Koeritzer, Bob; Wiedmann, Martin
2013-05-01
There is a continued need to develop improved rapid methods for detection of foodborne pathogens. The aim of this project was to evaluate the 3M Molecular Detection System (3M MDS), which uses isothermal DNA amplification, and the 3M Molecular Detection Assay Listeria using environmental samples obtained from retail delicatessens and meat, seafood, and dairy processing plants. Environmental sponge samples were tested for Listeria with the 3M MDS after 22 and 48 h of enrichment in 3M Modified Listeria Recovery Broth (3M mLRB); enrichments were also used for cultural detection of Listeria spp. Among 391 samples tested for Listeria, 74 were positive by both the 3M MDS and the cultural method, 310 were negative by both methods, 2 were positive by the 3M MDS and negative by the cultural method, and one sample was negative by the 3M MDS and positive by the cultural method. Four samples were removed from the sample set, prior to statistical analyses, due to potential cross-contamination during testing. Listeria isolates from positive samples represented L. monocytogenes, L. innocua, L. welshimeri, and L. seeligeri. Overall, the 3M MDS and culture-based detection after enrichment in 3M mLRB did not differ significantly (P < 0.05) with regard to the number of positive samples, when chi-square analyses were performed for (i) number of positive samples after 22 h, (ii) number of positive samples after 48 h, and (iii) number of positive samples after 22 and/or 48 h of enrichment in 3M mLRB. Among 288 sampling sites that were tested with duplicate sponges, 67 each tested positive with the 3M MDS and the traditional U.S. Food and Drug Administration Bacteriological Analytical Manual method, further supporting that the 3M MDS performs equivalently to traditional methods when used with environmental sponge samples.
Support, shape and number of replicate samples for tree foliage analysis.
Luyssaert, Sebastiaan; Mertens, Jan; Raitio, Hannu
2003-06-01
Many fundamental features of a sampling program are determined by the heterogeneity of the object under study and the settings for the error (alpha), the power (beta), the effect size (ES), the number of replicate samples, and sample support, which is a feature that is often overlooked. The number of replicates, alpha, beta, ES, and sample support are interconnected. The effect of the sample support and its shape on the required number of replicate samples was investigated by means of a resampling method. The method was applied to a simulated distribution of Cd in the crown of a Salix fragilis L. tree. Increasing the dimensions of the sample support results in a decrease in the variance of the element concentration under study. Analysis of the variance is often the foundation of statistical tests, therefore, valid statistical testing requires the use of a fixed sample support during the experiment. This requirement might be difficult to meet in time-series analyses and long-term monitoring programs. Sample supports have their largest dimension in the direction with the largest heterogeneity, i.e. the direction representing the crown height, and this will give more accurate results than supports with other shapes. Taking the relationships between the sample support and the variance of the element concentrations in tree crowns into account provides guidelines for sampling efficiency in terms of precision and costs. In terms of time, the optimal support to test whether the average Cd concentration of the crown exceeds a threshold value is 0.405 m3 (alpha = 0.05, beta = 0.20, ES = 1.0 mg kg(-1) dry mass). The average weight of this support is 23 g dry mass, and 11 replicate samples need to be taken. It should be noted that in this case the optimal support applies to Cd under conditions similar to those of the simulation, but not necessarily all the examinations for this tree species, element, and hypothesis test.
Hodgers, L.; Sidhu, J. P. S.; Toze, S.
2012-01-01
In this study, the microbiological quality of household tap water samples fed from rainwater tanks was assessed by monitoring the numbers of Escherichia coli bacteria and enterococci from 24 households in Southeast Queensland (SEQ), Australia. Quantitative PCR (qPCR) was also used for the quantitative detection of zoonotic pathogens in water samples from rainwater tanks and connected household taps. The numbers of zoonotic pathogens were also estimated in fecal samples from possums and various species of birds by using qPCR, as possums and birds are considered to be the potential sources of fecal contamination in roof-harvested rainwater (RHRW). Among the 24 households, 63% of rainwater tank and 58% of connected household tap water (CHTW) samples contained E. coli and exceeded Australian drinking water guidelines of <1 CFU E. coli per 100 ml water. Similarly, 92% of rainwater tanks and 83% of CHTW samples also contained enterococci. In all, 21%, 4%, and 13% of rainwater tank samples contained Campylobacter spp., Salmonella spp., and Giardia lamblia, respectively. Similarly, 21% of rainwater tank and 13% of CHTW samples contained Campylobacter spp. and G. lamblia, respectively. The number of E. coli (P = 0.78), Enterococcus (P = 0.64), Campylobacter (P = 0.44), and G. lamblia (P = 0.50) cells in rainwater tanks did not differ significantly from the numbers observed in the CHTW samples. Among the 40 possum fecal samples tested, Campylobacter spp., Cryptosporidium parvum, and G. lamblia were detected in 60%, 13%, and 30% of samples, respectively. Among the 38 bird fecal samples tested, Campylobacter spp., Salmonella spp., C. parvum, and G. lamblia were detected in 24%, 11%, 5%, and 13% of the samples, respectively. Household tap water samples fed from rainwater tanks tested in the study appeared to be highly variable. Regular cleaning of roofs and gutters, along with pruning of overhanging tree branches, might also prove effective in reducing animal fecal contamination of rainwater tanks. PMID:22020514
Occurrence of invertebrates at 38 stream sites in the Mississippi Embayment study unit, 1996-99
Caskey, Brian J.; Justus, B.G.; Zappia, Humbert
2002-01-01
A total of 88 invertebrate species and 178 genera representing 59 families, 8 orders, 6 classes, and 3 phyla was identified at 38 stream sites in the Mississippi Embayment Study Unit from 1996 through 1999 as part of the National Water-Quality Assessment Program. Sites were selected based on land use within the drainage basins and the availability of long-term streamflow data. Invertebrates were sampled as part of an overall sampling design to provide information related to the status and trends in water quality in the Mississippi Embayment Study Unit, which includes parts of Arkansas, Kentucky, Louisiana, Mississippi, Missouri, and Tennessee. Invertebrate sampling and processing was conducted using nationally standardized techniques developed for the National Water-Quality Assessment Program. These techniques included both a semi-quantitative method, which targeted habitats where invertebrate diversity is expected to be highest, and a qualitative multihabitat method, which samples all available habitat types possible within a sampling reach. All invertebrate samples were shipped to the USGS National Water-Quality Laboratory (NWQL) where they were processed. Of the 365 taxa identified, 156 were identified with the semi-quantitative method that involved sampling a known quantity of what was expected to be the richest habitat, woody debris. The qualitative method, which involved sampling all available habitats, identified 345 taxa The number of organisms identified in the semi-quantitative samples ranged from 74 to 3,295, whereas the number of taxa identified ranged from 9 to 54. The number of organisms identified in the qualitative samples ranged from 42 to 29,634, whereas the number of taxa ranged from 18 to 81. From all the organisms identified, chironomid taxa were the most frequently identified, and plecopteran taxa were among the least frequently identified.
Optimal tumor sampling for immunostaining of biomarkers in breast carcinoma
2011-01-01
Introduction Biomarkers, such as Estrogen Receptor, are used to determine therapy and prognosis in breast carcinoma. Immunostaining assays of biomarker expression have a high rate of inaccuracy; for example, estimates are as high as 20% for Estrogen Receptor. Biomarkers have been shown to be heterogeneously expressed in breast tumors and this heterogeneity may contribute to the inaccuracy of immunostaining assays. Currently, no evidence-based standards exist for the amount of tumor that must be sampled in order to correct for biomarker heterogeneity. The aim of this study was to determine the optimal number of 20X fields that are necessary to estimate a representative measurement of expression in a whole tissue section for selected biomarkers: ER, HER-2, AKT, ERK, S6K1, GAPDH, Cytokeratin, and MAP-Tau. Methods Two collections of whole tissue sections of breast carcinoma were immunostained for biomarkers. Expression was quantified using the Automated Quantitative Analysis (AQUA) method of quantitative immunofluorescence. Simulated sampling of various numbers of fields (ranging from one to thirty five) was performed for each marker. The optimal number was selected for each marker via resampling techniques and minimization of prediction error over an independent test set. Results The optimal number of 20X fields varied by biomarker, ranging between three to fourteen fields. More heterogeneous markers, such as MAP-Tau protein, required a larger sample of 20X fields to produce representative measurement. Conclusions The optimal number of 20X fields that must be sampled to produce a representative measurement of biomarker expression varies by marker with more heterogeneous markers requiring a larger number. The clinical implication of these findings is that breast biopsies consisting of a small number of fields may be inadequate to represent whole tumor biomarker expression for many markers. Additionally, for biomarkers newly introduced into clinical use, especially if therapeutic response is dictated by level of expression, the optimal size of tissue sample must be determined on a marker-by-marker basis. PMID:21592345
Wareham, K J; Hyde, R M; Grindlay, D; Brennan, M L; Dean, R S
2017-10-04
Randomised controlled trials (RCTs) are a key component of the veterinary evidence base. Sample sizes and defined outcome measures are crucial components of RCTs. To describe the sample size and number of outcome measures of veterinary RCTs either funded by the pharmaceutical industry or not, published in 2011. A structured search of PubMed identified RCTs examining the efficacy of pharmaceutical interventions. Number of outcome measures, number of animals enrolled per trial, whether a primary outcome was identified, and the presence of a sample size calculation were extracted from the RCTs. The source of funding was identified for each trial and groups compared on the above parameters. Literature searches returned 972 papers; 86 papers comprising 126 individual trials were analysed. The median number of outcomes per trial was 5.0; there were no significant differences across funding groups (p = 0.133). The median number of animals enrolled per trial was 30.0; this was similar across funding groups (p = 0.302). A primary outcome was identified in 40.5% of trials and was significantly more likely to be stated in trials funded by a pharmaceutical company. A very low percentage of trials reported a sample size calculation (14.3%). Failure to report primary outcomes, justify sample sizes and the reporting of multiple outcome measures was a common feature in all of the clinical trials examined in this study. It is possible some of these factors may be affected by the source of funding of the studies, but the influence of funding needs to be explored with a larger number of trials. Some veterinary RCTs provide a weak evidence base and targeted strategies are required to improve the quality of veterinary RCTs to ensure there is reliable evidence on which to base clinical decisions.
Ahmad, S; Srivastava, P K
2007-04-01
Investigations were carried to study the effect of heart incorporation (0%, 15% and 20%) and increasing levels of fat (20% and 25%) on physicochemical (pH, moisture content and thiobarbituric acid, TBA number) and microbiological (total plate count and yeast and mold count) quality and shelf life of semi dry sausages of buffalo meat during refrigerated storage (4°C). Different levels of fat significantly (p<0.05) increased the pH of the sausage samples. However different levels of heart incorporation did not significantly (p<0.05) affect pH, moisture content and TBA number of sausage samples. Fresh samples had pH, moisture content and TBA number in the range of 5.15-5.28, 42.4-47.4% and 0.073-0.134 respectively. Refrigerated storage significantly (p<0.05) increased TBA number of control samples while storage did not significantly (p<0.05) increase the TBA number of sodium ascorbate (SA) treated samples. Total plate counts of twelve sausage samples were f under the TFTC (too few to count) limit at the initial stage. Incorporation of different levels of heart and also increasing levels of fat did not significantly (p<0.05) increase the log TPC/g values. Yeast and molds were not detected in twelve samples of semi dry fermented sausages in their fresh condition. Storage revealed that there was a consistent decrease in pH, and moisture content. Refrigerated storage significantly (p<0.05) reduced both pH and moisture contents. TBA number and total plate counts and yeast and mold counts of controls were found to increase significantly (p<0.05) during refrigerated storage. However, in SA treated sausage, only TPC and yeast and mold count significantly (p<0.05) increased during refrigerated storage. Shelf life of the sausages was found to be 60 days under refrigerated storage (4°C).
Evaluation of Porcelain Cup Soil Water Samplers for Bacteriological Sampling1
Dazzo, Frank B.; Rothwell, Donald F.
1974-01-01
The validity of obtaining soil water for fecal coliform analyses by porcelain cup soil water samplers was examined. Numbers from samples of manure slurry drawn through porcelain cups were reduced 100- to 10,000,000-fold compared to numbers obtained from the external manure slurry, and 65% of the cups yielded coliform-free samples. Fecal coliforms adsorbed to cups apparently were released, thus influencing the counts of subsequent samples. Fecal coliforms persisted in soil water samplers buried in soil and thus could significantly influence the coliform counts of water samples obtained a month later. These studies indicate that porcelain cup soil water samplers do not yield valid water samples for fecal coliform analyses. Images PMID:16349998
2008-12-01
TASK NUMBER 5f. WORK UNIT NUMBER 6. AUTHOR( S ) 7. PERFORMING ORGANIZATION NAME( S ) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION REPORT NUMBER 9...SPONSORING/MONITORING AGENCY NAME( S ) AND ADDRESS(ES) 10. SPONSOR/MONITOR’S ACRONYM( S ) 11. SPONSOR/MONITOR’S REPORT NUMBER( S ) 12. DISTRIBUTION...exceeding 1X104 CFU/cm2. The coefficient of variation (CV) 38 for sample-to-sample loading within an experiment was 13.6% for spores and 6.1% for 39 S
Image subsampling and point scoring approaches for large-scale marine benthic monitoring programs
NASA Astrophysics Data System (ADS)
Perkins, Nicholas R.; Foster, Scott D.; Hill, Nicole A.; Barrett, Neville S.
2016-07-01
Benthic imagery is an effective tool for quantitative description of ecologically and economically important benthic habitats and biota. The recent development of autonomous underwater vehicles (AUVs) allows surveying of spatial scales that were previously unfeasible. However, an AUV collects a large number of images, the scoring of which is time and labour intensive. There is a need to optimise the way that subsamples of imagery are chosen and scored to gain meaningful inferences for ecological monitoring studies. We examine the trade-off between the number of images selected within transects and the number of random points scored within images on the percent cover of target biota, the typical output of such monitoring programs. We also investigate the efficacy of various image selection approaches, such as systematic or random, on the bias and precision of cover estimates. We use simulated biotas that have varying size, abundance and distributional patterns. We find that a relatively small sampling effort is required to minimise bias. An increased precision for groups that are likely to be the focus of monitoring programs is best gained through increasing the number of images sampled rather than the number of points scored within images. For rare species, sampling using point count approaches is unlikely to provide sufficient precision, and alternative sampling approaches may need to be employed. The approach by which images are selected (simple random sampling, regularly spaced etc.) had no discernible effect on mean and variance estimates, regardless of the distributional pattern of biota. Field validation of our findings is provided through Monte Carlo resampling analysis of a previously scored benthic survey from temperate waters. We show that point count sampling approaches are capable of providing relatively precise cover estimates for candidate groups that are not overly rare. The amount of sampling required, in terms of both the number of images and number of points, varies with the abundance, size and distributional pattern of target biota. Therefore, we advocate either the incorporation of prior knowledge or the use of baseline surveys to establish key properties of intended target biota in the initial stages of monitoring programs.
Quantification of hookworm ova from wastewater matrices using quantitative PCR.
Gyawali, Pradip; Ahmed, Warish; Sidhu, Jatinder P; Jagals, Paul; Toze, Simon
2017-07-01
A quantitative PCR (qPCR) assay was used to quantify Ancylostoma caninum ova in wastewater and sludge samples. We estimated the average gene copy numbers for a single ovum using a mixed population of ova. The average gene copy numbers derived from the mixed population were used to estimate numbers of hookworm ova in A. caninum seeded and unseeded wastewater and sludge samples. The newly developed qPCR assay estimated an average of 3.7×10 3 gene copies per ovum, which was then validated by seeding known numbers of hookworm ova into treated wastewater. The qPCR estimated an average of (1.1±0.1), (8.6±2.9) and (67.3±10.4) ova for treated wastewater that was seeded with (1±0), (10±2) and (100±21) ova, respectively. The further application of the qPCR assay for the quantification of A. caninum ova was determined by seeding a known numbers of ova into the wastewater matrices. The qPCR results indicated that 50%, 90% and 67% of treated wastewater (1L), raw wastewater (1L) and sludge (~4g) samples had variable numbers of A. caninum gene copies. After conversion of the qPCR estimated gene copy numbers to ova for treated wastewater, raw wastewater, and sludge samples, had an average of 0.02, 1.24 and 67 ova, respectively. The result of this study indicated that qPCR can be used for the quantification of hookworm ova from wastewater and sludge samples; however, caution is advised in interpreting qPCR generated data for health risk assessment. Copyright © 2017. Published by Elsevier B.V.
Low, Dennis J.; Chichester, Douglas C.
2006-01-01
This study, by the U.S. Geological Survey (USGS) in cooperation with the Pennsylvania Department of Environmental Protection (PADEP), provides a compilation of ground-water-quality data for a 25-year period (January 1, 1979, through August 11, 2004) based on water samples from wells. The data are from eight source agencies唯orough of Carroll Valley, Chester County Health Department, Pennsylvania Department of Environmental Protection-Ambient and Fixed Station Network, Montgomery County Health Department, Pennsylvania Drinking Water Information System, Pennsylvania Department of Agriculture, Susquehanna River Basin Commission, and the U.S. Geological Survey. The ground-water-quality data from the different source agencies varied in type and number of analyses; however, the analyses are represented by 12 major analyte groups:biological (bacteria and viruses), fungicides, herbicides, insecticides, major ions, minor ions (including trace elements), nutrients (dominantly nitrate and nitrite as nitrogen), pesticides, radiochemicals (dominantly radon or radium), volatile organic compounds, wastewater compounds, and water characteristics (dominantly field pH, field specific conductance, and hardness).A summary map shows the areal distribution of wells with ground-water-quality data statewide and by major watersheds and source agency. Maps of 35 watersheds within Pennsylvania are used to display the areal distribution of water-quality information. Additional maps emphasize the areal distribution with respect to 13 major geolithologic units in Pennsylvania and concentration ranges of nitrate (as nitrogen). Summary data tables by source agency provide information on the number of wells and samples collected for each of the 35 watersheds and analyte groups. The number of wells sampled for ground-water-quality data varies considerably across Pennsylvania. Of the 8,012 wells sampled, the greatest concentration of wells are in the southeast (Berks, Bucks, Chester, Delaware, Lancaster, Montgomery, and Philadelphia Counties), in the vicinity of Pittsburgh, and in the northwest (Erie County). The number of wells sampled is relatively sparse in south-central (Adams, Cambria, Cumberland, and Franklin Counties), central (Centre, Indiana, and Snyder Counties), and north-central (Bradford, Potter, and Tioga Counties) Pennsylvania. Little to no data are available for approximately one-third of the state. Water characteristics and nutrients were the most frequently sampled major analyte groups; approximately 21,000 samples were collected for each group. Major and minor ions were the next most-frequently sampled major analyte groups; approximately 17,000 and 12,000 samples were collected, respectively. For the remaining eight major analyte groups, the number of samples collected ranged from a low of 307 samples (wastewater compounds) to a high of approximately 3,000 samples (biological).The number of samples that exceeded a maximum contaminant level (MCL) or secondary maximum contaminant level (SMCL) by major analyte group also varied. Of the 2,988 samples in the biological analyte group, 53 percent had water that exceeded an MCL. Almost 2,500 samples were collected and analyzed for volatile organic compounds; 14 percent exceeded an MCL. Other major analyte groups that frequently exceeded MCLs or SMCLs included major ions (17,465 samples and a 33.9 percent exceedence), minor ions (11,905 samples and a 17.1 percent exceedence), and water characteristics (21,183 samples and a 20.3 percent exceedence). Samples collected and analyzed for fungicides, herbicides, insecticides, and pesticides (4,062 samples), radiochemicals (1,628 samples), wastewater compounds (307 samples), and nutrients (20,822 samples) had the lowest exceedences of 0.3, 8.4, 0.0, and 8.8 percent, respectively.
Sana, Dandara Emery Morais; Mayrink de Miranda, Priscila; Pitol, Bruna Caroline Vieira; Moran, Mariana Soares; Silva, Nayara Nascimento Toledo; Guerreiro da Silva, Ismael Dali Cotrim; de Cássia Stocco, Rita; Beçak, Willy; Lima, Angélica Alves; Carneiro, Cláudia Martins
2013-09-01
Herein, we evaluated cervical samples from normal tissue or HPV-infected tissue, to determine if the relative nuclear/cytoplasmic ratio (NA/CA) and the presence of nonclassical cytological criteria are a novel cytological criterion for the diagnosis of HPV. Significantly, larger NA/CA ratios were found for the HPV-ATYPIA+ and HPV+ATYPIA+ groups compared with HPV-ATYPIA- group, regardless of collection method. For the samples collected with a spatula, only three samples from the HPV-ATIPIA- group showed four or more nonclassical parameters (i.e., were positive), while a larger number of the samples in the HPV-ATYPIA+, HPV+ATYPIA-, and HPV+ATYPIA+ groups were positive (13, 4, and 13 samples, respectively). Among those collected with a brush, no sample showed four or more nonclassical criteria in the HPV-ATYPIA- group, while a number of samples were positive in the HPV-ATYPIA+, HPV+ATYPIA-, and HPV+ATYPIA+ groups (4, 3, and 4 samples, respectively). HPV infection was associated with significant morphometrical changes; no increase in the NA/CA ratio was found in the HPV+ATYPIA- samples, compared with the HPV-ATIPIA- samples collected with either a spatula or a brush. In conclusion, by including nonclassical cytological criteria into the patient diagnosis, we were able to reduce the number of false negative and false positive HPV diagnoses made using conventional cytology alone. Copyright © 2013 Wiley Periodicals, Inc.
Estimating species richness and accumulation by modeling species occurrence and detectability
Dorazio, R.M.; Royle, J. Andrew; Soderstrom, B.; Glimskarc, A.
2006-01-01
A statistical model is developed for estimating species richness and accumulation by formulating these community-level attributes as functions of model-based estimators of species occurrence while accounting for imperfect detection of individual species. The model requires a sampling protocol wherein repeated observations are made at a collection of sample locations selected to be representative of the community. This temporal replication provides the data needed to resolve the ambiguity between species absence and nondetection when species are unobserved at sample locations. Estimates of species richness and accumulation are computed for two communities, an avian community and a butterfly community. Our model-based estimates suggest that detection failures in many bird species were attributed to low rates of occurrence, as opposed to simply low rates of detection. We estimate that the avian community contains a substantial number of uncommon species and that species richness greatly exceeds the number of species actually observed in the sample. In fact, predictions of species accumulation suggest that even doubling the number of sample locations would not have revealed all of the species in the community. In contrast, our analysis of the butterfly community suggests that many species are relatively common and that the estimated richness of species in the community is nearly equal to the number of species actually detected in the sample. Our predictions of species accumulation suggest that the number of sample locations actually used in the butterfly survey could have been cut in half and the asymptotic richness of species still would have been attained. Our approach of developing occurrence-based summaries of communities while allowing for imperfect detection of species is broadly applicable and should prove useful in the design and analysis of surveys of biodiversity.
Estimation of the rain signal in the presence of large surface clutter
NASA Technical Reports Server (NTRS)
Ahamad, Atiq; Moore, Richard K.
1994-01-01
The principal limitation for the use of a spaceborne imaging SAR as a rain radar is the surface-clutter problem. Signals may be estimated in the presence of noise by averaging large numbers of independent samples. This method was applied to obtain an estimate of the rain echo by averaging a set of N(sub c) samples of the clutter in a separate measurement and subtracting the clutter estimate from the combined estimate. The number of samples required for successful estimation (within 10-20%) for off-vertical angles of incidence appears to be prohibitively large. However, by appropriately degrading the resolution in both range and azimuth, the required number of samples can be obtained. For vertical incidence, the number of samples required for successful estimation is reasonable. In estimating the clutter it was assumed that the surface echo is the same outside the rain volume as it is within the rain volume. This may be true for the forest echo, but for convective storms over the ocean the surface echo outside the rain volume is very different from that within. It is suggested that the experiment be performed with vertical incidence over forest to overcome this limitation.
Early detection of nonnative alleles in fish populations: When sample size actually matters
Croce, Patrick Della; Poole, Geoffrey C.; Payne, Robert A.; Gresswell, Bob
2017-01-01
Reliable detection of nonnative alleles is crucial for the conservation of sensitive native fish populations at risk of introgression. Typically, nonnative alleles in a population are detected through the analysis of genetic markers in a sample of individuals. Here we show that common assumptions associated with such analyses yield substantial overestimates of the likelihood of detecting nonnative alleles. We present a revised equation to estimate the likelihood of detecting nonnative alleles in a population with a given level of admixture. The new equation incorporates the effects of the genotypic structure of the sampled population and shows that conventional methods overestimate the likelihood of detection, especially when nonnative or F-1 hybrid individuals are present. Under such circumstances—which are typical of early stages of introgression and therefore most important for conservation efforts—our results show that improved detection of nonnative alleles arises primarily from increasing the number of individuals sampled rather than increasing the number of genetic markers analyzed. Using the revised equation, we describe a new approach to determining the number of individuals to sample and the number of diagnostic markers to analyze when attempting to monitor the arrival of nonnative alleles in native populations.
Relative efficiency and sample size for cluster randomized trials with variable cluster sizes.
You, Zhiying; Williams, O Dale; Aban, Inmaculada; Kabagambe, Edmond Kato; Tiwari, Hemant K; Cutter, Gary
2011-02-01
The statistical power of cluster randomized trials depends on two sample size components, the number of clusters per group and the numbers of individuals within clusters (cluster size). Variable cluster sizes are common and this variation alone may have significant impact on study power. Previous approaches have taken this into account by either adjusting total sample size using a designated design effect or adjusting the number of clusters according to an assessment of the relative efficiency of unequal versus equal cluster sizes. This article defines a relative efficiency of unequal versus equal cluster sizes using noncentrality parameters, investigates properties of this measure, and proposes an approach for adjusting the required sample size accordingly. We focus on comparing two groups with normally distributed outcomes using t-test, and use the noncentrality parameter to define the relative efficiency of unequal versus equal cluster sizes and show that statistical power depends only on this parameter for a given number of clusters. We calculate the sample size required for an unequal cluster sizes trial to have the same power as one with equal cluster sizes. Relative efficiency based on the noncentrality parameter is straightforward to calculate and easy to interpret. It connects the required mean cluster size directly to the required sample size with equal cluster sizes. Consequently, our approach first determines the sample size requirements with equal cluster sizes for a pre-specified study power and then calculates the required mean cluster size while keeping the number of clusters unchanged. Our approach allows adjustment in mean cluster size alone or simultaneous adjustment in mean cluster size and number of clusters, and is a flexible alternative to and a useful complement to existing methods. Comparison indicated that we have defined a relative efficiency that is greater than the relative efficiency in the literature under some conditions. Our measure of relative efficiency might be less than the measure in the literature under some conditions, underestimating the relative efficiency. The relative efficiency of unequal versus equal cluster sizes defined using the noncentrality parameter suggests a sample size approach that is a flexible alternative and a useful complement to existing methods.
2017-09-28
SECURITY CLASSIFICATION OF: In forensic DNA analysis, the interpretation of a sample acquired from the environment may be dependent upon the...sample acquired from the environment may be dependent upon the assumption on the number of individuals from which the evidence arose. Degraded and...NOCIt results to those obtained when allele counting or maxiumum likelihood estimator (MLE) methods are employed. NOCIt does not depend upon an AT and
2017-10-01
AWARD NUMBER: W81XWH-16-1-0524 TITLE: Non-Uniformly Sampled MR Correlated Spectroscopic Imaging in Breast Cancer and Nonlinear Reconstruction...author(s) and should not be construed as an official Department of the Army position, policy or decision unless so designated by other...COVERED 30 Sep 2016 - 29 Sep 2017 5a. CONTRACT NUMBER 4. TITLE AND SUBTITLE Non-Uniformly Sampled MR Correlated Spectroscopic Imaging in Breast
Impact of xynthia tempest on viral contamination of shellfish.
Grodzki, Marco; Ollivier, Joanna; Le Saux, Jean-Claude; Piquet, Jean-Côme; Noyer, Mathilde; Le Guyader, Françoise S
2012-05-01
Viral contamination in oyster and mussel samples was evaluated after a massive storm with hurricane wind named "Xynthia tempest" destroyed a number of sewage treatment plants in an area harboring many shellfish farms. Although up to 90% of samples were found to be contaminated 2 days after the disaster, detected viral concentrations were low. A 1-month follow-up showed a rapid decrease in the number of positive samples, even for norovirus.
Code of Federal Regulations, 2013 CFR
2013-01-01
... at end of Table I] Factor Grades AL 2 Number of 50-count samples 3 1 2 3 4 5 6 7 8 9 10 11 12 13 14... Grades AL 2 Number of 50-count samples 3 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40... port of entry into the United States. 2 AL—Absolute limit permitted in individual 33-count sample. 3...
Code of Federal Regulations, 2014 CFR
2014-01-01
... at end of Table I] Factor Grades AL 2 Number of 50-count samples 3 1 2 3 4 5 6 7 8 9 10 11 12 13 14... Grades AL 2 Number of 50-count samples 3 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40... port of entry into the United States. 2 AL—Absolute limit permitted in individual 33-count sample. 3...
Environmental sampling can be difficult and expensive to carry out. Those taking the samples would like to integrate their knowledge of the system of study or their judgment about the system into the sample selection process to decrease the number of necessary samples. However,...
Does use of a PACS increase the number of images per study? A case study in ultrasound.
Horii, Steven; Nisenbaum, Harvey; Farn, James; Coleman, Beverly; Rowling, Susan; Langer, Jill; Jacobs, Jill; Arger, Peter; Pinheiro, Lisa; Klein, Wendy; Reber, Michele; Iyoob, Christopher
2002-03-01
The purpose of this study was to determine if the use of a picture archiving and communications system (PACS) in ultrasonography increased the number of images acquired per examination. The hypothesis that such an increase does occur was based on anecdotal information; this study sought to test the hypothesis. A random sample of all ultrasound examination types was drawn from the period 1998 through 1999. The ultrasound PACS in use (ACCESS; Kodak Health Information Systems, Dallas, TX) records the number of grayscale and color images saved as part of each study. Each examination in the sample was checked in the ultrasound PACS database,.and the number of grayscale and color images was recorded. The comparison film-based sample was drawn from the period 1994 through 1995. The number of examinations of each type selected was based on the overall statistics of the section; that is, the sample was designed to represent the approximate frequency with which the various examination types are done. For film-based image counts, the jackets were retrieved, and the number of grayscale and color images were counted. The number of images obtained per examination (for most examinations) in ultrasound increased with PACS use. This was more evident with some examination types (eg, pelvis). This result, however, has to be examined for possible systematic biases because ultrasound practice has changed over the time since the authors stopped using film routinely. The use of PACS in ultrasonography was not associated with an increase in the number of images per examination based solely on the use of PACS, with the exception of neonatal head studies. Increases in the number of images per study was otherwise associated with examinations for which changes in protocols resulted in the increased image counts.
Getting DNA copy numbers without control samples
2012-01-01
Background The selection of the reference to scale the data in a copy number analysis has paramount importance to achieve accurate estimates. Usually this reference is generated using control samples included in the study. However, these control samples are not always available and in these cases, an artificial reference must be created. A proper generation of this signal is crucial in terms of both noise and bias. We propose NSA (Normality Search Algorithm), a scaling method that works with and without control samples. It is based on the assumption that genomic regions enriched in SNPs with identical copy numbers in both alleles are likely to be normal. These normal regions are predicted for each sample individually and used to calculate the final reference signal. NSA can be applied to any CN data regardless the microarray technology and preprocessing method. It also finds an optimal weighting of the samples minimizing possible batch effects. Results Five human datasets (a subset of HapMap samples, Glioblastoma Multiforme (GBM), Ovarian, Prostate and Lung Cancer experiments) have been analyzed. It is shown that using only tumoral samples, NSA is able to remove the bias in the copy number estimation, to reduce the noise and therefore, to increase the ability to detect copy number aberrations (CNAs). These improvements allow NSA to also detect recurrent aberrations more accurately than other state of the art methods. Conclusions NSA provides a robust and accurate reference for scaling probe signals data to CN values without the need of control samples. It minimizes the problems of bias, noise and batch effects in the estimation of CNs. Therefore, NSA scaling approach helps to better detect recurrent CNAs than current methods. The automatic selection of references makes it useful to perform bulk analysis of many GEO or ArrayExpress experiments without the need of developing a parser to find the normal samples or possible batches within the data. The method is available in the open-source R package NSA, which is an add-on to the aroma.cn framework. http://www.aroma-project.org/addons. PMID:22898240
Getting DNA copy numbers without control samples.
Ortiz-Estevez, Maria; Aramburu, Ander; Rubio, Angel
2012-08-16
The selection of the reference to scale the data in a copy number analysis has paramount importance to achieve accurate estimates. Usually this reference is generated using control samples included in the study. However, these control samples are not always available and in these cases, an artificial reference must be created. A proper generation of this signal is crucial in terms of both noise and bias.We propose NSA (Normality Search Algorithm), a scaling method that works with and without control samples. It is based on the assumption that genomic regions enriched in SNPs with identical copy numbers in both alleles are likely to be normal. These normal regions are predicted for each sample individually and used to calculate the final reference signal. NSA can be applied to any CN data regardless the microarray technology and preprocessing method. It also finds an optimal weighting of the samples minimizing possible batch effects. Five human datasets (a subset of HapMap samples, Glioblastoma Multiforme (GBM), Ovarian, Prostate and Lung Cancer experiments) have been analyzed. It is shown that using only tumoral samples, NSA is able to remove the bias in the copy number estimation, to reduce the noise and therefore, to increase the ability to detect copy number aberrations (CNAs). These improvements allow NSA to also detect recurrent aberrations more accurately than other state of the art methods. NSA provides a robust and accurate reference for scaling probe signals data to CN values without the need of control samples. It minimizes the problems of bias, noise and batch effects in the estimation of CNs. Therefore, NSA scaling approach helps to better detect recurrent CNAs than current methods. The automatic selection of references makes it useful to perform bulk analysis of many GEO or ArrayExpress experiments without the need of developing a parser to find the normal samples or possible batches within the data. The method is available in the open-source R package NSA, which is an add-on to the aroma.cn framework. http://www.aroma-project.org/addons.
PLAN SECTIONS AND ELEVATIONS OF VESSEL SAMPLING STATIONS "P", "Q", ...
PLAN SECTIONS AND ELEVATIONS OF VESSEL SAMPLING STATIONS "P", "Q", "S" CELLS MAIN PROCESSING BUILDING (CPP-601). INL DRAWING NUMBER 200-0601-00-291-053694. ALTERNATE ID NUMBER CPP-E-1394. - Idaho National Engineering Laboratory, Idaho Chemical Processing Plant, Fuel Reprocessing Complex, Scoville, Butte County, ID
DETERMINING THE MINIMUM NUMBER OF SITES FOR BIOASSESSMENT OF THE OHIO RIVER
For wadeable stream bioassessment, much work has been done to determine the number of samples to composite or the appropriate reach length for obtaining an adequate sample. Proportions of stream miles in a given condition are then often reported at the watershed or ecoregion lev...
Berrang, M E; Smith, D P; Hinton, A
2006-02-01
Because of the escape of highly contaminated gut contents from the cloaca of positive carcasses, Campylobacter numbers recovered from broiler carcass skin samples increase during automated feather removal. Vinegar is known to have antimicrobial action. The objective of this study was to determine the effect of vinegar placed in the cloaca prior to feather removal on the numbers of Campylobacter recovered from broiler breast skin. Broilers were stunned, killed, and bled in a pilot processing plant. Vinegar was placed in the colons of the chickens prior to scalding. Carcasses were scalded, and Campylobacter numbers were determined on breast skin before and after passage through a commercial-style feather-picking machine. Campylobacter numbers recovered from the breast skin of untreated control carcasses increased during feather removal from 1.3 log CFU per sample prior to defeathering to 4.2 log afterward. Placement of water in the colon before scalding had no effect on Campylobacter numbers. Campylobacter numbers recovered from the breast skin of carcasses treated with vinegar also increased during defeathering but to a significantly lesser extent. Treated carcasses experienced only a 1-log increase from 1.6 log CFU per sample before feather removal to 2.6 log CFU per sample afterward. Application of an effective food-grade antimicrobial in the colon prior to scald can limit the increase in Campylobacter contamination of broiler carcasses during defeathering.
Sampling designs for HIV molecular epidemiology with application to Honduras.
Shepherd, Bryan E; Rossini, Anthony J; Soto, Ramon Jeremias; De Rivera, Ivette Lorenzana; Mullins, James I
2005-11-01
Proper sampling is essential to characterize the molecular epidemiology of human immunodeficiency virus (HIV). HIV sampling frames are difficult to identify, so most studies use convenience samples. We discuss statistically valid and feasible sampling techniques that overcome some of the potential for bias due to convenience sampling and ensure better representation of the study population. We employ a sampling design called stratified cluster sampling. This first divides the population into geographical and/or social strata. Within each stratum, a population of clusters is chosen from groups, locations, or facilities where HIV-positive individuals might be found. Some clusters are randomly selected within strata and individuals are randomly selected within clusters. Variation and cost help determine the number of clusters and the number of individuals within clusters that are to be sampled. We illustrate the approach through a study designed to survey the heterogeneity of subtype B strains in Honduras.
[Comparison of the Conventional Centrifuged and Filtrated Preparations in Urine Cytology].
Sekita, Nobuyuki; Shimosakai, Hirofumi; Nishikawa, Rika; Sato, Hiroaki; Kouno, Hiroyoshi; Fujimura, Masaaki; Mikami, Kazuo
2016-03-01
The urine cytology test is one of the most important tools for the diagnosis of malignant urinary tract tumors. This test is also of great value for predicting malignancy. However, the sensitivity of this test is not high enough to screen for malignant cells. In our laboratory, we were able to attain a high sensitivity of urine cytology tests after changing the preparation method of urine samples. The differences in the cytodiagnosis between the two methods are discussed here. From January 2012 to June 2013, 2,031 urine samples were prepared using the conventional centrifuge method (C method) ; and from September 2013 to March 2015, 2,453 urine samples were prepared using the filtration method (F method) for the cytology test. When the samples included in category 4 or 5, were defined as cytological positive, the sensitivities of this test with samples prepared using the F method were significantly high compared with samples prepared using the C method (72% vs 28%, p<0.001). The number of cells on the glass slides prepared by the F method was significantly higher than that of the samples prepared by the C method (p<0.001). After introduction of the F method, the number of f alse negative cases was decreased in the urine cytology test because a larger number of cells was seen and easily detected as atypical or malignant epithelial cells. Therefore, this method has a higher sensitivity than the conventional C method as the sensitivity of urine cytology tests relies partially on the number of cells visualized in the prepared samples.
Yenilmez, Firdes; Düzgün, Sebnem; Aksoy, Aysegül
2015-01-01
In this study, kernel density estimation (KDE) was coupled with ordinary two-dimensional kriging (OK) to reduce the number of sampling locations in measurement and kriging of dissolved oxygen (DO) concentrations in Porsuk Dam Reservoir (PDR). Conservation of the spatial correlation structure in the DO distribution was a target. KDE was used as a tool to aid in identification of the sampling locations that would be removed from the sampling network in order to decrease the total number of samples. Accordingly, several networks were generated in which sampling locations were reduced from 65 to 10 in increments of 4 or 5 points at a time based on kernel density maps. DO variograms were constructed, and DO values in PDR were kriged. Performance of the networks in DO estimations were evaluated through various error metrics, standard error maps (SEM), and whether the spatial correlation structure was conserved or not. Results indicated that smaller number of sampling points resulted in loss of information in regard to spatial correlation structure in DO. The minimum representative sampling points for PDR was 35. Efficacy of the sampling location selection method was tested against the networks generated by experts. It was shown that the evaluation approach proposed in this study provided a better sampling network design in which the spatial correlation structure of DO was sustained for kriging.
Fenlon, D R
1981-04-01
Of 1241 samples of seagulls faeces examined, 12.9% were found to contain salmonellae. The number of positive samples was significantly higher (17-21%) near sewage outfalls. Twenty-seven serotypes were isolated, including a new serotype named Salmonella grampian. The range and frequency of serotypes carried by gulls was similar to those in the human population, suggesting sewage as a possible source of gull infection. The number of salmonellae found in positive samples was low (0.18-191 g-1 faeces). This was similar to the numbers found in sewage, 10-80 1-1, suggesting gulls may only carry infected material without infecting themselves. Antibiotic resistance in the isolates was low, only 21 showing resistance to the antibiotics tested, although most of these were determined by resistance transfer plasmids.
Practical quantum random number generator based on measuring the shot noise of vacuum states
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shen Yong; Zou Hongxin; Tian Liang
2010-06-15
The shot noise of vacuum states is a kind of quantum noise and is totally random. In this paper a nondeterministic random number generation scheme based on measuring the shot noise of vacuum states is presented and experimentally demonstrated. We use a homodyne detector to measure the shot noise of vacuum states. Considering that the frequency bandwidth of our detector is limited, we derive the optimal sampling rate so that sampling points have the least correlation with each other. We also choose a method to extract random numbers from sampling values, and prove that the influence of classical noise canmore » be avoided with this method so that the detector does not have to be shot-noise limited. The random numbers generated with this scheme have passed ent and diehard tests.« less
Williams, Michael S; Cao, Yong; Ebel, Eric D
2013-07-15
Levels of pathogenic organisms in food and water have steadily declined in many parts of the world. A consequence of this reduction is that the proportion of samples that test positive for the most contaminated product-pathogen pairings has fallen to less than 0.1. While this is unequivocally beneficial to public health, datasets with very few enumerated samples present an analytical challenge because a large proportion of the observations are censored values. One application of particular interest to risk assessors is the fitting of a statistical distribution function to datasets collected at some point in the farm-to-table continuum. The fitted distribution forms an important component of an exposure assessment. A number of studies have compared different fitting methods and proposed lower limits on the proportion of samples where the organisms of interest are identified and enumerated, with the recommended lower limit of enumerated samples being 0.2. This recommendation may not be applicable to food safety risk assessments for a number of reasons, which include the development of new Bayesian fitting methods, the use of highly sensitive screening tests, and the generally larger sample sizes found in surveys of food commodities. This study evaluates the performance of a Markov chain Monte Carlo fitting method when used in conjunction with a screening test and enumeration of positive samples by the Most Probable Number technique. The results suggest that levels of contamination for common product-pathogen pairs, such as Salmonella on poultry carcasses, can be reliably estimated with the proposed fitting method and samples sizes in excess of 500 observations. The results do, however, demonstrate that simple guidelines for this application, such as the proportion of positive samples, cannot be provided. Published by Elsevier B.V.
Basic quantitative polymerase chain reaction using real-time fluorescence measurements.
Ares, Manuel
2014-10-01
This protocol uses quantitative polymerase chain reaction (qPCR) to measure the number of DNA molecules containing a specific contiguous sequence in a sample of interest (e.g., genomic DNA or cDNA generated by reverse transcription). The sample is subjected to fluorescence-based PCR amplification and, theoretically, during each cycle, two new duplex DNA molecules are produced for each duplex DNA molecule present in the sample. The progress of the reaction during PCR is evaluated by measuring the fluorescence of dsDNA-dye complexes in real time. In the early cycles, DNA duplication is not detected because inadequate amounts of DNA are made. At a certain threshold cycle, DNA-dye complexes double each cycle for 8-10 cycles, until the DNA concentration becomes so high and the primer concentration so low that the reassociation of the product strands blocks efficient synthesis of new DNA and the reaction plateaus. There are two types of measurements: (1) the relative change of the target sequence compared to a reference sequence and (2) the determination of molecule number in the starting sample. The first requires a reference sequence, and the second requires a sample of the target sequence with known numbers of the molecules of sequence to generate a standard curve. By identifying the threshold cycle at which a sample first begins to accumulate DNA-dye complexes exponentially, an estimation of the numbers of starting molecules in the sample can be extrapolated. © 2014 Cold Spring Harbor Laboratory Press.
The effect of a cannula milk sampling technique on the microbiological diagnosis of bovine mastitis.
Friman, M; Hiitiö, H; Niemi, M; Holopainen, J; Pyörälä, S; Simojoki, H
2017-08-01
Two methods of collecting milk samples from mastitic bovine mammary quarters were compared. Samples were taken in a consistent order in which standard aseptic technique sampling was done first, followed by insertion of a sterile cannula through the teat canal and collection of a second sample. Microbiological results of those two sampling techniques were compared. Milk samples were analysed using multiplex real-time polymerase chain reaction (PCR). The cannula technique produced a reduced number of microbial species or groups of species per sample compared with conventional sampling. Staphylococcus spp. were the most common species identified and were detected more often during conventional sampling than with cannula sampling. Staphylococcus spp. identified in milk samples could also have originated from the teat canal without being present in the milk. The number of samples positive for Trueperella pyogenes or yeasts in the conventional samples was twice as high as in the cannula samples, indicating that the presence of Trueperella pyogenes and yeast species should not necessarily be interpreted as being the causative agents of bovine intra-mammary infections (IMI). Copyright © 2017 Elsevier Ltd. All rights reserved.
Stability and bias of classification rates in biological applications of discriminant analysis
Williams, B.K.; Titus, K.; Hines, J.E.
1990-01-01
We assessed the sampling stability of classification rates in discriminant analysis by using a factorial design with factors for multivariate dimensionality, dispersion structure, configuration of group means, and sample size. A total of 32,400 discriminant analyses were conducted, based on data from simulated populations with appropriate underlying statistical distributions. Simulation results indicated strong bias in correct classification rates when group sample sizes were small and when overlap among groups was high. We also found that stability of the correct classification rates was influenced by these factors, indicating that the number of samples required for a given level of precision increases with the amount of overlap among groups. In a review of 60 published studies, we found that 57% of the articles presented results on classification rates, though few of them mentioned potential biases in their results. Wildlife researchers should choose the total number of samples per group to be at least 2 times the number of variables to be measured when overlap among groups is low. Substantially more samples are required as the overlap among groups increases
Predominant bacteria in an activated sludge reactor for the degradation of cutting fluids
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, C.A.; Claus, G.W.; Taylor, P.A.
1983-01-01
For the first time, an activated sludge reactor, established for the degradation of cutting fluids, was examined for predominant bacteria. In addition, both total and viable numbers of bacteria in the reactor were determined so that the percentage of each predominant type in the total reactor population could be determined. Three samples were studied, and a total of 15 genera were detected. In each sample, the genus Pseudomonas and the genus Microcyclus were present in high numbers. Three other genera, Acinetobacter, Alcaligenes, and Corynebacterium, were also found in every sample but in lower numbers. In one sample, numerous appendage bacteriamore » were present, and one of these, the genus Seliberia, was the most predominant organism in that sample. However, in the other two samples no appendage bacteria were detected. Six genera were found in this reactor which have not been previously reported in either cutting fluids in use or in other activated sludge systems. These genera were Aeromonas, Hyphomonas, Listeria, Microcyclus, Moraxella, and Spirosoma. None of the predominant bacterial belonged to groups of strict pathogens. 22 references, 6 figures, 3 tables.« less
Radial q-space sampling for DSI
Baete, Steven H.; Yutzy, Stephen; Boada, Fernando, E.
2015-01-01
Purpose Diffusion Spectrum Imaging (DSI) has been shown to be an effective tool for non-invasively depicting the anatomical details of brain microstructure. Existing implementations of DSI sample the diffusion encoding space using a rectangular grid. Here we present a different implementation of DSI whereby a radially symmetric q-space sampling scheme for DSI (RDSI) is used to improve the angular resolution and accuracy of the reconstructed Orientation Distribution Functions (ODF). Methods Q-space is sampled by acquiring several q-space samples along a number of radial lines. Each of these radial lines in q-space is analytically connected to a value of the ODF at the same angular location by the Fourier slice theorem. Results Computer simulations and in vivo brain results demonstrate that RDSI correctly estimates the ODF when moderately high b-values (4000 s/mm2) and number of q-space samples (236) are used. Conclusion The nominal angular resolution of RDSI depends on the number of radial lines used in the sampling scheme, and only weakly on the maximum b-value. In addition, the radial analytical reconstruction reduces truncation artifacts which affect Cartesian reconstructions. Hence, a radial acquisition of q-space can be favorable for DSI. PMID:26363002
Bacterial contaminants in carbonated soft drinks sold in Bangladesh markets.
Akond, Muhammad Ali; Alam, Saidul; Hasan, S M R; Mubassara, Sanzida; Uddin, Sarder Nasir; Shirin, Momena
2009-03-31
A total of 225 carbonated soft drink (CSD) samples from nine brands, from various locations in five metropolitan cities of Bangladesh were examined to determine their bacteriological quality. Most samples were not in compliance with microbiological standards set by organizations like the World Health Organization (WHO). Pseudomonas aeruginosa was the predominant species with an incidence of 95%. Streptococcus spp. and Bacillus stearothermophilus were the next most prevalent with numbers ranging from 6 to 122 and 9 to 105 cfu/100 ml, respectively. Fifty four percent of the samples yielded Salmonella spp. at numbers ranging from 2 to 90 cfu/100 ml. Total coliform (TC) and faecal coliform (FC) counts were found in 68-100% and 76-100% of samples of individual brands, at numbers ranging from 5 to 213 and 3 to 276 cfu/100 ml, respectively. According to WHO standards 60-88% of samples from six brands and 32% and 40% of samples from two other brands belonged to the intermediate risk group with FC counts of 100-1000 cfu/100 ml. Heterotrophic plate counts, however, were under the permissible limit in all 225 samples. These findings suggest that carbonated soft drinks commercially available in Bangladesh pose substantial risks to public health.
Selected ground-water-quality data in Pennsylvania - 1979-2006
Low, Dennis J.; Chichester, Douglas C.; Zarr, Linda F.
2009-01-01
This study, by the U.S. Geological Survey (USGS) in cooperation with the Pennsylvania Department of Environmental Protection (PADEP), provides a compilation of ground-water-quality data for a 28-year period (January 1, 1979, through December 31, 2006) based on water samples from wells and springs. The data are from 14 source agencies or programs—Borough of Carroll Valley, Chester County Health Department, Montgomery County Health Department, Pennsylvania Department of Agriculture, Pennsylvania Department of Environmental Protection 2002 Pennsylvania Water-Quality Assessment, Pennsylvania Department of Environmental Protection Agency Act 537 Sewage Facilities Program, Pennsylvania Department of Environmental Protection-Ambient and Fixed Station Network, Pennsylvania Department of Environmental Protection–North-Central Region, Pennsylvania Department of Environmental Protection–South-Central Region, Pennsylvania Drinking Water Information System, Pennsylvania Topographic and Geologic Survey, Susquehanna River Basin Commission, U.S. Environmental Protection Agency, and the U.S. Geological Survey. The ground-water-quality data from the different source agencies or programs varied in type and number of analyses; however, the analyses are represented by 11 major analyte groups: antibiotics, major ions, microorganisms (bacteria, viruses, and other microorganisms), minor ions (including trace elements), nutrients (predominantly nitrate and nitrite as nitrogen), pesticides, pharmaceuticals, radiochemicals (predominantly radon or radium), volatiles (volatile organic compounds), wastewater compounds, and water characteristics (field measurements, predominantly field pH, field specific conductance, and hardness). For the USGS and the PADEP–North-Central Region, the pesticide analyte group was broken down into fungicides, herbicides, and insecticides. Summary maps show the areal distribution of wells and springs with ground-water-quality data statewide by source agency or program. Summary data tables by source agency or program provide information on the number of wells and springs and samples collected for each of the 35 watersheds and analyte groups.The number of wells and springs sampled for ground-water-quality data varies considerably across Pennsylvania. Of the 24,772 wells and springs sampled, the greatest concentration of wells and springs is in the southeast (Berks, Bucks, Chester, Delaware, Lancaster, Montgomery, and Philadelphia Counties) and in the northwest (Erie County). The number of wells and springs sampled is relatively sparse in north-central (Cameron, Elk, Forest, McKean, Potter, and Warren Counties) Pennsylvania. Little to no data are available for approximately one-fourth of the state. Nutrients and water characteristics were the most frequently sampled major analyte groups—43,025 and 30,583 samples, respectively. Minor ions and major ions were the next most frequently sampled major analyte groups–26,972 and 13,115 samples, respectively. For the remaining 10 major analyte groups, the number of samples collected ranged from a low of 24 samples (antibiotic compounds) to a high of approximately 4,674 samples (microorganisms).The number of samples that exceeded a maximum contaminant level (MCL) or secondary maximum contaminant level (SMCL) by major analyte group also varied. Of the 4,674 samples in the microorganism analyte group, 50.2 percent had water that exceeded an MCL. Of the 4,528 samples collected and analyzed for volatile organic compounds, 23.5 percent exceeded an MCL. Other major analyte groups that frequently exceeded MCLs or SMCLs included major ions (18,343 samples and a 27.7 percent exceedence), minor ions (26,972 samples, 44.7 percent exceedence), pesticides (4,868 samples, 0.7 percent exceedence), water characteristics (30,583 samples, 19.3 percent exceedence), and radiochemicals (1,866 samples, 9.6 percent exceedence). Samples collected and analyzed for antibiotics (24 samples), fungicides (1,273 samples), herbicides (1,470 samples), insecticides (1,424 samples), nutrients (43,025 samples), pharmaceuticals (28 samples), and wastewater compounds (328 samples) had the lowest exceedences of 0.0, 2.4, 1.2, <1.0, 8.3, 0.0, and <1.0 percent, respectively.
Using Electronic Data Interchange to Report Product Quality
1993-03-01
Numbers 0 31.1 S........................ . . . . ........... .... . .--- . ... N/U 140 SPS Sampling Parameters for Summary Statistics 0 1 N/U 150 REF...DTM Date/Time Reference 0 1 N/U 190 REF Reference Numbers 021 .................................. .......... .. ... NAU 200 STA Statistics 0 1 N/U 210...Measurements 0 1 N/U 120 DTM Date/Time Reference 0 >1 N/U 130 REF Reference Numbers 0 >1 :LOOIV f-SPS N/U 140 SPS Sampling Parameters for Summary Statistics 0 1
An Investigation of Lost Time and Utilization in a Sample of First-Term Male and Female Soldiers
1982-10-01
Fitzgibbons, D., & Moch, M. Employee absenteeism : A multivariate analysis with replication. Organizational Behavior and Human Performance , 1980, 26, 349...TIME AND UTILIZATION Technical Report IN A SAMPLE OF FIRST-TERM MALE AND FEMALE April 1981-October 1982 SODER . PERFORMING ORO. REPORT NUMBER . t 7...AIJTHOR(s) S. CONTRACT OR GRANT NUMBER(s) a Joel M. Savell, Carlos K. Rigby, and - Andrew A. Zbikowski 9. PERFORMING ORGANIZATION NAME AND ADDRESS 10
Yingkajorn, Mingkwan; Sermwitayawong, Natthawan; Palittapongarnpimp, Prasit; Nishibuchi, Mitsuaki; Robins, William P; Mekalanos, John J; Vuddhakul, Varaporn
2014-05-01
Correlation between the numbers of Vibrio parahaemolyticus and its specific bacteriophages in cockles was investigated from June 2009 to May 2010 in Hat Yai, Songkhla, Thailand. Cockles obtained monthly from a local market were sampled to determine the numbers of V. parahaemolyticus and bacteriophages that could form plaques on ten strains of pandemic and nonpandemic V. parahaemolyticus. In addition, V. parahaemolyticus isolates from clinical samples from Hat Yai hospital over the same period were investigated. All 139 cockles sampled were positive for V. parahaemolyticus. However, only 76 of them were positive for bacteriophages. During the testing period, the number of bacteriophages was not significantly correlated with the incidence of V. parahaemolyticus-infected patients, but the numbers of V. parahaemolyticus isolates from the cockle samples were closely related to the number of infected patients. The bacteriophages isolated from V. parahaemolyticus also infected Vibrio alginolyticus and Vibrio mimicus, suggesting that the broad host range of phages may be a factor of providing the possibility of their participation in the processes of genetic exchange between V. parahaemolyticus and closely related Vibrio spp. In conclusion, this study indicated that the number of V. parahaemolyticus in cockles may be a useful tool for predicting the relative risk of infection by V. parahaemolyticus in this area of Thailand.
Correlated Observations, the Law of Small Numbers and Bank Runs
2016-01-01
Empirical descriptions and studies suggest that generally depositors observe a sample of previous decisions before deciding if to keep their funds deposited or to withdraw them. These observed decisions may exhibit different degrees of correlation across depositors. In our model depositors decide sequentially and are assumed to follow the law of small numbers in the sense that they believe that a bank run is underway if the number of observed withdrawals in their sample is large. Theoretically, with highly correlated samples and infinite depositors runs occur with certainty, while with random samples it needs not be the case, as for many parameter settings the likelihood of bank runs is zero. We investigate the intermediate cases and find that i) decreasing the correlation and ii) increasing the sample size reduces the likelihood of bank runs, ceteris paribus. Interestingly, the multiplicity of equilibria, a feature of the canonical Diamond-Dybvig model that we use also, disappears almost completely in our setup. Our results have relevant policy implications. PMID:27035435
Correlated Observations, the Law of Small Numbers and Bank Runs.
Horváth, Gergely; Kiss, Hubert János
2016-01-01
Empirical descriptions and studies suggest that generally depositors observe a sample of previous decisions before deciding if to keep their funds deposited or to withdraw them. These observed decisions may exhibit different degrees of correlation across depositors. In our model depositors decide sequentially and are assumed to follow the law of small numbers in the sense that they believe that a bank run is underway if the number of observed withdrawals in their sample is large. Theoretically, with highly correlated samples and infinite depositors runs occur with certainty, while with random samples it needs not be the case, as for many parameter settings the likelihood of bank runs is zero. We investigate the intermediate cases and find that i) decreasing the correlation and ii) increasing the sample size reduces the likelihood of bank runs, ceteris paribus. Interestingly, the multiplicity of equilibria, a feature of the canonical Diamond-Dybvig model that we use also, disappears almost completely in our setup. Our results have relevant policy implications.
Horowitz, Arthur J.; Clarke, Robin T.; Merten, Gustavo Henrique
2015-01-01
Since the 1970s, there has been both continuing and growing interest in developing accurate estimates of the annual fluvial transport (fluxes and loads) of suspended sediment and sediment-associated chemical constituents. This study provides an evaluation of the effects of manual sample numbers (from 4 to 12 year−1) and sample scheduling (random-based, calendar-based and hydrology-based) on the precision, bias and accuracy of annual suspended sediment flux estimates. The evaluation is based on data from selected US Geological Survey daily suspended sediment stations in the USA and covers basins ranging in area from just over 900 km2 to nearly 2 million km2 and annual suspended sediment fluxes ranging from about 4 Kt year−1 to about 200 Mt year−1. The results appear to indicate that there is a scale effect for random-based and calendar-based sampling schemes, with larger sample numbers required as basin size decreases. All the sampling schemes evaluated display some level of positive (overestimates) or negative (underestimates) bias. The study further indicates that hydrology-based sampling schemes are likely to generate the most accurate annual suspended sediment flux estimates with the fewest number of samples, regardless of basin size. This type of scheme seems most appropriate when the determination of suspended sediment concentrations, sediment-associated chemical concentrations, annual suspended sediment and annual suspended sediment-associated chemical fluxes only represent a few of the parameters of interest in multidisciplinary, multiparameter monitoring programmes. The results are just as applicable to the calibration of autosamplers/suspended sediment surrogates currently used to measure/estimate suspended sediment concentrations and ultimately, annual suspended sediment fluxes, because manual samples are required to adjust the sample data/measurements generated by these techniques so that they provide depth-integrated and cross-sectionally representative data.
Two-sample binary phase 2 trials with low type I error and low sample size
Litwin, Samuel; Basickes, Stanley; Ross, Eric A.
2017-01-01
Summary We address design of two-stage clinical trials comparing experimental and control patients. Our end-point is success or failure, however measured, with null hypothesis that the chance of success in both arms is p0 and alternative that it is p0 among controls and p1 > p0 among experimental patients. Standard rules will have the null hypothesis rejected when the number of successes in the (E)xperimental arm, E, sufficiently exceeds C, that among (C)ontrols. Here, we combine one-sample rejection decision rules, E ≥ m, with two-sample rules of the form E – C > r to achieve two-sample tests with low sample number and low type I error. We find designs with sample numbers not far from the minimum possible using standard two-sample rules, but with type I error of 5% rather than 15% or 20% associated with them, and of equal power. This level of type I error is achieved locally, near the stated null, and increases to 15% or 20% when the null is significantly higher than specified. We increase the attractiveness of these designs to patients by using 2:1 randomization. Examples of the application of this new design covering both high and low success rates under the null hypothesis are provided. PMID:28118686
Rey, Sergio J.; Stephens, Philip A.; Laura, Jason R.
2017-01-01
Large data contexts present a number of challenges to optimal choropleth map classifiers. Application of optimal classifiers to a sample of the attribute space is one proposed solution. The properties of alternative sampling-based classification methods are examined through a series of Monte Carlo simulations. The impacts of spatial autocorrelation, number of desired classes, and form of sampling are shown to have significant impacts on the accuracy of map classifications. Tradeoffs between improved speed of the sampling approaches and loss of accuracy are also considered. The results suggest the possibility of guiding the choice of classification scheme as a function of the properties of large data sets.
Falling number sampling variation within trucks at first point of sale
USDA-ARS?s Scientific Manuscript database
Falling number (FN) is a test widely performed on raw samples of wheat and barley as a means to indicate the level of enzyme activity, alpha-amylase, associated with seed germination. In most circumstances of wheat, high activity levels are associated with decreased quality of the end products, and...
Sampling studies to estimate the HIV prevalence rate in female commercial sex workers.
Pascom, Ana Roberta Pati; Szwarcwald, Célia Landmann; Barbosa Júnior, Aristides
2010-01-01
We investigated sampling methods being used to estimate the HIV prevalence rate among female commercial sex workers. The studies were classified according to the adequacy or not of the sample size to estimate HIV prevalence rate and according to the sampling method (probabilistic or convenience). We identified 75 studies that estimated the HIV prevalence rate among female sex workers. Most of the studies employed convenience samples. The sample size was not adequate to estimate HIV prevalence rate in 35 studies. The use of convenience sample limits statistical inference for the whole group. It was observed that there was an increase in the number of published studies since 2005, as well as in the number of studies that used probabilistic samples. This represents a large advance in the monitoring of risk behavior practices and HIV prevalence rate in this group.
Ferrer-Paris, José Rafael; Sánchez-Mercado, Ada; Rodríguez, Jon Paul
2013-03-01
The development of efficient sampling protocols is an essential prerequisite to evaluate and identify priority conservation areas. There are f ew protocols for fauna inventory and monitoring in wide geographical scales for the tropics, where the complexity of communities and high biodiversity levels, make the implementation of efficient protocols more difficult. We proposed here a simple strategy to optimize the capture of dung beetles, applied to sampling with baited traps and generalizable to other sampling methods. We analyzed data from eight transects sampled between 2006-2008 withthe aim to develop an uniform sampling design, that allows to confidently estimate species richness, abundance and composition at wide geographical scales. We examined four characteristics of any sampling design that affect the effectiveness of the sampling effort: the number of traps, sampling duration, type and proportion of bait, and spatial arrangement of the traps along transects. We used species accumulation curves, rank-abundance plots, indicator species analysis, and multivariate correlograms. We captured 40 337 individuals (115 species/morphospecies of 23 genera). Most species were attracted by both dung and carrion, but two thirds had greater relative abundance in traps baited with human dung. Different aspects of the sampling design influenced each diversity attribute in different ways. To obtain reliable richness estimates, the number of traps was the most important aspect. Accurate abundance estimates were obtained when the sampling period was increased, while the spatial arrangement of traps was determinant to capture the species composition pattern. An optimum sampling strategy for accurate estimates of richness, abundance and diversity should: (1) set 50-70 traps to maximize the number of species detected, (2) get samples during 48-72 hours and set trap groups along the transect to reliably estimate species abundance, (3) set traps in groups of at least 10 traps to suitably record the local species composition, and (4) separate trap groups by a distance greater than 5-10km to avoid spatial autocorrelation. For the evaluation of other sampling protocols we recommend to, first, identify the elements of sampling design that could affect the sampled effort (the number of traps, sampling duration, type and proportion of bait) and their spatial distribution (spatial arrangement of the traps) and then, to evaluate how they affect richness, abundance and species composition estimates.
The Preliminary Examination of Organics in the Returned Stardust Samples from Comet Wild 2
NASA Technical Reports Server (NTRS)
Sandford, S. A.; Aleon, J.; Alexander, C.; Butterworth, A.; Clemett, S. J.; Cody, G.; Cooper, G.; Dworkin, J. P.; Flynn, G. J.; Gilles, M. K.
2006-01-01
The primary objective of STARDUST is to collect coma samples from comet 8lP/Wild 2. These samples were collected by impact onto aerogel tiles on Jan 2, 2004 when the spacecraft flew through the comet's coma at a relative velocity of about 6.1 km/sec. Measurements of dust impacts on the front of the spacecraft suggest that the aerogel particle collector was impacted by 2800 +/- 500 particles larger than 15 micron in diameter. Following recovery of the Sample Return Capsule (SRC) on Jan 15, 2006, the aerogel collector trays will be removed in a clean room at JSC. After documentation of the collection, selected aerogel tiles will be removed and aerogel and cometary samples will be extracted for study. A number of different extraction techniques will be used, each optimized for the analytical technique that is to be used. The STARDUST Mission will carry out a 6 month preliminary examination (PE) of a small portion of the returned samples. The examination of the samples will be made by a number of subteams that will concentrate on specific aspects of the samples. One of these is the Organics PE Team (see the author list above for team members). These team members will use a number of analytical techniques to produce a preliminary characterization of the abundance and nature of the organics (if any) in the returned samples.
OSATE Overview & Community Updates
2015-02-15
update 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Delange /Julien 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK...main language capabilities Modeling patterns & model samples for beginners Error-Model examples EMV2 model constructs Demonstration of tools Case
Luo, Yong; Wu, Dapeng; Zeng, Shaojiang; Gai, Hongwei; Long, Zhicheng; Shen, Zheng; Dai, Zhongpeng; Qin, Jianhua; Lin, Bingcheng
2006-09-01
A novel sample injection method for chip CE was presented. This injection method uses hydrostatic pressure, generated by emptying the sample waste reservoir, for sample loading and electrokinetic force for dispensing. The injection was performed on a double-cross microchip. One cross, created by the sample and separation channels, is used for formation of a sample plug. Another cross, formed by the sample and controlling channels, is used for plug control. By varying the electric field in the controlling channel, the sample plug volume can be linearly adjusted. Hydrostatic pressure takes advantage of its ease of generation on a microfluidic chip, without any electrode or external pressure pump, thus allowing a sample injection with a minimum number of electrodes. The potential of this injection method was demonstrated by a four-separation-channel chip CE system. In this system, parallel sample separation can be achieved with only two electrodes, which is otherwise impossible with conventional injection methods. Hydrostatic pressure maintains the sample composition during the sample loading, allowing the injection to be free of injection bias.
Comparison of chain sampling plans with single and double sampling plans
NASA Technical Reports Server (NTRS)
Stephens, K. S.; Dodge, H. F.
1976-01-01
The efficiency of chain sampling is examined through matching of operating characteristics (OC) curves of chain sampling plans (ChSP) with single and double sampling plans. In particular, the operating characteristics of some ChSP-0, 3 and 1, 3 as well as ChSP-0, 4 and 1, 4 are presented, where the number pairs represent the first and the second cumulative acceptance numbers. The fact that the ChSP procedure uses cumulative results from two or more samples and that the parameters can be varied to produce a wide variety of operating characteristics raises the question whether it may be possible for such plans to provide a given protection with less inspection than with single or double sampling plans. The operating ratio values reported illustrate the possibilities of matching single and double sampling plans with ChSP. It is shown that chain sampling plans provide improved efficiency over single and double sampling plans having substantially the same operating characteristics.
Hot Deformation of Ti-6Al-4V Single-Colony Samples (Preprint)
2008-02-01
Journal Article Preprint 4 . TITLE AND SUBTITLE HOT DEFORMATION OF Ti-6Al-4V SINGLE-COLONY SAMPLES (PREPRINT) 5a. CONTRACT NUMBER In-house 5b...GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 62102F 6 . AUTHOR(S) A.A. Salem (Universal Technology Corp.) S.L. Semiatin (AFRL/RXLMP) 5d. PROJECT...strength, corrosion resistance, and low density, Ti-6Al-4V is the most commonly used alpha/beta titanium alloy. It accounts for approximately 80
Reinforced Concrete Beams under Combined Axial and Lateral Loading.
1982-01-01
NUMBER(s) Golden E. Lane, Jr. F29601-76-C-015 9. PERFORMING ORGANIZATION NAME AND ADDRESS 10. PROGRAM ELEMENT . PROJECT, TASK AREA 4 WORK UNIT NUMBERS New...acquisition system. The voltage output from the system’s digital multimeter was recorded on a floppy disk. The sampling rate was approximately two... samples per second for every channel. The same system was used to reduce and plot the data. TEST APPARATUS Figure 9 shows a schematic drawing of the load
Fenlon, D. R.
1981-01-01
Of 1241 samples of seagulls faeces examined, 12.9% were found to contain salmonellae. The number of positive samples was significantly higher (17-21%) near sewage outfalls. Twenty-seven serotypes were isolated, including a new serotype named Salmonella grampian. The range and frequency of serotypes carried by gulls was similar to those in the human population, suggesting sewage as a possible source of gull infection. The number of salmonellae found in positive samples was low (0.18-191 g-1 faeces). This was similar to the numbers found in sewage, 10-80 1-1, suggesting gulls may only carry infected material without infecting themselves. Antibiotic resistance in the isolates was low, only 21 showing resistance to the antibiotics tested, although most of these were determined by resistance transfer plasmids. PMID:7462604
Analysis of the research sample collections of Uppsala biobank.
Engelmark, Malin T; Beskow, Anna H
2014-10-01
Uppsala Biobank is the joint and only biobank organization of the two principals, Uppsala University and Uppsala University Hospital. Biobanks are required to have updated registries on sample collection composition and management in order to fulfill legal regulations. We report here the results from the first comprehensive and overall analysis of the 131 research sample collections organized in the biobank. The results show that the median of the number of samples in the collections was 700 and that the number of samples varied from less than 500 to over one million. Blood samples, such as whole blood, serum, and plasma, were included in the vast majority, 84.0%, of the research sample collections. Also, as much as 95.5% of the newly collected samples within healthcare included blood samples, which further supports the concept that blood samples have fundamental importance for medical research. Tissue samples were also commonly used and occurred in 39.7% of the research sample collections, often combined with other types of samples. In total, 96.9% of the 131 sample collections included samples collected for healthcare, showing the importance of healthcare as a research infrastructure. Of the collections that had accessed existing samples from healthcare, as much as 96.3% included tissue samples from the Department of Pathology, which shows the importance of pathology samples as a resource for medical research. Analysis of different research areas shows that the most common of known public health diseases are covered. Collections that had generated the most publications, up to over 300, contained a large number of samples collected systematically and repeatedly over many years. More knowledge about existing biobank materials, together with public registries on sample collections, will support research collaborations, improve transparency, and bring us closer to the goals of biobanks, which is to save and prolong human lives and improve health and quality of life.
Influence of wave-front sampling in adaptive optics retinal imaging
Laslandes, Marie; Salas, Matthias; Hitzenberger, Christoph K.; Pircher, Michael
2017-01-01
A wide range of sampling densities of the wave-front has been used in retinal adaptive optics (AO) instruments, compared to the number of corrector elements. We developed a model in order to characterize the link between number of actuators, number of wave-front sampling points and AO correction performance. Based on available data from aberration measurements in the human eye, 1000 wave-fronts were generated for the simulations. The AO correction performance in the presence of these representative aberrations was simulated for different deformable mirror and Shack Hartmann wave-front sensor combinations. Predictions of the model were experimentally tested through in vivo measurements in 10 eyes including retinal imaging with an AO scanning laser ophthalmoscope. According to our study, a ratio between wavefront sampling points and actuator elements of 2 is sufficient to achieve high resolution in vivo images of photoreceptors. PMID:28271004
Variational Approach to Enhanced Sampling and Free Energy Calculations
NASA Astrophysics Data System (ADS)
Valsson, Omar; Parrinello, Michele
2014-08-01
The ability of widely used sampling methods, such as molecular dynamics or Monte Carlo simulations, to explore complex free energy landscapes is severely hampered by the presence of kinetic bottlenecks. A large number of solutions have been proposed to alleviate this problem. Many are based on the introduction of a bias potential which is a function of a small number of collective variables. However constructing such a bias is not simple. Here we introduce a functional of the bias potential and an associated variational principle. The bias that minimizes the functional relates in a simple way to the free energy surface. This variational principle can be turned into a practical, efficient, and flexible sampling method. A number of numerical examples are presented which include the determination of a three-dimensional free energy surface. We argue that, beside being numerically advantageous, our variational approach provides a convenient and novel standpoint for looking at the sampling problem.
Characterization of dendritic cells in lip and oral cavity squamous cell carcinoma.
Costa, Nádia Lago; Gonçalves, Andréia Souza; Martins, Allisson Filipe Lopes; Arantes, Diego Antônio Costa; Silva, Tarcília Aparecida; Batista, Aline Carvalho
2016-07-01
There may be differences in the antitumor immunity induced by dendritic cells (DCs) during the development of squamous cell carcinoma (SCC) located in the lip rather than in the oral cavity. The aim of this study was to evaluate the number of immature and mature DCs in SCC and potentially malignant disorders of the oral cavity and lip. Immunohistochemistry was used to identify the number (cells/mm(2) ) of immature (CD1a(+) ) or mature (CD83(+) ) DCs in samples of oral cavity SCC (OCSCC) (n = 39), lip SCC (LSCC) (n = 23), leukoplakia (LK) (n = 21), actinic cheilitis (AC) (n = 13), and normal mucosa of the oral cavity (OC control, n = 12) and the lip (lip control, n = 11). The number of CD1a(+) cells tended to be higher in the OC control samples compared with the LK (P = 0.04) and OCSCC (P = 0.21). Unlike, this cell population was lower in the lip control than in AC or LSCC (P < 0.05). The number of CD83(+) cells was increased in the LSCC samples compared with the AC and lip control (P = 0.0001) and in OCSCC compared with both the LK (P = 0.001) and OC control (P = 0.0001) samples. LSCC showed an elevated number of CD1a(+) and CD83(+) cells compared with OCSCC (P = 0.03). The population of mature DCs was lower than the population of immature DCs in all of the tested groups (P < 0.05). There were a greater number of both mature and immature DC populations in the LSCC samples than in the OCSCC, which could contribute to establishing a more effective immune antitumor response for this neoplasm. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Van Damme, Inge; Mattheus, Wesley; Bertrand, Sophie; De Zutter, Lieven
2018-05-01
The tonsils, oral cavity and faeces of 94 pigs at slaughter were sampled to assess the numbers of total aerobic bacteria, Enterobacteriaceae and Escherichia coli in the rectal content, tonsils and oral cavity of pigs at time of evisceration. Moreover, the prevalence, numbers and types of Salmonella spp. were determined. Mean numbers of Enterobacteriaceae in tonsils and the oral cavity differed between slaughterhouses. The proportion of Enterobacteriaceae relative to total aerobic bacteria differed between the different tissues, though large variations were observed between animals. Salmonella spp. were mostly detected in oral cavity swabs (n = 51, 54%), of which six samples were contaminated in numbers over 2.0 log CFU/100 cm 2 . Salmonella spp. were also recovered from 17 tonsillar tissue samples (18%) and 12 tonsillar swabs (13%). Out of the 29 rectal content samples from which Salmonella was recovered (31%), most were lowly contaminated, in the range between -1 and 0 log CFU/g. The predominant serotypes were S. Typhimurium and its monophasic variant, which were recovered from 33 and 13 pigs, respectively. In most cases, the same serotypes and MLVA profiles were found in pigs slaughtered during the same day, thus suggesting a common source of contamination. Copyright © 2017 Elsevier Ltd. All rights reserved.
Dilution effects on ultrafine particle emissions from Euro 5 and Euro 6 diesel and gasoline vehicles
NASA Astrophysics Data System (ADS)
Louis, Cédric; Liu, Yao; Martinet, Simon; D'Anna, Barbara; Valiente, Alvaro Martinez; Boreave, Antoinette; R'Mili, Badr; Tassel, Patrick; Perret, Pascal; André, Michel
2017-11-01
Dilution and temperature used during sampling of vehicle exhaust can modify particle number concentration and size distribution. Two experiments were performed on a chassis dynamometer to assess exhaust dilution and temperature on particle number and particle size distribution for Euro 5 and Euro 6 vehicles. In the first experiment, the effects of dilution (ratio from 8 to 4 000) and temperature (ranging from 50 °C to 150 °C) on particle quantification were investigated directly from tailpipe for a diesel and a gasoline Euro 5 vehicles. In the second experiment, particle emissions from Euro 6 diesel and gasoline vehicles directly sampled from the tailpipe were compared to the constant volume sampling (CVS) measurements under similar sampling conditions. Low primary dilutions (3-5) induced an increase in particle number concentration by a factor of 2 compared to high primary dilutions (12-20). Low dilution temperatures (50 °C) induced 1.4-3 times higher particle number concentration than high dilution temperatures (150 °C). For the Euro 6 gasoline vehicle with direct injection, constant volume sampling (CVS) particle number concentrations were higher than after the tailpipe by a factor of 6, 80 and 22 for Artemis urban, road and motorway, respectively. For the same vehicle, particle size distribution measured after the tailpipe was centred on 10 nm, and particles were smaller than the ones measured after CVS that was centred between 50 nm and 70 nm. The high particle concentration (≈106 #/cm3) and the growth of diameter, measured in the CVS, highlighted aerosol transformations, such as nucleation, condensation and coagulation occurring in the sampling system and this might have biased the particle measurements.
Screen Space Ambient Occlusion Based Multiple Importance Sampling for Real-Time Rendering
NASA Astrophysics Data System (ADS)
Zerari, Abd El Mouméne; Babahenini, Mohamed Chaouki
2018-03-01
We propose a new approximation technique for accelerating the Global Illumination algorithm for real-time rendering. The proposed approach is based on the Screen-Space Ambient Occlusion (SSAO) method, which approximates the global illumination for large, fully dynamic scenes at interactive frame rates. Current algorithms that are based on the SSAO method suffer from difficulties due to the large number of samples that are required. In this paper, we propose an improvement to the SSAO technique by integrating it with a Multiple Importance Sampling technique that combines a stratified sampling method with an importance sampling method, with the objective of reducing the number of samples. Experimental evaluation demonstrates that our technique can produce high-quality images in real time and is significantly faster than traditional techniques.
Agodi, A; Auxilia, F; Barchitta, M; Cristina, M L; D'Alessandro, D; Mura, I; Nobile, M; Pasquarella, C
2015-07-01
Recent studies have shown a higher rate of surgical site infections in hip prosthesis implantation using unidirectional airflow ventilation compared with turbulent ventilation. However, these studies did not measure the air microbial quality of operating theatres (OTs), and assumed it to be compliant with the recommended standards for this ventilation technique. To evaluate airborne microbial contamination in OTs during hip and knee replacement surgery, and compare the findings with values recommended for joint replacement surgery. Air samplings were performed in 28 OTs supplied with unidirectional, turbulent and mixed airflow ventilation. Samples were collected using passive sampling to determine the index of microbial air contamination (IMA). Active sampling was also performed in some of the OTs. The average number of people in the OT and the number of door openings during the sampling period were recorded. In total, 1228 elective prosthesis procedures (60.1% hip and 39.9% knee) were included in this study. Of passive samplings performed during surgical activity in unidirectional airflow ventilation OTs (U-OTs) and mixed airflow OTs (M-OTs), 58.9% and 87.6% had IMA values >2, respectively. Of samplings performed during surgical activity in turbulent airflow OTs (T-OTs) and in turbulent airflow OTs with the surgical team wearing Steri-Shield Turbo Helmets (TH-OTs), 8.6% and 60% had IMA values ≤ 2, respectively. Positive correlation was found between IMA values and the number of people in the OT and the number of door openings (P < 0.001). In addition, correlation was found between active and passive sampling (P < 0.001). These findings challenge the belief that unidirectional systems always provide acceptable airborne bacterial counts. Copyright © 2015 The Healthcare Infection Society. Published by Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Romero, Vicente; Bonney, Matthew; Schroeder, Benjamin
When very few samples of a random quantity are available from a source distribution of unknown shape, it is usually not possible to accurately infer the exact distribution from which the data samples come. Under-estimation of important quantities such as response variance and failure probabilities can result. For many engineering purposes, including design and risk analysis, we attempt to avoid under-estimation with a strategy to conservatively estimate (bound) these types of quantities -- without being overly conservative -- when only a few samples of a random quantity are available from model predictions or replicate experiments. This report examines a classmore » of related sparse-data uncertainty representation and inference approaches that are relatively simple, inexpensive, and effective. Tradeoffs between the methods' conservatism, reliability, and risk versus number of data samples (cost) are quantified with multi-attribute metrics use d to assess method performance for conservative estimation of two representative quantities: central 95% of response; and 10 -4 probability of exceeding a response threshold in a tail of the distribution. Each method's performance is characterized with 10,000 random trials on a large number of diverse and challenging distributions. The best method and number of samples to use in a given circumstance depends on the uncertainty quantity to be estimated, the PDF character, and the desired reliability of bounding the true value. On the basis of this large data base and study, a strategy is proposed for selecting the method and number of samples for attaining reasonable credibility levels in bounding these types of quantities when sparse samples of random variables or functions are available from experiments or simulations.« less
Accounting for Incomplete Species Detection in Fish Community Monitoring
DOE Office of Scientific and Technical Information (OSTI.GOV)
McManamay, Ryan A; Orth, Dr. Donald J; Jager, Yetta
2013-01-01
Riverine fish assemblages are heterogeneous and very difficult to characterize with a one-size-fits-all approach to sampling. Furthermore, detecting changes in fish assemblages over time requires accounting for variation in sampling designs. We present a modeling approach that permits heterogeneous sampling by accounting for site and sampling covariates (including method) in a model-based framework for estimation (versus a sampling-based framework). We snorkeled during three surveys and electrofished during a single survey in suite of delineated habitats stratified by reach types. We developed single-species occupancy models to determine covariates influencing patch occupancy and species detection probabilities whereas community occupancy models estimated speciesmore » richness in light of incomplete detections. For most species, information-theoretic criteria showed higher support for models that included patch size and reach as covariates of occupancy. In addition, models including patch size and sampling method as covariates of detection probabilities also had higher support. Detection probability estimates for snorkeling surveys were higher for larger non-benthic species whereas electrofishing was more effective at detecting smaller benthic species. The number of sites and sampling occasions required to accurately estimate occupancy varied among fish species. For rare benthic species, our results suggested that higher number of occasions, and especially the addition of electrofishing, may be required to improve detection probabilities and obtain accurate occupancy estimates. Community models suggested that richness was 41% higher than the number of species actually observed and the addition of an electrofishing survey increased estimated richness by 13%. These results can be useful to future fish assemblage monitoring efforts by informing sampling designs, such as site selection (e.g. stratifying based on patch size) and determining effort required (e.g. number of sites versus occasions).« less
A novel method for sex determination by detecting the number of X chromosomes.
Nakanishi, Hiroaki; Shojo, Hideki; Ohmori, Takeshi; Hara, Masaaki; Takada, Aya; Adachi, Noboru; Saito, Kazuyuki
2015-01-01
A novel method for sex determination, based on the detection of the number of X chromosomes, was established. Current methods, based on the detection of the Y chromosome, can directly identify an unknown sample as male, but female gender is determined indirectly, by not detecting the Y chromosome. Thus, a direct determination of female gender is important because the quality (e.g., fragmentation and amelogenin-Y null allele) of the Y chromosome DNA may lead to a false result. Thus, we developed a novel sex determination method by analyzing the number of X chromosomes using a copy number variation (CNV) detection technique (the comparative Ct method). In this study, we designed a primer set using the amelogenin-X gene without the CNV region as the target to determine the X chromosome copy number, to exclude the influence of the CNV region from the comparative Ct value. The number of X chromosomes was determined statistically using the CopyCaller software with real-time PCR. All DNA samples from participants (20 males, 20 females) were evaluated correctly using this method with 1-ng template DNA. A minimum of 0.2-ng template DNA was found to be necessary for accurate sex determination with this method. When using ultraviolet-irradiated template DNA, as mock forensic samples, the sex of the samples could not be determined by short tandem repeat (STR) analysis but was correctly determined using our method. Thus, we successfully developed a method of sex determination based on the number of X chromosomes. Our novel method will be useful in forensic practice for sex determination.
40 CFR 761.283 - Determination of the number of samples to collect and sample collection locations.
Code of Federal Regulations, 2011 CFR
2011-07-01
... sampling points after the recleaning, but select three new pairs of sampling coordinates. (i) Beginning in the southwest corner (lower left when facing magnetic north) of the area to be sampled, measure in... new pair of sampling coordinates. Continue to select pairs of sampling coordinates until three are...
Impact on enzyme activity as a new quality index of wastewater.
Balestri, Francesco; Moschini, Roberta; Cappiello, Mario; Del-Corso, Antonella; Mura, Umberto
2013-03-15
The aim of this study was to define a new indicator for the quality of wastewaters that are released into the environment. A quality index is proposed for wastewater samples in terms of the inertness of wastewater samples toward enzyme activity. This involves taking advantage of the sensitivity of enzymes to pollutants that may be present in the waste samples. The effect of wastewater samples on the rate of a number of different enzyme-catalyzed reactions was measured, and the results for all the selected enzymes were analyzed in an integrated fashion (multi-enzymatic sensor). This approach enabled us to define an overall quality index, the "Impact on Enzyme Function" (IEF-index), which is composed of three indicators: i) the Synoptic parameter, related to the average effect of the waste sample on each component of the enzymatic sensor; ii) the Peak parameter, related to the maximum effect observed among all the effects exerted by the sample on the sensor components; and, iii) the Interference parameter, related to the number of sensor components that are affected less than a fixed threshold value. A number of water based samples including public potable tap water, fluids from urban sewage systems, wastewater disposal from leather, paper and dye industries were analyzed and the IEF-index was then determined. Although the IEF-index cannot discriminate between different types of wastewater samples, it could be a useful parameter in monitoring the improvement of the quality of a specific sample. However, by analyzing an adequate number of waste samples of the same type, even from different local contexts, the profile of the impact of each component of the multi-enzymatic sensor could be typical for specific types of waste. The IEF-index is proposed as a supplementary qualification score for wastewaters, in addition to the certification of the waste's conformity to legal requirements. Copyright © 2013 Elsevier Ltd. All rights reserved.
Noorani, Ayesha; Lynch, Andy G.; Achilleos, Achilleas; Eldridge, Matthew; Bower, Lawrence; Weaver, Jamie M.J.; Crawte, Jason; Ong, Chin-Ann; Shannon, Nicholas; MacRae, Shona; Grehan, Nicola; Nutzinger, Barbara; O'Donovan, Maria; Hardwick, Richard; Tavaré, Simon; Fitzgerald, Rebecca C.
2017-01-01
The scientific community has avoided using tissue samples from patients that have been exposed to systemic chemotherapy to infer the genomic landscape of a given cancer. Esophageal adenocarcinoma is a heterogeneous, chemoresistant tumor for which the availability and size of pretreatment endoscopic samples are limiting. This study compares whole-genome sequencing data obtained from chemo-naive and chemo-treated samples. The quality of whole-genomic sequencing data is comparable across all samples regardless of chemotherapy status. Inclusion of samples collected post-chemotherapy increased the proportion of late-stage tumors. When comparing matched pre- and post-chemotherapy samples from 10 cases, the mutational signatures, copy number, and SNV mutational profiles reflect the expected heterogeneity in this disease. Analysis of SNVs in relation to allele-specific copy-number changes pinpoints the common ancestor to a point prior to chemotherapy. For cases in which pre- and post-chemotherapy samples do show substantial differences, the timing of the divergence is near-synchronous with endoreduplication. Comparison across a large prospective cohort (62 treatment-naive, 58 chemotherapy-treated samples) reveals no significant differences in the overall mutation rate, mutation signatures, specific recurrent point mutations, or copy-number events in respect to chemotherapy status. In conclusion, whole-genome sequencing of samples obtained following neoadjuvant chemotherapy is representative of the genomic landscape of esophageal adenocarcinoma. Excluding these samples reduces the material available for cataloging and introduces a bias toward the earlier stages of cancer. PMID:28465312
Ruiz-Toledo, Jovani; Vandame, Rémy; Castro-Chan, Ricardo Alberto; Penilla-Navarro, Rosa Patricia; Gómez, Jaime; Sánchez, Daniel
2018-05-10
In this paper, we show the results of investigating the presence of organochlorine pesticides in honey and pollen samples from managed colonies of the honey bee, Apis mellifera L. and of the stingless bee Scaptotrigona mexicana Guérin. Three colonies of each species were moved into each of two sites. Three samples of pollen and three samples of honey were collected from each colony: the first collection occurred at the beginning of the study and the following ones at every six months during a year. Thus the total number of samples collected was 36 for honey (18 for A. mellifera and 18 for S. mexicana ) and 36 for pollen (18 for A. mellifera and 18 for S. mexicana ). We found that 88.44% and 93.33% of honey samples, and 22.22% and 100% of pollen samples of S. mexicana and A. mellifera , respectively, resulted positive to at least one organochlorine. The most abundant pesticides were Heptaclor (44% of the samples), γ-HCH (36%), DDT (19%), Endrin (18%) and DDE (11%). Despite the short foraging range of S. mexicana , the number of pesticides quantified in the honey samples was similar to that of A. mellifera . Paradoxically we found a small number of organochlorines in pollen samples of S. mexicana in comparison to A. mellifera , perhaps indicating a low abundance of pollen sources within the foraging range of this species.
7 CFR 75.48 - Identification number.
Code of Federal Regulations, 2014 CFR
2014-01-01
... 7 Agriculture 3 2014-01-01 2014-01-01 false Identification number. 75.48 Section 75.48 Agriculture... number. The Director may require the use of official identification numbers in connection with seed certificated or sampled under the Act. When identification numbers are required, they shall be specified by the...
7 CFR 75.48 - Identification number.
Code of Federal Regulations, 2013 CFR
2013-01-01
... 7 Agriculture 3 2013-01-01 2013-01-01 false Identification number. 75.48 Section 75.48 Agriculture... number. The Director may require the use of official identification numbers in connection with seed certificated or sampled under the Act. When identification numbers are required, they shall be specified by the...
7 CFR 75.48 - Identification number.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 7 Agriculture 3 2010-01-01 2010-01-01 false Identification number. 75.48 Section 75.48 Agriculture... number. The Director may require the use of official identification numbers in connection with seed certificated or sampled under the Act. When identification numbers are required, they shall be specified by the...
7 CFR 75.48 - Identification number.
Code of Federal Regulations, 2012 CFR
2012-01-01
... 7 Agriculture 3 2012-01-01 2012-01-01 false Identification number. 75.48 Section 75.48 Agriculture... number. The Director may require the use of official identification numbers in connection with seed certificated or sampled under the Act. When identification numbers are required, they shall be specified by the...
7 CFR 75.48 - Identification number.
Code of Federal Regulations, 2011 CFR
2011-01-01
... 7 Agriculture 3 2011-01-01 2011-01-01 false Identification number. 75.48 Section 75.48 Agriculture... number. The Director may require the use of official identification numbers in connection with seed certificated or sampled under the Act. When identification numbers are required, they shall be specified by the...
Zoonoses action plan Salmonella monitoring programme: an investigation of the sampling protocol.
Snary, E L; Munday, D K; Arnold, M E; Cook, A J C
2010-03-01
The Zoonoses Action Plan (ZAP) Salmonella Programme was established by the British Pig Executive to monitor Salmonella prevalence in quality-assured British pigs at slaughter by testing a sample of pigs with a meat juice enzyme-linked immunosorbent assay for antibodies against group B and C(1) Salmonella. Farms were assigned a ZAP level (1 to 3) depending on the monitored prevalence, and ZAP 2 or 3 farms were required to act to reduce the prevalence. The ultimate goal was to reduce the risk of human salmonellosis attributable to British pork. A mathematical model has been developed to describe the ZAP sampling protocol. Results show that the probability of assigning a farm the correct ZAP level was high, except for farms that had a seroprevalence close to the cutoff points between different ZAP levels. Sensitivity analyses identified that the probability of assigning a farm to the correct ZAP level was dependent on the sensitivity and specificity of the test, the number of batches taken to slaughter each quarter, and the number of samples taken per batch. The variability of the predicted seroprevalence was reduced as the number of batches or samples increased and, away from the cutoff points, the probability of being assigned the correct ZAP level increased as the number of batches or samples increased. In summary, the model described here provided invaluable insight into the ZAP sampling protocol. Further work is required to understand the impact of the program for Salmonella infection in British pig farms and therefore on human health.
The Metals in Replicate Samples data set contains the analytical results of measurements of up to 2 metals in 172 replicate (duplicate) samples from 86 households. Measurements were made in samples of blood. Duplicate samples for a small percentage of the total number of sample...
Counting glomeruli and podocytes: rationale and methodologies
Puelles, Victor G.; Bertram, John F.
2015-01-01
Purpose of review There is currently much interest in the numbers of both glomeruli and podocytes. This interest stems from greater understanding of the effects of suboptimal fetal events on nephron endowment, the associations between low nephron number and chronic cardiovascular and kidney disease in adults, and the emergence of the podocyte depletion hypothesis. Recent findings Obtaining accurate and precise estimates of glomerular and podocyte number has proven surprisingly difficult. When whole kidneys or large tissue samples are available, design-based stereological methods are considered gold-standard because they are based on principles that negate systematic bias. However, these methods are often tedious and time-consuming, and oftentimes inapplicable when dealing with small samples such as biopsies. Therefore, novel methods suitable for small tissue samples, and innovative approaches to facilitate high through put measurements, such as magnetic resonance imaging (MRI) to estimate glomerular number and flow cytometry to estimate podocyte number, have recently been described. Summary This review describes current gold-standard methods for estimating glomerular and podocyte number, as well as methods developed in the past 3 years. We are now better placed than ever before to accurately and precisely estimate glomerular and podocyte number, and to examine relationships between these measurements and kidney health and disease. PMID:25887899
45 CFR 98.102 - Content of Error Rate Reports.
Code of Federal Regulations, 2010 CFR
2010-10-01
... Funds and State Matching and Maintenance-of-Effort (MOE Funds): (1) Percentage of cases with an error... cases in the sample with an error compared to the total number of cases in the sample; (2) Percentage of cases with an improper payment (both over and under payments), expressed as the total number of cases in...
ERIC Educational Resources Information Center
Kim, Soyoung; Olejnik, Stephen
2005-01-01
The sampling distributions of five popular measures of association with and without two bias adjusting methods were examined for the single factor fixed-effects multivariate analysis of variance model. The number of groups, sample sizes, number of outcomes, and the strength of association were manipulated. The results indicate that all five…
Improved argument-FFT frequency offset estimation for QPSK coherent optical Systems
NASA Astrophysics Data System (ADS)
Han, Jilong; Li, Wei; Yuan, Zhilin; Li, Haitao; Huang, Liyan; Hu, Qianggao
2016-02-01
A frequency offset estimation (FOE) algorithm based on fast Fourier transform (FFT) of the signal's argument is investigated, which does not require removing the modulated data phase. In this paper, we analyze the flaw of the argument-FFT algorithm and propose a combined FOE algorithm, in which the absolute of frequency offset (FO) is accurately calculated by argument-FFT algorithm with a relatively large number of samples and the sign of FO is determined by FFT-based interpolation discrete Fourier transformation (DFT) algorithm with a relatively small number of samples. Compared with the previous algorithms based on argument-FFT, the proposed one has low complexity and can still effectively work with a relatively less number of samples.
Patty, Philipus J; Frisken, Barbara J
2006-04-01
We compare results for the number-weighted mean radius and polydispersity obtained either by directly fitting number distributions to dynamic light-scattering data or by converting results obtained by fitting intensity-weighted distributions. We find that results from fits using number distributions are angle independent and that converting intensity-weighted distributions is not always reliable, especially when the polydispersity of the sample is large. We compare the results of fitting symmetric and asymmetric distributions, as represented by Gaussian and Schulz distributions, respectively, to data for extruded vesicles and find that the Schulz distribution provides a better estimate of the size distribution for these samples.
Stabley, Deborah L; Harris, Ashlee W; Holbrook, Jennifer; Chubbs, Nicholas J; Lozo, Kevin W; Crawford, Thomas O; Swoboda, Kathryn J; Funanage, Vicky L; Wang, Wenlan; Mackenzie, William; Scavina, Mena; Sol-Church, Katia; Butchbach, Matthew E R
2015-07-01
Proximal spinal muscular atrophy (SMA) is an early-onset motor neuron disease characterized by loss of α-motor neurons and associated muscle atrophy. SMA is caused by deletion or other disabling mutation of survival motor neuron 1 (SMN1). In the human genome, a large duplication of the SMN-containing region gives rise to a second copy of this gene (SMN2) that is distinguishable by a single nucleotide change in exon 7. Within the SMA population, there is substantial variation in SMN2 copy number; in general, those individuals with SMA who have a high SMN2 copy number have a milder disease. Because SMN2 functions as a disease modifier, its accurate copy number determination may have clinical relevance. In this study, we describe the development of an assay to assess SMN1 and SMN2 copy numbers in DNA samples using an array-based digital PCR (dPCR) system. This dPCR assay can accurately and reliably measure the number of SMN1 and SMN2 copies in DNA samples. In a cohort of SMA patient-derived cell lines, the assay confirmed a strong inverse correlation between SMN2 copy number and disease severity. Array dPCR is a practical technique to determine, accurately and reliably, SMN1 and SMN2 copy numbers from SMA samples.
Lowe, Terrence (Peter); Tebbs, Kerry; Sparling, Donald W.
2016-01-01
Three types of macroinvertebrate collecting devices, Gerking box traps, D-shaped sweep nets, and activity traps, have commonly been used to sample macroinvertebrates when conducting rapid biological assessments of North American wetlands. We compared collections of macroinvertebrates identified to the family level made with these devices in 6 constructed and 2 natural wetlands on the Delmarva Peninsula of Maryland. We also assessed their potential efficacy in comparisons among wetlands using several proportional and richness attributes. Differences in median diversity among samples from the 3 devices were significant; the sweep-net samples had the greatest diversity and the activity-trap samples had the least diversity. Differences in median abundance were not significant between the Gerking box-trap samples and sweep-net samples, but median abundance among activity-trap samples was significantly lower than among samples of the other 2 devices. Within samples, the proportions of median diversity composed of major class and order groupings were similar among the 3 devices. However the proportions of median abundance composed of the major class and order groupings within activity-trap samples were not similar to those of the other 2 devices. There was a slight but significant increase in the total number of families captured when we combined activity-trap samples with Gerking box-trap samples or with sweep-net samples, and the per-sample median numbers of families of the combined activity-trap and sweep-net samples was significantly higher than that of the combined activity-trap and Gerking box-trap samples. We detected significant differences among wetlands for 4 macroinvertebrate attributes with the Gerking box-trap data, 6 attributes with sweep-net data, and 5 attributes with the activity-trap data. A small, but significant increase in the number of attributes showing differences among wetlands occurred when we combined activity-trap samples with those of the Gerking boxtrap or sweep net.
Nichols, J.D.; Boulinier, T.; Hines, J.E.; Pollock, K.H.; Sauer, J.R.
1998-01-01
Inferences about spatial variation in species richness and community composition are important both to ecological hypotheses about the structure and function of communities and to community-level conservation and management. Few sampling programs for animal communities provide censuses, and usually some species present. We present estimators useful for drawing inferences about comparative species richness and composition between different sampling locations when not all species are detected in sampling efforts. Based on capture-recapture models using the robust design, our methods estimate relative species richness, proportion of species in one location that are also found in another, and number of species found in one location but not in another. The methods use data on the presence or absence of each species at different sampling occasions (or locations) to estimate the number of species not detected at any occasions (or locations). This approach permits estimation of the number of species in the sampled community and in subsets of the community useful for estimating the fraction of species shared by two communities. We provide an illustration of our estimation methods by comparing bird species richness and composition in two locations sampled by routes of the North American Breeding Bird Survey. In this example analysis, the two locations (an associated bird communities) represented different levels of urbanization. Estimates of relative richness, proportion of shared species, and number of species present on one route but not the other indicated that the route with the smaller fraction of urban area had greater richness and a larer number of species that were not found on the more urban route than vice versa. We developed a software package, COMDYN, for computing estimates based on the methods. Because these estimation methods explicitly deal with sampling in which not all species are detected, we recommend their use for addressing questions about species richness and community composition.
Performance evaluation of DNA copy number segmentation methods.
Pierre-Jean, Morgane; Rigaill, Guillem; Neuvial, Pierre
2015-07-01
A number of bioinformatic or biostatistical methods are available for analyzing DNA copy number profiles measured from microarray or sequencing technologies. In the absence of rich enough gold standard data sets, the performance of these methods is generally assessed using unrealistic simulation studies, or based on small real data analyses. To make an objective and reproducible performance assessment, we have designed and implemented a framework to generate realistic DNA copy number profiles of cancer samples with known truth. These profiles are generated by resampling publicly available SNP microarray data from genomic regions with known copy-number state. The original data have been extracted from dilutions series of tumor cell lines with matched blood samples at several concentrations. Therefore, the signal-to-noise ratio of the generated profiles can be controlled through the (known) percentage of tumor cells in the sample. This article describes this framework and its application to a comparison study between methods for segmenting DNA copy number profiles from SNP microarrays. This study indicates that no single method is uniformly better than all others. It also helps identifying pros and cons of the compared methods as a function of biologically informative parameters, such as the fraction of tumor cells in the sample and the proportion of heterozygous markers. This comparison study may be reproduced using the open source and cross-platform R package jointseg, which implements the proposed data generation and evaluation framework: http://r-forge.r-project.org/R/?group_id=1562. © The Author 2014. Published by Oxford University Press.
Chen, Qixuan; Li, Jingguang
2014-05-01
Many recent studies have examined the association between number acuity, which is the ability to rapidly and non-symbolically estimate the quantity of items appearing in a scene, and symbolic math performance. However, various contradictory results have been reported. To comprehensively evaluate the association between number acuity and symbolic math performance, we conduct a meta-analysis to synthesize the results observed in previous studies. First, a meta-analysis of cross-sectional studies (36 samples, N = 4705) revealed a significant positive correlation between these skills (r = 0.20, 95% CI = [0.14, 0.26]); the association remained after considering other potential moderators (e.g., whether general cognitive abilities were controlled). Moreover, a meta-analysis of longitudinal studies revealed 1) that number acuity may prospectively predict later math performance (r = 0.24, 95% CI = [0.11, 0.37]; 6 samples) and 2) that number acuity is retrospectively correlated to early math performance as well (r = 0.17, 95% CI = [0.07, 0.26]; 5 samples). In summary, these pieces of evidence demonstrate a moderate but statistically significant association between number acuity and math performance. Based on the estimated effect sizes, power analyses were conducted, which suggested that many previous studies were underpowered due to small sample sizes. This may account for the disparity between findings in the literature, at least in part. Finally, the theoretical and practical implications of our meta-analytic findings are presented, and future research questions are discussed. Copyright © 2014 Elsevier B.V. All rights reserved.
Electrophysiological responses to feedback during the application of abstract rules.
Walsh, Matthew M; Anderson, John R
2013-11-01
Much research focuses on how people acquire concrete stimulus-response associations from experience; however, few neuroscientific studies have examined how people learn about and select among abstract rules. To address this issue, we recorded ERPs as participants performed an abstract rule-learning task. In each trial, they viewed a sample number and two test numbers. Participants then chose a test number using one of three abstract mathematical rules they freely selected from: greater than the sample number, less than the sample number, or equal to the sample number. No one rule was always rewarded, but some rules were rewarded more frequently than others. To maximize their earnings, participants needed to learn which rules were rewarded most frequently. All participants learned to select the best rules for repeating and novel stimulus sets that obeyed the overall reward probabilities. Participants differed, however, in the extent to which they overgeneralized those rules to repeating stimulus sets that deviated from the overall reward probabilities. The feedback-related negativity (FRN), an ERP component thought to reflect reward prediction error, paralleled behavior. The FRN was sensitive to item-specific reward probabilities in participants who detected the deviant stimulus set, and the FRN was sensitive to overall reward probabilities in participants who did not. These results show that the FRN is sensitive to the utility of abstract rules and that the individual's representation of a task's states and actions shapes behavior as well as the FRN.
Electrophysiological Responses to Feedback during the Application of Abstract Rules
Walsh, Matthew M.; Anderson, John R.
2017-01-01
Much research focuses on how people acquire concrete stimulus–response associations from experience; however, few neuroscientific studies have examined how people learn about and select among abstract rules. To address this issue, we recorded ERPs as participants performed an abstract rule-learning task. In each trial, they viewed a sample number and two test numbers. Participants then chose a test number using one of three abstract mathematical rules they freely selected from: greater than the sample number, less than the sample number, or equal to the sample number. No one rule was always rewarded, but some rules were rewarded more frequently than others. To maximize their earnings, participants needed to learn which rules were rewarded most frequently. All participants learned to select the best rules for repeating and novel stimulus sets that obeyed the overall reward probabilities. Participants differed, however, in the extent to which they overgeneralized those rules to repeating stimulus sets that deviated from the overall reward probabilities. The feedback-related negativity (FRN), an ERP component thought to reflect reward prediction error, paralleled behavior. The FRN was sensitive to item-specific reward probabilities in participants who detected the deviant stimulus set, and the FRN was sensitive to overall reward probabilities in participants who did not. These results show that the FRN is sensitive to the utility of abstract rules and that the individualʼs representation of a taskʼs states and actions shapes behavior as well as the FRN. PMID:23915052
7 CFR 29.133 - Identification number.
Code of Federal Regulations, 2013 CFR
2013-01-01
... 7 Agriculture 2 2013-01-01 2013-01-01 false Identification number. 29.133 Section 29.133... REGULATIONS TOBACCO INSPECTION Regulations Miscellaneous § 29.133 Identification number. The Director may require the use of official identification numbers in connection with tobacco certificated or sampled...
7 CFR 29.133 - Identification number.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 7 Agriculture 2 2010-01-01 2010-01-01 false Identification number. 29.133 Section 29.133... REGULATIONS TOBACCO INSPECTION Regulations Miscellaneous § 29.133 Identification number. The Director may require the use of official identification numbers in connection with tobacco certificated or sampled...
7 CFR 29.133 - Identification number.
Code of Federal Regulations, 2012 CFR
2012-01-01
... 7 Agriculture 2 2012-01-01 2012-01-01 false Identification number. 29.133 Section 29.133... REGULATIONS TOBACCO INSPECTION Regulations Miscellaneous § 29.133 Identification number. The Director may require the use of official identification numbers in connection with tobacco certificated or sampled...
7 CFR 29.133 - Identification number.
Code of Federal Regulations, 2014 CFR
2014-01-01
... 7 Agriculture 2 2014-01-01 2014-01-01 false Identification number. 29.133 Section 29.133... REGULATIONS TOBACCO INSPECTION Regulations Miscellaneous § 29.133 Identification number. The Director may require the use of official identification numbers in connection with tobacco certificated or sampled...
7 CFR 29.133 - Identification number.
Code of Federal Regulations, 2011 CFR
2011-01-01
... 7 Agriculture 2 2011-01-01 2011-01-01 false Identification number. 29.133 Section 29.133... REGULATIONS TOBACCO INSPECTION Regulations Miscellaneous § 29.133 Identification number. The Director may require the use of official identification numbers in connection with tobacco certificated or sampled...
49 CFR 1244.4 - Sampling of waybills.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 49 Transportation 9 2010-10-01 2010-10-01 false Sampling of waybills. 1244.4 Section 1244.4... PROPERTY-RAILROADS § 1244.4 Sampling of waybills. (a) Subject railroads shall file waybill sample... expected sampling rates for the manual system are as follows: Numbers of carloads on waybill Expected...
49 CFR 1244.4 - Sampling of waybills.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 49 Transportation 9 2011-10-01 2011-10-01 false Sampling of waybills. 1244.4 Section 1244.4... PROPERTY-RAILROADS § 1244.4 Sampling of waybills. (a) Subject railroads shall file waybill sample... expected sampling rates for the manual system are as follows: Numbers of carloads on waybill Expected...
Improving EEG-Based Motor Imagery Classification for Real-Time Applications Using the QSA Method.
Batres-Mendoza, Patricia; Ibarra-Manzano, Mario A; Guerra-Hernandez, Erick I; Almanza-Ojeda, Dora L; Montoro-Sanjose, Carlos R; Romero-Troncoso, Rene J; Rostro-Gonzalez, Horacio
2017-01-01
We present an improvement to the quaternion-based signal analysis (QSA) technique to extract electroencephalography (EEG) signal features with a view to developing real-time applications, particularly in motor imagery (IM) cognitive processes. The proposed methodology (iQSA, improved QSA) extracts features such as the average, variance, homogeneity, and contrast of EEG signals related to motor imagery in a more efficient manner (i.e., by reducing the number of samples needed to classify the signal and improving the classification percentage) compared to the original QSA technique. Specifically, we can sample the signal in variable time periods (from 0.5 s to 3 s, in half-a-second intervals) to determine the relationship between the number of samples and their effectiveness in classifying signals. In addition, to strengthen the classification process a number of boosting-technique-based decision trees were implemented. The results show an 82.30% accuracy rate for 0.5 s samples and 73.16% for 3 s samples. This is a significant improvement compared to the original QSA technique that offered results from 33.31% to 40.82% without sampling window and from 33.44% to 41.07% with sampling window, respectively. We can thus conclude that iQSA is better suited to develop real-time applications.
Radial q-space sampling for DSI.
Baete, Steven H; Yutzy, Stephen; Boada, Fernando E
2016-09-01
Diffusion spectrum imaging (DSI) has been shown to be an effective tool for noninvasively depicting the anatomical details of brain microstructure. Existing implementations of DSI sample the diffusion encoding space using a rectangular grid. Here we present a different implementation of DSI whereby a radially symmetric q-space sampling scheme for DSI is used to improve the angular resolution and accuracy of the reconstructed orientation distribution functions. Q-space is sampled by acquiring several q-space samples along a number of radial lines. Each of these radial lines in q-space is analytically connected to a value of the orientation distribution functions at the same angular location by the Fourier slice theorem. Computer simulations and in vivo brain results demonstrate that radial diffusion spectrum imaging correctly estimates the orientation distribution functions when moderately high b-values (4000 s/mm2) and number of q-space samples (236) are used. The nominal angular resolution of radial diffusion spectrum imaging depends on the number of radial lines used in the sampling scheme, and only weakly on the maximum b-value. In addition, the radial analytical reconstruction reduces truncation artifacts which affect Cartesian reconstructions. Hence, a radial acquisition of q-space can be favorable for DSI. Magn Reson Med 76:769-780, 2016. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
Improving EEG-Based Motor Imagery Classification for Real-Time Applications Using the QSA Method
Batres-Mendoza, Patricia; Guerra-Hernandez, Erick I.; Almanza-Ojeda, Dora L.; Montoro-Sanjose, Carlos R.
2017-01-01
We present an improvement to the quaternion-based signal analysis (QSA) technique to extract electroencephalography (EEG) signal features with a view to developing real-time applications, particularly in motor imagery (IM) cognitive processes. The proposed methodology (iQSA, improved QSA) extracts features such as the average, variance, homogeneity, and contrast of EEG signals related to motor imagery in a more efficient manner (i.e., by reducing the number of samples needed to classify the signal and improving the classification percentage) compared to the original QSA technique. Specifically, we can sample the signal in variable time periods (from 0.5 s to 3 s, in half-a-second intervals) to determine the relationship between the number of samples and their effectiveness in classifying signals. In addition, to strengthen the classification process a number of boosting-technique-based decision trees were implemented. The results show an 82.30% accuracy rate for 0.5 s samples and 73.16% for 3 s samples. This is a significant improvement compared to the original QSA technique that offered results from 33.31% to 40.82% without sampling window and from 33.44% to 41.07% with sampling window, respectively. We can thus conclude that iQSA is better suited to develop real-time applications. PMID:29348744
NASA Astrophysics Data System (ADS)
Semenova, T. A.; Golovchenko, A. V.
2017-07-01
The population density and taxonomic structure of micromycetes were monitored for six months in a model experiment with natural and mechanically fragmented (fine and coarse) samples of sphagnum. Sphagnum fragmentation favored an increase in the number of micromycetes only during the first week of the experiment. On the average, the number of micromycetes in fine-fragmented samples was two times greater than that in the coarse-fragmented samples. The diversity of micromycetes increased in the fragmented samples of sphagnum owing to the activation of some species, which remained in the inactive state as spores in the peat before fragmentation.
2010-03-16
Exceeded at ERP Soil and Groundwater Sites 86 A-2a. Identification of IRIS Chemicals of Interest on the ATSDR CERCLA Priority List of Hazardous...the Number (Bold Font) of Air Force ERP Samples in Which They Were Detected 317 A-4d. Air Force ERP Soil Samples: IRIS Chemicals of Interest...Ranked by the Number (Bold Font) of Air Force ERP Soil Samples in Which They Were Detected 333 A-4e. Air Force ERP Groundwater Samples: IRIS Chemicals of
View of container of green-colored lunar soil in Lunar Receiving Laboratory
1971-08-13
S71-43052 (August 1971) --- A close-up view of a container full of green-colored lunar soil in the Non-Sterile Nitrogen Processing Line (NNPL) in the Lunar Receiving Laboratory (LRL) at the Manned Spacecraft Center (MSC). This sample, broken down into six separate samples after this photo was made, was made up of comprehensive fines from near Spur Crater on the Apennine Front. The numbers assigned to the sample include numbers 15300 through 15305. Astronauts David R. Scott and James B. Irwin took the sample during their second extravehicular activity (EVA) at a ground elapsed time (GET) of 146:05 to 146:06.
Microbiological evaluation of South Australian rock lobster meat.
Yap, A S
1977-12-01
Samples of frozen precooked rock lobster meat from five South Australian fish-processing plants situated in the West Coast and south-east regions were tested over a period of six months during the 1974/5 lobster fishing season. The most probable number (MPN) of E. coli and coliforms, Staphylococcus aureus and Salmonella, as well as total plate count (TPC) were determined in 480 samples. Monthly geometric mean TPC ranged from 1600/g to 25,000/g. The highest geometric mean of the MPN of coliforms and E. coli were 4.9/g and 1.8/g respectively. The highest geometric mean number of staphylococci was 18.6/g. Salmonella was not detected in the 480 units tested. Only 0.4% of the samples had TPC exceeding 100,000/g. Coliforms and E. coli were not present in 76.1% and 92.7% respectively of the samples tested. Staphylococcus aureus was not detected in 67.7% of the samples. The numbers of organisms in 82% of the samples fall within the microbiological standards proposed by the National Health and Medical Research Council of Australia for frozen precooked foods. The results of this study demonstrate the microbial quality of precooked lobster meat attainable when good manufacturing practices are used.
Okubo, Torahiko; Osaki, Takako; Nozaki, Eriko; Uemura, Akira; Sakai, Kouhei; Matushita, Mizue; Matsuo, Junji; Nakamura, Shinji; Kamiya, Shigeru; Yamaguchi, Hiroyuki
2017-01-01
Although human occupancy is a source of airborne bacteria, the role of walkers on bacterial communities in built environments is poorly understood. Therefore, we visualized the impact of walker occupancy combined with other factors (temperature, humidity, atmospheric pressure, dust particles) on airborne bacterial features in the Sapporo underground pedestrian space in Sapporo, Japan. Air samples (n = 18; 4,800L/each sample) were collected at 8:00 h to 20:00 h on 3 days (regular sampling) and at early morning / late night (5:50 h to 7:50 h / 22:15 h to 24:45 h) on a day (baseline sampling), and the number of CFUs (colony forming units) OTUs (operational taxonomic units) and other factors were determined. The results revealed that temperature, humidity, and atmospheric pressure changed with weather. The number of walkers increased greatly in the morning and evening on each regular sampling day, although total walker numbers did not differ significantly among regular sampling days. A slight increase in small dust particles (0.3-0.5μm) was observed on the days with higher temperature regardless of regular or baseline sampling. At the period on regular sampling, CFU levels varied irregularly among days, and the OTUs of 22-phylum types were observed, with the majority being from Firmicutes or Proteobacteria (γ-), including Staphylococcus sp. derived from human individuals. The data obtained from regular samplings reveled that although no direct interaction of walker occupancy and airborne CFU and OTU features was observed upon Pearson's correlation analysis, cluster analysis indicated an obvious lineage consisting of walker occupancy, CFU numbers, OTU types, small dust particles, and seasonal factors (including temperature and humidity). Meanwhile, at the period on baseline sampling both walker and CFU numbers were similarly minimal. Taken together, the results revealed a positive correlation of walker occupancy with airborne bacteria that increased with increases in temperature and humidity in the presence of airborne small particles. Moreover, the results indicated that small dust particles at high temperature and humidity may be a crucial factor responsible for stabilizing the bacteria released from walkers in built environments. The findings presented herein advance our knowledge and understanding of the relationship between humans and bacterial communities in built environments, and will help improve public health in urban communities.
Dal Grande, Eleonora; Chittleborough, Catherine R; Campostrini, Stefano; Taylor, Anne W
2016-04-18
Emerging communication technologies have had an impact on population-based telephone surveys worldwide. Our objective was to examine the potential biases of health estimates in South Australia, a state of Australia, obtained via current landline telephone survey methodologies and to report on the impact of mobile-only household on household surveys. Data from an annual multi-stage, systematic, clustered area, face-to-face population survey, Health Omnibus Survey (approximately 3000 interviews annually), included questions about telephone ownership to assess the population that were non-contactable by current telephone sampling methods (2006 to 2013). Univariable analyses (2010 to 2013) and trend analyses were conducted for sociodemographic and health indicator variables in relation to telephone status. Relative coverage biases (RCB) of two hypothetical telephone samples was undertaken by examining the prevalence estimates of health status and health risk behaviours (2010 to 2013): directory-listed numbers, consisting mainly of landline telephone numbers and a small proportion of mobile telephone numbers; and a random digit dialling (RDD) sample of landline telephone numbers which excludes mobile-only households. Telephone (landline and mobile) coverage in South Australia is very high (97%). Mobile telephone ownership increased slightly (7.4%), rising from 89.7% in 2006 to 96.3% in 2013; mobile-only households increased by 431% over the eight year period from 5.2% in 2006 to 27.6% in 2013. Only half of the households have either a mobile or landline number listed in the telephone directory. There were small differences in the prevalence estimates for current asthma, arthritis, diabetes and obesity between the hypothetical telephone samples and the overall sample. However, prevalence estimate for diabetes was slightly underestimated (RCB value of -0.077) in 2013. Mixed RCB results were found for having a mental health condition for both telephone samples. Current smoking prevalence was lower for both hypothetical telephone samples in absolute differences and RCB values: -0.136 to -0.191 for RDD landline samples and -0.129 to -0.313 for directory-listed samples. These findings suggest landline-based sampling frames used in Australia, when appropriately weighted, produce reliable representative estimates for some health indicators but not for all. Researchers need to be aware of their limitations and potential biased estimates.
van Hassel, Daniël; van der Velden, Lud; de Bakker, Dinny; van der Hoek, Lucas; Batenburg, Ronald
2017-12-04
Our research is based on a technique for time sampling, an innovative method for measuring the working hours of Dutch general practitioners (GPs), which was deployed in an earlier study. In this study, 1051 GPs were questioned about their activities in real time by sending them one SMS text message every 3 h during 1 week. The required sample size for this study is important for health workforce planners to know if they want to apply this method to target groups who are hard to reach or if fewer resources are available. In this time-sampling method, however, standard power analyses is not sufficient for calculating the required sample size as this accounts only for sample fluctuation and not for the fluctuation of measurements taken from every participant. We investigated the impact of the number of participants and frequency of measurements per participant upon the confidence intervals (CIs) for the hours worked per week. Statistical analyses of the time-use data we obtained from GPs were performed. Ninety-five percent CIs were calculated, using equations and simulation techniques, for various different numbers of GPs included in the dataset and for various frequencies of measurements per participant. Our results showed that the one-tailed CI, including sample and measurement fluctuation, decreased from 21 until 3 h between one and 50 GPs. As a result of the formulas to calculate CIs, the increase of the precision continued and was lower with the same additional number of GPs. Likewise, the analyses showed how the number of participants required decreased if more measurements per participant were taken. For example, one measurement per 3-h time slot during the week requires 300 GPs to achieve a CI of 1 h, while one measurement per hour requires 100 GPs to obtain the same result. The sample size needed for time-use research based on a time-sampling technique depends on the design and aim of the study. In this paper, we showed how the precision of the measurement of hours worked each week by GPs strongly varied according to the number of GPs included and the frequency of measurements per GP during the week measured. The best balance between both dimensions will depend upon different circumstances, such as the target group and the budget available.
Okubo, Torahiko; Osaki, Takako; Nozaki, Eriko; Uemura, Akira; Sakai, Kouhei; Matushita, Mizue; Matsuo, Junji; Nakamura, Shinji; Kamiya, Shigeru
2017-01-01
Although human occupancy is a source of airborne bacteria, the role of walkers on bacterial communities in built environments is poorly understood. Therefore, we visualized the impact of walker occupancy combined with other factors (temperature, humidity, atmospheric pressure, dust particles) on airborne bacterial features in the Sapporo underground pedestrian space in Sapporo, Japan. Air samples (n = 18; 4,800L/each sample) were collected at 8:00 h to 20:00 h on 3 days (regular sampling) and at early morning / late night (5:50 h to 7:50 h / 22:15 h to 24:45 h) on a day (baseline sampling), and the number of CFUs (colony forming units) OTUs (operational taxonomic units) and other factors were determined. The results revealed that temperature, humidity, and atmospheric pressure changed with weather. The number of walkers increased greatly in the morning and evening on each regular sampling day, although total walker numbers did not differ significantly among regular sampling days. A slight increase in small dust particles (0.3–0.5μm) was observed on the days with higher temperature regardless of regular or baseline sampling. At the period on regular sampling, CFU levels varied irregularly among days, and the OTUs of 22-phylum types were observed, with the majority being from Firmicutes or Proteobacteria (γ-), including Staphylococcus sp. derived from human individuals. The data obtained from regular samplings reveled that although no direct interaction of walker occupancy and airborne CFU and OTU features was observed upon Pearson's correlation analysis, cluster analysis indicated an obvious lineage consisting of walker occupancy, CFU numbers, OTU types, small dust particles, and seasonal factors (including temperature and humidity). Meanwhile, at the period on baseline sampling both walker and CFU numbers were similarly minimal. Taken together, the results revealed a positive correlation of walker occupancy with airborne bacteria that increased with increases in temperature and humidity in the presence of airborne small particles. Moreover, the results indicated that small dust particles at high temperature and humidity may be a crucial factor responsible for stabilizing the bacteria released from walkers in built environments. The findings presented herein advance our knowledge and understanding of the relationship between humans and bacterial communities in built environments, and will help improve public health in urban communities. PMID:28922412
Tan, Ling; Hu, Yerong; Tao, Yongguang; Wang, Bin; Xiao, Jun; Tang, Zhenjie; Lu, Ting
2018-01-01
Background To identify whether RET is a potential target for NSCLC treatment, we examined the status of the RET gene in 631 early and mid stage NSCLC cases from south central China. Methods RET expression was identified by Western blot. RET‐positive expression samples were verified by immunohistochemistry. RET gene mutation, copy number variation, and rearrangement were analyzed by DNA Sanger sequencing, TaqMan copy number assays, and reverse transcription‐PCR. ALK and ROS1 expression levels were tested by Western blot and EGFR mutation using Sanger sequencing. Results The RET‐positive rate was 2.5% (16/631). RET‐positive expression was related to poorer tumor differentiation (P < 0.05). In the 16 RET‐positive samples, only two samples of moderately and poorly differentiated lung adenocarcinomas displayed RET rearrangement, both in RET‐KIF5B fusion partners. Neither ALK nor ROS1 translocation was found. The EGFR mutation rate in RET‐positive samples was significantly lower than in RET‐negative samples (P < 0.05). Conclusion RET‐positive expression in early and mid stage NSCLC cases from south central China is relatively low and is related to poorer tumor differentiation. RET gene alterations (copy number gain and rearrangement) exist in all RET‐positive samples. RET‐positive expression is a relatively independent factor in NSCLC patients, which indicates that the RET gene may be a novel target site for personalized treatment of NSCLC. PMID:29473341
Practicability of monitoring soil Cd, Hg, and Pb pollution based on a geochemical survey in China.
Xia, Xueqi; Yang, Zhongfang; Li, Guocheng; Yu, Tao; Hou, Qingye; Mutelo, Admire Muchimamui
2017-04-01
Repeated visiting, i.e., sampling and analysis at two or more temporal points, is one of the important ways of monitoring soil heavy metal contamination. However, with the concern about the cost, determination of the number of samples and the temporal interval, and their capability to detect a certain change is a key technical problem to be solved. This depends on the spatial variation of the parameters in the monitoring units. The "National Multi-Purpose Regional Geochemical Survey" (NMPRGS) project in China, acquired the spatial distribution of heavy metals using a high density sampling method in the most arable regions in China. Based on soil Cd, Hg, and Pb data and taking administrative regions as the monitoring units, the number of samples and temporal intervals that may be used for monitoring soil heavy metal contamination were determined. It was found that there is a large variety of spatial variation of the elements in each NMPRGS region. This results in the difficulty in the determination of the minimum detectable changes (MDC), the number of samples, and temporal intervals for revisiting. This paper recommends a suitable set of the number of samples (n r ) for each region under the balance of cost, practicability, and monitoring precision. Under n r , MDC values are acceptable for all the regions, and the minimum temporal intervals are practical with the range of 3.3-13.3 years. Copyright © 2017 Elsevier Ltd. All rights reserved.
Laukkanen-Ninios, R.; Ortiz Martínez, P.; Siitonen, A.; Fredriksson-Ahomaa, M.; Korkeala, H.
2013-01-01
Sporadic and epidemiologically linked Yersinia enterocolitica strains (n = 379) isolated from fecal samples from human patients, tonsil or fecal samples from pigs collected at slaughterhouses, and pork samples collected at meat stores were genotyped using multiple-locus variable-number tandem-repeat analysis (MLVA) with six loci, i.e., V2A, V4, V5, V6, V7, and V9. In total, 312 different MLVA types were found. Similar types were detected (i) in fecal samples collected from human patients over 2 to 3 consecutive years, (ii) in samples from humans and pigs, and (iii) in samples from pigs that originated from the same farms. Among porcine strains, we found farm-specific MLVA profiles. Variations in the numbers of tandem repeats from one to four for variable-number tandem-repeat (VNTR) loci V2A, V5, V6, and V7 were observed within a farm. MLVA was applicable for serotypes O:3, O:5,27, and O:9 and appeared to be a highly discriminating tool for distinguishing sporadic and outbreak-related strains. With long-term use, interpretation of the results became more challenging due to variations in more-discriminating loci, as was observed for strains originating from pig farms. Additionally, we encountered unexpectedly short V2A VNTR fragments and sequenced them. According to the sequencing results, updated guidelines for interpreting V2A VNTR results were prepared. PMID:23637293
Accuracy or precision: Implications of sample design and methodology on abundance estimation
Kowalewski, Lucas K.; Chizinski, Christopher J.; Powell, Larkin A.; Pope, Kevin L.; Pegg, Mark A.
2015-01-01
Sampling by spatially replicated counts (point-count) is an increasingly popular method of estimating population size of organisms. Challenges exist when sampling by point-count method, and it is often impractical to sample entire area of interest and impossible to detect every individual present. Ecologists encounter logistical limitations that force them to sample either few large-sample units or many small sample-units, introducing biases to sample counts. We generated a computer environment and simulated sampling scenarios to test the role of number of samples, sample unit area, number of organisms, and distribution of organisms in the estimation of population sizes using N-mixture models. Many sample units of small area provided estimates that were consistently closer to true abundance than sample scenarios with few sample units of large area. However, sample scenarios with few sample units of large area provided more precise abundance estimates than abundance estimates derived from sample scenarios with many sample units of small area. It is important to consider accuracy and precision of abundance estimates during the sample design process with study goals and objectives fully recognized, although and with consequence, consideration of accuracy and precision of abundance estimates is often an afterthought that occurs during the data analysis process.
High-efficiency multiphoton boson sampling
NASA Astrophysics Data System (ADS)
Wang, Hui; He, Yu; Li, Yu-Huai; Su, Zu-En; Li, Bo; Huang, He-Liang; Ding, Xing; Chen, Ming-Cheng; Liu, Chang; Qin, Jian; Li, Jin-Peng; He, Yu-Ming; Schneider, Christian; Kamp, Martin; Peng, Cheng-Zhi; Höfling, Sven; Lu, Chao-Yang; Pan, Jian-Wei
2017-06-01
Boson sampling is considered as a strong candidate to demonstrate 'quantum computational supremacy' over classical computers. However, previous proof-of-principle experiments suffered from small photon number and low sampling rates owing to the inefficiencies of the single-photon sources and multiport optical interferometers. Here, we develop two central components for high-performance boson sampling: robust multiphoton interferometers with 99% transmission rate and actively demultiplexed single-photon sources based on a quantum dot-micropillar with simultaneously high efficiency, purity and indistinguishability. We implement and validate three-, four- and five-photon boson sampling, and achieve sampling rates of 4.96 kHz, 151 Hz and 4 Hz, respectively, which are over 24,000 times faster than previous experiments. Our architecture can be scaled up for a larger number of photons and with higher sampling rates to compete with classical computers, and might provide experimental evidence against the extended Church-Turing thesis.
Two-sample binary phase 2 trials with low type I error and low sample size.
Litwin, Samuel; Basickes, Stanley; Ross, Eric A
2017-04-30
We address design of two-stage clinical trials comparing experimental and control patients. Our end point is success or failure, however measured, with null hypothesis that the chance of success in both arms is p 0 and alternative that it is p 0 among controls and p 1 > p 0 among experimental patients. Standard rules will have the null hypothesis rejected when the number of successes in the (E)xperimental arm, E, sufficiently exceeds C, that among (C)ontrols. Here, we combine one-sample rejection decision rules, E⩾m, with two-sample rules of the form E - C > r to achieve two-sample tests with low sample number and low type I error. We find designs with sample numbers not far from the minimum possible using standard two-sample rules, but with type I error of 5% rather than 15% or 20% associated with them, and of equal power. This level of type I error is achieved locally, near the stated null, and increases to 15% or 20% when the null is significantly higher than specified. We increase the attractiveness of these designs to patients by using 2:1 randomization. Examples of the application of this new design covering both high and low success rates under the null hypothesis are provided. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Variance Estimation, Design Effects, and Sample Size Calculations for Respondent-Driven Sampling
2006-01-01
Hidden populations, such as injection drug users and sex workers, are central to a number of public health problems. However, because of the nature of these groups, it is difficult to collect accurate information about them, and this difficulty complicates disease prevention efforts. A recently developed statistical approach called respondent-driven sampling improves our ability to study hidden populations by allowing researchers to make unbiased estimates of the prevalence of certain traits in these populations. Yet, not enough is known about the sample-to-sample variability of these prevalence estimates. In this paper, we present a bootstrap method for constructing confidence intervals around respondent-driven sampling estimates and demonstrate in simulations that it outperforms the naive method currently in use. We also use simulations and real data to estimate the design effects for respondent-driven sampling in a number of situations. We conclude with practical advice about the power calculations that are needed to determine the appropriate sample size for a study using respondent-driven sampling. In general, we recommend a sample size twice as large as would be needed under simple random sampling. PMID:16937083
Distilled single-cell genome sequencing and de novo assembly for sparse microbial communities.
Taghavi, Zeinab; Movahedi, Narjes S; Draghici, Sorin; Chitsaz, Hamidreza
2013-10-01
Identification of every single genome present in a microbial sample is an important and challenging task with crucial applications. It is challenging because there are typically millions of cells in a microbial sample, the vast majority of which elude cultivation. The most accurate method to date is exhaustive single-cell sequencing using multiple displacement amplification, which is simply intractable for a large number of cells. However, there is hope for breaking this barrier, as the number of different cell types with distinct genome sequences is usually much smaller than the number of cells. Here, we present a novel divide and conquer method to sequence and de novo assemble all distinct genomes present in a microbial sample with a sequencing cost and computational complexity proportional to the number of genome types, rather than the number of cells. The method is implemented in a tool called Squeezambler. We evaluated Squeezambler on simulated data. The proposed divide and conquer method successfully reduces the cost of sequencing in comparison with the naïve exhaustive approach. Squeezambler and datasets are available at http://compbio.cs.wayne.edu/software/squeezambler/.
Estimating and comparing microbial diversity in the presence of sequencing errors
Chiu, Chun-Huo
2016-01-01
Estimating and comparing microbial diversity are statistically challenging due to limited sampling and possible sequencing errors for low-frequency counts, producing spurious singletons. The inflated singleton count seriously affects statistical analysis and inferences about microbial diversity. Previous statistical approaches to tackle the sequencing errors generally require different parametric assumptions about the sampling model or about the functional form of frequency counts. Different parametric assumptions may lead to drastically different diversity estimates. We focus on nonparametric methods which are universally valid for all parametric assumptions and can be used to compare diversity across communities. We develop here a nonparametric estimator of the true singleton count to replace the spurious singleton count in all methods/approaches. Our estimator of the true singleton count is in terms of the frequency counts of doubletons, tripletons and quadrupletons, provided these three frequency counts are reliable. To quantify microbial alpha diversity for an individual community, we adopt the measure of Hill numbers (effective number of taxa) under a nonparametric framework. Hill numbers, parameterized by an order q that determines the measures’ emphasis on rare or common species, include taxa richness (q = 0), Shannon diversity (q = 1, the exponential of Shannon entropy), and Simpson diversity (q = 2, the inverse of Simpson index). A diversity profile which depicts the Hill number as a function of order q conveys all information contained in a taxa abundance distribution. Based on the estimated singleton count and the original non-singleton frequency counts, two statistical approaches (non-asymptotic and asymptotic) are developed to compare microbial diversity for multiple communities. (1) A non-asymptotic approach refers to the comparison of estimated diversities of standardized samples with a common finite sample size or sample completeness. This approach aims to compare diversity estimates for equally-large or equally-complete samples; it is based on the seamless rarefaction and extrapolation sampling curves of Hill numbers, specifically for q = 0, 1 and 2. (2) An asymptotic approach refers to the comparison of the estimated asymptotic diversity profiles. That is, this approach compares the estimated profiles for complete samples or samples whose size tends to be sufficiently large. It is based on statistical estimation of the true Hill number of any order q ≥ 0. In the two approaches, replacing the spurious singleton count by our estimated count, we can greatly remove the positive biases associated with diversity estimates due to spurious singletons and also make fair comparisons across microbial communities, as illustrated in our simulation results and in applying our method to analyze sequencing data from viral metagenomes. PMID:26855872
Barkhofen, Sonja; Bartley, Tim J; Sansoni, Linda; Kruse, Regina; Hamilton, Craig S; Jex, Igor; Silberhorn, Christine
2017-01-13
Sampling the distribution of bosons that have undergone a random unitary evolution is strongly believed to be a computationally hard problem. Key to outperforming classical simulations of this task is to increase both the number of input photons and the size of the network. We propose driven boson sampling, in which photons are input within the network itself, as a means to approach this goal. We show that the mean number of photons entering a boson sampling experiment can exceed one photon per input mode, while maintaining the required complexity, potentially leading to less stringent requirements on the input states for such experiments. When using heralded single-photon sources based on parametric down-conversion, this approach offers an ∼e-fold enhancement in the input state generation rate over scattershot boson sampling, reaching the scaling limit for such sources. This approach also offers a dramatic increase in the signal-to-noise ratio with respect to higher-order photon generation from such probabilistic sources, which removes the need for photon number resolution during the heralding process as the size of the system increases.
Exact sampling of graphs with prescribed degree correlations
NASA Astrophysics Data System (ADS)
Bassler, Kevin E.; Del Genio, Charo I.; Erdős, Péter L.; Miklós, István; Toroczkai, Zoltán
2015-08-01
Many real-world networks exhibit correlations between the node degrees. For instance, in social networks nodes tend to connect to nodes of similar degree and conversely, in biological and technological networks, high-degree nodes tend to be linked with low-degree nodes. Degree correlations also affect the dynamics of processes supported by a network structure, such as the spread of opinions or epidemics. The proper modelling of these systems, i.e., without uncontrolled biases, requires the sampling of networks with a specified set of constraints. We present a solution to the sampling problem when the constraints imposed are the degree correlations. In particular, we develop an exact method to construct and sample graphs with a specified joint-degree matrix, which is a matrix providing the number of edges between all the sets of nodes of a given degree, for all degrees, thus completely specifying all pairwise degree correlations, and additionally, the degree sequence itself. Our algorithm always produces independent samples without backtracking. The complexity of the graph construction algorithm is {O}({NM}) where N is the number of nodes and M is the number of edges.
Method and system for providing precise multi-function modulation
NASA Technical Reports Server (NTRS)
Davarian, Faramaz (Inventor); Sumida, Joe T. (Inventor)
1989-01-01
A method and system is disclosed which provides precise multi-function digitally implementable modulation for a communication system. The invention provides a modulation signal for a communication system in response to an input signal from a data source. A digitized time response is generated from samples of a time domain representation of a spectrum profile of a selected modulation scheme. The invention generates and stores coefficients for each input symbol in accordance with the selected modulation scheme. The output signal is provided by a plurality of samples, each sample being generated by summing the products of a predetermined number of the coefficients and a predetermined number of the samples of the digitized time response. In a specific illustrative implementation, the samples of the output signals are converted to analog signals, filtered and used to modulate a carrier in a conventional manner. The invention is versatile in that it allows for the storage of the digitized time responses and corresponding coefficient lookup table of a number of modulation schemes, any of which may then be selected for use in accordance with the teachings of the invention.
Accelerating root system phenotyping of seedlings through a computer-assisted processing pipeline.
Dupuy, Lionel X; Wright, Gladys; Thompson, Jacqueline A; Taylor, Anna; Dekeyser, Sebastien; White, Christopher P; Thomas, William T B; Nightingale, Mark; Hammond, John P; Graham, Neil S; Thomas, Catherine L; Broadley, Martin R; White, Philip J
2017-01-01
There are numerous systems and techniques to measure the growth of plant roots. However, phenotyping large numbers of plant roots for breeding and genetic analyses remains challenging. One major difficulty is to achieve high throughput and resolution at a reasonable cost per plant sample. Here we describe a cost-effective root phenotyping pipeline, on which we perform time and accuracy benchmarking to identify bottlenecks in such pipelines and strategies for their acceleration. Our root phenotyping pipeline was assembled with custom software and low cost material and equipment. Results show that sample preparation and handling of samples during screening are the most time consuming task in root phenotyping. Algorithms can be used to speed up the extraction of root traits from image data, but when applied to large numbers of images, there is a trade-off between time of processing the data and errors contained in the database. Scaling-up root phenotyping to large numbers of genotypes will require not only automation of sample preparation and sample handling, but also efficient algorithms for error detection for more reliable replacement of manual interventions.
Simplified pupal surveys of Aedes aegypti (L.) for entomologic surveillance and dengue control.
Barrera, Roberto
2009-07-01
Pupal surveys of Aedes aegypti (L.) are useful indicators of risk for dengue transmission, although sample sizes for reliable estimations can be large. This study explores two methods for making pupal surveys more practical yet reliable and used data from 10 pupal surveys conducted in Puerto Rico during 2004-2008. The number of pupae per person for each sampling followed a negative binomial distribution, thus showing aggregation. One method found a common aggregation parameter (k) for the negative binomial distribution, a finding that enabled the application of a sequential sampling method requiring few samples to determine whether the number of pupae/person was above a vector density threshold for dengue transmission. A second approach used the finding that the mean number of pupae/person is correlated with the proportion of pupa-infested households and calculated equivalent threshold proportions of pupa-positive households. A sequential sampling program was also developed for this method to determine whether observed proportions of infested households were above threshold levels. These methods can be used to validate entomological thresholds for dengue transmission.
Sticky trap and stem-tap sampling protocols for the Asian citrus psyllid (Hemiptera: Psyllidae)
USDA-ARS?s Scientific Manuscript database
Sampling statistics were obtained to develop a sampling protocol for estimating numbers of adult Diaphorina citri in citrus using two different sampling methods: yellow sticky traps and stem–tap samples. A 4.0 ha block of mature orange trees was stratified into ten 0.4 ha strata and sampled using...
A high-throughput microRNA expression profiling system.
Guo, Yanwen; Mastriano, Stephen; Lu, Jun
2014-01-01
As small noncoding RNAs, microRNAs (miRNAs) regulate diverse biological functions, including physiological and pathological processes. The expression and deregulation of miRNA levels contain rich information with diagnostic and prognostic relevance and can reflect pharmacological responses. The increasing interest in miRNA-related research demands global miRNA expression profiling on large numbers of samples. We describe here a robust protocol that supports high-throughput sample labeling and detection on hundreds of samples simultaneously. This method employs 96-well-based miRNA capturing from total RNA samples and on-site biochemical reactions, coupled with bead-based detection in 96-well format for hundreds of miRNAs per sample. With low-cost, high-throughput, high detection specificity, and flexibility to profile both small and large numbers of samples, this protocol can be adapted in a wide range of laboratory settings.
Designing occupancy studies: general advice and allocating survey effort
MacKenzie, D.I.; Royle, J. Andrew
2005-01-01
1. The fraction of sampling units in a landscape where a target species is present (occupancy) is an extensively used concept in ecology. Yet in many applications the species will not always be detected in a sampling unit even when present, resulting in biased estimates of occupancy. Given that sampling units are surveyed repeatedly within a relatively short timeframe, a number of similar methods have now been developed to provide unbiased occupancy estimates. However, practical guidance on the efficient design of occupancy studies has been lacking. 2. In this paper we comment on a number of general issues related to designing occupancy studies, including the need for clear objectives that are explicitly linked to science or management, selection of sampling units, timing of repeat surveys and allocation of survey effort. Advice on the number of repeat surveys per sampling unit is considered in terms of the variance of the occupancy estimator, for three possible study designs. 3. We recommend that sampling units should be surveyed a minimum of three times when detection probability is high (> 0.5 survey-1), unless a removal design is used. 4. We found that an optimal removal design will generally be the most efficient, but we suggest it may be less robust to assumption violations than a standard design. 5. Our results suggest that for a rare species it is more efficient to survey more sampling units less intensively, while for a common species fewer sampling units should be surveyed more intensively. 6. Synthesis and applications. Reliable inferences can only result from quality data. To make the best use of logistical resources, study objectives must be clearly defined; sampling units must be selected, and repeated surveys timed appropriately; and a sufficient number of repeated surveys must be conducted. Failure to do so may compromise the integrity of the study. The guidance given here on study design issues is particularly applicable to studies of species occurrence and distribution, habitat selection and modelling, metapopulation studies and monitoring programmes.
DOE-2 sample run book: Version 2.1E
DOE Office of Scientific and Technical Information (OSTI.GOV)
Winkelmann, F.C.; Birdsall, B.E.; Buhl, W.F.
1993-11-01
The DOE-2 Sample Run Book shows inputs and outputs for a variety of building and system types. The samples start with a simple structure and continue to a high-rise office building, a medical building, three small office buildings, a bar/lounge, a single-family residence, a small office building with daylighting, a single family residence with an attached sunspace, a ``parameterized`` building using input macros, and a metric input/output example. All of the samples use Chicago TRY weather. The main purpose of the Sample Run Book is instructional. It shows the relationship of LOADS-SYSTEMS-PLANT-ECONOMICS inputs, displays various input styles, and illustrates manymore » of the basic and advanced features of the program. Many of the sample runs are preceded by a sketch of the building showing its general appearance and the zoning used in the input. In some cases we also show a 3-D rendering of the building as produced by the program DrawBDL. Descriptive material has been added as comments in the input itself. We find that a number of users have loaded these samples onto their editing systems and use them as ``templates`` for creating new inputs. Another way of using them would be to store various portions as files that can be read into the input using the {number_sign}{number_sign} include command, which is part of the Input Macro feature introduced in version DOE-2.lD. Note that the energy rate structures here are the same as in the DOE-2.lD samples, but have been rewritten using the new DOE-2.lE commands and keywords for ECONOMICS. The samples contained in this report are the same as those found on the DOE-2 release files. However, the output numbers that appear here may differ slightly from those obtained from the release files. The output on the release files can be used as a check set to compare results on your computer.« less
Activity classification using the GENEA: optimum sampling frequency and number of axes.
Zhang, Shaoyan; Murray, Peter; Zillmer, Ruediger; Eston, Roger G; Catt, Michael; Rowlands, Alex V
2012-11-01
The GENEA shows high accuracy for classification of sedentary, household, walking, and running activities when sampling at 80 Hz on three axes. It is not known whether it is possible to decrease this sampling frequency and/or the number of axes without detriment to classification accuracy. The purpose of this study was to compare the classification rate of activities on the basis of data from a single axis, two axes, and three axes, with sampling rates ranging from 5 to 80 Hz. Sixty participants (age, 49.4 yr (6.5 yr); BMI, 24.6 kg·m (3.4 kg·m)) completed 10-12 semistructured activities in the laboratory and outdoor environment while wearing a GENEA accelerometer on the right wrist. We analyzed data from single axis, dual axes, and three axes at sampling rates of 5, 10, 20, 40, and 80 Hz. Mathematical models based on features extracted from mean, SD, fast Fourier transform, and wavelet decomposition were built, which combined one of the numbers of axes with one of the sampling rates to classify activities into sedentary, household, walking, and running. Classification accuracy was high irrespective of the number of axes for data collected at 80 Hz (96.93% ± 0.97%), 40 Hz (97.4% ± 0.73%), 20 Hz (96.86% ± 1.12%), and 10 Hz (97.01% ± 1.01%) but dropped for data collected at 5 Hz (94.98% ± 1.36%). Sampling frequencies >10 Hz and/or more than one axis of measurement were not associated with greater classification accuracy. Lower sampling rates and measurement of a single axis would result in a lower data load, longer battery life, and higher efficiency of data processing. Further research should investigate whether a lower sampling rate and a single axis affects classification accuracy when considering a wider range of activities.
Anisotropy of the Hot Plastic Deformation of Ti-6Al-4V Single-Colony Samples (Preprint)
2009-04-01
April 2009 Journal Article Preprint 01 April 2009- 01 April 2009 4 . TITLE AND SUBTITLE ANISOTROPY OF THE HOT PLASTIC DEFORMATION OF Ti-6Al-4V SINGLE...COLONY SAMPLES (PREPRINT) 5a. CONTRACT NUMBER In-house 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 62102F 6 . AUTHOR(S) A.A. Salem and S.L...resistance, and low density, Ti-6Al-4V is the most commonly used alpha/beta titanium alloy. It accounts for approximately 80% of the total titanium used in
Compatible Basal Area and Number of Trees Estimators from Remeasured Horizontal Point Samples
Francis A. Roesch; Edwin J. Green; Charles T. Scott
1989-01-01
Compatible groups of estimators for total value at time 1 (V1), survivor growth (S), and ingrowth (I) for use with permanent horizontal point samples are evaluated for the special cases of estimating the change in both the number of trees and basal area. Caveats which should be observed before any one compatible grouping of estimators is chosen...
ERIC Educational Resources Information Center
Spearing, Debra; Woehlke, Paula
To assess the effect on discriminant analysis in terms of correct classification into two groups, the following parameters were systematically altered using Monte Carlo techniques: sample sizes; proportions of one group to the other; number of independent variables; and covariance matrices. The pairing of the off diagonals (or covariances) with…
ERIC Educational Resources Information Center
Green, Samuel B.; Thompson, Marilyn S.; Levy, Roy; Lo, Wen-Juo
2015-01-01
Traditional parallel analysis (T-PA) estimates the number of factors by sequentially comparing sample eigenvalues with eigenvalues for randomly generated data. Revised parallel analysis (R-PA) sequentially compares the "k"th eigenvalue for sample data to the "k"th eigenvalue for generated data sets, conditioned on"k"-…
40 CFR 761.308 - Sample selection by random number generation on any two-dimensional square grid.
Code of Federal Regulations, 2013 CFR
2013-07-01
... generation on any two-dimensional square grid. 761.308 Section 761.308 Protection of Environment... § 761.79(b)(3) § 761.308 Sample selection by random number generation on any two-dimensional square grid. (a) Divide the surface area of the non-porous surface into rectangular or square areas having a...
40 CFR 761.308 - Sample selection by random number generation on any two-dimensional square grid.
Code of Federal Regulations, 2011 CFR
2011-07-01
... generation on any two-dimensional square grid. 761.308 Section 761.308 Protection of Environment... § 761.79(b)(3) § 761.308 Sample selection by random number generation on any two-dimensional square grid. (a) Divide the surface area of the non-porous surface into rectangular or square areas having a...
40 CFR 761.308 - Sample selection by random number generation on any two-dimensional square grid.
Code of Federal Regulations, 2010 CFR
2010-07-01
... generation on any two-dimensional square grid. 761.308 Section 761.308 Protection of Environment... § 761.79(b)(3) § 761.308 Sample selection by random number generation on any two-dimensional square grid. (a) Divide the surface area of the non-porous surface into rectangular or square areas having a...
40 CFR 761.308 - Sample selection by random number generation on any two-dimensional square grid.
Code of Federal Regulations, 2014 CFR
2014-07-01
... generation on any two-dimensional square grid. 761.308 Section 761.308 Protection of Environment... § 761.79(b)(3) § 761.308 Sample selection by random number generation on any two-dimensional square grid. (a) Divide the surface area of the non-porous surface into rectangular or square areas having a...
40 CFR 761.308 - Sample selection by random number generation on any two-dimensional square grid.
Code of Federal Regulations, 2012 CFR
2012-07-01
... generation on any two-dimensional square grid. 761.308 Section 761.308 Protection of Environment... § 761.79(b)(3) § 761.308 Sample selection by random number generation on any two-dimensional square grid. (a) Divide the surface area of the non-porous surface into rectangular or square areas having a...
Relationships between Perron-Frobenius eigenvalue and measurements of loops in networks
NASA Astrophysics Data System (ADS)
Chen, Lei; Kou, Yingxin; Li, Zhanwu; Xu, An; Chang, Yizhe
2018-07-01
The Perron-Frobenius eigenvalue (PFE) is widely used as measurement of the number of loops in networks, but what exactly the relationship between the PFE and the number of loops in networks is has not been researched yet, is it strictly monotonically increasing? And what are the relationships between the PFE and other measurements of loops in networks? Such as the average loop degree of nodes, and the distribution of loop ranks. We make researches on these questions based on samples of ER random network, NW small-world network and BA scale-free network, and the results confirm that, both the number of loops in network and the average loop degree of nodes of all samples do increase with the increase of the PFE in general trend, but neither of them are strictly monotonically increasing, so the PFE is capable to be used as a rough estimative measurement of the number of loops in networks and the average loop degree of nodes. Furthermore, we find that a majority of the loop ranks of all samples obey Weibull distribution, of which the scale parameter A and the shape parameter B have approximate power-law relationships with the PFE of the samples.
Malone, E; Elliott, C; Kennedy, G; Savage, D; Regan, L
2011-05-01
A simple, new method permitting the simultaneous determination and confirmation of trace residues of 24 different growth promoters and metabolites using liquid chromatography-mass spectrometry was developed and validated. The compounds were extracted from bovine tissue using acetonitrile; sodium sulphate was also added at this stage to aid with purification. The resulting mixture was then evaporated to approximately 1 ml and subsequently centrifuged at high speed and an aliquot injected onto the LC-MS/MS system. The calculated CCα values ranged between 0.11 and 0.46 µg kg(-1); calculated CCβ were in the range 0.19-0.79 µg kg(-1). Accuracy, measurement of uncertainty, repeatability and linearity were also determined for each analyte. The analytical method was applied to a number of bovine tissue samples imported into Ireland from third countries. Levels of progesterone were found in a number of samples at concentrations ranging between 0.28 and 30.30 µg kg(-1). Levels of alpha- and beta-testosterone were also found in a number of samples at concentrations ranging between 0.22 and 8.63 µg kg(-1) and between 0.16 and 2.08 µg kg(-1) respectively.
Clostridium perfringens in the Environment1
Matches, Jack R.; Liston, John; Curran, Donald
1974-01-01
Clostridium perfringens was isolated from samples collected in Puget Sound in the state of Washington and areas considered as possible sources of these organisms to Puget Sound. The distribution of C. perfringens in the total Clostridium population was determined for fish gut contents and sediments collected in highly polluted and less polluted areas, sewage samples, freshwater sediments, and soils. The greatest numbers of C. perfringens were obtained from marine sediments collected near the sewage outfall at West Point. Fewer isolates were made from fish collected from less polluted stations, although the number of C. perfringens remained high in sediments from other Puget Sound stations. The proportion of C. perfringens in the total Clostridium populations varied between 56 and 71% for sewage samples and only 0.4 to 4.1% for freshwater sediments and soil samples. Only 25 C. perfringens isolates out of 137 from fish guts, or 18%, were identifiable serologically and these fell into 12 groups. C. perfringens were fed to fish and the fish were sacrificed after varying lengths of time. The number of C. perfringens increased slightly in the gut during the first 24 h and then the numbers decreased rapidly for the next 120 h. PMID:4371684
1994-03-01
labels of a, which is called significance levels. The hypothesis tests are done based on the a levels . The maximum probabilities of making type II error...critical values at specific a levels . This procedure is done for each of the 50,000 samples. The number of the samples passing each test at those specific... a levels is counted. The ratio of the number of accepted samples to 50,000 gives the percentage point. Then, subtracting that value from one would
Cluster Stability Estimation Based on a Minimal Spanning Trees Approach
NASA Astrophysics Data System (ADS)
Volkovich, Zeev (Vladimir); Barzily, Zeev; Weber, Gerhard-Wilhelm; Toledano-Kitai, Dvora
2009-08-01
Among the areas of data and text mining which are employed today in science, economy and technology, clustering theory serves as a preprocessing step in the data analyzing. However, there are many open questions still waiting for a theoretical and practical treatment, e.g., the problem of determining the true number of clusters has not been satisfactorily solved. In the current paper, this problem is addressed by the cluster stability approach. For several possible numbers of clusters we estimate the stability of partitions obtained from clustering of samples. Partitions are considered consistent if their clusters are stable. Clusters validity is measured as the total number of edges, in the clusters' minimal spanning trees, connecting points from different samples. Actually, we use the Friedman and Rafsky two sample test statistic. The homogeneity hypothesis, of well mingled samples within the clusters, leads to asymptotic normal distribution of the considered statistic. Resting upon this fact, the standard score of the mentioned edges quantity is set, and the partition quality is represented by the worst cluster corresponding to the minimal standard score value. It is natural to expect that the true number of clusters can be characterized by the empirical distribution having the shortest left tail. The proposed methodology sequentially creates the described value distribution and estimates its left-asymmetry. Numerical experiments, presented in the paper, demonstrate the ability of the approach to detect the true number of clusters.
Sample selection via angular distance in the space of the arguments of an artificial neural network
NASA Astrophysics Data System (ADS)
Fernández Jaramillo, J. M.; Mayerle, R.
2018-05-01
In the construction of an artificial neural network (ANN) a proper data splitting of the available samples plays a major role in the training process. This selection of subsets for training, testing and validation affects the generalization ability of the neural network. Also the number of samples has an impact in the time required for the design of the ANN and the training. This paper introduces an efficient and simple method for reducing the set of samples used for training a neural network. The method reduces the required time to calculate the network coefficients, while keeping the diversity and avoiding overtraining the ANN due the presence of similar samples. The proposed method is based on the calculation of the angle between two vectors, each one representing one input of the neural network. When the angle formed among samples is smaller than a defined threshold only one input is accepted for the training. The accepted inputs are scattered throughout the sample space. Tidal records are used to demonstrate the proposed method. The results of a cross-validation show that with few inputs the quality of the outputs is not accurate and depends on the selection of the first sample, but as the number of inputs increases the accuracy is improved and differences among the scenarios with a different starting sample have and important reduction. A comparison with the K-means clustering algorithm shows that for this application the proposed method with a smaller number of samples is producing a more accurate network.
7 CFR 42.104 - Sampling plans and defects.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 7 Agriculture 2 2010-01-01 2010-01-01 false Sampling plans and defects. 42.104 Section 42.104... REGULATIONS STANDARDS FOR CONDITION OF FOOD CONTAINERS Procedures for Stationary Lot Sampling and Inspection § 42.104 Sampling plans and defects. (a) Sampling plans. Sections 42.109 through 42.111 show the number...
7 CFR 42.104 - Sampling plans and defects.
Code of Federal Regulations, 2011 CFR
2011-01-01
... 7 Agriculture 2 2011-01-01 2011-01-01 false Sampling plans and defects. 42.104 Section 42.104... REGULATIONS STANDARDS FOR CONDITION OF FOOD CONTAINERS Procedures for Stationary Lot Sampling and Inspection § 42.104 Sampling plans and defects. (a) Sampling plans. Sections 42.109 through 42.111 show the number...
An Investigation of the Sampling Distribution of the Congruence Coefficient.
ERIC Educational Resources Information Center
Broadbooks, Wendy J.; Elmore, Patricia B.
This study developed and investigated an empirical sampling distribution of the congruence coefficient. The effects of sample size, number of variables, and population value of the congruence coefficient on the sampling distribution of the congruence coefficient were examined. Sample data were generated on the basis of the common factor model and…
NASA Astrophysics Data System (ADS)
Qiang, Wei
2011-12-01
We describe a sampling scheme for the two-dimensional (2D) solid state NMR experiments, which can be readily applied to the sensitivity-limited samples. The sampling scheme utilizes continuous, non-uniform sampling profile for the indirect dimension, i.e. the acquisition number decreases as a function of the evolution time ( t1) in the indirect dimension. For a beta amyloid (Aβ) fibril sample, we observed overall 40-50% signal enhancement by measuring the cross peak volume, while the cross peak linewidths remained comparable to the linewidths obtained by regular sampling and processing strategies. Both the linear and Gaussian decay functions for the acquisition numbers result in similar percentage of increment in signal. In addition, we demonstrated that this sampling approach can be applied with different dipolar recoupling approaches such as radiofrequency assisted diffusion (RAD) and finite-pulse radio-frequency-driven recoupling (fpRFDR). This sampling scheme is especially suitable for the sensitivity-limited samples which require long signal averaging for each t1 point, for instance the biological membrane proteins where only a small fraction of the sample is isotopically labeled.
Rodic, Maja; Zhou, Xinlin; Tikhomirova, Tatiana; Wei, Wei; Malykh, Sergei; Ismatulina, Victoria; Sabirova, Elena; Davidova, Yulia; Tosto, Maria Grazia; Lemelin, Jean-Pascal; Kovas, Yulia
2015-01-01
The present study evaluated 626 5-7-year-old children in the UK, China, Russia, and Kyrgyzstan on a cognitive test battery measuring: (1) general skills; (2) non-symbolic number sense; (3) symbolic number understanding; (4) simple arithmetic - operating with numbers; and (5) familiarity with numbers. Although most inter-population differences were small, 13% of the variance in arithmetic skills could be explained by the sample, replicating the pattern, previously found with older children in PISA. Furthermore, the same cognitive skills were related to early arithmetic in these diverse populations. Only understanding of symbolic number explained variation in mathematical performance in all samples. We discuss the results in terms of potential influences of socio-demographic, linguistic and genetic factors on individual differences in mathematics. © 2014 John Wiley & Sons Ltd.
Herrick, Robert F; Stewart, James H; Allen, Joseph G
2016-02-01
PCBs in building materials such as caulks and sealants are a largely unrecognized source of contamination in the building environment. Schools are of particular interest, as the period of extensive school construction (about 1950 to 1980) coincides with the time of greatest use of PCBs as plasticizers in building materials. In the USA, we estimate that the number of schools with PCB in building caulk ranges from 12,960 to 25,920 based upon the number of schools built in the time of PCB use and the proportion of buildings found to contain PCB caulk and sealants. Field and laboratory studies have demonstrated that PCBs from both interior and exterior caulking can be the source of elevated PCB air concentrations in these buildings, at levels that exceed health-based PCB exposure guidelines for building occupants. Air sampling in buildings containing PCB caulk has shown that the airborne PCB concentrations can be highly variable, even in repeat samples collected within a room. Sampling and data analysis strategies that recognize this variability can provide the basis for informed decision making about compliance with health-based exposure limits, even in cases where small numbers of samples are taken. The health risks posed by PCB exposures, particularly among children, mandate precautionary approaches to managing PCBs in building materials.
Scholtens, Ingrid; Laurensse, Emile; Molenaar, Bonnie; Zaaijer, Stephanie; Gaballo, Heidi; Boleij, Peter; Bak, Arno; Kok, Esther
2013-09-25
Nowadays most animal feed products imported into Europe have a GMO (genetically modified organism) label. This means that they contain European Union (EU)-authorized GMOs. For enforcement of these labeling requirements, it is necessary, with the rising number of EU-authorized GMOs, to perform an increasing number of analyses. In addition to this, it is necessary to test products for the potential presence of EU-unauthorized GMOs. Analysis for EU-authorized and -unauthorized GMOs in animal feed has thus become laborious and expensive. Initial screening steps may reduce the number of GMO identification methods that need to be applied, but with the increasing diversity also screening with GMO elements has become more complex. For the present study, the application of an informative detailed 24-element screening and subsequent identification strategy was applied in 50 animal feed samples. Almost all feed samples were labeled as containing GMO-derived materials. The main goal of the study was therefore to investigate if a detailed screening strategy would reduce the number of subsequent identification analyses. An additional goal was to test the samples in this way for the potential presence of EU-unauthorized GMOs. Finally, to test the robustness of the approach, eight of the samples were tested in a concise interlaboratory study. No significant differences were found between the results of the two laboratories.
Li, Chun-Hong; Zuo, Hua-Li; Zhang, Qian; Wang, Feng-Qin; Hu, Yuan-Jia; Qian, Zheng-Ming; Li, Wen-Jia; Xia, Zhi-Ning; Yang, Feng-Qing
2017-01-01
Background: As one of the bioactive components in Cordyceps sinensis (CS), proteins were rarely used as index components to study the correlation between the protein components and producing areas of natural CS. Objective: Protein components of 26 natural CS samples produced in Qinghai, Tibet, and Sichuan provinces were analyzed and compared to investigate the relationship among 26 different producing areas. Materials and Methods: Proteins from 26 different producing areas were extracted by Tris-HCl buffer with Triton X-100, and separated using sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and two-dimensional electrophoresis (2-DE). Results: The SDS-PAGE results indicated that the number of protein bands and optical density curves of proteins in 26 CS samples was a bit different. However, the 2-DE results showed that the numbers and abundance of protein spots in protein profiles of 26 samples were obviously different and showed certain association with producing areas. Conclusions: Based on the expression values of matched protein spots, 26 batches of CS samples can be divided into two main categories (Tibet and Qinghai) by hierarchical cluster analysis. SUMMARY The number of protein bands and optical density curves of proteins in 26 Cordyceps sinensis samples were a bit different on the sodium dodecyl sulfate-polyacrylamide gel electrophoresis protein profilesNumbers and abundance of protein spots in protein profiles of 26 samples were obvious different on two-dimensional electrophoresis mapsTwenty-six different producing areas of natural Cordyceps sinensis samples were divided into two main categories (Tibet and Qinghai) by Hierarchical cluster analysis based on the values of matched protein spots. Abbreviations Used: SDS-PAGE: Sodium dodecyl sulfate polyacrylamide gel electrophoresis, 2-DE: Two-dimensional electrophoresis, Cordyceps sinensis: CS, TCMs: Traditional Chinese medicines PMID:28250651
Li, Chun-Hong; Zuo, Hua-Li; Zhang, Qian; Wang, Feng-Qin; Hu, Yuan-Jia; Qian, Zheng-Ming; Li, Wen-Jia; Xia, Zhi-Ning; Yang, Feng-Qing
2017-01-01
As one of the bioactive components in Cordyceps sinensis (CS), proteins were rarely used as index components to study the correlation between the protein components and producing areas of natural CS. Protein components of 26 natural CS samples produced in Qinghai, Tibet, and Sichuan provinces were analyzed and compared to investigate the relationship among 26 different producing areas. Proteins from 26 different producing areas were extracted by Tris-HCl buffer with Triton X-100, and separated using sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and two-dimensional electrophoresis (2-DE). The SDS-PAGE results indicated that the number of protein bands and optical density curves of proteins in 26 CS samples was a bit different. However, the 2-DE results showed that the numbers and abundance of protein spots in protein profiles of 26 samples were obviously different and showed certain association with producing areas. Based on the expression values of matched protein spots, 26 batches of CS samples can be divided into two main categories (Tibet and Qinghai) by hierarchical cluster analysis. The number of protein bands and optical density curves of proteins in 26 Cordyceps sinensis samples were a bit different on the sodium dodecyl sulfate-polyacrylamide gel electrophoresis protein profilesNumbers and abundance of protein spots in protein profiles of 26 samples were obvious different on two-dimensional electrophoresis mapsTwenty-six different producing areas of natural Cordyceps sinensis samples were divided into two main categories (Tibet and Qinghai) by Hierarchical cluster analysis based on the values of matched protein spots. Abbreviations Used : SDS-PAGE: Sodium dodecyl sulfate polyacrylamide gel electrophoresis, 2-DE: Two-dimensional electrophoresis, Cordyceps sinensis : CS, TCMs: Traditional Chinese medicines.
NASA Astrophysics Data System (ADS)
Aptikaeva, O. I.; Gamburtsev, A. G.; Martyushov, A. N.
2012-12-01
We have investigated the numbers of emergency hospitalizations in mental and drug-treatment hospitals in Kazan in 1996-2006 and in Moscow in 1984-1996. Samples have been analyzed by disease type, sex, age, and place of residence (city or village). This study aims to discover differences and common traits in various structures of series of hospitalizations in these samples and their possible relationships with the changing parameters of the environment. We have found similar structures of series of samples of the same type both in Moscow and in Kazan. In some cases, cyclic structures of series of numbers of hospitalizations and series of changes in solar activity and the rate of rotation of the earth change simultaneously.
The use of mini-samples in palaeomagnetism
NASA Astrophysics Data System (ADS)
Böhnel, Harald; Michalk, Daniel; Nowaczyk, Norbert; Naranjo, Gildardo Gonzalez
2009-10-01
Rock cores of ~25 mm diameter are widely used in palaeomagnetism. Occasionally smaller diameters have been used as well which represents distinct advantages in terms of throughput, weight of equipment and core collections. How their orientation precision compares to 25 mm cores, however, has not been evaluated in detail before. Here we compare the site mean directions and their statistical parameters for 12 lava flows sampled with 25 mm cores (standard samples, typically 8 cores per site) and with 12 mm drill cores (mini-samples, typically 14 cores per site). The site-mean directions for both sample sizes appear to be indistinguishable in most cases. For the mini-samples, site dispersion parameters k on average are slightly lower than for the standard samples reflecting their larger orienting and measurement errors. Applying the Wilcoxon signed-rank test the probability that k or α95 have the same distribution for both sizes is acceptable only at the 17.4 or 66.3 per cent level, respectively. The larger mini-core numbers per site appears to outweigh the lower k values yielding also slightly smaller confidence limits α95. Further, both k and α95 are less variable for mini-samples than for standard size samples. This is interpreted also to result from the larger number of mini-samples per site, which better averages out the detrimental effect of undetected abnormal remanence directions. Sampling of volcanic rocks with mini-samples therefore does not present a disadvantage in terms of the overall obtainable uncertainty of site mean directions. Apart from this, mini-samples do present clear advantages during the field work, as about twice the number of drill cores can be recovered compared to 25 mm cores, and the sampled rock unit is then more widely covered, which reduces the contribution of natural random errors produced, for example, by fractures, cooling joints, and palaeofield inhomogeneities. Mini-samples may be processed faster in the laboratory, which is of particular advantage when carrying out palaeointensity experiments.
Shaffer, Patrick; Valsson, Omar; Parrinello, Michele
2016-01-01
The capabilities of molecular simulations have been greatly extended by a number of widely used enhanced sampling methods that facilitate escaping from metastable states and crossing large barriers. Despite these developments there are still many problems which remain out of reach for these methods which has led to a vigorous effort in this area. One of the most important problems that remains unsolved is sampling high-dimensional free-energy landscapes and systems that are not easily described by a small number of collective variables. In this work we demonstrate a new way to compute free-energy landscapes of high dimensionality based on the previously introduced variationally enhanced sampling, and we apply it to the miniprotein chignolin. PMID:26787868
Evaluating mixed samples as a source of error in non-invasive genetic studies using microsatellites
Roon, David A.; Thomas, M.E.; Kendall, K.C.; Waits, L.P.
2005-01-01
The use of noninvasive genetic sampling (NGS) for surveying wild populations is increasing rapidly. Currently, only a limited number of studies have evaluated potential biases associated with NGS. This paper evaluates the potential errors associated with analysing mixed samples drawn from multiple animals. Most NGS studies assume that mixed samples will be identified and removed during the genotyping process. We evaluated this assumption by creating 128 mixed samples of extracted DNA from brown bear (Ursus arctos) hair samples. These mixed samples were genotyped and screened for errors at six microsatellite loci according to protocols consistent with those used in other NGS studies. Five mixed samples produced acceptable genotypes after the first screening. However, all mixed samples produced multiple alleles at one or more loci, amplified as only one of the source samples, or yielded inconsistent electropherograms by the final stage of the error-checking process. These processes could potentially reduce the number of individuals observed in NGS studies, but errors should be conservative within demographic estimates. Researchers should be aware of the potential for mixed samples and carefully design gel analysis criteria and error checking protocols to detect mixed samples.
Blais, Lucie; Vilain, Anne; Kettani, Fatima-Zohra; Forget, Amélie; Lalonde, Geneviève; Beauchesne, Marie-France; Ducharme, Francine M; Lemière, Catherine
2014-01-01
Objectives and hypotheses Adherence to inhaled corticosteroids (ICS) is a major issue in asthma. This study aimed to estimate the accuracy of the days’ supply and number of refills allowed, variables recorded in Québec claims databases and used to estimate adherence, and to develop correction factors, if required. We hypothesised that the accuracy of the days’ supply for ICS would be low whereas the accuracy of the number of refills allowed would be high. Setting 40 community pharmacies in Québec (Canada) and a medication registry. Participants We collected data for 1108 ICS original prescriptions stored in the 40 pharmacies (sample 1), and we obtained a second sample of 2676 ICS prescriptions selected from reMed, a medication registry (sample 2). Primary and secondary outcomes We estimated the concordance of the days’ supply and number of refills between Québec claims databases and the original prescription from sample 1. We developed a correction factor for the days’ supply in sample 1 and validated it in sample 2. Analyses were stratified by age: 0–11 and 12–64 years. Results In sample 1, the concordance for the days’ supply was 39.6% (95% CI 37.6% to 41.6%) in those aged 0–11 years and 56% (54.9% to 57.2%) in those aged 12–64 years. The concordance increased to 59.4% (58.2% to 60.5%) in those aged 0–11 years and 74.2% (73.5% to 74.9%) in those aged 12–64 years after applying the correction factors in sample 2. The concordance for the refills allowed was 92.1% (91% to 93.1%) in those aged 0–11 years and 93.1% (92.5% to 93.7%) in those aged 12–64 years in sample 1. Conclusions The accuracy of the days’ supply was moderate among those aged 0–11 years and substantial among those aged 12–64 years after applying the correction factors. The accuracy of the number of refills was almost perfect in both groups. PMID:25432902
Forcino, Frank L; Leighton, Lindsey R; Twerdy, Pamela; Cahill, James F
2015-01-01
Community ecologists commonly perform multivariate techniques (e.g., ordination, cluster analysis) to assess patterns and gradients of taxonomic variation. A critical requirement for a meaningful statistical analysis is accurate information on the taxa found within an ecological sample. However, oversampling (too many individuals counted per sample) also comes at a cost, particularly for ecological systems in which identification and quantification is substantially more resource consuming than the field expedition itself. In such systems, an increasingly larger sample size will eventually result in diminishing returns in improving any pattern or gradient revealed by the data, but will also lead to continually increasing costs. Here, we examine 396 datasets: 44 previously published and 352 created datasets. Using meta-analytic and simulation-based approaches, the research within the present paper seeks (1) to determine minimal sample sizes required to produce robust multivariate statistical results when conducting abundance-based, community ecology research. Furthermore, we seek (2) to determine the dataset parameters (i.e., evenness, number of taxa, number of samples) that require larger sample sizes, regardless of resource availability. We found that in the 44 previously published and the 220 created datasets with randomly chosen abundances, a conservative estimate of a sample size of 58 produced the same multivariate results as all larger sample sizes. However, this minimal number varies as a function of evenness, where increased evenness resulted in increased minimal sample sizes. Sample sizes as small as 58 individuals are sufficient for a broad range of multivariate abundance-based research. In cases when resource availability is the limiting factor for conducting a project (e.g., small university, time to conduct the research project), statistically viable results can still be obtained with less of an investment.
Morvan, B; Bonnemoy, F; Fonty, G; Gouet, P
1996-03-01
Total number of bacteria, cellulolytic bacteria, and H2-utilizing microbial populations (methanogenic archaea, acetogenic and sulfate-reducing bacteria) were enumerated in fresh rumen samples from sheep, cattle, buffaloes, deer, llamas, and caecal samples from horses. Methanogens and sulfate reducers were found in all samples, whereas acetogenes were not detected in some samples of each animal. Archaea methanogens were the largest H2-utilizing populations in all animals, and a correlation was observed between the numbers of methanogens and those of cellulolytic microorganisms. Higher counts of acetogens were found in horses and llamas (1 x 10(4) and 4 x 10(4) cells ml-1 respectively).
A Monte Carlo Program for Simulating Selection Decisions from Personnel Tests
ERIC Educational Resources Information Center
Petersen, Calvin R.; Thain, John W.
1976-01-01
Relative to test and criterion parameters and cutting scores, the correlation coefficient, sample size, and number of samples to be drawn (all inputs), this program calculates decision classification rates across samples and for combined samples. Several other related indices are also computed. (Author)
Ademi, Abdulakim; Grozdanov, Anita; Paunović, Perica; Dimitrov, Aleksandar T
2015-01-01
Summary A model consisting of an equation that includes graphene thickness distribution is used to calculate theoretical 002 X-ray diffraction (XRD) peak intensities. An analysis was performed upon graphene samples produced by two different electrochemical procedures: electrolysis in aqueous electrolyte and electrolysis in molten salts, both using a nonstationary current regime. Herein, the model is enhanced by a partitioning of the corresponding 2θ interval, resulting in significantly improved accuracy of the results. The model curves obtained exhibit excellent fitting to the XRD intensities curves of the studied graphene samples. The employed equation parameters make it possible to calculate the j-layer graphene region coverage of the graphene samples, and hence the number of graphene layers. The results of the thorough analysis are in agreement with the calculated number of graphene layers from Raman spectra C-peak position values and indicate that the graphene samples studied are few-layered. PMID:26665083
Phenotypic constraints promote latent versatility and carbon efficiency in metabolic networks.
Bardoscia, Marco; Marsili, Matteo; Samal, Areejit
2015-07-01
System-level properties of metabolic networks may be the direct product of natural selection or arise as a by-product of selection on other properties. Here we study the effect of direct selective pressure for growth or viability in particular environments on two properties of metabolic networks: latent versatility to function in additional environments and carbon usage efficiency. Using a Markov chain Monte Carlo (MCMC) sampling based on flux balance analysis (FBA), we sample from a known biochemical universe random viable metabolic networks that differ in the number of directly constrained environments. We find that the latent versatility of sampled metabolic networks increases with the number of directly constrained environments and with the size of the networks. We then show that the average carbon wastage of sampled metabolic networks across the constrained environments decreases with the number of directly constrained environments and with the size of the networks. Our work expands the growing body of evidence about nonadaptive origins of key functional properties of biological networks.
Dobbs, N A; Twelves, C J; Ramirez, A J; Towlson, K E; Gregory, W M; Richards, M A
1993-01-01
We have studied the practical implications and acceptability to patients of pharmacokinetic studies in 34 women receiving anthracyclines for advanced breast cancer. The following parameters were recorded: age, ECOG performance status, psychological state (Rotterdam Symptom Checklist), cytotoxic drug and dose, number of venepunctures for treatment and sampling, and time when the sampling cannula was removed. Immediately after finishing pharmacokinetic sampling, patients completed a questionnaire which revealed that (i) all patients understood sampling was for research, (ii) 35% of patients experienced problems with sampling, (iii) benefits from participation were perceived by 56% of patients. Of 20 patients later questioned after completion of their treatment course, 40% recalled difficulties with blood sampling. Factors identifying in advance those patients who tolerate pharmacokinetic studies poorly were not identified but the number of venepunctures should be minimised. Patients may also perceive benefits from 'non-therapeutic' research.
Effect of Thermodiffusion Nitriding on Cytocompatibility of Ti-6Al-4V Titanium Alloy
NASA Astrophysics Data System (ADS)
Pohrelyuk, I. M.; Tkachuk, O. V.; Proskurnyak, R. V.; Boiko, N. M.; Kluchivska, O. Yu.; Stoika, R. S.
2016-04-01
The nitrided layer was formed on the surface of Ti-6Al-4V titanium alloy by the thermodiffusion saturation in nitrogen at the atmospheric pressure. The study of the vitality of pseudonormal human embryo kidney cells of the HEK293T line showed that their cultivation in the presence of the untreated alloy sample is accompanied by a statistically significant reduction in the number of living cells compared with the control sample (untreated cells), whereas their cultivation in the presence of the nitrided alloy sample does not change the cell number considerably. In addition, it was shown that cell behavior in the presence of the nitrided sample differs only slightly from the control sample, whereas the growth of cells in the presence of the untreated alloy differed significantly from that in the control sample, demonstrating small groups of cells instead of their big clusters.
Optimal design in pediatric pharmacokinetic and pharmacodynamic clinical studies.
Roberts, Jessica K; Stockmann, Chris; Balch, Alfred; Yu, Tian; Ward, Robert M; Spigarelli, Michael G; Sherwin, Catherine M T
2015-03-01
It is not trivial to conduct clinical trials with pediatric participants. Ethical, logistical, and financial considerations add to the complexity of pediatric studies. Optimal design theory allows investigators the opportunity to apply mathematical optimization algorithms to define how to structure their data collection to answer focused research questions. These techniques can be used to determine an optimal sample size, optimal sample times, and the number of samples required for pharmacokinetic and pharmacodynamic studies. The aim of this review is to demonstrate how to determine optimal sample size, optimal sample times, and the number of samples required from each patient by presenting specific examples using optimal design tools. Additionally, this review aims to discuss the relative usefulness of sparse vs rich data. This review is intended to educate the clinician, as well as the basic research scientist, whom plan on conducting a pharmacokinetic/pharmacodynamic clinical trial in pediatric patients. © 2015 John Wiley & Sons Ltd.
Lee, K V; Moon, R D; Burkness, E C; Hutchison, W D; Spivak, M
2010-08-01
The parasitic mite Varroa destructor Anderson & Trueman (Acari: Varroidae) is arguably the most detrimental pest of the European-derived honey bee, Apis mellifera L. Unfortunately, beekeepers lack a standardized sampling plan to make informed treatment decisions. Based on data from 31 commercial apiaries, we developed sampling plans for use by beekeepers and researchers to estimate the density of mites in individual colonies or whole apiaries. Beekeepers can estimate a colony's mite density with chosen level of precision by dislodging mites from approximately to 300 adult bees taken from one brood box frame in the colony, and they can extrapolate to mite density on a colony's adults and pupae combined by doubling the number of mites on adults. For sampling whole apiaries, beekeepers can repeat the process in each of n = 8 colonies, regardless of apiary size. Researchers desiring greater precision can estimate mite density in an individual colony by examining three, 300-bee sample units. Extrapolation to density on adults and pupae may require independent estimates of numbers of adults, of pupae, and of their respective mite densities. Researchers can estimate apiary-level mite density by taking one 300-bee sample unit per colony, but should do so from a variable number of colonies, depending on apiary size. These practical sampling plans will allow beekeepers and researchers to quantify mite infestation levels and enhance understanding and management of V. destructor.
NASA Astrophysics Data System (ADS)
Janniche, G. S.; Mouvet, C.; Albrechtsen, H.-J.
2011-04-01
Vertical variation in sorption and mineralization potential of mecoprop (MCPP), isoproturon and acetochlor were investigated at low concentrations (μg-range) at the cm-scale in unsaturated sub-surface limestone samples and saturated sandy aquifer samples from an agricultural catchment in Brévilles, France. From two intact core drills, four heterogenic limestone sections were collected from 4.50 to 26.40 m below surface (mbs) and divided into 12 sub-samples of 8-25 cm length, and one sandy aquifer section from 19.20 to 19.53 m depth divided into 7 sub-samples of 4-5 cm length. In the sandy aquifer section acetochlor and isoproturon sorption increased substantially with depth; in average 78% (acetochlor) and 61% (isoproturon) per 5 cm. Also the number of acetochlor and isoproturon degraders (most-probable-number) was higher in the bottom half of the aquifer section (93-> 16 000/g) than in the upper half (4-71/g). One 50 cm long limestone section with a distinct shift in color showed a clear shift in mineralization, number of degraders and sorption: In the two brown, uppermost samples, up to 31% mecoprop and up to 9% isoproturon was mineralized during 231 days, the numbers of mecoprop and isoproturon degraders were 1300 to > 16 000/g, and the sorption of both isoproturon and acetochlor was more than three times higher, compared to the two deeper, grayish samples just below where mineralization (≤ 4%) and numbers of degraders (1-520/g) were low for all three herbicides. In both unsaturated limestone and sandy aquifer, variations and even distinct shifts in both mineralization, number of specific degraders and sorption were seen within just 4-15 cm of vertical distance. A simple conceptual model of herbicides leaching to groundwater through a 10 m unsaturated limestone was established, and calculations showed that a 30 cm active layer with the measured sorption and mineralization values hardly impacted the fate of the investigated herbicides, whereas a total thickness of layers of 1 m would substantially increase natural attenuation.
Variability in Population Density of House Dust Mites of Bitlis and Muş, Turkey.
Aykut, M; Erman, O K; Doğan, S
2016-05-01
This study was conducted to investigate the relationship between the number of house dust mites/g dust and different physical and environmental variables. A total of 1,040 house dust samples were collected from houses in Bitlis and Muş Provinces, Turkey, between May 2010 and February 2012. Overall, 751 (72.2%) of dust samples were mite positive. The number of mites/g dust varied between 20 and 1,840 in mite-positive houses. A significant correlation was detected between mean number of mites and altitude of houses, frequency of monthly vacuum cleaning, number of individuals in the household, and relative humidity. No association was found between the number of mites and temperature, type of heating, existence of allergic diseases, age and structure of houses. A maximum number of mites were detected in summer and a minimum number was detected in autumn. © The Authors 2016. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
HYPERSAMP - HYPERGEOMETRIC ATTRIBUTE SAMPLING SYSTEM BASED ON RISK AND FRACTION DEFECTIVE
NASA Technical Reports Server (NTRS)
De, Salvo L. J.
1994-01-01
HYPERSAMP is a demonstration of an attribute sampling system developed to determine the minimum sample size required for any preselected value for consumer's risk and fraction of nonconforming. This statistical method can be used in place of MIL-STD-105E sampling plans when a minimum sample size is desirable, such as when tests are destructive or expensive. HYPERSAMP utilizes the Hypergeometric Distribution and can be used for any fraction nonconforming. The program employs an iterative technique that circumvents the obstacle presented by the factorial of a non-whole number. HYPERSAMP provides the required Hypergeometric sample size for any equivalent real number of nonconformances in the lot or batch under evaluation. Many currently used sampling systems, such as the MIL-STD-105E, utilize the Binomial or the Poisson equations as an estimate of the Hypergeometric when performing inspection by attributes. However, this is primarily because of the difficulty in calculation of the factorials required by the Hypergeometric. Sampling plans based on the Binomial or Poisson equations will result in the maximum sample size possible with the Hypergeometric. The difference in the sample sizes between the Poisson or Binomial and the Hypergeometric can be significant. For example, a lot size of 400 devices with an error rate of 1.0% and a confidence of 99% would require a sample size of 400 (all units would need to be inspected) for the Binomial sampling plan and only 273 for a Hypergeometric sampling plan. The Hypergeometric results in a savings of 127 units, a significant reduction in the required sample size. HYPERSAMP is a demonstration program and is limited to sampling plans with zero defectives in the sample (acceptance number of zero). Since it is only a demonstration program, the sample size determination is limited to sample sizes of 1500 or less. The Hypergeometric Attribute Sampling System demonstration code is a spreadsheet program written for IBM PC compatible computers running DOS and Lotus 1-2-3 or Quattro Pro. This program is distributed on a 5.25 inch 360K MS-DOS format diskette, and the program price includes documentation. This statistical method was developed in 1992.
Gavett, Brandon E
2015-03-01
The base rates of abnormal test scores in cognitively normal samples have been a focus of recent research. The goal of the current study is to illustrate how Bayes' theorem uses these base rates--along with the same base rates in cognitively impaired samples and prevalence rates of cognitive impairment--to yield probability values that are more useful for making judgments about the absence or presence of cognitive impairment. Correlation matrices, means, and standard deviations were obtained from the Wechsler Memory Scale--4th Edition (WMS-IV) Technical and Interpretive Manual and used in Monte Carlo simulations to estimate the base rates of abnormal test scores in the standardization and special groups (mixed clinical) samples. Bayes' theorem was applied to these estimates to identify probabilities of normal cognition based on the number of abnormal test scores observed. Abnormal scores were common in the standardization sample (65.4% scoring below a scaled score of 7 on at least one subtest) and more common in the mixed clinical sample (85.6% scoring below a scaled score of 7 on at least one subtest). Probabilities varied according to the number of abnormal test scores, base rates of normal cognition, and cutoff scores. The results suggest that interpretation of base rates obtained from cognitively healthy samples must also account for data from cognitively impaired samples. Bayes' theorem can help neuropsychologists answer questions about the probability that an individual examinee is cognitively healthy based on the number of abnormal test scores observed.
Smith, Blair H; Hannaford, Philip C; Elliott, Alison M; Smith, W Cairns; Chambers, W Alastair
2005-04-01
Sampling for primary care research must strike a balance between efficiency and external validity. For most conditions, even a large population sample will yield a small number of cases, yet other sampling techniques risk problems with extrapolation of findings. To compare the efficiency and external validity of two sampling methods for both an intervention study and epidemiological research in primary care--a convenience sample and a general population sample--comparing the response and follow-up rates, the demographic and clinical characteristics of each sample, and calculating the 'number needed to sample' (NNS) for a hypothetical randomized controlled trial. In 1996, we selected two random samples of adults from 29 general practices in Grampian, for an epidemiological study of chronic pain. One sample of 4175 was identified by an electronic questionnaire that listed patients receiving regular analgesic prescriptions--the 'repeat prescription sample'. The other sample of 5036 was identified from all patients on practice lists--the 'general population sample'. Questionnaires, including demographic, pain and general health measures, were sent to all. A similar follow-up questionnaire was sent in 2000 to all those agreeing to participate in further research. We identified a potential group of subjects for a hypothetical trial in primary care based on a recently published trial (those aged 25-64, with severe chronic back pain, willing to participate in further research). The repeat prescription sample produced better response rates than the general sample overall (86% compared with 82%, P < 0.001), from both genders and from the oldest and youngest age groups. The NNS using convenience sampling was 10 for each member of the final potential trial sample, compared with 55 using general population sampling. There were important differences between the samples in age, marital and employment status, social class and educational level. However, among the potential trial sample, there were no demographic differences. Those from the repeat prescription sample had poorer indices than the general population sample in all pain and health measures. The repeat prescription sampling method was approximately five times more efficient than the general population method. However demographic and clinical differences in the repeat prescription sample might hamper extrapolation of findings to the general population, particularly in an epidemiological study, and demonstrate that simple comparison with age and gender of the target population is insufficient.
A fast learning method for large scale and multi-class samples of SVM
NASA Astrophysics Data System (ADS)
Fan, Yu; Guo, Huiming
2017-06-01
A multi-class classification SVM(Support Vector Machine) fast learning method based on binary tree is presented to solve its low learning efficiency when SVM processing large scale multi-class samples. This paper adopts bottom-up method to set up binary tree hierarchy structure, according to achieved hierarchy structure, sub-classifier learns from corresponding samples of each node. During the learning, several class clusters are generated after the first clustering of the training samples. Firstly, central points are extracted from those class clusters which just have one type of samples. For those which have two types of samples, cluster numbers of their positive and negative samples are set respectively according to their mixture degree, secondary clustering undertaken afterwards, after which, central points are extracted from achieved sub-class clusters. By learning from the reduced samples formed by the integration of extracted central points above, sub-classifiers are obtained. Simulation experiment shows that, this fast learning method, which is based on multi-level clustering, can guarantee higher classification accuracy, greatly reduce sample numbers and effectively improve learning efficiency.
Soil moisture optimal sampling strategy for Sentinel 1 validation super-sites in Poland
NASA Astrophysics Data System (ADS)
Usowicz, Boguslaw; Lukowski, Mateusz; Marczewski, Wojciech; Lipiec, Jerzy; Usowicz, Jerzy; Rojek, Edyta; Slominska, Ewa; Slominski, Jan
2014-05-01
Soil moisture (SM) exhibits a high temporal and spatial variability that is dependent not only on the rainfall distribution, but also on the topography of the area, physical properties of soil and vegetation characteristics. Large variability does not allow on certain estimation of SM in the surface layer based on ground point measurements, especially in large spatial scales. Remote sensing measurements allow estimating the spatial distribution of SM in the surface layer on the Earth, better than point measurements, however they require validation. This study attempts to characterize the SM distribution by determining its spatial variability in relation to the number and location of ground point measurements. The strategy takes into account the gravimetric and TDR measurements with different sampling steps, abundance and distribution of measuring points on scales of arable field, wetland and commune (areas: 0.01, 1 and 140 km2 respectively), taking into account the different status of SM. Mean values of SM were lowly sensitive on changes in the number and arrangement of sampling, however parameters describing the dispersion responded in a more significant manner. Spatial analysis showed autocorrelations of the SM, which lengths depended on the number and the distribution of points within the adopted grids. Directional analysis revealed a differentiated anisotropy of SM for different grids and numbers of measuring points. It can therefore be concluded that both the number of samples, as well as their layout on the experimental area, were reflected in the parameters characterizing the SM distribution. This suggests the need of using at least two variants of sampling, differing in the number and positioning of the measurement points, wherein the number of them must be at least 20. This is due to the value of the standard error and range of spatial variability, which show little change with the increase in the number of samples above this figure. Gravimetric method gives a more varied distribution of SM than those derived from TDR measurements. It should be noted that reducing the number of samples in the measuring grid leads to flattening the distribution of SM from both methods and increasing the estimation error at the same time. Grid of sensors for permanent measurement points should include points that have similar distributions of SM in the vicinity. Results of the analysis including number, the maximum correlation ranges and the acceptable estimation error should be taken into account when choosing of the measurement points. Adoption or possible adjustment of the distribution of the measurement points should be verified by performing additional measuring campaigns during the dry and wet periods. Presented approach seems to be appropriate for creation of regional-scale test (super) sites, to validate products of satellites equipped with SAR (Synthetic Aperture Radar), operating in C-band, with spatial resolution suited to single field scale, as for example: ERS-1, ERS-2, Radarsat and Sentinel-1, which is going to be launched in next few months. The work was partially funded by the Government of Poland through an ESA Contract under the PECS ELBARA_PD project No. 4000107897/13/NL/KML.
Fujikawa, Hiroshi
2017-01-01
Microbial concentration in samples of a food product lot has been generally assumed to follow the log-normal distribution in food sampling, but this distribution cannot accommodate the concentration of zero. In the present study, first, a probabilistic study with the most probable number (MPN) technique was done for a target microbe present at a low (or zero) concentration in food products. Namely, based on the number of target pathogen-positive samples in the total samples of a product found by a qualitative, microbiological examination, the concentration of the pathogen in the product was estimated by means of the MPN technique. The effects of the sample size and the total sample number of a product were then examined. Second, operating characteristic (OC) curves for the concentration of a target microbe in a product lot were generated on the assumption that the concentration of a target microbe could be expressed with the Poisson distribution. OC curves for Salmonella and Cronobacter sakazakii in powdered formulae for infants and young children were successfully generated. The present study suggested that the MPN technique and the Poisson distribution would be useful for qualitative microbiological test data analysis for a target microbe whose concentration in a lot is expected to be low.
Optimal Time-Resource Allocation for Energy-Efficient Physical Activity Detection
Thatte, Gautam; Li, Ming; Lee, Sangwon; Emken, B. Adar; Annavaram, Murali; Narayanan, Shrikanth; Spruijt-Metz, Donna; Mitra, Urbashi
2011-01-01
The optimal allocation of samples for physical activity detection in a wireless body area network for health-monitoring is considered. The number of biometric samples collected at the mobile device fusion center, from both device-internal and external Bluetooth heterogeneous sensors, is optimized to minimize the transmission power for a fixed number of samples, and to meet a performance requirement defined using the probability of misclassification between multiple hypotheses. A filter-based feature selection method determines an optimal feature set for classification, and a correlated Gaussian model is considered. Using experimental data from overweight adolescent subjects, it is found that allocating a greater proportion of samples to sensors which better discriminate between certain activity levels can result in either a lower probability of error or energy-savings ranging from 18% to 22%, in comparison to equal allocation of samples. The current activity of the subjects and the performance requirements do not significantly affect the optimal allocation, but employing personalized models results in improved energy-efficiency. As the number of samples is an integer, an exhaustive search to determine the optimal allocation is typical, but computationally expensive. To this end, an alternate, continuous-valued vector optimization is derived which yields approximately optimal allocations and can be implemented on the mobile fusion center due to its significantly lower complexity. PMID:21796237
An adaptive importance sampling algorithm for Bayesian inversion with multimodal distributions
Li, Weixuan; Lin, Guang
2015-03-21
Parametric uncertainties are encountered in the simulations of many physical systems, and may be reduced by an inverse modeling procedure that calibrates the simulation results to observations on the real system being simulated. Following Bayes’ rule, a general approach for inverse modeling problems is to sample from the posterior distribution of the uncertain model parameters given the observations. However, the large number of repetitive forward simulations required in the sampling process could pose a prohibitive computational burden. This difficulty is particularly challenging when the posterior is multimodal. We present in this paper an adaptive importance sampling algorithm to tackle thesemore » challenges. Two essential ingredients of the algorithm are: 1) a Gaussian mixture (GM) model adaptively constructed as the proposal distribution to approximate the possibly multimodal target posterior, and 2) a mixture of polynomial chaos (PC) expansions, built according to the GM proposal, as a surrogate model to alleviate the computational burden caused by computational-demanding forward model evaluations. In three illustrative examples, the proposed adaptive importance sampling algorithm demonstrates its capabilities of automatically finding a GM proposal with an appropriate number of modes for the specific problem under study, and obtaining a sample accurately and efficiently representing the posterior with limited number of forward simulations.« less
MSeq-CNV: accurate detection of Copy Number Variation from Sequencing of Multiple samples.
Malekpour, Seyed Amir; Pezeshk, Hamid; Sadeghi, Mehdi
2018-03-05
Currently a few tools are capable of detecting genome-wide Copy Number Variations (CNVs) based on sequencing of multiple samples. Although aberrations in mate pair insertion sizes provide additional hints for the CNV detection based on multiple samples, the majority of the current tools rely only on the depth of coverage. Here, we propose a new algorithm (MSeq-CNV) which allows detecting common CNVs across multiple samples. MSeq-CNV applies a mixture density for modeling aberrations in depth of coverage and abnormalities in the mate pair insertion sizes. Each component in this mixture density applies a Binomial distribution for modeling the number of mate pairs with aberration in the insertion size and also a Poisson distribution for emitting the read counts, in each genomic position. MSeq-CNV is applied on simulated data and also on real data of six HapMap individuals with high-coverage sequencing, in 1000 Genomes Project. These individuals include a CEU trio of European ancestry and a YRI trio of Nigerian ethnicity. Ancestry of these individuals is studied by clustering the identified CNVs. MSeq-CNV is also applied for detecting CNVs in two samples with low-coverage sequencing in 1000 Genomes Project and six samples form the Simons Genome Diversity Project.
Dawson-Coates, J A; Chase, J C; Funk, V; Booy, M H; Haines, L R; Falkenberg, C L; Whitaker, D J; Olafson, R W; Pearson, T W
2003-08-01
Atlantic salmon, Salmo salar L., were exposed to Kudoa thyrsites (Myxozoa, Myxosporea)-containing sea water for 15 months, and then harvested and assessed for parasite burden and fillet quality. At harvest, parasites were enumerated in muscle samples from a variety of somatic and opercular sites, and mean counts were determined for each fish. After 6 days storage at 4 degrees C, fillet quality was determined by visual assessment and by analysis of muscle firmness using a texture analyzer. Fillet quality could best be predicted by determining mean parasite numbers and spore counts in all eight tissue samples (somatic and opercular) or in four fillet samples, as the counts from opercular samples alone showed greater variability and thus decreased reliability. The variability in both plasmodia and spore numbers between tissue samples taken from an individual fish indicated that the parasites were not uniformly distributed in the somatic musculature. Therefore, to best predict the probable level of fillet degradation caused by K. thyrsites infections, multiple samples must be taken from each fish. If this is performed, a mean plasmodia count of 0.3 mm(-2) or a mean spore count of 4.0 x 10(5) g(-1) of tissue are the levels where the probability of severe myoliquefaction becomes a significant risk.
Bremsstrahlung-Based Imaging and Assays of Radioactive, Mixed and Hazardous Waste
NASA Astrophysics Data System (ADS)
Kwofie, J.; Wells, D. P.; Selim, F. A.; Harmon, F.; Duttagupta, S. P.; Jones, J. L.; White, T.; Roney, T.
2003-08-01
A new nondestructive accelerator based x-ray fluorescence (AXRF) approach has been developed to identify heavy metals in large-volume samples. Such samples are an important part of the process and waste streams of U.S Department of Energy sites, as well as other industries such as mining and milling. Distributions of heavy metal impurities in these process and waste samples can range from homogeneous to highly inhomogeneous, and non-destructive assays and imaging that can address both are urgently needed. Our approach is based on using high-energy, pulsed bremsstrahlung beams (3-6.5 MeV) from small electron accelerators to produce K-shell atomic fluorescence x-rays. In addition we exploit pair-production, Compton scattering and x-ray transmission measurements from these beams to probe locations of high density and high atomic number. The excellent penetrability of these beams allows assays and images for soil-like samples at least 15 g/cm2 thick, with elemental impurities of atomic number greater than approximately 50. Fluorescence yield of a variety of targets was measured as a function of impurity atomic number, impurity homogeneity, and sample thickness. We report on actual and potential detection limits of heavy metal impurities in a soil matrix for a variety of samples, and on the potential for imaging, using AXRF and these related probes.
An adaptive importance sampling algorithm for Bayesian inversion with multimodal distributions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Weixuan; Lin, Guang, E-mail: guanglin@purdue.edu
2015-08-01
Parametric uncertainties are encountered in the simulations of many physical systems, and may be reduced by an inverse modeling procedure that calibrates the simulation results to observations on the real system being simulated. Following Bayes' rule, a general approach for inverse modeling problems is to sample from the posterior distribution of the uncertain model parameters given the observations. However, the large number of repetitive forward simulations required in the sampling process could pose a prohibitive computational burden. This difficulty is particularly challenging when the posterior is multimodal. We present in this paper an adaptive importance sampling algorithm to tackle thesemore » challenges. Two essential ingredients of the algorithm are: 1) a Gaussian mixture (GM) model adaptively constructed as the proposal distribution to approximate the possibly multimodal target posterior, and 2) a mixture of polynomial chaos (PC) expansions, built according to the GM proposal, as a surrogate model to alleviate the computational burden caused by computational-demanding forward model evaluations. In three illustrative examples, the proposed adaptive importance sampling algorithm demonstrates its capabilities of automatically finding a GM proposal with an appropriate number of modes for the specific problem under study, and obtaining a sample accurately and efficiently representing the posterior with limited number of forward simulations.« less
45 CFR 160.536 - Statistical sampling.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 45 Public Welfare 1 2010-10-01 2010-10-01 false Statistical sampling. 160.536 Section 160.536... REQUIREMENTS GENERAL ADMINISTRATIVE REQUIREMENTS Procedures for Hearings § 160.536 Statistical sampling. (a) In... statistical sampling study as evidence of the number of violations under § 160.406 of this part, or the...
42 CFR 1003.133 - Statistical sampling.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 42 Public Health 5 2011-10-01 2011-10-01 false Statistical sampling. 1003.133 Section 1003.133... AUTHORITIES CIVIL MONEY PENALTIES, ASSESSMENTS AND EXCLUSIONS § 1003.133 Statistical sampling. (a) In meeting... statistical sampling study as evidence of the number and amount of claims and/or requests for payment as...
45 CFR 160.536 - Statistical sampling.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 45 Public Welfare 1 2011-10-01 2011-10-01 false Statistical sampling. 160.536 Section 160.536... REQUIREMENTS GENERAL ADMINISTRATIVE REQUIREMENTS Procedures for Hearings § 160.536 Statistical sampling. (a) In... statistical sampling study as evidence of the number of violations under § 160.406 of this part, or the...
42 CFR 1003.133 - Statistical sampling.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 42 Public Health 5 2010-10-01 2010-10-01 false Statistical sampling. 1003.133 Section 1003.133... AUTHORITIES CIVIL MONEY PENALTIES, ASSESSMENTS AND EXCLUSIONS § 1003.133 Statistical sampling. (a) In meeting... statistical sampling study as evidence of the number and amount of claims and/or requests for payment as...
A number of articles have investigated the impact of sampling design on remotely sensed landcover accuracy estimates. Gong and Howarth (1990) found significant differences for Kappa accuracy values when comparing purepixel sampling, stratified random sampling, and stratified sys...
Qualitative Meta-Analysis on the Hospital Task: Implications for Research
ERIC Educational Resources Information Center
Noll, Jennifer; Sharma, Sashi
2014-01-01
The "law of large numbers" indicates that as sample size increases, sample statistics become less variable and more closely estimate their corresponding population parameters. Different research studies investigating how people consider sample size when evaluating the reliability of a sample statistic have found a wide range of…
Sampling strategies for estimating brook trout effective population size
Andrew R. Whiteley; Jason A. Coombs; Mark Hudy; Zachary Robinson; Keith H. Nislow; Benjamin H. Letcher
2012-01-01
The influence of sampling strategy on estimates of effective population size (Ne) from single-sample genetic methods has not been rigorously examined, though these methods are increasingly used. For headwater salmonids, spatially close kin association among age-0 individuals suggests that sampling strategy (number of individuals and location from...
Lico, Michael S.; Pennington, Nyle
1999-01-01
The U.S. Geological Survey, in cooperation with the Tahoe Regional Planning Agency and the Lahontan Regional Water-Quality Control Board, sampled Lake Tahoe, major tributary streams to Lake Tahoe, and several other lakes in the Lake Tahoe Basin for manmade organic compounds during 1997-99. Gasoline components were found in all samples collected from Lake Tahoe during the summer boating season. Methyl tert-butyl ether (MTBE), benzene, toluene, ethylbenzene, and xylenes (BTEX) were the commonly detected compounds in these samples. Most samples from tributary streams and lakes with no motorized boating had no detectable concentrations of gasoline components. Motorized boating activity appears to be directly linked in space and time to the occurrence of these gasoline components. Other sources of gasoline components to Lake Tahoe, such as the atmosphere, surface runoff, and subsurface flow, are minor compared to the input by motorized boating. Water sampled from Lake Tahoe during mid-winter, when motorized boating activity is low, had no MTBE and only one sample had any detectable BTEX compounds. Soluble pesticides rarely were detected in water samples from the Lake Tahoe Basin. The only detectable concentrations of these compounds were in samples from Blackwood and Taylor Creeks collected during spring runoff. Concentrations found in these samples were low, in the 1 to 4 nanograms per liter range. Organochlorine compounds were detected in samples collected from semipermeable membrane devices (SPMD's) collected from Lake Tahoe, tributary streams, and Upper Angora Lake. In Lake Tahoe, SPMD samples collected offshore from urbanized areas contained the largest number and highest concentrations of organochlorine compounds. The most commonly detected organochlorine compounds were cis- and trans-chlordane, p, p'-DDE, and hexachlorobenzene. In tributary streams, SPMD samples collected during spring runoff generally had higher combined concentrations of organochlorine compounds than those collected during baseflow conditions. Upper Angora Lake had the fewest number of organochlorine compounds detected of all lake samples. Dioxins and furans were not detected in SPMD samples from two sites in Lake Tahoe or from two tributary streams. The number of polycyclic aromatic hydrocarbon (PAH) compounds and their combined concentrations generally were higher in samples from Lake Tahoe than those from tributary streams. Areas of high-motorized boating activity at Lake Tahoe had the largest number and highest concentrations of PAH's. PAH compounds were detected in samples from SPMD's in four of six tributary streams during spring runoff, all tributary streams during baseflow conditions, and at all lake sites. The most commonly detected PAH's in tributary streams during spring runoff were phenanthrene, fluoranthene, pyrene, and chrysene, and during baseflow conditions were phenanthrene, 1-methylphenanthrene, diethylnaphthalene, and pyrene. Upper Truckee River, which has an urban area in its drainage basin, had the largest number and highest combined concentration of PAH's of all stream samples. Bottom-sediment from Lake Tahoe had detectable concentrations of p-cresol, a phenol, in all but one sample. A sample collected near Chambers Lodge contained phenol at an estimated concentration of 4 micrograms per kilogram (?g/kg). Bottom-sediment samples from tributary streams had no detectable concentrations of organochlorine or PAH compounds. Several compounds were detected in bottom sediment from Upper Angora Lake at high concentrations. These compounds and their concentrations were p, p'-DDD (10 ?g/kg), p, p'-DDE (7.4 ?g/kg), 2,6-dimethylnaphthalene (estimated at 190 ?g/kg), pentachlorophenol (3,000 ?g/kg), and p-cresol (4,400 ?g/kg).
NASA Technical Reports Server (NTRS)
Tomberlin, T. J.
1985-01-01
Research studies of residents' responses to noise consist of interviews with samples of individuals who are drawn from a number of different compact study areas. The statistical techniques developed provide a basis for those sample design decisions. These techniques are suitable for a wide range of sample survey applications. A sample may consist of a random sample of residents selected from a sample of compact study areas, or in a more complex design, of a sample of residents selected from a sample of larger areas (e.g., cities). The techniques may be applied to estimates of the effects on annoyance of noise level, numbers of noise events, the time-of-day of the events, ambient noise levels, or other factors. Methods are provided for determining, in advance, how accurately these effects can be estimated for different sample sizes and study designs. Using a simple cost function, they also provide for optimum allocation of the sample across the stages of the design for estimating these effects. These techniques are developed via a regression model in which the regression coefficients are assumed to be random, with components of variance associated with the various stages of a multi-stage sample design.
Purushothaman, Jasmine; Kharusi, Lubna Al; Mills, Claudia E; Ghielani, Hamed; Marzouki, Mohammad Al
2013-12-11
A bloom of the hydromedusan jellyfish, Timoides agassizii, occurred in February 2011 off the coast of Sohar, Al Batinah, Sultanate of Oman, in the Gulf of Oman. This species was first observed in 1902 in great numbers off Haddummati Atoll in the Maldive Islands in the Indian Ocean and has rarely been seen since. The species appeared briefly in large numbers off Oman in 2011 and subsequent observation of our 2009 samples of zooplankton from Sohar revealed that it was also present in low numbers (two collected) in one sample in 2009; these are the first records in the Indian Ocean north of the Maldives. Medusae collected off Oman were almost identical to those recorded previously from the Maldive Islands, Papua New Guinea, the Marshall Islands, Guam, the South China Sea, and Okinawa. T. agassizii is a species that likely lives for several months. It was present in our plankton samples together with large numbers of the oceanic siphonophore Physalia physalis only during a single month's samples, suggesting that the temporary bloom off Oman was likely due to the arrival of mature, open ocean medusae into nearshore waters. We see no evidence that T. agassizii has established a new population along Oman, since if so, it would likely have been present in more than one sample period. We are unable to deduce further details of the life cycle of this species from blooms of many mature individuals nearshore, about a century apart. Examination of a single damaged T. agassizii medusa from Guam, calls into question the existence of its congener, T. latistyla, known only from a single specimen.
Modeling the effect of temperature on survival rate of Listeria monocytogenes in yogurt.
Szczawiński, J; Szczawińska, M E; Łobacz, A; Jackowska-Tracz, A
2016-01-01
The aim of the study was to (i) evaluate the behavior of Listeria monocytogenes in a commercially produced yogurt, (ii) determine the survival/inactivation rates of L. monocytogenes during cold storage of yogurt and (iii) to generate primary and secondary mathematical models to predict the behavior of these bacteria during storage at different temperatures. The samples of yogurt were inoculated with the mixture of three L. monocytogenes strains and stored at 3, 6, 9, 12 and 15°C for 16 days. The number of listeriae was determined after 0, 1, 2, 3, 5, 7, 9, 12, 14 and 16 days of storage. From each sample a series of decimal dilutions were prepared and plated onto ALOA agar (agar for Listeria according to Ottaviani and Agosti). It was found that applied temperature and storage time significantly influenced the survival rate of listeriae (p<0.01). The number of L. monocytogenes in all the samples decreased linearly with storage time. The slowest decrease in the number of the bacteria was found in the samples stored at 6°C (D-10 value = 243.9 h), whereas the highest reduction in the number of the bacteria was observed in the samples stored at 15°C (D-10 value = 87.0 h). The number of L. monocytogenes was correlated with the pH value of the samples (p<0.01). The natural logarithm of the mean survival/inactivation rates of L. monocytogenes calculated from the primary model was fitted to two secondary models, namely linear and polynomial. Mathematical equations obtained from both secondary models can be applied as a tool for the prediction of the survival/inactivation rate of L. monocytogenes in yogurt stored under temperature range from 3 to 15°C, however, the polynomial model gave a better fit to the experimental data.
A statistical treatment of bioassay pour fractions
NASA Astrophysics Data System (ADS)
Barengoltz, Jack; Hughes, David
A bioassay is a method for estimating the number of bacterial spores on a spacecraft surface for the purpose of demonstrating compliance with planetary protection (PP) requirements (Ref. 1). The details of the process may be seen in the appropriate PP document (e.g., for NASA, Ref. 2). In general, the surface is mechanically sampled with a damp sterile swab or wipe. The completion of the process is colony formation in a growth medium in a plate (Petri dish); the colonies are counted. Consider a set of samples from randomly selected, known areas of one spacecraft surface, for simplicity. One may calculate the mean and standard deviation of the bioburden density, which is the ratio of counts to area sampled. The standard deviation represents an estimate of the variation from place to place of the true bioburden density commingled with the precision of the individual sample counts. The accuracy of individual sample results depends on the equipment used, the collection method, and the culturing method. One aspect that greatly influences the result is the pour fraction, which is the quantity of fluid added to the plates divided by the total fluid used in extracting spores from the sampling equipment. In an analysis of a single sample’s counts due to the pour fraction, one seeks to answer the question: What is the probability that if a certain number of spores are counted with a known pour fraction, that there are an additional number of spores in the part of the rinse not poured. This is given for specific values by the binomial distribution density, where detection (of culturable spores) is success and the probability of success is the pour fraction. A special summation over the binomial distribution, equivalent to adding for all possible values of the true total number of spores, is performed. This distribution when normalized will almost yield the desired quantity. It is the probability that the additional number of spores does not exceed a certain value. Of course, for a desired value of uncertainty, one must invert the calculation. However, this probability of finding exactly the number of spores in the poured part is correct only in the case where all values of the true number of spores greater than or equal to the adjusted count are equally probable. This is not realistic, of course, but the result can only overestimate the uncertainty. So it is useful. In probability speak, one has the conditional probability given any true total number of spores. Therefore one must multiply it by the probability of each possible true count, before the summation. If the counts for a sample set (of which this is one sample) are available, one may use the calculated variance and the normal probability distribution. In this approach, one assumes a normal distribution and neglects the contribution from spatial variation. The former is a common assumption. The latter can only add to the conservatism (over estimate the number of spores at some level of confidence). A more straightforward approach is to assume a Poisson probability distribution for the measured total sample set counts, and use the product of the number of samples and the mean number of counts per sample as the mean of the Poisson distribution. It is necessary to set the total count to 1 in the Poisson distribution when actual total count is zero. Finally, even when the planetary protection requirements for spore burden refer only to the mean values, they require an adjustment for pour fraction and method efficiency (a PP specification based on independent data). The adjusted mean values are a 50/50 proposition (e.g., the probability of the true total counts in the sample set exceeding the estimate is 0.50). However, this is highly unconservative when the total counts are zero. No adjustment to the mean values occurs for either pour fraction or efficiency. The recommended approach is once again to set the total counts to 1, but now applied to the mean values. Then one may apply the corrections to the revised counts. It can be shown by the methods developed in this work that this change is usually conservative enough to increase the level of confidence in the estimate to 0.5. 1. NASA. (2005) Planetary protection provisions for robotic extraterrestrial missions. NPR 8020.12C, April 2005, National Aeronautics and Space Administration, Washington, DC. 2. NASA. (2010) Handbook for the Microbiological Examination of Space Hardware, NASA-HDBK-6022, National Aeronautics and Space Administration, Washington, DC.
NASA Astrophysics Data System (ADS)
Angulo, M.; Balestra, B.
2013-12-01
Our project is dedicated to the study of sediment samples gathered from the Gulf of Cadiz during the Integrated Ocean Drilling Project (IODP), Expedition 339 at Site U1390. The Gulf of Cadiz is an area of the Atlantic Ocean located directly south of Portugal and the western end of the Strait of Gibraltar. We analyzed the sediment samples to obtain the numbers of coccoliths per gram of sediment in each sample. Coccolithophores (from which the coccoliths are the fossils that remain in the sediments) are one of the most abundant groups of living phytoplankton and they are significant components of marine sediment. In order to prepare our samples for counting (the process by which we determined the number of coccoliths in our samples) we utilized about 23 samples from the uppermost 17 meters of the sediment core. This process involved collecting subsamples of each individual sample and then oven drying them. Then we weighted them by utilizing a microbalance to collect the desired amount of sample we needed (between 2 and 4 mg). Several of our samples were slightly below and above this desired amount due to human error. Once we gathered the desired amount of samples for our project, we proceeded to use a filtration system to obtain filters. Then we put the filters into an oven to dry them. After the samples were dried over the course of a day, we proceeded to prepare them for viewing through the microscope. To do this, we cut the filter and placed it upon a microscope slide. Then, we applied oil to the slide, cover and placed it under the Light Microscope (LM). We looked at five different views in each filters under the microscope, counting the number of coccoliths in each view. The counting has been expressed in terms of numbers of coccoliths per gram of sediment (total coccolith concentration). Our results showed us that the amount of coccoliths in the sediment samples receded during the cold periods of time, such as an ice age, and fluctuated in an upward pattern as the climate warmed. This project was part of a first step for further research, namely to continue to determine how climate has changed in the past ~20,000 years in the investigated area.
Approximation of Failure Probability Using Conditional Sampling
NASA Technical Reports Server (NTRS)
Giesy. Daniel P.; Crespo, Luis G.; Kenney, Sean P.
2008-01-01
In analyzing systems which depend on uncertain parameters, one technique is to partition the uncertain parameter domain into a failure set and its complement, and judge the quality of the system by estimating the probability of failure. If this is done by a sampling technique such as Monte Carlo and the probability of failure is small, accurate approximation can require so many sample points that the computational expense is prohibitive. Previous work of the authors has shown how to bound the failure event by sets of such simple geometry that their probabilities can be calculated analytically. In this paper, it is shown how to make use of these failure bounding sets and conditional sampling within them to substantially reduce the computational burden of approximating failure probability. It is also shown how the use of these sampling techniques improves the confidence intervals for the failure probability estimate for a given number of sample points and how they reduce the number of sample point analyses needed to achieve a given level of confidence.
Experimental breakdown of selected anodized aluminum samples in dilute plasmas
NASA Technical Reports Server (NTRS)
Grier, Norman T.; Domitz, Stanley
1992-01-01
Anodized aluminum samples representative of Space Station Freedom structural material were tested for electrical breakdown under space plasma conditions. In space, this potential arises across the insulating anodized coating when the spacecraft structure is driven to a negative bias relative to the external plasma potential due to plasma-surface interaction phenomena. For anodized materials used in the tests, it was found that breakdown voltage varied from 100 to 2000 volts depending on the sample. The current in the arcs depended on the sample, the capacitor, and the voltage. The level of the arc currents varied from 60 to 1000 amperes. The plasma number density varied from 3 x 10 exp 6 to 10 exp 3 ions per cc. The time between arcs increased as the number density was lowered. Corona testing of anodized samples revealed that samples with higher corona inception voltage had higher arcing inception voltages. From this it is concluded that corona testing may provide a method of screening the samples.
Resampling methods in Microsoft Excel® for estimating reference intervals
Theodorsson, Elvar
2015-01-01
Computer- intensive resampling/bootstrap methods are feasible when calculating reference intervals from non-Gaussian or small reference samples. Microsoft Excel® in version 2010 or later includes natural functions, which lend themselves well to this purpose including recommended interpolation procedures for estimating 2.5 and 97.5 percentiles. The purpose of this paper is to introduce the reader to resampling estimation techniques in general and in using Microsoft Excel® 2010 for the purpose of estimating reference intervals in particular. Parametric methods are preferable to resampling methods when the distributions of observations in the reference samples is Gaussian or can transformed to that distribution even when the number of reference samples is less than 120. Resampling methods are appropriate when the distribution of data from the reference samples is non-Gaussian and in case the number of reference individuals and corresponding samples are in the order of 40. At least 500-1000 random samples with replacement should be taken from the results of measurement of the reference samples. PMID:26527366
Resampling methods in Microsoft Excel® for estimating reference intervals.
Theodorsson, Elvar
2015-01-01
Computer-intensive resampling/bootstrap methods are feasible when calculating reference intervals from non-Gaussian or small reference samples. Microsoft Excel® in version 2010 or later includes natural functions, which lend themselves well to this purpose including recommended interpolation procedures for estimating 2.5 and 97.5 percentiles. The purpose of this paper is to introduce the reader to resampling estimation techniques in general and in using Microsoft Excel® 2010 for the purpose of estimating reference intervals in particular. Parametric methods are preferable to resampling methods when the distributions of observations in the reference samples is Gaussian or can transformed to that distribution even when the number of reference samples is less than 120. Resampling methods are appropriate when the distribution of data from the reference samples is non-Gaussian and in case the number of reference individuals and corresponding samples are in the order of 40. At least 500-1000 random samples with replacement should be taken from the results of measurement of the reference samples.
Schmidt, Robert L; Howard, Kirsten; Hall, Brian J; Layfield, Lester J
2012-12-01
Sample adequacy is an important aspect of overall fine-needle aspiration cytology (FNAC) performance. FNAC effectiveness is augmented by an increasing number of needle passes, but increased needle passes are associated with higher costs and greater risk of adverse events. The objective of this study was to compare the impact of several different sampling policies on FNAC effectiveness and adverse event rates using discrete event simulation. We compared 8 different sampling policies in 12 different sampling environments. All sampling policies were effective when the per-pass accuracy is high (>80%). Rapid on-site evaluation (ROSE) improves FNAC effectiveness when the per-pass adequacy rate is low. ROSE is unlikely to be cost-effective in sampling environments in which the per-pass adequacy is high. Alternative ROSE assessors (eg, cytotechnologists) may be a cost-effective alternative to pathologists when the per-pass adequacy rate is moderate (60%-80%) or when the number of needle passes is limited.
Estimation of total bacteria by real-time PCR in patients with periodontal disease.
Brajović, Gavrilo; Popović, Branka; Puletić, Miljan; Kostić, Marija; Milasin, Jelena
2016-01-01
Periodontal diseases are associated with the presence of elevated levels of bacteria within the gingival crevice. The aim of this study was to evaluate a total amount of bacteria in subgingival plaque samples in patients with a periodontal disease. A quantitative evaluation of total bacteria amount using quantitative real-time polymerase chain reaction (qRT-PCR) was performed on 20 samples of patients with ulceronecrotic periodontitis and on 10 samples of healthy subjects. The estimation of total bacterial amount was based on gene copy number for 16S rRNA that was determined by comparing to Ct values/gene copy number of the standard curve. A statistically significant difference between average gene copy number of total bacteria in periodontal patients (2.55 x 10⁷) and healthy control (2.37 x 10⁶) was found (p = 0.01). Also, a trend of higher numbers of the gene copy in deeper periodontal lesions (> 7 mm) was confirmed by a positive value of coefficient of correlation (r = 0.073). The quantitative estimation of total bacteria based on gene copy number could be an important additional tool in diagnosing periodontitis.
Sequential sampling: a novel method in farm animal welfare assessment.
Heath, C A E; Main, D C J; Mullan, S; Haskell, M J; Browne, W J
2016-02-01
Lameness in dairy cows is an important welfare issue. As part of a welfare assessment, herd level lameness prevalence can be estimated from scoring a sample of animals, where higher levels of accuracy are associated with larger sample sizes. As the financial cost is related to the number of cows sampled, smaller samples are preferred. Sequential sampling schemes have been used for informing decision making in clinical trials. Sequential sampling involves taking samples in stages, where sampling can stop early depending on the estimated lameness prevalence. When welfare assessment is used for a pass/fail decision, a similar approach could be applied to reduce the overall sample size. The sampling schemes proposed here apply the principles of sequential sampling within a diagnostic testing framework. This study develops three sequential sampling schemes of increasing complexity to classify 80 fully assessed UK dairy farms, each with known lameness prevalence. Using the Welfare Quality herd-size-based sampling scheme, the first 'basic' scheme involves two sampling events. At the first sampling event half the Welfare Quality sample size is drawn, and then depending on the outcome, sampling either stops or is continued and the same number of animals is sampled again. In the second 'cautious' scheme, an adaptation is made to ensure that correctly classifying a farm as 'bad' is done with greater certainty. The third scheme is the only scheme to go beyond lameness as a binary measure and investigates the potential for increasing accuracy by incorporating the number of severely lame cows into the decision. The three schemes are evaluated with respect to accuracy and average sample size by running 100 000 simulations for each scheme, and a comparison is made with the fixed size Welfare Quality herd-size-based sampling scheme. All three schemes performed almost as well as the fixed size scheme but with much smaller average sample sizes. For the third scheme, an overall association between lameness prevalence and the proportion of lame cows that were severely lame on a farm was found. However, as this association was found to not be consistent across all farms, the sampling scheme did not prove to be as useful as expected. The preferred scheme was therefore the 'cautious' scheme for which a sampling protocol has also been developed.
NASA Astrophysics Data System (ADS)
Beukema, J. J.; Dekker, R.
2011-06-01
A 40-y series of consistently collected samples (15 fixed sampling sites, constant sampled area of 15 × 0.95 m2, annual sampling only in late-winter/early-spring seasons, and consistent sieving and sorting procedures; restriction to 50 easily recognizable species) of macrozoobenthos on Balgzand, a tidal flat area in the westernmost part of the Wadden Sea (The Netherlands), revealed significantly increasing trends of species richness. Total numbers of species annually encountered increased from ~28 to ~38. Mean species density (number of species found per sampling site) increased from ~13 to ~18 per 0.95 m2. During the 40 years of the 1970-2009 period of observation, 4 exotic species invaded the area: (in order of first appearance) Ensis directus, Marenzelleria viridis, Crassostrea gigas, and Hemigrapsus takanoi. Another 5 species recently moved to Balgzand from nearby (subtidal) locations. Together, these 9 new species on the tidal flats explained by far most of the increase in total species numbers, but accounted for only one-third of the observed increase in species density (as a consequence of the restricted distribution of most of them). Species density increased particularly by a substantial number of species that showed increasing trends in the numbers of tidal flat sites they occupied. Most of these wider-spreading species were found to suffer from cold winters. During the 40-y period of observation, winter temperatures rose by about 2°C and cold winters became less frequent. The mean number of cold-sensitive species found per site significantly increased by almost 2 per 0.95 m2. Among the other species (not sensitive to low winter temperatures), 6 showed a rising and 2 a declining trend in number of occupied sites, resulting in a net long-term increase in species density amounting to another gain of 1.6 per 0.95 m2. Half of the 50 studied species did not show such long-term trend, nor were invaders. Thus, each of 3 groups (local or alien invaders/winter-sensitive species/other increasing species) contributed to a roughly similar extent to the overall increase in species density.
Caciagli, P; Verderio, A
2003-06-30
Several aspects of enzyme-linked immunosorbent assay (ELISA) procedures and data analysis have been examined in an attempt to find a rapid and reliable method for discriminating between 'positive' and 'negative' results when testing a large number of samples. A layout of ELISA plates was designed to reduce uncontrolled variation and to optimize the number of negative and positive controls. A transformation using the fourth root (A(1/4)) of the optical density readings corrected for the blank (A) stabilized the variance of most ELISA data examined. Transformed A values were used to calculate the true limits, at a set protection level, for false positive (C) and false negative (D). Methods are discussed to reduce the number of undifferentiated samples, i.e. the samples with response falling between C and D. The whole procedure was set up for use with an electronic spreadsheet. With the addition of few instructions of the type 'if em leader then em leader else' in the spreadsheet, the ELISA results were obtained in the simple trichotomous form 'negative/undefined/positive'. This allowed rapid analysis of more than 1100 maize samples testing for the presence of seven aphid-borne viruses-in fact almost 8000 ELISA samples.
Inverse sampling regression for pooled data.
Montesinos-López, Osval A; Montesinos-López, Abelardo; Eskridge, Kent; Crossa, José
2017-06-01
Because pools are tested instead of individuals in group testing, this technique is helpful for estimating prevalence in a population or for classifying a large number of individuals into two groups at a low cost. For this reason, group testing is a well-known means of saving costs and producing precise estimates. In this paper, we developed a mixed-effect group testing regression that is useful when the data-collecting process is performed using inverse sampling. This model allows including covariate information at the individual level to incorporate heterogeneity among individuals and identify which covariates are associated with positive individuals. We present an approach to fit this model using maximum likelihood and we performed a simulation study to evaluate the quality of the estimates. Based on the simulation study, we found that the proposed regression method for inverse sampling with group testing produces parameter estimates with low bias when the pre-specified number of positive pools (r) to stop the sampling process is at least 10 and the number of clusters in the sample is also at least 10. We performed an application with real data and we provide an NLMIXED code that researchers can use to implement this method.
Enhanced chromium adsorption capacity via plasma modification of natural zeolites
NASA Astrophysics Data System (ADS)
Cagomoc, Charisse Marie D.; Vasquez, Magdaleno R., Jr.
2017-01-01
Natural zeolites such as mordenite are excellent adsorbents for heavy metals. To enhance the adsorption capacity of zeolite, sodium-exchanged samples were irradiated with 13.56 MHz capacitively coupled radio frequency (RF) argon gas discharge. Hexavalent chromium [Cr(VI)] was used as the test heavy metal. Pristine and plasma-treated zeolite samples were soaked in 50 mg/L Cr solution and the amount of adsorbed Cr(VI) on the zeolites was calculated at predetermined time intervals. Compared with untreated zeolite samples, initial Cr(VI) uptake was 70% higher for plasma-treated zeolite granules (50 W 30 min) after 1 h of soaking. After 24 h, all plasma-treated zeolites showed increased Cr(VI) uptake. For a 2- to 4-month period, Cr(VI) uptake increased about 130% compared with untreated zeolite granules. X-ray diffraction analyses between untreated and treated zeolite samples revealed no major difference in terms of its crystal structure. However, for plasma-treated samples, an increase in the number of surface defects was observed from scanning electron microscopy images. This increase in the number of surface defects induced by plasma exposure played a crucial role in increasing the number of active sorption sites on the zeolite surface.
163 years of refinement: the British Geological Survey sample registration scheme
NASA Astrophysics Data System (ADS)
Howe, M. P.
2011-12-01
The British Geological Survey manages the largest UK geoscience samples collection, including: - 15,000 onshore boreholes, including over 250 km of drillcore - Vibrocores, gravity cores and grab samples from over 32,000 UK marine sample stations. 640 boreholes - Over 3 million UK fossils, including a "type and stratigraphic" reference collection of 250,000 fossils, 30,000 of which are "type, figured or cited" - Comprehensive microfossil collection, including many borehole samples - 290km of drillcore and 4.5 million cuttings samples from over 8000 UK continental shelf hydrocarbon wells - Over one million mineralogical and petrological samples, including 200,00 thin sections The current registration scheme was introduced in 1848 and is similar to that used by Charles Darwin on the Beagle. Every Survey collector or geologist has been issue with a unique prefix code of one or more letters and these were handwritten on preprinted numbers, arranged in books of 1 - 5,000 and 5,001 to 10,000. Similar labels are now computer printed. Other prefix codes are used for corporate collections, such as borehole samples, thin sections, microfossils, macrofossil sections, museum reference fossils, display quality rock samples and fossil casts. Such numbers infer significant immediate information to the curator, without the need to consult detailed registers. The registration numbers have been recorded in a series of over 1,000 registers, complete with metadata including sample ID, locality, horizon, collector and date. Citations are added as appropriate. Parent-child relationships are noted when re-registering subsubsamples. For example, a borehole sample BDA1001 could have been subsampled for a petrological thin section and off-cut (E14159), a fossil thin section (PF365), micropalynological slides (MPA273), one of which included a new holotype (MPK111), and a figured macrofossil (GSE1314). All main corporate collection now have publically-available online databases, such as PalaeoSaurus (fossils), Britrocks (mineralogy and petrology) and ComBo (combined onshore and offshore boreholes). ComBo links to core images, when available. Similar links are under development for Britrocks and PalaeoSaurus, with the latter also to include HR laser scanned digital models. These databases also link to internal and public GIS systems and to the BGS digital field data capture system. PalaeoSaurus holds an identification/authority/date history for each specimen, as well as recording type status, and figure and citation details. Similar comments can be added to Britrocks and ComBo. For several years, the BGS has provided online web access to the databases, for the discovery of physical samples , including parent-child links and citation information. Regretfully, authors frequently fail to cite sample registration numbers (nineteenth century geologists were sometimes better than their twenty-first century counterparts), or to supply copies of, or links to, the data generated, despite it being a condition of sample access. The need for editors and referees to enforce the inclusion of sample registration numbers, and for authors to lodge copies of papers, reports and data with the sample providers, is more important than yet another new database.
Rotor assembly and method for automatically processing liquids
Burtis, Carl A.; Johnson, Wayne F.; Walker, William A.
1992-01-01
A rotor assembly for performing a relatively large number of processing steps upon a sample, such as a whole blood sample, and a diluent, such as water, includes a rotor body for rotation about an axis and including a network of chambers within which various processing steps are performed upon the sample and diluent and passageways through which the sample and diluent are transferred. A transfer mechanism is movable through the rotor body by the influence of a magnetic field generated adjacent the transfer mechanism and movable along the rotor body, and the assembly utilizes centrifugal force, a transfer of momentum and capillary action to perform any of a number of processing steps such as separation, aliquoting, transference, washing, reagent addition and mixing of the sample and diluent within the rotor body. The rotor body is particularly suitable for automatic immunoassay analyses.
Stochastic coupled cluster theory: Efficient sampling of the coupled cluster expansion
NASA Astrophysics Data System (ADS)
Scott, Charles J. C.; Thom, Alex J. W.
2017-09-01
We consider the sampling of the coupled cluster expansion within stochastic coupled cluster theory. Observing the limitations of previous approaches due to the inherently non-linear behavior of a coupled cluster wavefunction representation, we propose new approaches based on an intuitive, well-defined condition for sampling weights and on sampling the expansion in cluster operators of different excitation levels. We term these modifications even and truncated selections, respectively. Utilising both approaches demonstrates dramatically improved calculation stability as well as reduced computational and memory costs. These modifications are particularly effective at higher truncation levels owing to the large number of terms within the cluster expansion that can be neglected, as demonstrated by the reduction of the number of terms to be sampled when truncating at triple excitations by 77% and hextuple excitations by 98%.
Tokar, Tomas; Pastrello, Chiara; Ramnarine, Varune R.; Zhu, Chang-Qi; Craddock, Kenneth J.; Pikor, Larrisa A.; Vucic, Emily A.; Vary, Simon; Shepherd, Frances A.; Tsao, Ming-Sound; Lam, Wan L.; Jurisica, Igor
2018-01-01
In many cancers, significantly down- or upregulated genes are found within chromosomal regions with DNA copy number alteration opposite to the expression changes. Generally, this paradox has been overlooked as noise, but can potentially be a consequence of interference of epigenetic regulatory mechanisms, including microRNA-mediated control of mRNA levels. To explore potential associations between microRNAs and paradoxes in non-small-cell lung cancer (NSCLC) we curated and analyzed lung adenocarcinoma (LUAD) data, comprising gene expressions, copy number aberrations (CNAs) and microRNA expressions. We integrated data from 1,062 tumor samples and 241 normal lung samples, including newly-generated array comparative genomic hybridization (aCGH) data from 63 LUAD samples. We identified 85 “paradoxical” genes whose differential expression consistently contrasted with aberrations of their copy numbers. Paradoxical status of 70 out of 85 genes was validated on sample-wise basis using The Cancer Genome Atlas (TCGA) LUAD data. Of these, 41 genes are prognostic and form a clinically relevant signature, which we validated on three independent datasets. By meta-analysis of results from 9 LUAD microRNA expression studies we identified 24 consistently-deregulated microRNAs. Using TCGA-LUAD data we showed that deregulation of 19 of these microRNAs explains differential expression of the paradoxical genes. Our results show that deregulation of paradoxical genes is crucial in LUAD and their expression pattern is maintained epigenetically, defying gene copy number status. PMID:29507679
Anamnart, Witthaya; Pattanawongsa, Attarat; Intapan, Pewpan Maleewong; Maleewong, Wanchai
2010-01-01
We succeeded in stimulation of excretion of Strongyloides stercoralis larvae in stool by oral administration of a single dose of 400 mg albendazole to strongyloidiasis patients. This result overcame the false-negative results of stool examination due to low larval numbers. Stool samples were collected from 152 asymptomatic strongyloidiasis patients in the morning, prior to eating. After breakfast, they were given a dose of 400 mg albendazole, and stool samples were collected the following morning. Agar plate culture (APC), modified formalin-ether concentration technique (MFECT), and direct-smear (DS) methods were used to examine stool specimens within 3 h after defecation. The results before and after albendazole was taken were compared. All APCs that were positive became negative after albendazole administration, while MFECT showed a 1.4- to 18.0-fold increase in larval numbers in 97.4% (148/152) of the samples. The DSs were positive in 3 out of 3 smears at a larval number of ≥45 larvae per g (lpg) of stool, and in 1or 2 out of 3 smears at a larval number between 35 and 44 lpg. At a larval number of <35 lpg, the DS became negative. Interestingly 90.5% (19/21) of the samples that were negative by all methods before albendazole administration became positive by MFECT after the treatment. Thus, MFECT can be effectively used for diagnosis of strongyloidiasis with prior administration of albendazole to the subject. PMID:20844212
Evaluation of a Professional Practice Model in the Ambulatory Care Setting
2014-03-10
were truly grotesque! (FM T2); Communication about rescheduling is very poor. When messages are left for doctors or nurses 98% there is not return... Nursing Research TSNRP Program, 4301 Jones Bridge RD Bethesda, MD 20814 11. SPONSOR/MONITOR’S REPORT NUMBER(S) 12. DISTRIBUTION...across three time periods. Sample/Methods: Nursing staff (n=42) and patients (n=1220) were recruited using non-purposive sampling for the satisfaction
Climate Change Mitigation: Can the U.S. Intelligence Community Help?
2013-06-01
satellite sensors to establish the concentration of atmospheric CO2 parts per million (ppm mole fraction) in samples collected at multiple...measurements. Spatial sampling density, the number of sensors or—in the case of satellite imagery the number and resolution of the images—likewise influences...Somewhat paradoxically, sensor accuracy from either remote ( satellites ) or in situ sensors is an important consideration, but it must also be evaluated
2016-04-14
study dynamic events such as melting, evaporation, crystallization, dissolution, self-assembly, membrane disruption, sample movement tracking. To... polymeric hairy nanopraticle, suprastructures REPORT DOCUMENTATION PAGE 11. SPONSOR/MONITOR’S REPORT NUMBER(S) 10. SPONSOR/MONITOR’S ACRONYM(S...the AFM will permit us to study dynamic events such as melting, evaporation, crystallization, dissolution, self-assembly, membrane disruption, sample
Francis A. Roesch; Todd A. Schroeder; James T. Vogt
2017-01-01
The resilience of a National Forest Inventory and Monitoring sample design can sometimes depend upon the degree to which it can adapt to fluctuations in funding. If a budget reduction necessitates the observation of fewer plots per year, some practitioners weigh the problem as a tradeoff between reducing the total number of plots and measuring the original number of...
ERIC Educational Resources Information Center
Nejem, Khamis Mousa; Muhanna, Wafa
2013-01-01
The purpose of this study was to investigate the effect of using computer games in teaching mathematics on developing the number sense of fourth grade students. To achieve this purpose a study sample of (81) students was selected from the fourth grade. This sample was divided into two groups. One group was randomly chosen to be the experimental…
Potential Reporting Bias in Neuroimaging Studies of Sex Differences.
David, Sean P; Naudet, Florian; Laude, Jennifer; Radua, Joaquim; Fusar-Poli, Paolo; Chu, Isabella; Stefanick, Marcia L; Ioannidis, John P A
2018-04-17
Numerous functional magnetic resonance imaging (fMRI) studies have reported sex differences. To empirically evaluate for evidence of excessive significance bias in this literature, we searched for published fMRI studies of human brain to evaluate sex differences, regardless of the topic investigated, in Medline and Scopus over 10 years. We analyzed the prevalence of conclusions in favor of sex differences and the correlation between study sample sizes and number of significant foci identified. In the absence of bias, larger studies (better powered) should identify a larger number of significant foci. Across 179 papers, median sample size was n = 32 (interquartile range 23-47.5). A median of 5 foci related to sex differences were reported (interquartile range, 2-9.5). Few articles (n = 2) had titles focused on no differences or on similarities (n = 3) between sexes. Overall, 158 papers (88%) reached "positive" conclusions in their abstract and presented some foci related to sex differences. There was no statistically significant relationship between sample size and the number of foci (-0.048% increase for every 10 participants, p = 0.63). The extremely high prevalence of "positive" results and the lack of the expected relationship between sample size and the number of discovered foci reflect probable reporting bias and excess significance bias in this literature.
Comparing thin slices of verbal communication behavior of varying number and duration.
Carcone, April Idalski; Naar, Sylvie; Eggly, Susan; Foster, Tanina; Albrecht, Terrance L; Brogan, Kathryn E
2015-02-01
The aim of this study was to assess the accuracy of thin slices to characterize the verbal communication behavior of counselors and patients engaged in Motivational Interviewing sessions relative to fully coded sessions. Four thin slice samples that varied in number (four versus six slices) and duration (one- versus two-minutes) were extracted from a previously coded dataset. In the parent study, an observational code scheme was used to characterize specific counselor and patient verbal communication behaviors. For the current study, we compared the frequency of communication codes and the correlations among the full dataset and each thin slice sample. Both the proportion of communication codes and strength of the correlation demonstrated the highest degree of accuracy when a greater number (i.e., six versus four) and duration (i.e., two- versus one-minute) of slices were extracted. These results suggest that thin slice sampling may be a useful and accurate strategy to reduce coding burden when coding specific verbal communication behaviors within clinical encounters. We suggest researchers interested in using thin slice sampling in their own work conduct preliminary research to determine the number and duration of thin slices required to accurately characterize the behaviors of interest. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Purcell, Maureen K.; Powers, Rachel L.; Besijn, Bonnie; Hershberger, Paul K.
2017-01-01
We report the development and validation of two quantitative PCR (qPCR) assays to detect Nanophyetus salmincola DNA in water samples and in fish and snail tissues. Analytical and diagnostic validation demonstrated good sensitivity, specificity, and repeatability of both qPCR assays. The N. salmincola DNA copy number in kidney tissue was significantly correlated with metacercaria counts based on microscopy. Extraction methods were optimized for the sensitive qPCR detection of N. salmincola DNA in settled water samples. Artificially spiked samples suggested that the 1-cercaria/L threshold corresponded to an estimated log10 copies per liter ≥ 6.0. Significant correlation of DNA copy number per liter and microscopic counts indicated that the estimated qPCR copy number was a good predictor of the number of waterborne cercariae. However, the detection of real-world samples below the estimated 1-cercaria/L threshold suggests that the assays may also detect other N. salmincola life stages, nonintact cercariae, or free DNA that settles with the debris. In summary, the qPCR assays reported here are suitable for identifying and quantifying all life stages of N. salmincola that occur in fish tissues, snail tissues, and water.
40 CFR 761.283 - Determination of the number of samples to collect and sample collection locations.
Code of Federal Regulations, 2012 CFR
2012-07-01
... sites at this example location: a loading dock, a transformer storage lot, and a disposal pit. The... (three samples). The non-liquid PCB remediation wastes present at the transformer storage lot are oily...
40 CFR 761.283 - Determination of the number of samples to collect and sample collection locations.
Code of Federal Regulations, 2010 CFR
2010-07-01
... sites at this example location: a loading dock, a transformer storage lot, and a disposal pit. The... (three samples). The non-liquid PCB remediation wastes present at the transformer storage lot are oily...
Planning and processing multistage samples with a computer programMUST.
John W. Hazard; Larry E. Stewart
1974-01-01
A computer program was written to handle multistage sampling designs in insect populations. It is, however, general enough to be used for any population where the number of stages does not exceed three. The program handles three types of sampling situations, all of which assume equal probability sampling. Option 1 takes estimates of sample variances, costs, and either...
Stable Estimation of a Covariance Matrix Guided by Nuclear Norm Penalties
Chi, Eric C.; Lange, Kenneth
2014-01-01
Estimation of a covariance matrix or its inverse plays a central role in many statistical methods. For these methods to work reliably, estimated matrices must not only be invertible but also well-conditioned. The current paper introduces a novel prior to ensure a well-conditioned maximum a posteriori (MAP) covariance estimate. The prior shrinks the sample covariance estimator towards a stable target and leads to a MAP estimator that is consistent and asymptotically efficient. Thus, the MAP estimator gracefully transitions towards the sample covariance matrix as the number of samples grows relative to the number of covariates. The utility of the MAP estimator is demonstrated in two standard applications – discriminant analysis and EM clustering – in this sampling regime. PMID:25143662
Lenart, Anna; Wolny-Koładka, Katarzyna
2013-01-01
The present study aimed to identify the effect of heavy metal concentration and soil pH on the abundance of the selected soil microorganisms within ArcelorMittal Poland steelworks, Cracow. The analysis included 20 soil samples, where the concentration of Fe, Zn, Cd, Pb, Ni, Cu, Mn, Cr and soil pH were evaluated together with the number of mesophilic bacteria, fungi, Actinomycetes and Azotobacter spp. In the majority of samples soil pH was alkaline. The limits of heavy metals exceeded in eight samples and in one sample, the concentration of Zn exceeded 31-fold. Chromium was the element which most significantly limited the number of bacteria and Actinomycetes.
30 CFR 90.210 - Respirable dust samples; report to operator.
Code of Federal Regulations, 2010 CFR
2010-07-01
... MINE SAFETY AND HEALTH MANDATORY HEALTH STANDARDS-COAL MINERS WHO HAVE EVIDENCE OF THE DEVELOPMENT OF PNEUMOCONIOSIS Sampling Procedures § 90.210 Respirable dust samples; report to operator. (a) The Secretary shall... for voiding any samples; and, (7) The Social Security Number of the part 90 miner. (b) Upon receipt...
What about N? A methodological study of sample-size reporting in focus group studies.
Carlsen, Benedicte; Glenton, Claire
2011-03-11
Focus group studies are increasingly published in health related journals, but we know little about how researchers use this method, particularly how they determine the number of focus groups to conduct. The methodological literature commonly advises researchers to follow principles of data saturation, although practical advise on how to do this is lacking. Our objectives were firstly, to describe the current status of sample size in focus group studies reported in health journals. Secondly, to assess whether and how researchers explain the number of focus groups they carry out. We searched PubMed for studies that had used focus groups and that had been published in open access journals during 2008, and extracted data on the number of focus groups and on any explanation authors gave for this number. We also did a qualitative assessment of the papers with regard to how number of groups was explained and discussed. We identified 220 papers published in 117 journals. In these papers insufficient reporting of sample sizes was common. The number of focus groups conducted varied greatly (mean 8.4, median 5, range 1 to 96). Thirty seven (17%) studies attempted to explain the number of groups. Six studies referred to rules of thumb in the literature, three stated that they were unable to organize more groups for practical reasons, while 28 studies stated that they had reached a point of saturation. Among those stating that they had reached a point of saturation, several appeared not to have followed principles from grounded theory where data collection and analysis is an iterative process until saturation is reached. Studies with high numbers of focus groups did not offer explanations for number of groups. Too much data as a study weakness was not an issue discussed in any of the reviewed papers. Based on these findings we suggest that journals adopt more stringent requirements for focus group method reporting. The often poor and inconsistent reporting seen in these studies may also reflect the lack of clear, evidence-based guidance about deciding on sample size. More empirical research is needed to develop focus group methodology.
Chiang, Kuo-Szu; Bock, Clive H; Lee, I-Hsuan; El Jarroudi, Moussa; Delfosse, Philippe
2016-12-01
The effect of rater bias and assessment method on hypothesis testing was studied for representative experimental designs for plant disease assessment using balanced and unbalanced data sets. Data sets with the same number of replicate estimates for each of two treatments are termed "balanced" and those with unequal numbers of replicate estimates are termed "unbalanced". The three assessment methods considered were nearest percent estimates (NPEs), an amended 10% incremental scale, and the Horsfall-Barratt (H-B) scale. Estimates of severity of Septoria leaf blotch on leaves of winter wheat were used to develop distributions for a simulation model. The experimental designs are presented here in the context of simulation experiments which consider the optimal design for the number of specimens (individual units sampled) and the number of replicate estimates per specimen for a fixed total number of observations (total sample size for the treatments being compared). The criterion used to gauge each method was the power of the hypothesis test. As expected, at a given fixed number of observations, the balanced experimental designs invariably resulted in a higher power compared with the unbalanced designs at different disease severity means, mean differences, and variances. Based on these results, with unbiased estimates using NPE, the recommended number of replicate estimates taken per specimen is 2 (from a sample of specimens of at least 30), because this conserves resources. Furthermore, for biased estimates, an apparent difference in the power of the hypothesis test was observed between assessment methods and between experimental designs. Results indicated that, regardless of experimental design or rater bias, an amended 10% incremental scale has slightly less power compared with NPEs, and that the H-B scale is more likely than the others to cause a type II error. These results suggest that choice of assessment method, optimizing sample number and number of replicate estimates, and using a balanced experimental design are important criteria to consider to maximize the power of hypothesis tests for comparing treatments using disease severity estimates.
Darnaude, Audrey M.
2016-01-01
Background Mixture models (MM) can be used to describe mixed stocks considering three sets of parameters: the total number of contributing sources, their chemical baseline signatures and their mixing proportions. When all nursery sources have been previously identified and sampled for juvenile fish to produce baseline nursery-signatures, mixing proportions are the only unknown set of parameters to be estimated from the mixed-stock data. Otherwise, the number of sources, as well as some/all nursery-signatures may need to be also estimated from the mixed-stock data. Our goal was to assess bias and uncertainty in these MM parameters when estimated using unconditional maximum likelihood approaches (ML-MM), under several incomplete sampling and nursery-signature separation scenarios. Methods We used a comprehensive dataset containing otolith elemental signatures of 301 juvenile Sparus aurata, sampled in three contrasting years (2008, 2010, 2011), from four distinct nursery habitats. (Mediterranean lagoons) Artificial nursery-source and mixed-stock datasets were produced considering: five different sampling scenarios where 0–4 lagoons were excluded from the nursery-source dataset and six nursery-signature separation scenarios that simulated data separated 0.5, 1.5, 2.5, 3.5, 4.5 and 5.5 standard deviations among nursery-signature centroids. Bias (BI) and uncertainty (SE) were computed to assess reliability for each of the three sets of MM parameters. Results Both bias and uncertainty in mixing proportion estimates were low (BI ≤ 0.14, SE ≤ 0.06) when all nursery-sources were sampled but exhibited large variability among cohorts and increased with the number of non-sampled sources up to BI = 0.24 and SE = 0.11. Bias and variability in baseline signature estimates also increased with the number of non-sampled sources, but tended to be less biased, and more uncertain than mixing proportion ones, across all sampling scenarios (BI < 0.13, SE < 0.29). Increasing separation among nursery signatures improved reliability of mixing proportion estimates, but lead to non-linear responses in baseline signature parameters. Low uncertainty, but a consistent underestimation bias affected the estimated number of nursery sources, across all incomplete sampling scenarios. Discussion ML-MM produced reliable estimates of mixing proportions and nursery-signatures under an important range of incomplete sampling and nursery-signature separation scenarios. This method failed, however, in estimating the true number of nursery sources, reflecting a pervasive issue affecting mixture models, within and beyond the ML framework. Large differences in bias and uncertainty found among cohorts were linked to differences in separation of chemical signatures among nursery habitats. Simulation approaches, such as those presented here, could be useful to evaluate sensitivity of MM results to separation and variability in nursery-signatures for other species, habitats or cohorts. PMID:27761305
Automatic sample Dewar for MX beam-line
DOE Office of Scientific and Technical Information (OSTI.GOV)
Charignon, T.; Tanchon, J.; Trollier, T.
2014-01-29
It is very common for crystals of large biological macromolecules to show considerable variation in quality of their diffraction. In order to increase the number of samples that are tested for diffraction quality before any full data collections at the ESRF*, an automatic sample Dewar has been implemented. Conception and performances of the Dewar are reported in this paper. The automatic sample Dewar has 240 samples capability with automatic loading/unloading ports. The storing Dewar is capable to work with robots and it can be integrated in a full automatic MX** beam-line. The samples are positioned in the front of themore » loading/unloading ports with and automatic rotating plate. A view port has been implemented for data matrix camera reading on each sample loaded in the Dewar. At last, the Dewar is insulated with polyurethane foam that keeps the liquid nitrogen consumption below 1.6 L/h. At last, the static insulation also makes vacuum equipment and maintenance unnecessary. This Dewar will be useful for increasing the number of samples tested in synchrotrons.« less
Pikkemaat, M G; Rapallini, M L B A; Karp, M T; Elferink, J W A
2010-08-01
Tetracyclines are extensively used in veterinary medicine. For the detection of tetracycline residues in animal products, a broad array of methods is available. Luminescent bacterial biosensors represent an attractive inexpensive, simple and fast method for screening large numbers of samples. A previously developed cell-biosensor method was subjected to an evaluation study using over 300 routine poultry samples and the results were compared with a microbial inhibition test. The cell-biosensor assay yielded many more suspect samples, 10.2% versus 2% with the inhibition test, which all could be confirmed by liquid chromatography-tandem mass spectrometry (LC-MS/MS). Only one sample contained a concentration above the maximum residue limit (MRL) of 100 microg kg(-1), while residue levels in most of the suspect samples were very low (<10 microg kg(-1)). The method appeared to be specific and robust. Using an experimental set-up comprising the analysis of a series of three sample dilutions allowed an appropriate cut-off for confirmatory analysis, limiting the number of samples and requiring further analysis to a minimum.
A comparison of liver sampling techniques in dogs.
Kemp, S D; Zimmerman, K L; Panciera, D L; Monroe, W E; Leib, M S; Lanz, O I
2015-01-01
The liver sampling technique in dogs that consistently provides samples adequate for accurate histopathologic interpretation is not known. To compare histopathologic results of liver samples obtained by punch, cup, and 14 gauge needle to large wedge samples collected at necropsy. Seventy dogs undergoing necropsy. Prospective study. Liver specimens were obtained from the left lateral liver lobe with an 8 mm punch, a 5 mm cup, and a 14 gauge needle. After sample acquisition, two larger tissue samples were collected near the center of the left lateral lobe to be used as a histologic standard for comparison. Histopathologic features and numbers of portal triads in each sample were recorded. The mean number of portal triads obtained by each sampling method were 2.9 in needle samples, 3.4 in cup samples, 12 in punch samples, and 30.7 in the necropsy samples. The diagnoses in 66% of needle samples, 60% of cup samples, and 69% of punch samples were in agreement with the necropsy samples, and these proportions were not significantly different from each other. The corresponding kappa coefficients were 0.59 for needle biopsies, 0.52 for cup biopsies, and 0.62 for punch biopsies. The histopathologic interpretation of a liver sample in the dog is unlikely to vary if the liver biopsy specimen contains at least 3-12 portal triads. However, in comparison large necropsy samples, the accuracy of all tested methods was relatively low. Copyright © 2014 by the American College of Veterinary Internal Medicine.
Recommendations for representative ballast water sampling
NASA Astrophysics Data System (ADS)
Gollasch, Stephan; David, Matej
2017-05-01
Until now, the purpose of ballast water sampling studies was predominantly limited to general scientific interest to determine the variety of species arriving in ballast water in a recipient port. Knowing the variety of species arriving in ballast water also contributes to the assessment of relative species introduction vector importance. Further, some sampling campaigns addressed awareness raising or the determination of organism numbers per water volume to evaluate the species introduction risk by analysing the propagule pressure of species. A new aspect of ballast water sampling, which this contribution addresses, is compliance monitoring and enforcement of ballast water management standards as set by, e.g., the IMO Ballast Water Management Convention. To achieve this, sampling methods which result in representative ballast water samples are essential. We recommend such methods based on practical tests conducted on two commercial vessels also considering results from our previous studies. The results show that different sampling approaches influence the results regarding viable organism concentrations in ballast water samples. It was observed that the sampling duration (i.e., length of the sampling process), timing (i.e., in which point in time of the discharge the sample is taken), the number of samples and the sampled water quantity are the main factors influencing the concentrations of viable organisms in a ballast water sample. Based on our findings we provide recommendations for representative ballast water sampling.
Comparison of oral fluid collection methods for the molecular detection of hepatitis B virus.
Portilho, M M; Mendonça, Acf; Marques, V A; Nabuco, L C; Villela-Nogueira, C A; Ivantes, Cap; Lewis-Ximenez, L L; Lampe, E; Villar, L M
2017-11-01
This study aims to compare the efficiency of four oral fluid collection methods (Salivette, FTA Card, spitting and DNA-Sal) to detect HBV DNA by qualitative PCR. Seventy-four individuals (32 HBV reactive and 42 with no HBV markers) donated serum and oral fluid. In-house qualitative PCR to detect HBV was used for both samples and commercial quantitative PCR for serum. HBV DNA was detected in all serum samples from HBV-infected individuals, and it was not detected in control group. HBV DNA from HBV group was detected in 17 samples collected with Salivette device, 16 samples collected by FTA Card device, 16 samples collected from spitting and 13 samples collected by DNA-Sal device. Samples that corresponded to a higher viral load in their paired serum sample could be detected using all oral fluid collection methods, but Salivette collection device yielded the largest numbers of positive samples and had a wide range of viral load that was detected. It was possible to detect HBV DNA using all devices tested, but higher number of positive samples was observed when samples were collected using Salivette device, which shows high concordance to viral load observed in the paired serum samples. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd. All rights reserved.
Landguth, Erin L.; Gedy, Bradley C.; Oyler-McCance, Sara J.; Garey, Andrew L.; Emel, Sarah L.; Mumma, Matthew; Wagner, Helene H.; Fortin, Marie-Josée; Cushman, Samuel A.
2012-01-01
The influence of study design on the ability to detect the effects of landscape pattern on gene flow is one of the most pressing methodological gaps in landscape genetic research. To investigate the effect of study design on landscape genetics inference, we used a spatially-explicit, individual-based program to simulate gene flow in a spatially continuous population inhabiting a landscape with gradual spatial changes in resistance to movement. We simulated a wide range of combinations of number of loci, number of alleles per locus and number of individuals sampled from the population. We assessed how these three aspects of study design influenced the statistical power to successfully identify the generating process among competing hypotheses of isolation-by-distance, isolation-by-barrier, and isolation-by-landscape resistance using a causal modelling approach with partial Mantel tests. We modelled the statistical power to identify the generating process as a response surface for equilibrium and non-equilibrium conditions after introduction of isolation-by-landscape resistance. All three variables (loci, alleles and sampled individuals) affect the power of causal modelling, but to different degrees. Stronger partial Mantel r correlations between landscape distances and genetic distances were found when more loci were used and when loci were more variable, which makes comparisons of effect size between studies difficult. Number of individuals did not affect the accuracy through mean equilibrium partial Mantel r, but larger samples decreased the uncertainty (increasing the precision) of equilibrium partial Mantel r estimates. We conclude that amplifying more (and more variable) loci is likely to increase the power of landscape genetic inferences more than increasing number of individuals.
Landguth, E.L.; Fedy, B.C.; Oyler-McCance, S.J.; Garey, A.L.; Emel, S.L.; Mumma, M.; Wagner, H.H.; Fortin, M.-J.; Cushman, S.A.
2012-01-01
The influence of study design on the ability to detect the effects of landscape pattern on gene flow is one of the most pressing methodological gaps in landscape genetic research. To investigate the effect of study design on landscape genetics inference, we used a spatially-explicit, individual-based program to simulate gene flow in a spatially continuous population inhabiting a landscape with gradual spatial changes in resistance to movement. We simulated a wide range of combinations of number of loci, number of alleles per locus and number of individuals sampled from the population. We assessed how these three aspects of study design influenced the statistical power to successfully identify the generating process among competing hypotheses of isolation-by-distance, isolation-by-barrier, and isolation-by-landscape resistance using a causal modelling approach with partial Mantel tests. We modelled the statistical power to identify the generating process as a response surface for equilibrium and non-equilibrium conditions after introduction of isolation-by-landscape resistance. All three variables (loci, alleles and sampled individuals) affect the power of causal modelling, but to different degrees. Stronger partial Mantel r correlations between landscape distances and genetic distances were found when more loci were used and when loci were more variable, which makes comparisons of effect size between studies difficult. Number of individuals did not affect the accuracy through mean equilibrium partial Mantel r, but larger samples decreased the uncertainty (increasing the precision) of equilibrium partial Mantel r estimates. We conclude that amplifying more (and more variable) loci is likely to increase the power of landscape genetic inferences more than increasing number of individuals. ?? 2011 Blackwell Publishing Ltd.
An In Situ Method for Sizing Insoluble Residues in Precipitation and Other Aqueous Samples
Axson, Jessica L.; Creamean, Jessie M.; Bondy, Amy L.; Capracotta, Sonja S.; Warner, Katy Y.; Ault, Andrew P.
2015-01-01
Particles are frequently incorporated into clouds or precipitation, influencing climate by acting as cloud condensation or ice nuclei, taking up coatings during cloud processing, and removing species through wet deposition. Many of these particles, particularly ice nuclei, can remain suspended within cloud droplets/crystals as insoluble residues. While previous studies have measured the soluble or bulk mass of species within clouds and precipitation, no studies to date have determined the number concentration and size distribution of insoluble residues in precipitation or cloud water using in situ methods. Herein, for the first time we demonstrate that Nanoparticle Tracking Analysis (NTA) is a powerful in situ method for determining the total number concentration, number size distribution, and surface area distribution of insoluble residues in precipitation, both of rain and melted snow. The method uses 500 μL or less of liquid sample and does not require sample modification. Number concentrations for the insoluble residues in aqueous precipitation samples ranged from 2.0–3.0(±0.3)×108 particles cm−3, while surface area ranged from 1.8(±0.7)–3.2(±1.0)×107 μm2 cm−3. Number size distributions peaked between 133–150 nm, with both single and multi-modal character, while surface area distributions peaked between 173–270 nm. Comparison with electron microscopy of particles up to 10 μm show that, by number, > 97% residues are <1 μm in diameter, the upper limit of the NTA. The range of concentration and distribution properties indicates that insoluble residue properties vary with ambient aerosol concentrations, cloud microphysics, and meteorological dynamics. NTA has great potential for studying the role that insoluble residues play in critical atmospheric processes. PMID:25705069
Barone, Teresa L; Storey, John M E; Domingo, Norberto
2010-08-01
A field-aged, passive diesel particulate filter (DPF) used in a school bus retrofit program was evaluated for emissions of particle mass and number concentration before, during, and after regeneration. For the particle mass measurements, filter samples were collected for gravimetric analysis with a partial flow sampling system, which sampled proportionally to the exhaust flow. A condensation particle counter and scanning mobility particle sizer measured total number concentration and number-size distributions, respectively. The results of the evaluation show that the number concentration emissions decreased as the DPF became loaded with soot. However, after soot removal by regeneration, the number concentration emissions were approximately 20 times greater, which suggests the importance of the soot layer in helping to trap particles. Contrary to the number concentration results, particle mass emissions decreased from 6 +/- 1 mg/hp-hr before regeneration to 3 +/- 2 mg/hp-hr after regeneration. This indicates that nanoparticles with diameters less than 50 nm may have been emitted after regeneration because these particles contribute little to the total mass. Overall, average particle emission reductions of 95% by mass and 10,000-fold by number concentration after 4 yr of use provided evidence of the durability of a field-aged DPF. In contrast to previous reports for new DPFs in which elevated number concentrations occurred during the first 200 sec of a transient cycle, the number concentration emissions were elevated during the second half of the heavy-duty Federal Test Procedure (FTP) when high speed was sustained. This information is relevant for the analysis of mechanisms by which particles are emitted from field-aged DPFs.
Brungs, Daniel; Lynch, David; Luk, Alison Ws; Minaei, Elahe; Ranson, Marie; Aghmesheh, Morteza; Vine, Kara L; Carolan, Martin; Jaber, Mouhannad; de Souza, Paul; Becker, Therese M
2018-02-21
To demonstrate the feasibility of cryopreservation of peripheral blood mononuclear cells (PBMCs) for prognostic circulating tumor cell (CTC) detection in gastroesophageal cancer. Using 7.5 mL blood samples collected in EDTA tubes from patients with gastroesopheagal adenocarcinoma, CTCs were isolated by epithelial cell adhesion molecule based immunomagnetic capture using the IsoFlux platform. Paired specimens taken during the same blood draw ( n = 15) were used to compare number of CTCs isolated from fresh and cryopreserved PBMCs. Blood samples were processed within 24 h to recover the PBMC fraction, with PBMCs used for fresh analysis immediately processed for CTC isolation. Cryopreservation of PBMCs lasted from 2 wk to 25.2 mo (median 14.6 mo). CTCs isolated from pre-treatment cryopreserved PBMCs ( n = 43) were examined for associations with clinicopathological variables and survival outcomes. While there was a significant trend to a decrease in CTC numbers associated with cryopreserved specimens (mean number of CTCs 34.4 vs 51.5, P = 0.04), this was predominately in samples with a total CTC count of > 50, with low CTC count samples less affected ( P = 0.06). There was no significant association between the duration of cryopreservation and number of CTCs. In cryopreserved PBMCs from patient samples prior to treatment, a high CTC count (> 17) was associated with poorer overall survival (OS) ( n = 43, HR = 4.4, 95%CI: 1.7-11.7, P = 0.0013). In multivariate analysis, after controlling for sex, age, stage, ECOG performance status, and primary tumor location, a high CTC count remained significantly associated with a poorer OS (HR = 3.7, 95%CI: 1.2-12.4, P = 0.03). PBMC cryopreservation for delayed CTC isolation is a valid strategy to assist with sample collection, transporting and processing.
Tyree, M T; Dixon, M A; Tyree, E L; Johnson, R
1984-08-01
Measurements are reported of ultrasonic acoustic emissions (AEs) measured from sapwood samples of Thuja occidentalis L. and Tsuga canadensis (L.) Carr. during air dehydration. The measurements were undertaken to test the following three hypotheses: (a) Each cavitation event produces one ultrasonic AE. (b) Large tracheids are more likely to cavitate than small tracheids. (c) When stem water potentials are >-0.4 MPa, a significant fraction of the water content of sapwood is held by ;capillary forces.' The last two hypotheses were recently discussed at length by M. H. Zimmermann. Experimental evidence consistent with all three hypotheses was obtained. The evidence for each hypothesis respectively is: (a) the cumulative number of AEs nearly equals the number of tracheids in small samples; (b) more water is lost per AE event at the beginning of the dehydration process than at the end, and (c) sapwood samples dehydrated from an initial water potential of 0 MPa lost significantly more water before AEs started than lost by samples dehydrated from an initial water potential of about -0.4 MPa. The extra water held by fully hydrated sapwood samples may have been capillary water as defined by Zimmerman.We also report an improved method for the measurement of the ;intensity' of ultrasonic AEs. Intensity is defined here as the area under the positive spikes of the AE signal (plotted as voltage versus time). This method was applied to produce a frequency histogram of the number of AEs versus intensity. A large fraction of the total number of AEs were of low intensity even in small samples (4 mm diameter by 10 mm length). This suggests that the effective ;listening distance' for most AEs was less than 5 to 10 mm.
Ultrasonic Acoustic Emissions from the Sapwood of Cedar and Hemlock 1
Tyree, Melvin T.; Dixon, Michael A.; Tyree, E. Loeta; Johnson, Robert
1984-01-01
Measurements are reported of ultrasonic acoustic emissions (AEs) measured from sapwood samples of Thuja occidentalis L. and Tsuga canadensis (L.) Carr. during air dehydration. The measurements were undertaken to test the following three hypotheses: (a) Each cavitation event produces one ultrasonic AE. (b) Large tracheids are more likely to cavitate than small tracheids. (c) When stem water potentials are >−0.4 MPa, a significant fraction of the water content of sapwood is held by `capillary forces.' The last two hypotheses were recently discussed at length by M. H. Zimmermann. Experimental evidence consistent with all three hypotheses was obtained. The evidence for each hypothesis respectively is: (a) the cumulative number of AEs nearly equals the number of tracheids in small samples; (b) more water is lost per AE event at the beginning of the dehydration process than at the end, and (c) sapwood samples dehydrated from an initial water potential of 0 MPa lost significantly more water before AEs started than lost by samples dehydrated from an initial water potential of about −0.4 MPa. The extra water held by fully hydrated sapwood samples may have been capillary water as defined by Zimmerman. We also report an improved method for the measurement of the `intensity' of ultrasonic AEs. Intensity is defined here as the area under the positive spikes of the AE signal (plotted as voltage versus time). This method was applied to produce a frequency histogram of the number of AEs versus intensity. A large fraction of the total number of AEs were of low intensity even in small samples (4 mm diameter by 10 mm length). This suggests that the effective `listening distance' for most AEs was less than 5 to 10 mm. PMID:16663774
Brown, Larry R.; Panshin, Sandra Y.; Kratzer, Charles R.; Zamora, Celia; Gronberg, JoAnn M.
2004-01-01
Water samples were collected from 22 drainage basins for analysis of 48 dissolved pesticides during summer flow conditions in 1994 and 2001. Of the 48 pesticides, 31 were reported applied in the basin in the 28 days preceding the June 1994 sampling, 25 in the 28 days preceding the June 2001 sampling, and 24 in the 28 days preceding the August 2001 sampling. The number of dissolved pesticides detected was similar among sampling periods: 26 were detected in June 1994, 28 in June 2001, and 27 in August 2001. Concentrations of chlorpyrifos exceeded the California criterion for the protection of aquatic life from acute exposure at six sites in June 1994 and at five sites in June 2001. There was a single exceedance of the criterion for diazinon in June 1994. The number of pesticides applied in tributary basins was highly correlated with basin area during each sampling period (Spearman's r = 0.85, 0.70, and 0.84 in June 1994, June 2001, and August 2001, respectively, and p < 0.01 in all cases). Larger areas likely include a wider variety of crops, resulting in more varied pesticide use. Jaccard's similarities, cluster analysis, principal components analysis, and instantaneous load calculations generally indicate that west-side tributary basins were different from east-side tributary basins. In general, west-side basins had higher concentrations, instantaneous loads, and instantaneous yields of dissolved pesticides than east-side basins, although there were a number of exceptions. These differences may be related to a number of factors, including differences in basin size, soil texture, land use, irrigation practices, and stream discharge.
A study of active learning methods for named entity recognition in clinical text.
Chen, Yukun; Lasko, Thomas A; Mei, Qiaozhu; Denny, Joshua C; Xu, Hua
2015-12-01
Named entity recognition (NER), a sequential labeling task, is one of the fundamental tasks for building clinical natural language processing (NLP) systems. Machine learning (ML) based approaches can achieve good performance, but they often require large amounts of annotated samples, which are expensive to build due to the requirement of domain experts in annotation. Active learning (AL), a sample selection approach integrated with supervised ML, aims to minimize the annotation cost while maximizing the performance of ML-based models. In this study, our goal was to develop and evaluate both existing and new AL methods for a clinical NER task to identify concepts of medical problems, treatments, and lab tests from the clinical notes. Using the annotated NER corpus from the 2010 i2b2/VA NLP challenge that contained 349 clinical documents with 20,423 unique sentences, we simulated AL experiments using a number of existing and novel algorithms in three different categories including uncertainty-based, diversity-based, and baseline sampling strategies. They were compared with the passive learning that uses random sampling. Learning curves that plot performance of the NER model against the estimated annotation cost (based on number of sentences or words in the training set) were generated to evaluate different active learning and the passive learning methods and the area under the learning curve (ALC) score was computed. Based on the learning curves of F-measure vs. number of sentences, uncertainty sampling algorithms outperformed all other methods in ALC. Most diversity-based methods also performed better than random sampling in ALC. To achieve an F-measure of 0.80, the best method based on uncertainty sampling could save 66% annotations in sentences, as compared to random sampling. For the learning curves of F-measure vs. number of words, uncertainty sampling methods again outperformed all other methods in ALC. To achieve 0.80 in F-measure, in comparison to random sampling, the best uncertainty based method saved 42% annotations in words. But the best diversity based method reduced only 7% annotation effort. In the simulated setting, AL methods, particularly uncertainty-sampling based approaches, seemed to significantly save annotation cost for the clinical NER task. The actual benefit of active learning in clinical NER should be further evaluated in a real-time setting. Copyright © 2015 Elsevier Inc. All rights reserved.
Aitken, C G
1999-07-01
It is thought that, in a consignment of discrete units, a certain proportion of the units contain illegal material. A sample of the consignment is to be inspected. Various methods for the determination of the sample size are compared. The consignment will be considered as a random sample from some super-population of units, a certain proportion of which contain drugs. For large consignments, a probability distribution, known as the beta distribution, for the proportion of the consignment which contains illegal material is obtained. This distribution is based on prior beliefs about the proportion. Under certain specific conditions the beta distribution gives the same numerical results as an approach based on the binomial distribution. The binomial distribution provides a probability for the number of units in a sample which contain illegal material, conditional on knowing the proportion of the consignment which contains illegal material. This is in contrast to the beta distribution which provides probabilities for the proportion of a consignment which contains illegal material, conditional on knowing the number of units in the sample which contain illegal material. The interpretation when the beta distribution is used is much more intuitively satisfactory. It is also much more flexible in its ability to cater for prior beliefs which may vary given the different circumstances of different crimes. For small consignments, a distribution, known as the beta-binomial distribution, for the number of units in the consignment which are found to contain illegal material, is obtained, based on prior beliefs about the number of units in the consignment which are thought to contain illegal material. As with the beta and binomial distributions for large samples, it is shown that, in certain specific conditions, the beta-binomial and hypergeometric distributions give the same numerical results. However, the beta-binomial distribution, as with the beta distribution, has a more intuitively satisfactory interpretation and greater flexibility. The beta and the beta-binomial distributions provide methods for the determination of the minimum sample size to be taken from a consignment in order to satisfy a certain criterion. The criterion requires the specification of a proportion and a probability.
NASA Astrophysics Data System (ADS)
Javadi, S.; Ouyang, B.; Zhang, Z.; Ghoranneviss, M.; Salar Elahi, A.; Rawat, R. S.
2018-06-01
Tungsten is the leading candidate for plasma facing component (PFC) material for thermonuclear fusion reactors and various efforts are ongoing to evaluate its performance or response to intense fusion relevant radiation, plasma and thermal loads. This paper investigates the effects of hot dense decaying pinch plasma, highly energetic deuterium ions and fusion neutrons generated in a low-energy (3.0 kJ) plasma focus device on the structure, morphology and hardness of the PLANSEE double forged tungsten (W) samples surfaces. The tungsten samples were provided by Forschungszentrum Juelich (FZJ), Germany via International Atomic Energy Agency, Vienna, Austria. Tungsten samples were irradiated using different number of plasma focus (PF) shots (1, 5 and 10) at a fixed axial distance of 5 cm from the anode top and also at various distances from the top of the anode (5, 7, 9 and 11 cm) using fixed number (5) of plasma focus shots. The virgin tungsten sample had bcc structure (α-W phase). After PF irradiation, the XRD analysis showed (i) the presence of low intensity new diffraction peak corresponding to β-W phase at (211) crystalline plane indicating the partial structural phase transition in some of the samples, (ii) partial amorphization, and (iii) vacancy defects formation and compressive stress in irradiated tungsten samples. Field emission scanning electron microscopy showed the distinctive changes to non-uniform surface with nanometer sized particles and particle agglomerates along with large surface cracks at higher number of irradiation shots. X-ray photoelectron spectroscopy analysis demonstrated the reduction in relative tungsten oxide content and the increase in metallic tungsten after irradiation. Hardness of irradiated samples initially increased for one shot exposure due to reduction in tungsten oxide phase, but then decreased with increasing number of shots due to increasing concentration of defects. It is demonstrated that the plasma focus device provides appropriate intense fusion relevant pulses for testing the structural, morphological and mechanical changes on irradiated tungsten samples.
Fearon, Elizabeth; Chabata, Sungai T; Thompson, Jennifer A; Cowan, Frances M; Hargreaves, James R
2017-09-14
While guidance exists for obtaining population size estimates using multiplier methods with respondent-driven sampling surveys, we lack specific guidance for making sample size decisions. To guide the design of multiplier method population size estimation studies using respondent-driven sampling surveys to reduce the random error around the estimate obtained. The population size estimate is obtained by dividing the number of individuals receiving a service or the number of unique objects distributed (M) by the proportion of individuals in a representative survey who report receipt of the service or object (P). We have developed an approach to sample size calculation, interpreting methods to estimate the variance around estimates obtained using multiplier methods in conjunction with research into design effects and respondent-driven sampling. We describe an application to estimate the number of female sex workers in Harare, Zimbabwe. There is high variance in estimates. Random error around the size estimate reflects uncertainty from M and P, particularly when the estimate of P in the respondent-driven sampling survey is low. As expected, sample size requirements are higher when the design effect of the survey is assumed to be greater. We suggest a method for investigating the effects of sample size on the precision of a population size estimate obtained using multipler methods and respondent-driven sampling. Uncertainty in the size estimate is high, particularly when P is small, so balancing against other potential sources of bias, we advise researchers to consider longer service attendance reference periods and to distribute more unique objects, which is likely to result in a higher estimate of P in the respondent-driven sampling survey. ©Elizabeth Fearon, Sungai T Chabata, Jennifer A Thompson, Frances M Cowan, James R Hargreaves. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 14.09.2017.
Livingston, Michael; Dietze, Paul; Ferris, Jason; Pennay, Darren; Hayes, Linda; Lenton, Simon
2013-03-16
Telephone surveys based on samples of landline telephone numbers are widely used to measure the prevalence of health risk behaviours such as smoking, drug use and alcohol consumption. An increasing number of households are relying solely on mobile telephones, creating a potential bias for population estimates derived from landline-based sampling frames which do not incorporate mobile phone numbers. Studies in the US have identified significant differences between landline and mobile telephone users in smoking and alcohol consumption, but there has been little work in other settings or focussed on illicit drugs. This study examined Australian prevalence estimates of cannabis use, tobacco smoking and risky alcohol consumption based on samples selected using a dual-frame (mobile and landline) approach. Respondents from the landline sample were compared both to the overall mobile sample (including respondents who had access to a landline) and specifically to respondents who lived in mobile-only households. Bivariate comparisons were complemented with multivariate logistic regression models, controlling for the effects of basic demographic variables. The landline sample reported much lower prevalence of tobacco use, cannabis use and alcohol consumption than the mobile samples. Once demographic variables were adjusted for, there were no significant differences between the landline and mobile respondents on any of the alcohol measures examined. In contrast, the mobile samples had significantly higher rates of cannabis and tobacco use, even after adjustment. Weighted estimates from the dual-frame sample were generally higher than the landline sample across all substances, but only significantly higher for tobacco use. Landline telephone surveys in Australia are likely to substantially underestimate the prevalence of tobacco smoking by excluding potential respondents who live in mobile-only households. In contrast, estimates of alcohol consumption and cannabis use from landline surveys are likely to be broadly accurate, once basic demographic weighting is undertaken.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-08-26
... DEPARTMENT OF TRANSPORTATION National Highway Traffic Safety Administration [U.S. DOT Docket Number NHTSA-2010-0122] 2009 Fatality Analysis Reporting System (FARS)/National Automotive Sampling... Administration (NHTSA)--2009 Fatality Analysis Reporting System (FARS) & National Automotive Sampling System...
Use of multispectral data in design of forest sample surveys
NASA Technical Reports Server (NTRS)
Titus, S. J.; Wensel, L. C.
1977-01-01
The use of multispectral data in design of forest sample surveys using a computer software package is described. The system allows evaluation of a number of alternative sampling systems and, with appropriate cost data, estimates the implementation cost for each.
Use of multispectral data in design of forest sample surveys
NASA Technical Reports Server (NTRS)
Titus, S. J.; Wensel, L. C.
1977-01-01
The use of multispectral data in design of forest sample surveys using a computer software package, WILLIAM, is described. The system allows evaluation of a number of alternative sampling systems and, with appropriate cost data, estimates the implementation cost for each.
NASA Astrophysics Data System (ADS)
Herd, C. D. K.; Tornabene, L. L.; Bowling, T. J.; Walton, E. L.; Sharp, T. G.; Melosh, H. J.; Hamilton, J. S.; Viviano, C. E.; Ehlmann, B. L.
2018-04-01
We have made advances in constraining the potential source craters of the martian meteorites to a relatively small number. Our results have implications for Mars chronology and the prioritization of samples for Mars Sample Return.
Automatic bio-sample bacteria detection system
NASA Technical Reports Server (NTRS)
Chappelle, E. W.; Colburn, M.; Kelbaugh, B. N.; Picciolo, G. L.
1971-01-01
Electromechanical device analyzes urine specimens in 15 minutes and processes one sample per minute. Instrument utilizes bioluminescent reaction between luciferase-luciferin mixture and adenosine triphosphate (ATP) to determine number of bacteria present in the sample. Device has potential application to analysis of other body fluids.
Spatio-temporal optimization of sampling for bluetongue vectors (Culicoides) near grazing livestock
2013-01-01
Background Estimating the abundance of Culicoides using light traps is influenced by a large variation in abundance in time and place. This study investigates the optimal trapping strategy to estimate the abundance or presence/absence of Culicoides on a field with grazing animals. We used 45 light traps to sample specimens from the Culicoides obsoletus species complex on a 14 hectare field during 16 nights in 2009. Findings The large number of traps and catch nights enabled us to simulate a series of samples consisting of different numbers of traps (1-15) on each night. We also varied the number of catch nights when simulating the sampling, and sampled with increasing minimum distances between traps. We used resampling to generate a distribution of different mean and median abundance in each sample. Finally, we used the hypergeometric distribution to estimate the probability of falsely detecting absence of vectors on the field. The variation in the estimated abundance decreased steeply when using up to six traps, and was less pronounced when using more traps, although no clear cutoff was found. Conclusions Despite spatial clustering in vector abundance, we found no effect of increasing the distance between traps. We found that 18 traps were generally required to reach 90% probability of a true positive catch when sampling just one night. But when sampling over two nights the same probability level was obtained with just three traps per night. The results are useful for the design of vector monitoring programmes on fields with grazing animals. PMID:23705770
Classification and authentication of unknown water samples using machine learning algorithms.
Kundu, Palash K; Panchariya, P C; Kundu, Madhusree
2011-07-01
This paper proposes the development of water sample classification and authentication, in real life which is based on machine learning algorithms. The proposed techniques used experimental measurements from a pulse voltametry method which is based on an electronic tongue (E-tongue) instrumentation system with silver and platinum electrodes. E-tongue include arrays of solid state ion sensors, transducers even of different types, data collectors and data analysis tools, all oriented to the classification of liquid samples and authentication of unknown liquid samples. The time series signal and the corresponding raw data represent the measurement from a multi-sensor system. The E-tongue system, implemented in a laboratory environment for 6 numbers of different ISI (Bureau of Indian standard) certified water samples (Aquafina, Bisleri, Kingfisher, Oasis, Dolphin, and McDowell) was the data source for developing two types of machine learning algorithms like classification and regression. A water data set consisting of 6 numbers of sample classes containing 4402 numbers of features were considered. A PCA (principal component analysis) based classification and authentication tool was developed in this study as the machine learning component of the E-tongue system. A proposed partial least squares (PLS) based classifier, which was dedicated as well; to authenticate a specific category of water sample evolved out as an integral part of the E-tongue instrumentation system. The developed PCA and PLS based E-tongue system emancipated an overall encouraging authentication percentage accuracy with their excellent performances for the aforesaid categories of water samples. Copyright © 2011 ISA. Published by Elsevier Ltd. All rights reserved.
Optimised cord blood sample selection for small‑scale CD34+ cell immunomagnetic isolation.
Perdomo-Arciniegas, Ana-María; Vernot, Jean-Paul
2012-03-01
Haematopoietic stem cells (HSCs) are defined as multipotential cells, capable of self-renewal and reconstituting in vivo the haematopoietic compartment. The CD34 antigen is considered an important HSCs marker in humans. Immunomagnetic isolation, by targeting CD34 antigen, is widely used for human HSC separation. This method allows the enrichment of human HSCs that are present at low frequencies in umbilical cord blood (CB). Immunomagnetic CD34+-cell isolation reproducibility, regarding cell yield and purity, is affected by the CD34+ cell frequency and total cell numbers present in a given sample; CB HSC purification may thus yield variable results, which also depend on the volume and density fractionation-derived cell loss of a CB sample. The uncertainty of such an outcome and associated technical costs call for a cost-effective sample screening strategy. A correlation analysis using clinical and laboratory data from 59 CB samples was performed to establish predictive variables for CD34+-immunomagnetic HSCs isolation. This study described the positive association of CD34+-cell isolation with white and red cell numbers present after cell fractionation. Furthermore, purity has been correlated with lymphocyte percentages. Predictive variable cut-off values, which are particularly useful in situations involving low CB volumes being collected (such as prevalent late umbilical cord clamping clinical practice), were proposed for HSC isolation sampling. Using the simple and cost-effective CB sample screening criteria described here would lead to avoiding costly inefficient sample purification, thereby ensuring that pure CD34+ cells are obtained in the desired numbers following CD34 immunomagnetic isolation.
Accounting for imperfect detection in Hill numbers for biodiversity studies
Broms, Kristin M.; Hooten, Mevin B.; Fitzpatrick, Ryan M.
2015-01-01
The occupancy-based Hill number estimators are always at their asymptotic values (i.e. as if an infinite number of samples have been taken for the study region), therefore making it easy to compare biodiversity between different assemblages. In addition, the Hill numbers are computed as derived quantities within a Bayesian hierarchical model, allowing for straightforward inference.
76 FR 66683 - Submission for OMB Review; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2011-10-27
...: National Oceanic and Atmospheric Administration (NOAA). Title: Protocol for Access to Tissue Specimen Samples from the National Marine Mammal Tissue Bank. OMB Control Number: 0648-0468. Form Number(s): NA..., the National Marine Mammal Tissue Bank (NMMTB) was established by the National Marine Fisheries...
DNA copy number changes define spatial patterns of heterogeneity in colorectal cancer
Mamlouk, Soulafa; Childs, Liam Harold; Aust, Daniela; Heim, Daniel; Melching, Friederike; Oliveira, Cristiano; Wolf, Thomas; Durek, Pawel; Schumacher, Dirk; Bläker, Hendrik; von Winterfeld, Moritz; Gastl, Bastian; Möhr, Kerstin; Menne, Andrea; Zeugner, Silke; Redmer, Torben; Lenze, Dido; Tierling, Sascha; Möbs, Markus; Weichert, Wilko; Folprecht, Gunnar; Blanc, Eric; Beule, Dieter; Schäfer, Reinhold; Morkel, Markus; Klauschen, Frederick; Leser, Ulf; Sers, Christine
2017-01-01
Genetic heterogeneity between and within tumours is a major factor determining cancer progression and therapy response. Here we examined DNA sequence and DNA copy-number heterogeneity in colorectal cancer (CRC) by targeted high-depth sequencing of 100 most frequently altered genes. In 97 samples, with primary tumours and matched metastases from 27 patients, we observe inter-tumour concordance for coding mutations; in contrast, gene copy numbers are highly discordant between primary tumours and metastases as validated by fluorescent in situ hybridization. To further investigate intra-tumour heterogeneity, we dissected a single tumour into 68 spatially defined samples and sequenced them separately. We identify evenly distributed coding mutations in APC and TP53 in all tumour areas, yet highly variable gene copy numbers in numerous genes. 3D morpho-molecular reconstruction reveals two clusters with divergent copy number aberrations along the proximal–distal axis indicating that DNA copy number variations are a major source of tumour heterogeneity in CRC. PMID:28120820
On the importance of incorporating sampling weights in ...
Occupancy models are used extensively to assess wildlife-habitat associations and to predict species distributions across large geographic regions. Occupancy models were developed as a tool to properly account for imperfect detection of a species. Current guidelines on survey design requirements for occupancy models focus on the number of sample units and the pattern of revisits to a sample unit within a season. We focus on the sampling design or how the sample units are selected in geographic space (e.g., stratified, simple random, unequal probability, etc). In a probability design, each sample unit has a sample weight which quantifies the number of sample units it represents in the finite (oftentimes areal) sampling frame. We demonstrate the importance of including sampling weights in occupancy model estimation when the design is not a simple random sample or equal probability design. We assume a finite areal sampling frame as proposed for a national bat monitoring program. We compare several unequal and equal probability designs and varying sampling intensity within a simulation study. We found the traditional single season occupancy model produced biased estimates of occupancy and lower confidence interval coverage rates compared to occupancy models that accounted for the sampling design. We also discuss how our findings inform the analyses proposed for the nascent North American Bat Monitoring Program and other collaborative synthesis efforts that propose h
NASA Astrophysics Data System (ADS)
Ogasawara, M.; Kato, K.
2009-04-01
We invented a novel methodology for identifying origin of archaeological bitumen by use of field-ionization mass spectrometry (FI-MS). In the FI-MS method, fragmentation of molecular ions is minimal and there is a unit charge on each molecule. Thus, the observed mass spectra directly reflect the distribution of the molecular weights of the alkane components in bitumen. The distribution could be a molecular criterion for characterizing the bitumen sources from which each bitumen sample was derived. Actually, we decomposed the FI-MS spectra by Z-numbers into several components: the Z-number refers to z in the formula CnH2n+z and 2n-z is equivalent to the deficit number of H atoms when compared to the corresponding saturated hydrocarbon, which, in turn, is correlated to the ring number in alkanes. The integrated intensities of the component spectra corresponding to the Z-number were compared to each other. The difference in the observed spectra is reflected by the difference in concentration of alkane groups with different Z-number. In this way, the intensities data of the component spectra were used as indexes to search for the origin of the bitumen. FI-MS measurements were performed on 67 samples from five different bitumen sources and 41 bitumen samples excavated from archaeological sites in Honshu and Hokkaido, the largest and the second largest island in Japan, and Sakhalin island in Russia. By use of the spectral intensities of the seven alkane components in each sample, multiple discriminant analysis was employed for the data of raw bitumen samples and excavated samples from archaeological sites. The GC-MS chromatograms obtained from the archaeological samples from the Honshu area were all consistent with the results obtained by multivariate analysis, and thus the validity of the newly developed Z-number analysis was confirmed. As for the archaeological bitumen samples in Hokkaido, It was found that bitumen from Niigata, one of the main sources in Honshu, spread to the north in 2000 BC. It reached a small island near the north end of Hokkaido. Bitumen from Sakhalin reached the central lowland in Hokkaido, but it did not go into Honshu. Bitumen from Akita, another main source in Honshu, was predominated in the northeastern part of Honshu and the Oshima peninsula located at the southeastern end of Hokkaido. The story is consistent with a strong cultural tie between the Oshima peninsula and the northern Honshu throughout the Jomon period, the long lasting cultural period in Japanese prehistory. The long trade route along the coast of the Sea of Japan is being argued due to the recent archaeological findings obtained by excavations. Our results will shed more light on the geopolitical situation in the Jomon period of the area.
Junttila, Virpi; Kauranne, Tuomo; Finley, Andrew O.; Bradford, John B.
2015-01-01
Modern operational forest inventory often uses remotely sensed data that cover the whole inventory area to produce spatially explicit estimates of forest properties through statistical models. The data obtained by airborne light detection and ranging (LiDAR) correlate well with many forest inventory variables, such as the tree height, the timber volume, and the biomass. To construct an accurate model over thousands of hectares, LiDAR data must be supplemented with several hundred field sample measurements of forest inventory variables. This can be costly and time consuming. Different LiDAR-data-based and spatial-data-based sampling designs can reduce the number of field sample plots needed. However, problems arising from the features of the LiDAR data, such as a large number of predictors compared with the sample size (overfitting) or a strong correlation among predictors (multicollinearity), may decrease the accuracy and precision of the estimates and predictions. To overcome these problems, a Bayesian linear model with the singular value decomposition of predictors, combined with regularization, is proposed. The model performance in predicting different forest inventory variables is verified in ten inventory areas from two continents, where the number of field sample plots is reduced using different sampling designs. The results show that, with an appropriate field plot selection strategy and the proposed linear model, the total relative error of the predicted forest inventory variables is only 5%–15% larger using 50 field sample plots than the error of a linear model estimated with several hundred field sample plots when we sum up the error due to both the model noise variance and the model’s lack of fit.
Contaminants in landfill soils - Reliability of prefeasibility studies.
Hölzle, Ingo
2017-05-01
Recent landfill mining studies have researched the potential for resource recovery using samples from core drilling or grab cranes. However, most studies used small sample numbers, which may not represent the heterogeneous landfill composition. As a consequence, there exists a high risk of an incorrect economic and/or ecological evaluation. The main objective of this work is to investigate the possibilities and limitations of preliminary investigations concerning the crucial soil composition. The preliminary samples of landfill investigations were compared to the excavation samples from three completely excavated landfills in Germany. In addition, the research compared the reliability of prediction of the two investigation methods, core drilling and grab crane. Sampling using a grab crane led to better results, even for smaller investigations of 10 samples. Analyses of both methods showed sufficiently accurate results to make predictions (standard error 5%, level of confidence 95%) for most heavy metals, cyanide and PAH in the dry substance and for sulphate, barium, Benzo[a]pyrene, pH and the electrical conductivity in leachate analyses of soil type waste. While chrome and nickel showed less accurate results, the concentrations of hydrocarbons, TOC, DOC, PCB and fluorine (leachate) were not predictable even for sample numbers of up to 59. Overestimations of pollutant concentrations were more frequently apparent in drilling, and underestimations when using a grab crane. The dispersion of the element and elemental composition had no direct impact on the reliability of prediction. Thus, an individual consideration of the particular element or elemental composition for dry substance and leachate analyses is recommended to adapt the sample strategy and calculate an optimum sample number. Copyright © 2016 Elsevier Ltd. All rights reserved.
Molecular Surveillance as Monitoring Tool for Drug-Resistant Plasmodium falciparum in Suriname
Adhin, Malti R.; Labadie-Bracho, Mergiory; Bretas, Gustavo
2013-01-01
The aim of this translational study was to show the use of molecular surveillance for polymorphisms and copy number as a monitoring tool to track the emergence and dynamics of Plasmodium falciparum drug resistance. A molecular baseline for Suriname was established in 2005, with P. falciparum chloroquine resistance transporter (pfcrt) and P. falciparum multidrug resistance (pfmdr1) markers and copy number in 40 samples. The baseline results revealed the existence of a uniformly distributed mutated genotype corresponding with the fully mefloquine-sensitive 7G8-like genotype (Y184F, S1034C, N1042D, and D1246Y) and a fixed pfmdr1 N86 haplotype. All samples harbored the pivotal pfcrtK76T mutation, showing that chloroquine reintroduction should not yet be contemplated in Suriname. After 5 years, 40 samples were assessed to trace temporal changes in the status of pfmdr1 polymorphisms and copy number and showed minor genetic alterations in the pfmdr1 gene and no significant changes in copy number, thus providing scientific support for prolongation of the current drug policy in Suriname. PMID:23836573
Global Characterization of Protein Altering Mutations in Prostate Cancer
2011-08-01
integrative analyses of somatic mutation with gene expression and copy number change data collected on the same samples. To date, we have performed...implications for resistance to cancer therapeutics. We have also identified a subset of genes that appear to be recurrently mutated in our discovery set, and...integrative analyses of somatic mutation with gene expression and copy number change data collected on the same samples. Body This is a “synergy” project
2006-09-01
ORGANIZATION NAME(S) AND ADDRESS( ES ) Naval Postgraduate School Monterey, CA 93943-5000 8. PERFORMING ORGANIZATION REPORT NUMBER 9...SPONSORING /MONITORING AGENCY NAME(S) AND ADDRESS( ES ) N/A 10. SPONSORING/MONITORING AGENCY REPORT NUMBER 11. SUPPLEMENTARY NOTES The views expressed...mounted in a SATEC load frame. Figure 4 is a schematic of the diffusion bonding chamber and associated equipment. Al/Si and Cu/Quartz samples were
1979 Reserve Force Studies Surveys: Survey Design, Sample Design and Administrative Procedures,
1981-08-01
three factors: the need for a statistically significant number of usable questionnaires from different groups within the random sampls and from...Because of the multipurpose nature of these surveys and the large number of questions needed to fully address some of the topics covered, we...varies. Collection of data at the unit level is needed to accurately estimate actual reserve compensation and benefits and their possible role in both
Deepening and Extending Channels for Navigation. Charleston Harbor, South Carolina.
1980-04-01
AIJTNORf.) S. CONTRACT OR GRANT NUMBER(s) U.S. Army Corps of Engineers Charleston District S. PERFORMING ORGANIZATION NAME AND ADDRESS iO. PROGRAM ELEMENT ...marine areas, are written for water 69 quality rather than sediments), a comparison was made between the most recent (1975) sediment samples from...Charleston Harbor and sediment samples taken from locations in the Atlantic Intracoastal Waterway where one would expect to find non-contaminated material
Dynamic Pressure Induced Transformation Toughening and Strengthening in Bulk Metallic Glasses
2013-11-01
involved impact of 303 stainless steel flyer-plate on 303 stainless steel sample holder containing two BMGMC samples, at varying velocities. The Hugoniot...Technology. An aluminum sabot was used as the projectile with 303 Stainless Steel (SS) flyer plate to impact the DV1 bulk metallic glass composite. As...crystallization; polyamorphism; shear banding; high- strain -rate deformation REPORT DOCUMENTATION PAGE 11. SPONSOR/MONITOR’S REPORT NUMBER(S) 10. SPONSOR
Laser nitriding of iron: Nitrogen profiles and phases
NASA Astrophysics Data System (ADS)
Illgner, C.; Schaaf, P.; Lieb, K. P.; Schubert, E.; Queitsch, R.; Bergmann, H.-W.
1995-07-01
Armco iron samples were surface nitrided by irradiating them with pulses of an excimer laser in a nitrogen atmosphere. The resulting nitrogen depth profiles measured by Resonant Nuclear Reaction Analysis (RNRA) and the phase formation determined by Conversion Electron Mössbauer Spectroscopy (CEMS) were investigated as functions of energy density and the number of pulses. The nitrogen content of the samples was found to be independent of the number of pulses in a layer of 50 nm from the surface and to increase in depths exceeding 150 nm. The phase composition did not change with the number of pulses. The nitrogen content can be related to an enhanced nitrogen solubility based on high temperatures and high pressures due to the laser-induced plasma above the sample. With increasing pulse energy density, the phase composition changes towards phases with higher nitrogen contents. Nitrogen diffusion seems to be the limiting factor for the nitriding process.
NASA Astrophysics Data System (ADS)
Revathy, J. S.; Anooja, J.; Krishnaveni, R. B.; Gangadathan, M. P.; Varier, K. M.
2018-06-01
A light-weight multichannel analyser (MCA)-based γ -ray spectrometer, developed earlier at the Inter University Accelerator Centre, New Delhi, has been used as part of the PG curriculum, to determine the effective atomic numbers for γ attenuation of ^{137}Cs γ -ray in different types of samples. The samples used are mixtures of graphite, aluminum and selenium powders in different proportions, commercial and home-made edible powders, fruit and vegetable juices as well as certain allopathic and ayurvedic medications. A narrow beam good geometry set-up has been used in the experiments. The measured attenuation coefficients have been used to extract effective atomic numbers in the samples. The results are consistent with XCOM values wherever available. The present results suggest that the γ attenuation technique can be used as an effective non-destructive method for finding adulteration of food materials.
An investigation on mechanical properties of steel fibre reinforced for underwater welded joint
NASA Astrophysics Data System (ADS)
Navin, K.; Zakaria, M. S.; Zairi, S.
2017-09-01
Underwater pipelines are always exposed to water and have a high tendency to have corrosion especially on the welded joint. This research is about using fiber glass as steel fiber to coat the welded joint to determine the effectiveness in corrosion prevention of the welded joint. Number of coating is varied to determine the better number coating to coat the pipeline. Few samples were left without immersion in salt water and few samples are immersed into salt water with same salinity as sea water. The material sample is prepared in dog bone shape to enable to be used in Universal Tensile Machine (UTM). The material prepared is left immersed for recommended time and tested in Universal Tensile Machine. Upon analyzing the result, the result is used to determine the breakage point whether broken on the welded joint or different place and also the suitable number of coating to be used.
The Coalescent Process in Models with Selection
Kaplan, N. L.; Darden, T.; Hudson, R. R.
1988-01-01
Statistical properties of the process describing the genealogical history of a random sample of genes are obtained for a class of population genetics models with selection. For models with selection, in contrast to models without selection, the distribution of this process, the coalescent process, depends on the distribution of the frequencies of alleles in the ancestral generations. If the ancestral frequency process can be approximated by a diffusion, then the mean and the variance of the number of segregating sites due to selectively neutral mutations in random samples can be numerically calculated. The calculations are greatly simplified if the frequencies of the alleles are tightly regulated. If the mutation rates between alleles maintained by balancing selection are low, then the number of selectively neutral segregating sites in a random sample of genes is expected to substantially exceed the number predicted under a neutral model. PMID:3066685
Kondrashova, Olga; Love, Clare J.; Lunke, Sebastian; Hsu, Arthur L.; Waring, Paul M.; Taylor, Graham R.
2015-01-01
Whilst next generation sequencing can report point mutations in fixed tissue tumour samples reliably, the accurate determination of copy number is more challenging. The conventional Multiplex Ligation-dependent Probe Amplification (MLPA) assay is an effective tool for measurement of gene dosage, but is restricted to around 50 targets due to size resolution of the MLPA probes. By switching from a size-resolved format, to a sequence-resolved format we developed a scalable, high-throughput, quantitative assay. MLPA-seq is capable of detecting deletions, duplications, and amplifications in as little as 5ng of genomic DNA, including from formalin-fixed paraffin-embedded (FFPE) tumour samples. We show that this method can detect BRCA1, BRCA2, ERBB2 and CCNE1 copy number changes in DNA extracted from snap-frozen and FFPE tumour tissue, with 100% sensitivity and >99.5% specificity. PMID:26569395
Detection of image structures using the Fisher information and the Rao metric.
Maybank, Stephen J
2004-12-01
In many detection problems, the structures to be detected are parameterized by the points of a parameter space. If the conditional probability density function for the measurements is known, then detection can be achieved by sampling the parameter space at a finite number of points and checking each point to see if the corresponding structure is supported by the data. The number of samples and the distances between neighboring samples are calculated using the Rao metric on the parameter space. The Rao metric is obtained from the Fisher information which is, in turn, obtained from the conditional probability density function. An upper bound is obtained for the probability of a false detection. The calculations are simplified in the low noise case by making an asymptotic approximation to the Fisher information. An application to line detection is described. Expressions are obtained for the asymptotic approximation to the Fisher information, the volume of the parameter space, and the number of samples. The time complexity for line detection is estimated. An experimental comparison is made with a Hough transform-based method for detecting lines.
VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS
Huang, Jian; Horowitz, Joel L.; Wei, Fengrong
2010-01-01
We consider a nonparametric additive model of a conditional mean function in which the number of variables and additive components may be larger than the sample size but the number of nonzero additive components is “small” relative to the sample size. The statistical problem is to determine which additive components are nonzero. The additive components are approximated by truncated series expansions with B-spline bases. With this approximation, the problem of component selection becomes that of selecting the groups of coefficients in the expansion. We apply the adaptive group Lasso to select nonzero components, using the group Lasso to obtain an initial estimator and reduce the dimension of the problem. We give conditions under which the group Lasso selects a model whose number of components is comparable with the underlying model, and the adaptive group Lasso selects the nonzero components correctly with probability approaching one as the sample size increases and achieves the optimal rate of convergence. The results of Monte Carlo experiments show that the adaptive group Lasso procedure works well with samples of moderate size. A data example is used to illustrate the application of the proposed method. PMID:21127739
Perpendicular distance sampling: an alternative method for sampling downed coarse woody debris
Michael S. Williams; Jeffrey H. Gove
2003-01-01
Coarse woody debris (CWD) plays an important role in many forest ecosystem processes. In recent years, a number of new methods have been proposed to sample CWD. These methods select individual logs into the sample using some form of unequal probability sampling. One concern with most of these methods is the difficulty in estimating the volume of each log. A new method...
Monitoring Species of Concern Using Noninvasive Genetic Sampling and Capture-Recapture Methods
2016-11-01
ABBREVIATIONS AICc Akaike’s Information Criterion with small sample size correction AZGFD Arizona Game and Fish Department BMGR Barry M. Goldwater...MNKA Minimum Number Known Alive N Abundance Ne Effective Population Size NGS Noninvasive Genetic Sampling NGS-CR Noninvasive Genetic...parameter estimates from capture-recapture models require sufficient sample sizes , capture probabilities and low capture biases. For NGS-CR, sample
The influence of incubation time on adenovirus quantitation in A549 cells by most probable number.
Cashdollar, Jennifer L; Huff, Emma; Ryu, Hodon; Grimm, Ann C
2016-11-01
Cell culture based assays used to detect waterborne viruses typically call for incubating the sample for at least two weeks in order to ensure that all the culturable virus present is detected. Historically, this estimate was based, at least in part, on the length of time used for detecting poliovirus. In this study, we have examined A549 cells infected with human adenovirus type 2, and have found that a three week incubation of virus infected cells results in a higher number of detected viruses by quantal assay than what is seen after two weeks of incubation, with an average 955% increase in Most Probable Number (MPN) from 2 weeks to 3 weeks. This increase suggests that the extended incubation time is essential for accurately estimating viral titer, particularly for slow-growing viruses, UV treated samples, or samples with low titers of virus. In addition, we found that for some UV-treated samples, there was no detectable MPN at 2 weeks, but after 3 weeks, MPN values were obtained. For UV-treated samples, the average increase in MPN from 2 weeks to 3 weeks was 1401%, while untreated samples averaged a change in MPN of 674%, leading us to believe that the UV-damaged viral DNA may be able to be repaired such that viral replication then occurs. Published by Elsevier B.V.
Machine learning from computer simulations with applications in rail vehicle dynamics
NASA Astrophysics Data System (ADS)
Taheri, Mehdi; Ahmadian, Mehdi
2016-05-01
The application of stochastic modelling for learning the behaviour of a multibody dynamics (MBD) models is investigated. Post-processing data from a simulation run are used to train the stochastic model that estimates the relationship between model inputs (suspension relative displacement and velocity) and the output (sum of suspension forces). The stochastic model can be used to reduce the computational burden of the MBD model by replacing a computationally expensive subsystem in the model (suspension subsystem). With minor changes, the stochastic modelling technique is able to learn the behaviour of a physical system and integrate its behaviour within MBD models. The technique is highly advantageous for MBD models where real-time simulations are necessary, or with models that have a large number of repeated substructures, e.g. modelling a train with a large number of railcars. The fact that the training data are acquired prior to the development of the stochastic model discards the conventional sampling plan strategies like Latin Hypercube sampling plans where simulations are performed using the inputs dictated by the sampling plan. Since the sampling plan greatly influences the overall accuracy and efficiency of the stochastic predictions, a sampling plan suitable for the process is developed where the most space-filling subset of the acquired data with ? number of sample points that best describes the dynamic behaviour of the system under study is selected as the training data.