large sample numbers: Topics by Science.gov

Sample records for large sample numbers

Large sample area and size are needed for forest soil seed bank studies to ensure low discrepancy with standing vegetation.

PubMed

Shen, You-xin; Liu, Wei-li; Li, Yu-hui; Guan, Hui-lin

2014-01-01

A large number of small-sized samples invariably shows that woody species are absent from forest soil seed banks, leading to a large discrepancy with the seedling bank on the forest floor. We ask: 1) Does this conventional sampling strategy limit the detection of seeds of woody species? 2) Are large sample areas and sample sizes needed for higher recovery of seeds of woody species? We collected 100 samples that were 10 cm (length) × 10 cm (width) × 10 cm (depth), referred to as larger number of small-sized samples (LNSS) in a 1 ha forest plot, and placed them to germinate in a greenhouse, and collected 30 samples that were 1 m × 1 m × 10 cm, referred to as small number of large-sized samples (SNLS) and placed them (10 each) in a nearby secondary forest, shrub land and grass land. Only 15.7% of woody plant species of the forest stand were detected by the 100 LNSS, contrasting with 22.9%, 37.3% and 20.5% woody plant species being detected by SNLS in the secondary forest, shrub land and grassland, respectively. The increased number of species vs. sampled areas confirmed power-law relationships for forest stand, the LNSS and SNLS at all three recipient sites. Our results, although based on one forest, indicate that conventional LNSS did not yield a high percentage of detection for woody species, but SNLS strategy yielded a higher percentage of detection for woody species in the seed bank if samples were exposed to a better field germination environment. A 4 m2 minimum sample area derived from power equations is larger than the sampled area in most studies in the literature. Increased sample size also is needed to obtain an increased sample area if the number of samples is to remain relatively low.
Spreadsheet Simulation of the Law of Large Numbers

ERIC Educational Resources Information Center

Boger, George

2005-01-01

If larger and larger samples are successively drawn from a population and a running average calculated after each sample has been drawn, the sequence of averages will converge to the mean, [mu], of the population. This remarkable fact, known as the law of large numbers, holds true if samples are drawn from a population of discrete or continuous…
The cost of large numbers of hypothesis tests on power, effect size and sample size.

PubMed

Lazzeroni, L C; Ray, A

2012-01-01

Advances in high-throughput biology and computer science are driving an exponential increase in the number of hypothesis tests in genomics and other scientific disciplines. Studies using current genotyping platforms frequently include a million or more tests. In addition to the monetary cost, this increase imposes a statistical cost owing to the multiple testing corrections needed to avoid large numbers of false-positive results. To safeguard against the resulting loss of power, some have suggested sample sizes on the order of tens of thousands that can be impractical for many diseases or may lower the quality of phenotypic measurements. This study examines the relationship between the number of tests on the one hand and power, detectable effect size or required sample size on the other. We show that once the number of tests is large, power can be maintained at a constant level, with comparatively small increases in the effect size or sample size. For example at the 0.05 significance level, a 13% increase in sample size is needed to maintain 80% power for ten million tests compared with one million tests, whereas a 70% increase in sample size is needed for 10 tests compared with a single test. Relative costs are less when measured by increases in the detectable effect size. We provide an interactive Excel calculator to compute power, effect size or sample size when comparing study designs or genome platforms involving different numbers of hypothesis tests. The results are reassuring in an era of extreme multiple testing.
Sampling large random knots in a confined space

NASA Astrophysics Data System (ADS)

Arsuaga, J.; Blackstone, T.; Diao, Y.; Hinson, K.; Karadayi, E.; Saito, M.

2007-09-01

DNA knots formed under extreme conditions of condensation, as in bacteriophage P4, are difficult to analyze experimentally and theoretically. In this paper, we propose to use the uniform random polygon model as a supplementary method to the existing methods for generating random knots in confinement. The uniform random polygon model allows us to sample knots with large crossing numbers and also to generate large diagrammatically prime knot diagrams. We show numerically that uniform random polygons sample knots with large minimum crossing numbers and certain complicated knot invariants (as those observed experimentally). We do this in terms of the knot determinants or colorings. Our numerical results suggest that the average determinant of a uniform random polygon of n vertices grows faster than O(e^{n^2}) . We also investigate the complexity of prime knot diagrams. We show rigorously that the probability that a randomly selected 2D uniform random polygon of n vertices is almost diagrammatically prime goes to 1 as n goes to infinity. Furthermore, the average number of crossings in such a diagram is at the order of O(n2). Therefore, the two-dimensional uniform random polygons offer an effective way in sampling large (prime) knots, which can be useful in various applications.
Estimation of the rain signal in the presence of large surface clutter

NASA Technical Reports Server (NTRS)

Ahamad, Atiq; Moore, Richard K.

1994-01-01

The principal limitation for the use of a spaceborne imaging SAR as a rain radar is the surface-clutter problem. Signals may be estimated in the presence of noise by averaging large numbers of independent samples. This method was applied to obtain an estimate of the rain echo by averaging a set of N(sub c) samples of the clutter in a separate measurement and subtracting the clutter estimate from the combined estimate. The number of samples required for successful estimation (within 10-20%) for off-vertical angles of incidence appears to be prohibitively large. However, by appropriately degrading the resolution in both range and azimuth, the required number of samples can be obtained. For vertical incidence, the number of samples required for successful estimation is reasonable. In estimating the clutter it was assumed that the surface echo is the same outside the rain volume as it is within the rain volume. This may be true for the forest echo, but for convective storms over the ocean the surface echo outside the rain volume is very different from that within. It is suggested that the experiment be performed with vertical incidence over forest to overcome this limitation.
Investigating the Randomness of Numbers

ERIC Educational Resources Information Center

Pendleton, Kenn L.

2009-01-01

The use of random numbers is pervasive in today's world. Random numbers have practical applications in such far-flung arenas as computer simulations, cryptography, gambling, the legal system, statistical sampling, and even the war on terrorism. Evaluating the randomness of extremely large samples is a complex, intricate process. However, the…
Accelerating root system phenotyping of seedlings through a computer-assisted processing pipeline.

PubMed

Dupuy, Lionel X; Wright, Gladys; Thompson, Jacqueline A; Taylor, Anna; Dekeyser, Sebastien; White, Christopher P; Thomas, William T B; Nightingale, Mark; Hammond, John P; Graham, Neil S; Thomas, Catherine L; Broadley, Martin R; White, Philip J

2017-01-01

There are numerous systems and techniques to measure the growth of plant roots. However, phenotyping large numbers of plant roots for breeding and genetic analyses remains challenging. One major difficulty is to achieve high throughput and resolution at a reasonable cost per plant sample. Here we describe a cost-effective root phenotyping pipeline, on which we perform time and accuracy benchmarking to identify bottlenecks in such pipelines and strategies for their acceleration. Our root phenotyping pipeline was assembled with custom software and low cost material and equipment. Results show that sample preparation and handling of samples during screening are the most time consuming task in root phenotyping. Algorithms can be used to speed up the extraction of root traits from image data, but when applied to large numbers of images, there is a trade-off between time of processing the data and errors contained in the database. Scaling-up root phenotyping to large numbers of genotypes will require not only automation of sample preparation and sample handling, but also efficient algorithms for error detection for more reliable replacement of manual interventions.
An evaluation of sampling and full enumeration strategies for Fisher Jenks classification in big data settings

USGS Publications Warehouse

Rey, Sergio J.; Stephens, Philip A.; Laura, Jason R.

2017-01-01

Large data contexts present a number of challenges to optimal choropleth map classifiers. Application of optimal classifiers to a sample of the attribute space is one proposed solution. The properties of alternative sampling-based classification methods are examined through a series of Monte Carlo simulations. The impacts of spatial autocorrelation, number of desired classes, and form of sampling are shown to have significant impacts on the accuracy of map classifications. Tradeoffs between improved speed of the sampling approaches and loss of accuracy are also considered. The results suggest the possibility of guiding the choice of classification scheme as a function of the properties of large data sets.
Using next-generation sequencing for high resolution multiplex analysis of copy number variation from nanogram quantities of DNA from formalin-fixed paraffin-embedded specimens.

PubMed

Wood, Henry M; Belvedere, Ornella; Conway, Caroline; Daly, Catherine; Chalkley, Rebecca; Bickerdike, Melissa; McKinley, Claire; Egan, Phil; Ross, Lisa; Hayward, Bruce; Morgan, Joanne; Davidson, Leslie; MacLennan, Ken; Ong, Thian K; Papagiannopoulos, Kostas; Cook, Ian; Adams, David J; Taylor, Graham R; Rabbitts, Pamela

2010-08-01

The use of next-generation sequencing technologies to produce genomic copy number data has recently been described. Most approaches, however, reply on optimal starting DNA, and are therefore unsuitable for the analysis of formalin-fixed paraffin-embedded (FFPE) samples, which largely precludes the analysis of many tumour series. We have sought to challenge the limits of this technique with regards to quality and quantity of starting material and the depth of sequencing required. We confirm that the technique can be used to interrogate DNA from cell lines, fresh frozen material and FFPE samples to assess copy number variation. We show that as little as 5 ng of DNA is needed to generate a copy number karyogram, and follow this up with data from a series of FFPE biopsies and surgical samples. We have used various levels of sample multiplexing to demonstrate the adjustable resolution of the methodology, depending on the number of samples and available resources. We also demonstrate reproducibility by use of replicate samples and comparison with microarray-based comparative genomic hybridization (aCGH) and digital PCR. This technique can be valuable in both the analysis of routine diagnostic samples and in examining large repositories of fixed archival material.
Accuracy or precision: Implications of sample design and methodology on abundance estimation

USGS Publications Warehouse

Kowalewski, Lucas K.; Chizinski, Christopher J.; Powell, Larkin A.; Pope, Kevin L.; Pegg, Mark A.

2015-01-01

Sampling by spatially replicated counts (point-count) is an increasingly popular method of estimating population size of organisms. Challenges exist when sampling by point-count method, and it is often impractical to sample entire area of interest and impossible to detect every individual present. Ecologists encounter logistical limitations that force them to sample either few large-sample units or many small sample-units, introducing biases to sample counts. We generated a computer environment and simulated sampling scenarios to test the role of number of samples, sample unit area, number of organisms, and distribution of organisms in the estimation of population sizes using N-mixture models. Many sample units of small area provided estimates that were consistently closer to true abundance than sample scenarios with few sample units of large area. However, sample scenarios with few sample units of large area provided more precise abundance estimates than abundance estimates derived from sample scenarios with many sample units of small area. It is important to consider accuracy and precision of abundance estimates during the sample design process with study goals and objectives fully recognized, although and with consequence, consideration of accuracy and precision of abundance estimates is often an afterthought that occurs during the data analysis process.
Advances in metabolomic applications in plant genetics and breeding

USDA-ARS?s Scientific Manuscript database

Metabolomics is a systems biology discipline wherein abundances of endogenous metabolites from biological samples are identified and quantitatively measured across a large range of metabolites and/or a large number of samples. Since all developmental, physiological and response to the environment ph...
Screen Space Ambient Occlusion Based Multiple Importance Sampling for Real-Time Rendering

NASA Astrophysics Data System (ADS)

Zerari, Abd El Mouméne; Babahenini, Mohamed Chaouki

2018-03-01

We propose a new approximation technique for accelerating the Global Illumination algorithm for real-time rendering. The proposed approach is based on the Screen-Space Ambient Occlusion (SSAO) method, which approximates the global illumination for large, fully dynamic scenes at interactive frame rates. Current algorithms that are based on the SSAO method suffer from difficulties due to the large number of samples that are required. In this paper, we propose an improvement to the SSAO technique by integrating it with a Multiple Importance Sampling technique that combines a stratified sampling method with an importance sampling method, with the objective of reducing the number of samples. Experimental evaluation demonstrates that our technique can produce high-quality images in real time and is significantly faster than traditional techniques.
Decomposition and model selection for large contingency tables.

PubMed

Dahinden, Corinne; Kalisch, Markus; Bühlmann, Peter

2010-04-01

Large contingency tables summarizing categorical variables arise in many areas. One example is in biology, where large numbers of biomarkers are cross-tabulated according to their discrete expression level. Interactions of the variables are of great interest and are generally studied with log-linear models. The structure of a log-linear model can be visually represented by a graph from which the conditional independence structure can then be easily read off. However, since the number of parameters in a saturated model grows exponentially in the number of variables, this generally comes with a heavy computational burden. Even if we restrict ourselves to models of lower-order interactions or other sparse structures, we are faced with the problem of a large number of cells which play the role of sample size. This is in sharp contrast to high-dimensional regression or classification procedures because, in addition to a high-dimensional parameter, we also have to deal with the analogue of a huge sample size. Furthermore, high-dimensional tables naturally feature a large number of sampling zeros which often leads to the nonexistence of the maximum likelihood estimate. We therefore present a decomposition approach, where we first divide the problem into several lower-dimensional problems and then combine these to form a global solution. Our methodology is computationally feasible for log-linear interaction models with many categorical variables each or some of them having many levels. We demonstrate the proposed method on simulated data and apply it to a bio-medical problem in cancer research.
Electrofishing Effort Required to Estimate Biotic Condition in Southern Idaho Rivers

EPA Science Inventory

An important issue surrounding biomonitoring in large rivers is the minimum sampling effort required to collect an adequate number of fish for accurate and precise determinations of biotic condition. During the summer of 2002, we sampled 15 randomly selected large-river sites in...
Sample sizes to control error estimates in determining soil bulk density in California forest soils

Treesearch

Youzhi Han; Jianwei Zhang; Kim G. Mattson; Weidong Zhang; Thomas A. Weber

2016-01-01

Characterizing forest soil properties with high variability is challenging, sometimes requiring large numbers of soil samples. Soil bulk density is a standard variable needed along with element concentrations to calculate nutrient pools. This study aimed to determine the optimal sample size, the number of observation (n), for predicting the soil bulk density with a...
A high-throughput microRNA expression profiling system.

PubMed

Guo, Yanwen; Mastriano, Stephen; Lu, Jun

2014-01-01

As small noncoding RNAs, microRNAs (miRNAs) regulate diverse biological functions, including physiological and pathological processes. The expression and deregulation of miRNA levels contain rich information with diagnostic and prognostic relevance and can reflect pharmacological responses. The increasing interest in miRNA-related research demands global miRNA expression profiling on large numbers of samples. We describe here a robust protocol that supports high-throughput sample labeling and detection on hundreds of samples simultaneously. This method employs 96-well-based miRNA capturing from total RNA samples and on-site biochemical reactions, coupled with bead-based detection in 96-well format for hundreds of miRNAs per sample. With low-cost, high-throughput, high detection specificity, and flexibility to profile both small and large numbers of samples, this protocol can be adapted in a wide range of laboratory settings.
Monte Carlo simulation of induction time and metastable zone width; stochastic or deterministic?

NASA Astrophysics Data System (ADS)

Kubota, Noriaki

2018-03-01

The induction time and metastable zone width (MSZW) measured for small samples (say 1 mL or less) both scatter widely. Thus, these two are observed as stochastic quantities. Whereas, for large samples (say 1000 mL or more), the induction time and MSZW are observed as deterministic quantities. The reason for such experimental differences is investigated with Monte Carlo simulation. In the simulation, the time (under isothermal condition) and supercooling (under polythermal condition) at which a first single crystal is detected are defined as the induction time t and the MSZW ΔT for small samples, respectively. The number of crystals just at the moment of t and ΔT is unity. A first crystal emerges at random due to the intrinsic nature of nucleation, accordingly t and ΔT become stochastic. For large samples, the time and supercooling at which the number density of crystals N/V reaches a detector sensitivity (N/V)det are defined as t and ΔT for isothermal and polythermal conditions, respectively. The points of t and ΔT are those of which a large number of crystals have accumulated. Consequently, t and ΔT become deterministic according to the law of large numbers. Whether t and ΔT may stochastic or deterministic in actual experiments should not be attributed to change in nucleation mechanisms in molecular level. It could be just a problem caused by differences in the experimental definition of t and ΔT.
Occurrence of 1153 organic micropollutants in the aquatic environment of Vietnam.

PubMed

Chau, H T C; Kadokami, K; Duong, H T; Kong, L; Nguyen, T T; Nguyen, T Q; Ito, Y

2018-03-01

The rapid increase in the number and volume of chemical substances being used in modern society has been accompanied by a large number of potentially hazardous chemicals being found in environmental samples. In Vietnam, the monitoring of chemical substances is mainly limited to a small number of known pollutants in spite of rapid economic growth and urbanization, and there is an urgent need to examine a large number of chemicals to prevent impacts from expanding environmental pollution. However, it is difficult to analyze a large number of chemicals using existing methods, because they are time consuming and expensive. In the present study, we determined 1153 substances to grasp a pollution picture of microcontaminants in the aquatic environment. To achieve this objective, we have used two comprehensive analytical methods: (1) solid-phase extraction (SPE) and LC-TOF-MS analysis, and (2) SPE and GC-MS analysis. We collected 42 samples from northern (the Red River and Hanoi), central (Hue and Danang), and southern (Ho Chi Minh City and Saigon-Dongnai River) Vietnam. One hundred and sixty-five compounds were detected at least once. The compounds detected most frequently (>40 % samples) at μg/L concentrations were sterols (cholesterol, beta-sitosterol, stigmasterol, coprostanol), phthalates (bis(2-ethylhexyl) phthalate and di-n-butyl phthalate), and pharmaceutical and personal care products (caffeine, metformin). These contaminants were detected at almost the same detection frequency as in developed countries. The results reveal that surface waters in Vietnam, particularly in the center of large cities, are polluted by a large number of organic micropollutants, with households and business activities as the major sources. In addition, risk quotients (MEC/PNEC values) for nonylphenol, sulfamethoxazole, ampicillin, acetaminophen, erythromycin and clarithromycin were higher than 1, which indicates a possibility of adverse effects on aquatic ecosystems.
Flexible sampling large-scale social networks by self-adjustable random walk

NASA Astrophysics Data System (ADS)

Xu, Xiao-Ke; Zhu, Jonathan J. H.

2016-12-01

Online social networks (OSNs) have become an increasingly attractive gold mine for academic and commercial researchers. However, research on OSNs faces a number of difficult challenges. One bottleneck lies in the massive quantity and often unavailability of OSN population data. Sampling perhaps becomes the only feasible solution to the problems. How to draw samples that can represent the underlying OSNs has remained a formidable task because of a number of conceptual and methodological reasons. Especially, most of the empirically-driven studies on network sampling are confined to simulated data or sub-graph data, which are fundamentally different from real and complete-graph OSNs. In the current study, we propose a flexible sampling method, called Self-Adjustable Random Walk (SARW), and test it against with the population data of a real large-scale OSN. We evaluate the strengths of the sampling method in comparison with four prevailing methods, including uniform, breadth-first search (BFS), random walk (RW), and revised RW (i.e., MHRW) sampling. We try to mix both induced-edge and external-edge information of sampled nodes together in the same sampling process. Our results show that the SARW sampling method has been able to generate unbiased samples of OSNs with maximal precision and minimal cost. The study is helpful for the practice of OSN research by providing a highly needed sampling tools, for the methodological development of large-scale network sampling by comparative evaluations of existing sampling methods, and for the theoretical understanding of human networks by highlighting discrepancies and contradictions between existing knowledge/assumptions of large-scale real OSN data.
Meta-Storms: efficient search for similar microbial communities based on a novel indexing scheme and similarity score for metagenomic data.

PubMed

Su, Xiaoquan; Xu, Jian; Ning, Kang

2012-10-01

It has long been intriguing scientists to effectively compare different microbial communities (also referred as 'metagenomic samples' here) in a large scale: given a set of unknown samples, find similar metagenomic samples from a large repository and examine how similar these samples are. With the current metagenomic samples accumulated, it is possible to build a database of metagenomic samples of interests. Any metagenomic samples could then be searched against this database to find the most similar metagenomic sample(s). However, on one hand, current databases with a large number of metagenomic samples mostly serve as data repositories that offer few functionalities for analysis; and on the other hand, methods to measure the similarity of metagenomic data work well only for small set of samples by pairwise comparison. It is not yet clear, how to efficiently search for metagenomic samples against a large metagenomic database. In this study, we have proposed a novel method, Meta-Storms, that could systematically and efficiently organize and search metagenomic data. It includes the following components: (i) creating a database of metagenomic samples based on their taxonomical annotations, (ii) efficient indexing of samples in the database based on a hierarchical taxonomy indexing strategy, (iii) searching for a metagenomic sample against the database by a fast scoring function based on quantitative phylogeny and (iv) managing database by index export, index import, data insertion, data deletion and database merging. We have collected more than 1300 metagenomic data from the public domain and in-house facilities, and tested the Meta-Storms method on these datasets. Our experimental results show that Meta-Storms is capable of database creation and effective searching for a large number of metagenomic samples, and it could achieve similar accuracies compared with the current popular significance testing-based methods. Meta-Storms method would serve as a suitable database management and search system to quickly identify similar metagenomic samples from a large pool of samples. ningkang@qibebt.ac.cn Supplementary data are available at Bioinformatics online.

Optimal number of features as a function of sample size for various classification rules.

PubMed

Hua, Jianping; Xiong, Zixiang; Lowey, James; Suh, Edward; Dougherty, Edward R

2005-04-15

Given the joint feature-label distribution, increasing the number of features always results in decreased classification error; however, this is not the case when a classifier is designed via a classification rule from sample data. Typically (but not always), for fixed sample size, the error of a designed classifier decreases and then increases as the number of features grows. The potential downside of using too many features is most critical for small samples, which are commonplace for gene-expression-based classifiers for phenotype discrimination. For fixed sample size and feature-label distribution, the issue is to find an optimal number of features. Since only in rare cases is there a known distribution of the error as a function of the number of features and sample size, this study employs simulation for various feature-label distributions and classification rules, and across a wide range of sample and feature-set sizes. To achieve the desired end, finding the optimal number of features as a function of sample size, it employs massively parallel computation. Seven classifiers are treated: 3-nearest-neighbor, Gaussian kernel, linear support vector machine, polynomial support vector machine, perceptron, regular histogram and linear discriminant analysis. Three Gaussian-based models are considered: linear, nonlinear and bimodal. In addition, real patient data from a large breast-cancer study is considered. To mitigate the combinatorial search for finding optimal feature sets, and to model the situation in which subsets of genes are co-regulated and correlation is internal to these subsets, we assume that the covariance matrix of the features is blocked, with each block corresponding to a group of correlated features. Altogether there are a large number of error surfaces for the many cases. These are provided in full on a companion website, which is meant to serve as resource for those working with small-sample classification. For the companion website, please visit http://public.tgen.org/tamu/ofs/ e-dougherty@ee.tamu.edu.
Timoides agassizii Bigelow, 1904, little-known hydromedusa (Cnidaria), appears briefly in large numbers off Oman, March 2011, with additional notes about species of the genus Timoides.

PubMed

Purushothaman, Jasmine; Kharusi, Lubna Al; Mills, Claudia E; Ghielani, Hamed; Marzouki, Mohammad Al

2013-12-11

A bloom of the hydromedusan jellyfish, Timoides agassizii, occurred in February 2011 off the coast of Sohar, Al Batinah, Sultanate of Oman, in the Gulf of Oman. This species was first observed in 1902 in great numbers off Haddummati Atoll in the Maldive Islands in the Indian Ocean and has rarely been seen since. The species appeared briefly in large numbers off Oman in 2011 and subsequent observation of our 2009 samples of zooplankton from Sohar revealed that it was also present in low numbers (two collected) in one sample in 2009; these are the first records in the Indian Ocean north of the Maldives. Medusae collected off Oman were almost identical to those recorded previously from the Maldive Islands, Papua New Guinea, the Marshall Islands, Guam, the South China Sea, and Okinawa. T. agassizii is a species that likely lives for several months. It was present in our plankton samples together with large numbers of the oceanic siphonophore Physalia physalis only during a single month's samples, suggesting that the temporary bloom off Oman was likely due to the arrival of mature, open ocean medusae into nearshore waters. We see no evidence that T. agassizii has established a new population along Oman, since if so, it would likely have been present in more than one sample period. We are unable to deduce further details of the life cycle of this species from blooms of many mature individuals nearshore, about a century apart. Examination of a single damaged T. agassizii medusa from Guam, calls into question the existence of its congener, T. latistyla, known only from a single specimen.
Qualitative Meta-Analysis on the Hospital Task: Implications for Research

ERIC Educational Resources Information Center

Noll, Jennifer; Sharma, Sashi

2014-01-01

The "law of large numbers" indicates that as sample size increases, sample statistics become less variable and more closely estimate their corresponding population parameters. Different research studies investigating how people consider sample size when evaluating the reliability of a sample statistic have found a wide range of…
The Application Law of Large Numbers That Predicts The Amount of Actual Loss in Insurance of Life

NASA Astrophysics Data System (ADS)

Tinungki, Georgina Maria

2018-03-01

The law of large numbers is a statistical concept that calculates the average number of events or risks in a sample or population to predict something. The larger the population is calculated, the more accurate predictions. In the field of insurance, the Law of Large Numbers is used to predict the risk of loss or claims of some participants so that the premium can be calculated appropriately. For example there is an average that of every 100 insurance participants, there is one participant who filed an accident claim, then the premium of 100 participants should be able to provide Sum Assured to at least 1 accident claim. The larger the insurance participant is calculated, the more precise the prediction of the calendar and the calculation of the premium. Life insurance, as a tool for risk spread, can only work if a life insurance company is able to bear the same risk in large numbers. Here apply what is called the law of large number. The law of large numbers states that if the amount of exposure to losses increases, then the predicted loss will be closer to the actual loss. The use of the law of large numbers allows the number of losses to be predicted better.
Multiplex-Ready Technology for mid-throughput genotyping of molecular markers.

PubMed

Bonneau, Julien; Hayden, Matthew

2014-01-01

Screening molecular markers across large populations in breeding programs is generally time consuming and expensive. The Multiplex-Ready Technology (MRT) (Hayden et al., BMC genomics 9:80, 2008) was created to optimize polymorphism screening and genotyping using standardized PCR reaction conditions. The flexibility of this method maximizes the number of markers (up to 24 markers SSR or SNP, ideally small PCR product <500 bp and highly polymorphic) by using fluorescent dye (VIC, FAM, NED, and PET) and a semiautomated DNA fragment analyzer (ABI3730) capillary electrophoresis for large numbers of DNA samples (96 or 384 samples).
Procedures and equipment for staining large numbers of plant root samples for endomycorrhizal assay.

PubMed

Kormanik, P P; Bryan, W C; Schultz, R C

1980-04-01

A simplified method of clearing and staining large numbers of plant roots for vesicular-arbuscular (VA) mycorrhizal assay is presented. Equipment needed for handling multiple samples is described, and two formulations for the different chemical solutions are presented. Because one formulation contains phenol, its use should be limited to basic studies for which adequate laboratory exhaust hoods are available and great clarity of fungal structures is required. The second staining formulation, utilizing lactic acid instead of phenol, is less toxic, requires less elaborate laboratory facilities, and has proven to be completely satisfactory for VA assays.
Phenotypic Association Analyses With Copy Number Variation in Recurrent Depressive Disorder.

PubMed

Rucker, James J H; Tansey, Katherine E; Rivera, Margarita; Pinto, Dalila; Cohen-Woods, Sarah; Uher, Rudolf; Aitchison, Katherine J; Craddock, Nick; Owen, Michael J; Jones, Lisa; Jones, Ian; Korszun, Ania; Barnes, Michael R; Preisig, Martin; Mors, Ole; Maier, Wolfgang; Rice, John; Rietschel, Marcella; Holsboer, Florian; Farmer, Anne E; Craig, Ian W; Scherer, Stephen W; McGuffin, Peter; Breen, Gerome

2016-02-15

Defining the molecular genomic basis of the likelihood of developing depressive disorder is a considerable challenge. We previously associated rare, exonic deletion copy number variants (CNV) with recurrent depressive disorder (RDD). Sex chromosome abnormalities also have been observed to co-occur with RDD. In this reanalysis of our RDD dataset (N = 3106 cases; 459 screened control samples and 2699 population control samples), we further investigated the role of larger CNVs and chromosomal abnormalities in RDD and performed association analyses with clinical data derived from this dataset. We found an enrichment of Turner's syndrome among cases of depression compared with the frequency observed in a large population sample (N = 34,910) of live-born infants collected in Denmark (two-sided p = .023, odds ratio = 7.76 [95% confidence interval = 1.79-33.6]), a case of diploid/triploid mosaicism, and several cases of uniparental isodisomy. In contrast to our previous analysis, large deletion CNVs were no more frequent in cases than control samples, although deletion CNVs in cases contained more genes than control samples (two-sided p = .0002). After statistical correction for multiple comparisons, our data do not support a substantial role for CNVs in RDD, although (as has been observed in similar samples) occasional cases may harbor large variants with etiological significance. Genetic pleiotropy and sample heterogeneity suggest that very large sample sizes are required to study conclusively the role of genetic variation in mood disorders. Copyright © 2016 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
A fast learning method for large scale and multi-class samples of SVM

NASA Astrophysics Data System (ADS)

Fan, Yu; Guo, Huiming

2017-06-01

A multi-class classification SVM(Support Vector Machine) fast learning method based on binary tree is presented to solve its low learning efficiency when SVM processing large scale multi-class samples. This paper adopts bottom-up method to set up binary tree hierarchy structure, according to achieved hierarchy structure, sub-classifier learns from corresponding samples of each node. During the learning, several class clusters are generated after the first clustering of the training samples. Firstly, central points are extracted from those class clusters which just have one type of samples. For those which have two types of samples, cluster numbers of their positive and negative samples are set respectively according to their mixture degree, secondary clustering undertaken afterwards, after which, central points are extracted from achieved sub-class clusters. By learning from the reduced samples formed by the integration of extracted central points above, sub-classifiers are obtained. Simulation experiment shows that, this fast learning method, which is based on multi-level clustering, can guarantee higher classification accuracy, greatly reduce sample numbers and effectively improve learning efficiency.
Does Decision Quality (Always) Increase with the Size of Information Samples? Some Vicissitudes in Applying the Law of Large Numbers

ERIC Educational Resources Information Center

Fiedler, Klaus; Kareev, Yaakov

2006-01-01

Adaptive decision making requires that contingencies between decision options and their relative assets be assessed accurately and quickly. The present research addresses the challenging notion that contingencies may be more visible from small than from large samples of observations. An algorithmic account for such a seemingly paradoxical effect…
The Relationship between Intelligence and Multiple Domains of Religious Belief: Evidence from a Large Adult US Sample

ERIC Educational Resources Information Center

Lewis, Gary J.; Ritchie, Stuart J.; Bates, Timothy C.

2011-01-01

High levels of religiosity have been linked to lower levels of intelligence in a number of recent studies. These results have generated both controversy and theoretical interest. Here in a large sample of US adults we address several issues that restricted the generalizability of these previous results. We measured six dimensions of religiosity…
A novel storage system for cryoEM samples.

PubMed

Scapin, Giovanna; Prosise, Winifred W; Wismer, Michael K; Strickland, Corey

2017-07-01

We present here a new CryoEM grid boxes storage system designed to simplify sample labeling, tracking and retrieval. The system is based on the crystal pucks widely used by the X-ray crystallographic community for storage and shipping of crystals. This system is suitable for any cryoEM laboratory, but especially for large facilities that will need accurate tracking of large numbers of samples coming from different sources. Copyright © 2017. Published by Elsevier Inc.
The use of single-date MODIS imagery for estimating large-scale urban impervious surface fraction with spectral mixture analysis and machine learning techniques

NASA Astrophysics Data System (ADS)

Deng, Chengbin; Wu, Changshan

2013-12-01

Urban impervious surface information is essential for urban and environmental applications at the regional/national scales. As a popular image processing technique, spectral mixture analysis (SMA) has rarely been applied to coarse-resolution imagery due to the difficulty of deriving endmember spectra using traditional endmember selection methods, particularly within heterogeneous urban environments. To address this problem, we derived endmember signatures through a least squares solution (LSS) technique with known abundances of sample pixels, and integrated these endmember signatures into SMA for mapping large-scale impervious surface fraction. In addition, with the same sample set, we carried out objective comparative analyses among SMA (i.e. fully constrained and unconstrained SMA) and machine learning (i.e. Cubist regression tree and Random Forests) techniques. Analysis of results suggests three major conclusions. First, with the extrapolated endmember spectra from stratified random training samples, the SMA approaches performed relatively well, as indicated by small MAE values. Second, Random Forests yields more reliable results than Cubist regression tree, and its accuracy is improved with increased sample sizes. Finally, comparative analyses suggest a tentative guide for selecting an optimal approach for large-scale fractional imperviousness estimation: unconstrained SMA might be a favorable option with a small number of samples, while Random Forests might be preferred if a large number of samples are available.
Determination of dissolved-phase pesticides in surface water from the Yakima River basin, Washington, using the Goulden large-sample extractor and gas chromatography/mass spectrometer

USGS Publications Warehouse

Foster, Gregory D.; Gates, Paul M.; Foreman, William T.; McKenzie, Stuart W.; Rinella, Frank A.

1993-01-01

Concentrations of pesticides in the dissolved phase of surface water samples from the Yakima River basin, WA, were determined using preconcentration in the Goulden large-sample extractor (GLSE) and gas chromatography/mass spectrometry (GC/MS) analysis. Sample volumes ranging from 10 to 120 L were processed with the GLSE, and the results from the large-sample analyses were compared to those derived from 1-L continuous liquid-liquid extractions Few of the 40 target pesticides were detected in 1-L samples, whereas large-sample preconcentration in the GLSE provided detectable levels for many of the target pesticides. The number of pesticides detected in GLSE processed samples was usually directly proportional to sample volume, although the measured concentrations of the pesticides were generally lower at the larger sample volumes for the same water source. The GLSE can be used to provide lower detection levels relative to conventional liquid-liquid extraction in GC/MS analysis of pesticides in samples of surface water.
Optimal spatial sampling techniques for ground truth data in microwave remote sensing of soil moisture

NASA Technical Reports Server (NTRS)

Rao, R. G. S.; Ulaby, F. T.

1977-01-01

The paper examines optimal sampling techniques for obtaining accurate spatial averages of soil moisture, at various depths and for cell sizes in the range 2.5-40 acres, with a minimum number of samples. Both simple random sampling and stratified sampling procedures are used to reach a set of recommended sample sizes for each depth and for each cell size. Major conclusions from statistical sampling test results are that (1) the number of samples required decreases with increasing depth; (2) when the total number of samples cannot be prespecified or the moisture in only one single layer is of interest, then a simple random sample procedure should be used which is based on the observed mean and SD for data from a single field; (3) when the total number of samples can be prespecified and the objective is to measure the soil moisture profile with depth, then stratified random sampling based on optimal allocation should be used; and (4) decreasing the sensor resolution cell size leads to fairly large decreases in samples sizes with stratified sampling procedures, whereas only a moderate decrease is obtained in simple random sampling procedures.
Optimization of scat detection methods for a social ungulate, the wild pig, and experimental evaluation of factors affecting detection of scat

USGS Publications Warehouse

Keiter, David A.; Cunningham, Fred L.; Rhodes, Olin E.; Irwin, Brian J.; Beasley, James

2016-01-01

Collection of scat samples is common in wildlife research, particularly for genetic capture-mark-recapture applications. Due to high degradation rates of genetic material in scat, large numbers of samples must be collected to generate robust estimates. Optimization of sampling approaches to account for taxa-specific patterns of scat deposition is, therefore, necessary to ensure sufficient sample collection. While scat collection methods have been widely studied in carnivores, research to maximize scat collection and noninvasive sampling efficiency for social ungulates is lacking. Further, environmental factors or scat morphology may influence detection of scat by observers. We contrasted performance of novel radial search protocols with existing adaptive cluster sampling protocols to quantify differences in observed amounts of wild pig (Sus scrofa) scat. We also evaluated the effects of environmental (percentage of vegetative ground cover and occurrence of rain immediately prior to sampling) and scat characteristics (fecal pellet size and number) on the detectability of scat by observers. We found that 15- and 20-m radial search protocols resulted in greater numbers of scats encountered than the previously used adaptive cluster sampling approach across habitat types, and that fecal pellet size, number of fecal pellets, percent vegetative ground cover, and recent rain events were significant predictors of scat detection. Our results suggest that use of a fixed-width radial search protocol may increase the number of scats detected for wild pigs, or other social ungulates, allowing more robust estimation of population metrics using noninvasive genetic sampling methods. Further, as fecal pellet size affected scat detection, juvenile or smaller-sized animals may be less detectable than adult or large animals, which could introduce bias into abundance estimates. Knowledge of relationships between environmental variables and scat detection may allow researchers to optimize sampling protocols to maximize utility of noninvasive sampling for wild pigs and other social ungulates.
Optimization of scat detection methods for a social ungulate, the wild pig, and experimental evaluation of factors affecting detection of scat

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, David A.; Cunningham, Fred L.; Rhodes, Jr., Olin E.

Collection of scat samples is common in wildlife research, particularly for genetic capture-mark-recapture applications. Due to high degradation rates of genetic material in scat, large numbers of samples must be collected to generate robust estimates. Optimization of sampling approaches to account for taxa-specific patterns of scat deposition is, therefore, necessary to ensure sufficient sample collection. While scat collection methods have been widely studied in carnivores, research to maximize scat collection and noninvasive sampling efficiency for social ungulates is lacking. Further, environmental factors or scat morphology may influence detection of scat by observers. We contrasted performance of novel radial search protocolsmore » with existing adaptive cluster sampling protocols to quantify differences in observed amounts of wild pig ( Sus scrofa) scat. We also evaluated the effects of environmental (percentage of vegetative ground cover and occurrence of rain immediately prior to sampling) and scat characteristics (fecal pellet size and number) on the detectability of scat by observers. We found that 15- and 20-m radial search protocols resulted in greater numbers of scats encountered than the previously used adaptive cluster sampling approach across habitat types, and that fecal pellet size, number of fecal pellets, percent vegetative ground cover, and recent rain events were significant predictors of scat detection. Our results suggest that use of a fixed-width radial search protocol may increase the number of scats detected for wild pigs, or other social ungulates, allowing more robust estimation of population metrics using noninvasive genetic sampling methods. Further, as fecal pellet size affected scat detection, juvenile or smaller-sized animals may be less detectable than adult or large animals, which could introduce bias into abundance estimates. In conclusion, knowledge of relationships between environmental variables and scat detection may allow researchers to optimize sampling protocols to maximize utility of noninvasive sampling for wild pigs and other social ungulates.« less
Optimization of Scat Detection Methods for a Social Ungulate, the Wild Pig, and Experimental Evaluation of Factors Affecting Detection of Scat.

PubMed

Keiter, David A; Cunningham, Fred L; Rhodes, Olin E; Irwin, Brian J; Beasley, James C

2016-01-01

Collection of scat samples is common in wildlife research, particularly for genetic capture-mark-recapture applications. Due to high degradation rates of genetic material in scat, large numbers of samples must be collected to generate robust estimates. Optimization of sampling approaches to account for taxa-specific patterns of scat deposition is, therefore, necessary to ensure sufficient sample collection. While scat collection methods have been widely studied in carnivores, research to maximize scat collection and noninvasive sampling efficiency for social ungulates is lacking. Further, environmental factors or scat morphology may influence detection of scat by observers. We contrasted performance of novel radial search protocols with existing adaptive cluster sampling protocols to quantify differences in observed amounts of wild pig (Sus scrofa) scat. We also evaluated the effects of environmental (percentage of vegetative ground cover and occurrence of rain immediately prior to sampling) and scat characteristics (fecal pellet size and number) on the detectability of scat by observers. We found that 15- and 20-m radial search protocols resulted in greater numbers of scats encountered than the previously used adaptive cluster sampling approach across habitat types, and that fecal pellet size, number of fecal pellets, percent vegetative ground cover, and recent rain events were significant predictors of scat detection. Our results suggest that use of a fixed-width radial search protocol may increase the number of scats detected for wild pigs, or other social ungulates, allowing more robust estimation of population metrics using noninvasive genetic sampling methods. Further, as fecal pellet size affected scat detection, juvenile or smaller-sized animals may be less detectable than adult or large animals, which could introduce bias into abundance estimates. Knowledge of relationships between environmental variables and scat detection may allow researchers to optimize sampling protocols to maximize utility of noninvasive sampling for wild pigs and other social ungulates.
Optimization of scat detection methods for a social ungulate, the wild pig, and experimental evaluation of factors affecting detection of scat

DOE PAGES

Keiter, David A.; Cunningham, Fred L.; Rhodes, Jr., Olin E.; ...

2016-05-25

Collection of scat samples is common in wildlife research, particularly for genetic capture-mark-recapture applications. Due to high degradation rates of genetic material in scat, large numbers of samples must be collected to generate robust estimates. Optimization of sampling approaches to account for taxa-specific patterns of scat deposition is, therefore, necessary to ensure sufficient sample collection. While scat collection methods have been widely studied in carnivores, research to maximize scat collection and noninvasive sampling efficiency for social ungulates is lacking. Further, environmental factors or scat morphology may influence detection of scat by observers. We contrasted performance of novel radial search protocolsmore » with existing adaptive cluster sampling protocols to quantify differences in observed amounts of wild pig ( Sus scrofa) scat. We also evaluated the effects of environmental (percentage of vegetative ground cover and occurrence of rain immediately prior to sampling) and scat characteristics (fecal pellet size and number) on the detectability of scat by observers. We found that 15- and 20-m radial search protocols resulted in greater numbers of scats encountered than the previously used adaptive cluster sampling approach across habitat types, and that fecal pellet size, number of fecal pellets, percent vegetative ground cover, and recent rain events were significant predictors of scat detection. Our results suggest that use of a fixed-width radial search protocol may increase the number of scats detected for wild pigs, or other social ungulates, allowing more robust estimation of population metrics using noninvasive genetic sampling methods. Further, as fecal pellet size affected scat detection, juvenile or smaller-sized animals may be less detectable than adult or large animals, which could introduce bias into abundance estimates. In conclusion, knowledge of relationships between environmental variables and scat detection may allow researchers to optimize sampling protocols to maximize utility of noninvasive sampling for wild pigs and other social ungulates.« less
Characterization of Aspergillus section Nigri species populations in vineyard soil using droplet digital PCR

USDA-ARS?s Scientific Manuscript database

Identification of populations of Aspergillus section Nigri species in environmental samples using traditional methods is laborious and impractical for large numbers of samples. We developed species-specific primers and probes for quantitative droplet digital PCR (ddPCR) to improve sample throughput ...
A PRIOR EVALUATION OF TWO-STAGE CLUSTER SAMPLING FOR ACCURACY ASSESSMENT OF LARGE-AREA LAND-COVER MAPS

EPA Science Inventory

Two-stage cluster sampling reduces the cost of collecting accuracy assessment reference data by constraining sample elements to fall within a limited number of geographic domains (clusters). However, because classification error is typically positively spatially correlated, withi...

Cryopreservation of Circulating Tumor Cells for Enumeration and Characterization.

PubMed

Nejlund, Sarah; Smith, Julie; Kraan, Jaco; Stender, Henrik; Van, Mai N; Langkjer, Sven T; Nielsen, Mikkel T; Sölétormos, György; Hillig, Thore

2016-08-01

A blood sample containing circulating tumor cells (CTCs) may serve as a surrogate for metastasis in invasive cancer. Cryopreservation will provide new opportunities in management of clinical samples in the laboratory and allow collection of samples over time for future analysis of existing and upcoming cancer biomarkers. Blood samples from healthy volunteers were spiked with high (∼500) and low (∼50) number of tumor cells from culture. The samples were stored at -80C with cryopreservative dimethyl sulfoxide mixed with Roswell Park Memorial Institute 1640 medium. Flow cytometry tested if cryopreservation affected specific biomarkers regularly used to detect CTCs, i.e. cytokeratin (CK) and epithelial cell adhesion molecule (EpCAM) and white blood cell specific lymphocyte common antigen (CD45). After various time intervals (up to 6 months), samples were thawed and tumor cell recovery (enumeration) was examined. Clinical samples may differ from cell line studies, so the cryopreservation protocol was tested on 17 patients with invasive breast cancer and tumor cell recovery was examined. Two blood samples were drawn from each patient. Biomarkers, CK, CD45, and EpCAM, were not affected by the freezing and thawing procedures. Cryopreserved samples (n = 2) spiked with a high number of tumor cells (∼500) had a ∼90% recovery compared with the spiked fresh samples. In samples spiked with lower numbers of tumor cells (median = 43 in n = 5 samples), the recovery was 63% after cryopreservation (median 27 tumor cells), p = 0.03. With an even lower number of spiked tumor cells (median = 3 in n = 8 samples), the recovery rate of tumor cells after cryopreservation did not seem to be affected (median = 8), p = 0.09. Time of cryopreservation did not affect recovery. When testing the effect of cryopreservation on enumeration in clinical samples, no difference was observed in the number of CTCs between the fresh and the cryopreserved samples based on n = 17 pairs, p = 0.83; however, the variation was large. This large variation was confirmed by clinically paired fresh samples (n = 64 pairs), where 95% of the samples (<30 CTCs) vary in number up to ±15 CTCs, p = 0.18. A small loss of CTCs after cryopreservation may be expected; however, cryopreservation of CTCs for biomarker characterization for clinical applications seems promising.
Measuring discharge with ADCPs: Inferences from synthetic velocity profiles

USGS Publications Warehouse

Rehmann, C.R.; Mueller, D.S.; Oberg, K.A.

2009-01-01

Synthetic velocity profiles are used to determine guidelines for sampling discharge with acoustic Doppler current profilers (ADCPs). The analysis allows the effects of instrument characteristics, sampling parameters, and properties of the flow to be studied systematically. For mid-section measurements, the averaging time required for a single profile measurement always exceeded the 40 s usually recommended for velocity measurements, and it increased with increasing sample interval and increasing time scale of the large eddies. Similarly, simulations of transect measurements show that discharge error decreases as the number of large eddies sampled increases. The simulations allow sampling criteria that account for the physics of the flow to be developed. ?? 2009 ASCE.
Concentration of Enteroviruses, Adenoviruses, and Noroviruses from Drinking Water by Use of Glass Wool Filters▿

PubMed Central

Lambertini, Elisabetta; Spencer, Susan K.; Bertz, Phillip D.; Loge, Frank J.; Kieke, Burney A.; Borchardt, Mark A.

2008-01-01

Available filtration methods to concentrate waterborne viruses are either too costly for studies requiring large numbers of samples, limited to small sample volumes, or not very portable for routine field applications. Sodocalcic glass wool filtration is a cost-effective and easy-to-use method to retain viruses, but its efficiency and reliability are not adequately understood. This study evaluated glass wool filter performance to concentrate the four viruses on the U.S. Environmental Protection Agency contaminant candidate list, i.e., coxsackievirus, echovirus, norovirus, and adenovirus, as well as poliovirus. Total virus numbers recovered were measured by quantitative reverse transcription-PCR (qRT-PCR); infectious polioviruses were quantified by integrated cell culture (ICC)-qRT-PCR. Recovery efficiencies averaged 70% for poliovirus, 14% for coxsackievirus B5, 19% for echovirus 18, 21% for adenovirus 41, and 29% for norovirus. Virus strain and water matrix affected recovery, with significant interaction between the two variables. Optimal recovery was obtained at pH 6.5. No evidence was found that water volume, filtration rate, and number of viruses seeded influenced recovery. The method was successful in detecting indigenous viruses in municipal wells in Wisconsin. Long-term continuous filtration retained viruses sufficiently for their detection for up to 16 days after seeding for qRT-PCR and up to 30 days for ICC-qRT-PCR. Glass wool filtration is suitable for large-volume samples (1,000 liters) collected at high filtration rates (4 liters min−1), and its low cost makes it advantageous for studies requiring large numbers of samples. PMID:18359827
Differences in AMY1 Gene Copy Numbers Derived from Blood, Buccal Cells and Saliva Using Quantitative and Droplet Digital PCR Methods: Flagging the Pitfall.

PubMed

Ooi, Delicia Shu Qin; Tan, Verena Ming Hui; Ong, Siong Gim; Chan, Yiong Huak; Heng, Chew Kiat; Lee, Yung Seng

2017-01-01

The human salivary (AMY1) gene, encoding salivary α-amylase, has variable copy number variants (CNVs) in the human genome. We aimed to determine if real-time quantitative polymerase chain reaction (qPCR) and the more recently available Droplet Digital PCR (ddPCR) can provide a precise quantification of the AMY1 gene copy number in blood, buccal cells and saliva samples derived from the same individual. Seven participants were recruited and DNA was extracted from the blood, buccal cells and saliva samples provided by each participant. Taqman assay real-time qPCR and ddPCR were conducted to quantify AMY1 gene copy numbers. Statistical analysis was carried out to determine the difference in AMY1 gene copy number between the different biological specimens and different assay methods. We found significant within-individual difference (p<0.01) in AMY1 gene copy number between different biological samples as determined by qPCR. However, there was no significant within-individual difference in AMY1 gene copy number between different biological samples as determined by ddPCR. We also found that AMY1 gene copy number of blood samples were comparable between qPCR and ddPCR, while there is a significant difference (p<0.01) between AMY1 gene copy numbers measured by qPCR and ddPCR for both buccal swab and saliva samples. Despite buccal cells and saliva samples being possible sources of DNA, it is pertinent that ddPCR or a single biological sample, preferably blood sample, be used for determining highly polymorphic gene copy numbers like AMY1, due to the large within-individual variability between different biological samples if real time qPCR is employed.
DAMe: a toolkit for the initial processing of datasets with PCR replicates of double-tagged amplicons for DNA metabarcoding analyses.

PubMed

Zepeda-Mendoza, Marie Lisandra; Bohmann, Kristine; Carmona Baez, Aldo; Gilbert, M Thomas P

2016-05-03

DNA metabarcoding is an approach for identifying multiple taxa in an environmental sample using specific genetic loci and taxa-specific primers. When combined with high-throughput sequencing it enables the taxonomic characterization of large numbers of samples in a relatively time- and cost-efficient manner. One recent laboratory development is the addition of 5'-nucleotide tags to both primers producing double-tagged amplicons and the use of multiple PCR replicates to filter erroneous sequences. However, there is currently no available toolkit for the straightforward analysis of datasets produced in this way. We present DAMe, a toolkit for the processing of datasets generated by double-tagged amplicons from multiple PCR replicates derived from an unlimited number of samples. Specifically, DAMe can be used to (i) sort amplicons by tag combination, (ii) evaluate PCR replicates dissimilarity, and (iii) filter sequences derived from sequencing/PCR errors, chimeras, and contamination. This is attained by calculating the following parameters: (i) sequence content similarity between the PCR replicates from each sample, (ii) reproducibility of each unique sequence across the PCR replicates, and (iii) copy number of the unique sequences in each PCR replicate. We showcase the insights that can be obtained using DAMe prior to taxonomic assignment, by applying it to two real datasets that vary in their complexity regarding number of samples, sequencing libraries, PCR replicates, and used tag combinations. Finally, we use a third mock dataset to demonstrate the impact and importance of filtering the sequences with DAMe. DAMe allows the user-friendly manipulation of amplicons derived from multiple samples with PCR replicates built in a single or multiple sequencing libraries. It allows the user to: (i) collapse amplicons into unique sequences and sort them by tag combination while retaining the sample identifier and copy number information, (ii) identify sequences carrying unused tag combinations, (iii) evaluate the comparability of PCR replicates of the same sample, and (iv) filter tagged amplicons from a number of PCR replicates using parameters of minimum length, copy number, and reproducibility across the PCR replicates. This enables an efficient analysis of complex datasets, and ultimately increases the ease of handling datasets from large-scale studies.
Multiplex titration RT-PCR: rapid determination of gene expression patterns for a large number of genes

NASA Technical Reports Server (NTRS)

Nebenfuhr, A.; Lomax, T. L.

1998-01-01

We have developed an improved method for determination of gene expression levels with RT-PCR. The procedure is rapid and does not require extensive optimization or densitometric analysis. Since the detection of individual transcripts is PCR-based, small amounts of tissue samples are sufficient for the analysis of expression patterns in large gene families. Using this method, we were able to rapidly screen nine members of the Aux/IAA family of auxin-responsive genes and identify those genes which vary in message abundance in a tissue- and light-specific manner. While not offering the accuracy of conventional semi-quantitative or competitive RT-PCR, our method allows quick screening of large numbers of genes in a wide range of RNA samples with just a thermal cycler and standard gel analysis equipment.
Exploring high dimensional free energy landscapes: Temperature accelerated sliced sampling

NASA Astrophysics Data System (ADS)

Awasthi, Shalini; Nair, Nisanth N.

2017-03-01

Biased sampling of collective variables is widely used to accelerate rare events in molecular simulations and to explore free energy surfaces. However, computational efficiency of these methods decreases with increasing number of collective variables, which severely limits the predictive power of the enhanced sampling approaches. Here we propose a method called Temperature Accelerated Sliced Sampling (TASS) that combines temperature accelerated molecular dynamics with umbrella sampling and metadynamics to sample the collective variable space in an efficient manner. The presented method can sample a large number of collective variables and is advantageous for controlled exploration of broad and unbound free energy basins. TASS is also shown to achieve quick free energy convergence and is practically usable with ab initio molecular dynamics techniques.
Validation of Rapid Radiochemical Method for Californium ...

EPA Pesticide Factsheets

Technical Brief In the event of a radiological/nuclear contamination event, the response community would need tools and methodologies to rapidly assess the nature and the extent of contamination. To characterize a radiologically contaminated outdoor area and to inform risk assessment, large numbers of environmental samples would be collected and analyzed over a short period of time. To address the challenge of quickly providing analytical results to the field, the U.S. EPA developed a robust analytical method. This method allows response officials to characterize contaminated areas and to assess the effectiveness of remediation efforts, both rapidly and accurately, in the intermediate and late phases of environmental cleanup. Improvement in sample processing and analysis leads to increased laboratory capacity to handle the analysis of a large number of samples following the intentional or unintentional release of a radiological/nuclear contaminant.
Cryocooler based test setup for high current applications

NASA Astrophysics Data System (ADS)

Pradhan, Jedidiah; Das, Nisith Kr.; Roy, Anindya; Duttagupta, Anjan

2018-04-01

A cryo-cooler based cryogenic test setup has been designed, fabricated, and tested. The setup incorporates two numbers of cryo-coolers, one for sample cooling and the other one for cooling the large magnet coil. The performance and versatility of the setup has been tested using large samples of high-temperature superconductor magnet coil as well as short samples with high current. Several un-calibrated temperature sensors have been calibrated using this system. This paper presents the details of the system along with results of different performance tests.
Evaluation of a Class of Simple and Effective Uncertainty Methods for Sparse Samples of Random Variables and Functions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Romero, Vicente; Bonney, Matthew; Schroeder, Benjamin

When very few samples of a random quantity are available from a source distribution of unknown shape, it is usually not possible to accurately infer the exact distribution from which the data samples come. Under-estimation of important quantities such as response variance and failure probabilities can result. For many engineering purposes, including design and risk analysis, we attempt to avoid under-estimation with a strategy to conservatively estimate (bound) these types of quantities -- without being overly conservative -- when only a few samples of a random quantity are available from model predictions or replicate experiments. This report examines a classmore » of related sparse-data uncertainty representation and inference approaches that are relatively simple, inexpensive, and effective. Tradeoffs between the methods' conservatism, reliability, and risk versus number of data samples (cost) are quantified with multi-attribute metrics use d to assess method performance for conservative estimation of two representative quantities: central 95% of response; and 10 -4 probability of exceeding a response threshold in a tail of the distribution. Each method's performance is characterized with 10,000 random trials on a large number of diverse and challenging distributions. The best method and number of samples to use in a given circumstance depends on the uncertainty quantity to be estimated, the PDF character, and the desired reliability of bounding the true value. On the basis of this large data base and study, a strategy is proposed for selecting the method and number of samples for attaining reasonable credibility levels in bounding these types of quantities when sparse samples of random variables or functions are available from experiments or simulations.« less
1979 Reserve Force Studies Surveys: Survey Design, Sample Design and Administrative Procedures,

DTIC Science & Technology

1981-08-01

three factors: the need for a statistically significant number of usable questionnaires from different groups within the random sampls and from...Because of the multipurpose nature of these surveys and the large number of questions needed to fully address some of the topics covered, we...varies. Collection of data at the unit level is needed to accurately estimate actual reserve compensation and benefits and their possible role in both
Knowing when to trust a teacher: The contribution of category status and sample composition to young children's judgments of informant trustworthiness.

PubMed

Lawson, Chris A

2018-09-01

Two experiments examined the extent to which category status influences children's attention to the composition of evidence samples provided by different informants. Children were told about two informants, each of whom presented different samples of evidence, and then were asked to judge which informant they would trust to help them learn something new. The composition of evidence samples was manipulated such that one sample included either a large number (n = 5) or a diverse range of exemplars relative to the other sample, which included either a small number (n = 2) or a homogeneous range of exemplars. Experiment 1 revealed that participants (N = 37; M age = 4.76 years) preferred to place their trust in the informant who presented the large or diverse sample when each informant was labeled "teacher" but exhibited no preference when each informant was labeled "child." Experiment 2 revealed developmental differences in responses when labels and sample composition were pitted against each other. Younger children (n = 32; M age = 3.42 years) consistently trusted the "teacher" regardless of the composition of the sample the informant was said to have provided, whereas older children (n = 30; M age = 5.54 years) consistently trusted the informant who provided the large or diverse sample regardless of whether it was provided by a "teacher" or a "child." These results have important implications for understanding the interplay between children's category knowledge and their evaluation of evidence. Copyright © 2018 Elsevier Inc. All rights reserved.
Phylogenetic effective sample size.

PubMed

Bartoszek, Krzysztof

2016-10-21

In this paper I address the question-how large is a phylogenetic sample? I propose a definition of a phylogenetic effective sample size for Brownian motion and Ornstein-Uhlenbeck processes-the regression effective sample size. I discuss how mutual information can be used to define an effective sample size in the non-normal process case and compare these two definitions to an already present concept of effective sample size (the mean effective sample size). Through a simulation study I find that the AICc is robust if one corrects for the number of species or effective number of species. Lastly I discuss how the concept of the phylogenetic effective sample size can be useful for biodiversity quantification, identification of interesting clades and deciding on the importance of phylogenetic correlations. Copyright © 2016 Elsevier Ltd. All rights reserved.
Evaluation of the Biological Sampling Kit (BiSKit) for Large-Area Surface Sampling

PubMed Central

Buttner, Mark P.; Cruz, Patricia; Stetzenbach, Linda D.; Klima-Comba, Amy K.; Stevens, Vanessa L.; Emanuel, Peter A.

2004-01-01

Current surface sampling methods for microbial contaminants are designed to sample small areas and utilize culture analysis. The total number of microbes recovered is low because a small area is sampled, making detection of a potential pathogen more difficult. Furthermore, sampling of small areas requires a greater number of samples to be collected, which delays the reporting of results, taxes laboratory resources and staffing, and increases analysis costs. A new biological surface sampling method, the Biological Sampling Kit (BiSKit), designed to sample large areas and to be compatible with testing with a variety of technologies, including PCR and immunoassay, was evaluated and compared to other surface sampling strategies. In experimental room trials, wood laminate and metal surfaces were contaminated by aerosolization of Bacillus atrophaeus spores, a simulant for Bacillus anthracis, into the room, followed by settling of the spores onto the test surfaces. The surfaces were sampled with the BiSKit, a cotton-based swab, and a foam-based swab. Samples were analyzed by culturing, quantitative PCR, and immunological assays. The results showed that the large surface area (1 m2) sampled with the BiSKit resulted in concentrations of B. atrophaeus in samples that were up to 10-fold higher than the concentrations obtained with the other methods tested. A comparison of wet and dry sampling with the BiSKit indicated that dry sampling was more efficient (efficiency, 18.4%) than wet sampling (efficiency, 11.3%). The sensitivities of detection of B. atrophaeus on metal surfaces were 42 ± 5.8 CFU/m2 for wet sampling and 100.5 ± 10.2 CFU/m2 for dry sampling. These results demonstrate that the use of a sampling device capable of sampling larger areas results in higher sensitivity than that obtained with currently available methods and has the advantage of sampling larger areas, thus requiring collection of fewer samples per site. PMID:15574898
Direct determination of the number-weighted mean radius and polydispersity from dynamic light-scattering data.

PubMed

Patty, Philipus J; Frisken, Barbara J

2006-04-01

We compare results for the number-weighted mean radius and polydispersity obtained either by directly fitting number distributions to dynamic light-scattering data or by converting results obtained by fitting intensity-weighted distributions. We find that results from fits using number distributions are angle independent and that converting intensity-weighted distributions is not always reliable, especially when the polydispersity of the sample is large. We compare the results of fitting symmetric and asymmetric distributions, as represented by Gaussian and Schulz distributions, respectively, to data for extruded vesicles and find that the Schulz distribution provides a better estimate of the size distribution for these samples.
Importance sampling large deviations in nonequilibrium steady states. I.

PubMed

Ray, Ushnish; Chan, Garnet Kin-Lic; Limmer, David T

2018-03-28

Large deviation functions contain information on the stability and response of systems driven into nonequilibrium steady states and in such a way are similar to free energies for systems at equilibrium. As with equilibrium free energies, evaluating large deviation functions numerically for all but the simplest systems is difficult because by construction they depend on exponentially rare events. In this first paper of a series, we evaluate different trajectory-based sampling methods capable of computing large deviation functions of time integrated observables within nonequilibrium steady states. We illustrate some convergence criteria and best practices using a number of different models, including a biased Brownian walker, a driven lattice gas, and a model of self-assembly. We show how two popular methods for sampling trajectory ensembles, transition path sampling and diffusion Monte Carlo, suffer from exponentially diverging correlations in trajectory space as a function of the bias parameter when estimating large deviation functions. Improving the efficiencies of these algorithms requires introducing guiding functions for the trajectories.
Importance sampling large deviations in nonequilibrium steady states. I

NASA Astrophysics Data System (ADS)

Ray, Ushnish; Chan, Garnet Kin-Lic; Limmer, David T.

2018-03-01

Large deviation functions contain information on the stability and response of systems driven into nonequilibrium steady states and in such a way are similar to free energies for systems at equilibrium. As with equilibrium free energies, evaluating large deviation functions numerically for all but the simplest systems is difficult because by construction they depend on exponentially rare events. In this first paper of a series, we evaluate different trajectory-based sampling methods capable of computing large deviation functions of time integrated observables within nonequilibrium steady states. We illustrate some convergence criteria and best practices using a number of different models, including a biased Brownian walker, a driven lattice gas, and a model of self-assembly. We show how two popular methods for sampling trajectory ensembles, transition path sampling and diffusion Monte Carlo, suffer from exponentially diverging correlations in trajectory space as a function of the bias parameter when estimating large deviation functions. Improving the efficiencies of these algorithms requires introducing guiding functions for the trajectories.
Machine learning from computer simulations with applications in rail vehicle dynamics

NASA Astrophysics Data System (ADS)

Taheri, Mehdi; Ahmadian, Mehdi

2016-05-01

The application of stochastic modelling for learning the behaviour of a multibody dynamics (MBD) models is investigated. Post-processing data from a simulation run are used to train the stochastic model that estimates the relationship between model inputs (suspension relative displacement and velocity) and the output (sum of suspension forces). The stochastic model can be used to reduce the computational burden of the MBD model by replacing a computationally expensive subsystem in the model (suspension subsystem). With minor changes, the stochastic modelling technique is able to learn the behaviour of a physical system and integrate its behaviour within MBD models. The technique is highly advantageous for MBD models where real-time simulations are necessary, or with models that have a large number of repeated substructures, e.g. modelling a train with a large number of railcars. The fact that the training data are acquired prior to the development of the stochastic model discards the conventional sampling plan strategies like Latin Hypercube sampling plans where simulations are performed using the inputs dictated by the sampling plan. Since the sampling plan greatly influences the overall accuracy and efficiency of the stochastic predictions, a sampling plan suitable for the process is developed where the most space-filling subset of the acquired data with ? number of sample points that best describes the dynamic behaviour of the system under study is selected as the training data.
Risk of co-occuring psychopathology: testing a prediction of expectancy theory.

PubMed

Capron, Daniel W; Norr, Aaron M; Schmidt, Norman B

2013-01-01

Despite the high impact of anxiety sensitivity (AS; a fear of anxiety related sensations) research, almost no research attention has been paid to its parent theory, Reiss' expectancy theory (ET). ET has gone largely unexamined to this point, including the prediction that AS is a better predictor of number of fears than current anxiety. To test Reiss' prediction, we used a large (N = 317) clinical sample of anxiety outpatients. Specifically, we examined whether elevated AS predicted number of comorbid anxiety and non-anxiety disorder diagnoses in this sample. Consistent with ET, findings indicated that AS predicted number of comorbid anxiety disorder diagnoses above and beyond current anxiety symptoms. Also, AS did not predict the number of comorbid non-anxiety diagnoses when current anxiety symptoms were accounted for. These findings represent an important examination of a prediction of Reiss' ET and are consistent with the idea that AS may be a useful transdiagnostic treatment target. Copyright © 2012 Elsevier Ltd. All rights reserved.
Protocol for Detection of Yersinia pestis in Environmental ...

EPA Pesticide Factsheets

Methods Report This is the first ever open-access and detailed protocol available to all government departments and agencies, and their contractors to detect Yersinia pestis, the pathogen that causes plague, from multiple environmental sample types including water. Each analytical method includes sample processing procedure for each sample type in a step-by-step manner. It includes real-time PCR, traditional microbiological culture, and the Rapid Viability PCR (RV-PCR) analytical methods. For large volume water samples it also includes an ultra-filtration-based sample concentration procedure. Because of such a non-restrictive availability of this protocol to all government departments and agencies, and their contractors, the nation will now have increased laboratory capacity to analyze large number of samples during a wide-area plague incident.

Optimal Budget Allocation for Sample Average Approximation

DTIC Science & Technology

2011-06-01

an optimization algorithm applied to the sample average problem. We examine the convergence rate of the estimator as the computing budget tends to...regime for the optimization algorithm . 1 Introduction Sample average approximation (SAA) is a frequently used approach to solving stochastic programs...appealing due to its simplicity and the fact that a large number of standard optimization algorithms are often available to optimize the resulting sample
Image subsampling and point scoring approaches for large-scale marine benthic monitoring programs

NASA Astrophysics Data System (ADS)

Perkins, Nicholas R.; Foster, Scott D.; Hill, Nicole A.; Barrett, Neville S.

2016-07-01

Benthic imagery is an effective tool for quantitative description of ecologically and economically important benthic habitats and biota. The recent development of autonomous underwater vehicles (AUVs) allows surveying of spatial scales that were previously unfeasible. However, an AUV collects a large number of images, the scoring of which is time and labour intensive. There is a need to optimise the way that subsamples of imagery are chosen and scored to gain meaningful inferences for ecological monitoring studies. We examine the trade-off between the number of images selected within transects and the number of random points scored within images on the percent cover of target biota, the typical output of such monitoring programs. We also investigate the efficacy of various image selection approaches, such as systematic or random, on the bias and precision of cover estimates. We use simulated biotas that have varying size, abundance and distributional patterns. We find that a relatively small sampling effort is required to minimise bias. An increased precision for groups that are likely to be the focus of monitoring programs is best gained through increasing the number of images sampled rather than the number of points scored within images. For rare species, sampling using point count approaches is unlikely to provide sufficient precision, and alternative sampling approaches may need to be employed. The approach by which images are selected (simple random sampling, regularly spaced etc.) had no discernible effect on mean and variance estimates, regardless of the distributional pattern of biota. Field validation of our findings is provided through Monte Carlo resampling analysis of a previously scored benthic survey from temperate waters. We show that point count sampling approaches are capable of providing relatively precise cover estimates for candidate groups that are not overly rare. The amount of sampling required, in terms of both the number of images and number of points, varies with the abundance, size and distributional pattern of target biota. Therefore, we advocate either the incorporation of prior knowledge or the use of baseline surveys to establish key properties of intended target biota in the initial stages of monitoring programs.
A comparison of liver sampling techniques in dogs.

PubMed

Kemp, S D; Zimmerman, K L; Panciera, D L; Monroe, W E; Leib, M S; Lanz, O I

2015-01-01

The liver sampling technique in dogs that consistently provides samples adequate for accurate histopathologic interpretation is not known. To compare histopathologic results of liver samples obtained by punch, cup, and 14 gauge needle to large wedge samples collected at necropsy. Seventy dogs undergoing necropsy. Prospective study. Liver specimens were obtained from the left lateral liver lobe with an 8 mm punch, a 5 mm cup, and a 14 gauge needle. After sample acquisition, two larger tissue samples were collected near the center of the left lateral lobe to be used as a histologic standard for comparison. Histopathologic features and numbers of portal triads in each sample were recorded. The mean number of portal triads obtained by each sampling method were 2.9 in needle samples, 3.4 in cup samples, 12 in punch samples, and 30.7 in the necropsy samples. The diagnoses in 66% of needle samples, 60% of cup samples, and 69% of punch samples were in agreement with the necropsy samples, and these proportions were not significantly different from each other. The corresponding kappa coefficients were 0.59 for needle biopsies, 0.52 for cup biopsies, and 0.62 for punch biopsies. The histopathologic interpretation of a liver sample in the dog is unlikely to vary if the liver biopsy specimen contains at least 3-12 portal triads. However, in comparison large necropsy samples, the accuracy of all tested methods was relatively low. Copyright © 2014 by the American College of Veterinary Internal Medicine.
Species-area relationships and extinction forecasts.

PubMed

Halley, John M; Sgardeli, Vasiliki; Monokrousos, Nikolaos

2013-05-01

The species-area relationship (SAR) predicts that smaller areas contain fewer species. This is the basis of the SAR method that has been used to forecast large numbers of species committed to extinction every year due to deforestation. The method has a number of issues that must be handled with care to avoid error. These include the functional form of the SAR, the choice of equation parameters, the sampling procedure used, extinction debt, and forest regeneration. Concerns about the accuracy of the SAR technique often cite errors not much larger than the natural scatter of the SAR itself. Such errors do not undermine the credibility of forecasts predicting large numbers of extinctions, although they may be a serious obstacle in other SAR applications. Very large errors can arise from misinterpretation of extinction debt, inappropriate functional form, and ignoring forest regeneration. Major challenges remain to understand better the relationship between sampling protocol and the functional form of SARs and the dynamics of relaxation, especially in continental areas, and to widen the testing of extinction forecasts. © 2013 New York Academy of Sciences.
Precision Timing Calorimeter for High Energy Physics

DOE PAGES

Anderson, Dustin; Apresyan, Artur; Bornheim, Adolf; ...

2016-04-01

Here, we present studies on the performance and characterization of the time resolution of LYSO-based calorimeters. Results for an LYSO sampling calorimeter and an LYSO-tungsten Shashlik calorimeter are presented. We also demonstrate that a time resolution of 30 ps is achievable for the LYSO sampling calorimeter. Timing calorimetry is described as a tool for mitigating the effects due to the large number of simultaneous interactions in the high luminosity environment foreseen for the Large Hadron Collider.
Opposed-flow flame spread and extinction in mixed-convection boundary layers

NASA Technical Reports Server (NTRS)

Altenkirch, R. A.; Wedha-Nayagam, M.

1989-01-01

Experimental data for flame spread down thin fuel samples in an opposing, mixed-convection, boundary-layer flow are analyzed to determine the gas-phase velocity that characterizes how the flame reacts as it spreads toward the leading edge of the fuel sample into a thinning boundary layer. In the forced-flow limit where the cube of the Reynolds number divided by the Grashof number, Re exp 3/Gr, is large, L(q)/L(e), where L(q) is a theoretical flame standoff distance at extinction and L(e) is the measured distance from the leading edge of the sample where extinction occurs, is found to be proportional to Re exp n with n = -0.874 and Re based on L(e). The value of n is established by the character of the flow field near the leading edge of the flame. The Re dependence is used, along with a correction for the mixed-convection situation where Re exp 3/Gr is not large, to construct a Damkohler number with which the measured spread rates correlate for all values of Re exp 3/Gr.
Big Data and Large Sample Size: A Cautionary Note on the Potential for Bias

PubMed Central

Chambers, David A.; Glasgow, Russell E.

2014-01-01

Abstract A number of commentaries have suggested that large studies are more reliable than smaller studies and there is a growing interest in the analysis of “big data” that integrates information from many thousands of persons and/or different data sources. We consider a variety of biases that are likely in the era of big data, including sampling error, measurement error, multiple comparisons errors, aggregation error, and errors associated with the systematic exclusion of information. Using examples from epidemiology, health services research, studies on determinants of health, and clinical trials, we conclude that it is necessary to exercise greater caution to be sure that big sample size does not lead to big inferential errors. Despite the advantages of big studies, large sample size can magnify the bias associated with error resulting from sampling or study design. Clin Trans Sci 2014; Volume #: 1–5 PMID:25043853
Analysis of Environmental Contamination resulting from Catastrophic Incidents: Part two: Building Laboratory Capability by Selecting and Developing Analytical Methodologies

EPA Science Inventory

Catastrophic incidents can generate a large number of samples with analytically diverse types including forensic, clinical, environmental, food, and others. Environmental samples include water, wastewater, soil, air, urban building and infrastructure materials, and surface resid...
Intervention for First Graders with Limited Number Knowledge: Large-Scale Replication of a Randomized Controlled Trial

ERIC Educational Resources Information Center

Gersten, Russell; Rolfhus, Eric; Clarke, Ben; Decker, Lauren E.; Wilkins, Chuck; Dimino, Joseph

2015-01-01

Replication studies are extremely rare in education. This randomized controlled trial (RCT) is a scale-up replication of Fuchs et al., which in a sample of 139 found a statistically significant positive impact for Number Rockets, a small-group intervention for at-risk first graders that focused on building understanding of number operations. The…
Modification of the Mantel-Haenszel and Logistic Regression DIF Procedures to Incorporate the SIBTEST Regression Correction

ERIC Educational Resources Information Center

DeMars, Christine E.

2009-01-01

The Mantel-Haenszel (MH) and logistic regression (LR) differential item functioning (DIF) procedures have inflated Type I error rates when there are large mean group differences, short tests, and large sample sizes.When there are large group differences in mean score, groups matched on the observed number-correct score differ on true score,…
Improved argument-FFT frequency offset estimation for QPSK coherent optical Systems

NASA Astrophysics Data System (ADS)

Han, Jilong; Li, Wei; Yuan, Zhilin; Li, Haitao; Huang, Liyan; Hu, Qianggao

2016-02-01

A frequency offset estimation (FOE) algorithm based on fast Fourier transform (FFT) of the signal's argument is investigated, which does not require removing the modulated data phase. In this paper, we analyze the flaw of the argument-FFT algorithm and propose a combined FOE algorithm, in which the absolute of frequency offset (FO) is accurately calculated by argument-FFT algorithm with a relatively large number of samples and the sign of FO is determined by FFT-based interpolation discrete Fourier transformation (DFT) algorithm with a relatively small number of samples. Compared with the previous algorithms based on argument-FFT, the proposed one has low complexity and can still effectively work with a relatively less number of samples.
Efficiently sampling conformations and pathways using the concurrent adaptive sampling (CAS) algorithm

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ahn, Surl-Hee; Grate, Jay W.; Darve, Eric F.

Molecular dynamics (MD) simulations are useful in obtaining thermodynamic and kinetic properties of bio-molecules but are limited by the timescale barrier, i.e., we may be unable to efficiently obtain properties because we need to run microseconds or longer simulations using femtoseconds time steps. While there are several existing methods to overcome this timescale barrier and efficiently sample thermodynamic and/or kinetic properties, problems remain in regard to being able to sample un- known systems, deal with high-dimensional space of collective variables, and focus the computational effort on slow timescales. Hence, a new sampling method, called the “Concurrent Adaptive Sampling (CAS) algorithm,”more » has been developed to tackle these three issues and efficiently obtain conformations and pathways. The method is not constrained to use only one or two collective variables, unlike most reaction coordinate-dependent methods. Instead, it can use a large number of collective vari- ables and uses macrostates (a partition of the collective variable space) to enhance the sampling. The exploration is done by running a large number of short simula- tions, and a clustering technique is used to accelerate the sampling. In this paper, we introduce the new methodology and show results from two-dimensional models and bio-molecules, such as penta-alanine and triazine polymer« less
Spatio-temporal optimization of sampling for bluetongue vectors (Culicoides) near grazing livestock

PubMed Central

2013-01-01

Background Estimating the abundance of Culicoides using light traps is influenced by a large variation in abundance in time and place. This study investigates the optimal trapping strategy to estimate the abundance or presence/absence of Culicoides on a field with grazing animals. We used 45 light traps to sample specimens from the Culicoides obsoletus species complex on a 14 hectare field during 16 nights in 2009. Findings The large number of traps and catch nights enabled us to simulate a series of samples consisting of different numbers of traps (1-15) on each night. We also varied the number of catch nights when simulating the sampling, and sampled with increasing minimum distances between traps. We used resampling to generate a distribution of different mean and median abundance in each sample. Finally, we used the hypergeometric distribution to estimate the probability of falsely detecting absence of vectors on the field. The variation in the estimated abundance decreased steeply when using up to six traps, and was less pronounced when using more traps, although no clear cutoff was found. Conclusions Despite spatial clustering in vector abundance, we found no effect of increasing the distance between traps. We found that 18 traps were generally required to reach 90% probability of a true positive catch when sampling just one night. But when sampling over two nights the same probability level was obtained with just three traps per night. The results are useful for the design of vector monitoring programmes on fields with grazing animals. PMID:23705770
Buccal Swabbing as a Noninvasive Method To Determine Bacterial, Archaeal, and Eukaryotic Microbial Community Structures in the Rumen

PubMed Central

Kirk, Michelle R.; Jonker, Arjan; McCulloch, Alan

2015-01-01

Analysis of rumen microbial community structure based on small-subunit rRNA marker genes in metagenomic DNA samples provides important insights into the dominant taxa present in the rumen and allows assessment of community differences between individuals or in response to treatments applied to ruminants. However, natural animal-to-animal variation in rumen microbial community composition can limit the power of a study considerably, especially when only subtle differences are expected between treatment groups. Thus, trials with large numbers of animals may be necessary to overcome this variation. Because ruminants pass large amounts of rumen material to their oral cavities when they chew their cud, oral samples may contain good representations of the rumen microbiota and be useful in lieu of rumen samples to study rumen microbial communities. We compared bacterial, archaeal, and eukaryotic community structures in DNAs extracted from buccal swabs to those in DNAs from samples collected directly from the rumen by use of a stomach tube for sheep on four different diets. After bioinformatic depletion of potential oral taxa from libraries of samples collected via buccal swabs, bacterial communities showed significant clustering by diet (R = 0.37; analysis of similarity [ANOSIM]) rather than by sampling method (R = 0.07). Archaeal, ciliate protozoal, and anaerobic fungal communities also showed significant clustering by diet rather than by sampling method, even without adjustment for potentially orally associated microorganisms. These findings indicate that buccal swabs may in future allow quick and noninvasive sampling for analysis of rumen microbial communities in large numbers of ruminants. PMID:26276109
Course Shopping in Urban Community Colleges: An Analysis of Student Drop and Add Activities

ERIC Educational Resources Information Center

Hagedorn, Linda Serra; Maxwell, William E.; Cypers, Scott; Moon, Hye Sun; Lester, Jaime

2007-01-01

This study examined the course shopping behaviors among a sample of approximately 5,000 community college students enrolled across nine campuses of a large urban district. The sample was purposely designed as an analytic, rather than a random, sample that sought to obtain adequate numbers of students in course areas that were of theoretical and of…
Factors Affecting the Adoption of R&D Project Selection Techniques at the Air Force Wright Aeronautical Laboratories

DTIC Science & Technology

1988-09-01

tested. To measure 42 the adequacy of the sample, the Kaiser - Meyer - Olkin measure of sampling adequacy was used. This technique is described in Factor...40 4- 0 - 7 0 0 07 -58d the relatively large number of variables, there was concern about the adequacy of the sample size. A Kaiser - Meyer - Olkin
Some limit theorems for ratios of order statistics from uniform random variables.

PubMed

Xu, Shou-Fang; Miao, Yu

2017-01-01

In this paper, we study the ratios of order statistics based on samples drawn from uniform distribution and establish some limit properties such as the almost sure central limit theorem, the large deviation principle, the Marcinkiewicz-Zygmund law of large numbers and complete convergence.
Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining.

PubMed

Hero, Alfred O; Rajaratnam, Bala

2016-01-01

When can reliable inference be drawn in fue "Big Data" context? This paper presents a framework for answering this fundamental question in the context of correlation mining, wifu implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics fue dataset is often variable-rich but sample-starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than fue number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for "Big Data". Sample complexity however has received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address fuis gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where fue variable dimension is fixed and fue sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; 3) the purely high dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa cale data dimension. We illustrate this high dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables fua t are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. we demonstrate various regimes of correlation mining based on the unifying perspective of high dimensional learning rates and sample complexity for different structured covariance models and different inference tasks.
Asymptotic Distributions of Coalescence Times and Ancestral Lineage Numbers for Populations with Temporally Varying Size

PubMed Central

Chen, Hua; Chen, Kun

2013-01-01

The distributions of coalescence times and ancestral lineage numbers play an essential role in coalescent modeling and ancestral inference. Both exact distributions of coalescence times and ancestral lineage numbers are expressed as the sum of alternating series, and the terms in the series become numerically intractable for large samples. More computationally attractive are their asymptotic distributions, which were derived in Griffiths (1984) for populations with constant size. In this article, we derive the asymptotic distributions of coalescence times and ancestral lineage numbers for populations with temporally varying size. For a sample of size n, denote by Tm the mth coalescent time, when m + 1 lineages coalesce into m lineages, and An(t) the number of ancestral lineages at time t back from the current generation. Similar to the results in Griffiths (1984), the number of ancestral lineages, An(t), and the coalescence times, Tm, are asymptotically normal, with the mean and variance of these distributions depending on the population size function, N(t). At the very early stage of the coalescent, when t → 0, the number of coalesced lineages n − An(t) follows a Poisson distribution, and as m → n, n(n−1)Tm/2N(0) follows a gamma distribution. We demonstrate the accuracy of the asymptotic approximations by comparing to both exact distributions and coalescent simulations. Several applications of the theoretical results are also shown: deriving statistics related to the properties of gene genealogies, such as the time to the most recent common ancestor (TMRCA) and the total branch length (TBL) of the genealogy, and deriving the allele frequency spectrum for large genealogies. With the advent of genomic-level sequencing data for large samples, the asymptotic distributions are expected to have wide applications in theoretical and methodological development for population genetic inference. PMID:23666939
Asymptotic distributions of coalescence times and ancestral lineage numbers for populations with temporally varying size.

PubMed

Chen, Hua; Chen, Kun

2013-07-01

The distributions of coalescence times and ancestral lineage numbers play an essential role in coalescent modeling and ancestral inference. Both exact distributions of coalescence times and ancestral lineage numbers are expressed as the sum of alternating series, and the terms in the series become numerically intractable for large samples. More computationally attractive are their asymptotic distributions, which were derived in Griffiths (1984) for populations with constant size. In this article, we derive the asymptotic distributions of coalescence times and ancestral lineage numbers for populations with temporally varying size. For a sample of size n, denote by Tm the mth coalescent time, when m + 1 lineages coalesce into m lineages, and An(t) the number of ancestral lineages at time t back from the current generation. Similar to the results in Griffiths (1984), the number of ancestral lineages, An(t), and the coalescence times, Tm, are asymptotically normal, with the mean and variance of these distributions depending on the population size function, N(t). At the very early stage of the coalescent, when t → 0, the number of coalesced lineages n - An(t) follows a Poisson distribution, and as m → n, $$n\\left(n-1\\right){T}_{m}/2N\\left(0\\right)$$ follows a gamma distribution. We demonstrate the accuracy of the asymptotic approximations by comparing to both exact distributions and coalescent simulations. Several applications of the theoretical results are also shown: deriving statistics related to the properties of gene genealogies, such as the time to the most recent common ancestor (TMRCA) and the total branch length (TBL) of the genealogy, and deriving the allele frequency spectrum for large genealogies. With the advent of genomic-level sequencing data for large samples, the asymptotic distributions are expected to have wide applications in theoretical and methodological development for population genetic inference.

Leadership Coaching for Principals: A National Study

ERIC Educational Resources Information Center

Wise, Donald; Cavazos, Blanca

2017-01-01

Surveys were sent to a large representative sample of public school principals in the United States asking if they had received leadership coaching. Comparison of responses to actual numbers of principals indicates that the sample represents the first national study of principal leadership coaching. Results indicate that approximately 50% of all…
A high-throughput core sampling device for the evaluation of maize stalk composition

PubMed Central

2012-01-01

Background A major challenge in the identification and development of superior feedstocks for the production of second generation biofuels is the rapid assessment of biomass composition in a large number of samples. Currently, highly accurate and precise robotic analysis systems are available for the evaluation of biomass composition, on a large number of samples, with a variety of pretreatments. However, the lack of an inexpensive and high-throughput process for large scale sampling of biomass resources is still an important limiting factor. Our goal was to develop a simple mechanical maize stalk core sampling device that can be utilized to collect uniform samples of a dimension compatible with robotic processing and analysis, while allowing the collection of hundreds to thousands of samples per day. Results We have developed a core sampling device (CSD) to collect maize stalk samples compatible with robotic processing and analysis. The CSD facilitates the collection of thousands of uniform tissue cores consistent with high-throughput analysis required for breeding, genetics, and production studies. With a single CSD operated by one person with minimal training, more than 1,000 biomass samples were obtained in an eight-hour period. One of the main advantages of using cores is the high level of homogeneity of the samples obtained and the minimal opportunity for sample contamination. In addition, the samples obtained with the CSD can be placed directly into a bath of ice, dry ice, or liquid nitrogen maintaining the composition of the biomass sample for relatively long periods of time. Conclusions The CSD has been demonstrated to successfully produce homogeneous stalk core samples in a repeatable manner with a throughput substantially superior to the currently available sampling methods. Given the variety of maize developmental stages and the diversity of stalk diameter evaluated, it is expected that the CSD will have utility for other bioenergy crops as well. PMID:22548834
Annealing Increases Stability Of Iridium Thermocouples

NASA Technical Reports Server (NTRS)

Germain, Edward F.; Daryabeigi, Kamran; Alderfer, David W.; Wright, Robert E.; Ahmed, Shaffiq

1989-01-01

Metallurgical studies carried out on samples of iridium versus iridium/40-percent rhodium thermocouples in condition received from manufacturer. Metallurgical studies included x-ray, macroscopic, resistance, and metallographic studies. Revealed large amount of internal stress caused by cold-working during manufacturing, and large number of segregations and inhomogeneities. Samples annealed in furnace at temperatures from 1,000 to 2,000 degree C for intervals up to 1 h to study effects of heat treatment. Wire annealed by this procedure found to be ductile.
Successful collection of stool samples for microbiome analyses from a large community-based population of elderly men.

PubMed

Abrahamson, Melanie; Hooker, Elizabeth; Ajami, Nadim J; Petrosino, Joseph F; Orwoll, Eric S

2017-09-01

The relationship of the gastrointestinal microbiome to health and disease is of major research interest, including the effects of the gut microbiota on age related conditions. Here we report on the outcome of a project to collect stool samples on a large number of community dwelling elderly men using the OMNIgene-GUT stool/feces collection kit (OMR-200, DNA Genotek, Ottawa, Canada). Among 1,328 men who were eligible for stool collection, 982 (74%) agreed to participate and 951 submitted samples. The collection process was reported to be acceptable, almost all samples obtained were adequate, the process of sample handling by mail was uniformly successful. The DNA obtained provided excellent results in microbiome analyses, yielding an abundance of species and a diversity of taxa as would be predicted. Our results suggest that population studies of older participants involving remote stool sample collection are feasible. These approaches would allow large scale research projects of the association of the gut microbiota with important clinical outcomes.
The coverage of a random sample from a biological community.

PubMed

Engen, S

1975-03-01

A taxonomic group will frequently have a large number of species with small abundances. When a sample is drawn at random from this group, one is therefore faced with the problem that a large proportion of the species will not be discovered. A general definition of quantitative measures of "sample coverage" is proposed, and the problem of statistical inference is considered for two special cases, (1) the actual total relative abundance of those species that are represented in the sample, and (2) their relative contribution to the information index of diversity. The analysis is based on a extended version of the negative binomial species frequency model. The results are tabulated.
Two phase sampling for wheat acreage estimation. [large area crop inventory experiment

NASA Technical Reports Server (NTRS)

Thomas, R. W.; Hay, C. M.

1977-01-01

A two phase LANDSAT-based sample allocation and wheat proportion estimation method was developed. This technique employs manual, LANDSAT full frame-based wheat or cultivated land proportion estimates from a large number of segments comprising a first sample phase to optimally allocate a smaller phase two sample of computer or manually processed segments. Application to the Kansas Southwest CRD for 1974 produced a wheat acreage estimate for that CRD within 2.42 percent of the USDA SRS-based estimate using a lower CRD inventory budget than for a simulated reference LACIE system. Factor of 2 or greater cost or precision improvements relative to the reference system were obtained.
A highly addressable static droplet array enabling digital control of a single droplet at pico-volume resolution.

PubMed

Jeong, Heon-Ho; Lee, Byungjin; Jin, Si Hyung; Jeong, Seong-Geun; Lee, Chang-Soo

2016-04-26

Droplet-based microfluidics enabling exquisite liquid-handling has been developed for diagnosis, drug discovery and quantitative biology. Compartmentalization of samples into a large number of tiny droplets is a great approach to perform multiplex assays and to improve reliability and accuracy using a limited volume of samples. Despite significant advances in microfluidic technology, individual droplet handling in pico-volume resolution is still a challenge in obtaining more efficient and varying multiplex assays. We present a highly addressable static droplet array (SDA) enabling individual digital manipulation of a single droplet using a microvalve system. In a conventional single-layer microvalve system, the number of microvalves required is dictated by the number of operation objects; thus, individual trap-and-release on a large-scale 2D array format is highly challenging. By integrating double-layer microvalves, we achieve a "balloon" valve that preserves the pressure-on state under released pressure; this valve can allow the selective releasing and trapping of 7200 multiplexed pico-droplets using only 1 μL of sample without volume loss. This selectivity and addressability completely arranged only single-cell encapsulated droplets from a mixture of droplet compositions via repetitive selective trapping and releasing. Thus, it will be useful for efficient handling of miniscule volumes of rare or clinical samples in multiplex or combinatory assays, and the selective collection of samples.
Effects of Sample Selection Bias on the Accuracy of Population Structure and Ancestry Inference

PubMed Central

Shringarpure, Suyash; Xing, Eric P.

2014-01-01

Population stratification is an important task in genetic analyses. It provides information about the ancestry of individuals and can be an important confounder in genome-wide association studies. Public genotyping projects have made a large number of datasets available for study. However, practical constraints dictate that of a geographical/ethnic population, only a small number of individuals are genotyped. The resulting data are a sample from the entire population. If the distribution of sample sizes is not representative of the populations being sampled, the accuracy of population stratification analyses of the data could be affected. We attempt to understand the effect of biased sampling on the accuracy of population structure analysis and individual ancestry recovery. We examined two commonly used methods for analyses of such datasets, ADMIXTURE and EIGENSOFT, and found that the accuracy of recovery of population structure is affected to a large extent by the sample used for analysis and how representative it is of the underlying populations. Using simulated data and real genotype data from cattle, we show that sample selection bias can affect the results of population structure analyses. We develop a mathematical framework for sample selection bias in models for population structure and also proposed a correction for sample selection bias using auxiliary information about the sample. We demonstrate that such a correction is effective in practice using simulated and real data. PMID:24637351
Estimating the Size of a Large Network and its Communities from a Random Sample

PubMed Central

Chen, Lin; Karbasi, Amin; Crawford, Forrest W.

2017-01-01

Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = (V, E) from the stochastic block model (SBM) with K communities/blocks. A sample is obtained by randomly choosing a subset W ⊆ V and letting G(W) be the induced subgraph in G of the vertices in W. In addition to G(W), we observe the total degree of each sampled vertex and its block membership. Given this partial information, we propose an efficient PopULation Size Estimation algorithm, called PULSE, that accurately estimates the size of the whole population as well as the size of each community. To support our theoretical analysis, we perform an exhaustive set of experiments to study the effects of sample size, K, and SBM model parameters on the accuracy of the estimates. The experimental results also demonstrate that PULSE significantly outperforms a widely-used method called the network scale-up estimator in a wide variety of scenarios. PMID:28867924
Estimating the Size of a Large Network and its Communities from a Random Sample.

PubMed

Chen, Lin; Karbasi, Amin; Crawford, Forrest W

2016-01-01

Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = ( V, E ) from the stochastic block model (SBM) with K communities/blocks. A sample is obtained by randomly choosing a subset W ⊆ V and letting G ( W ) be the induced subgraph in G of the vertices in W . In addition to G ( W ), we observe the total degree of each sampled vertex and its block membership. Given this partial information, we propose an efficient PopULation Size Estimation algorithm, called PULSE, that accurately estimates the size of the whole population as well as the size of each community. To support our theoretical analysis, we perform an exhaustive set of experiments to study the effects of sample size, K , and SBM model parameters on the accuracy of the estimates. The experimental results also demonstrate that PULSE significantly outperforms a widely-used method called the network scale-up estimator in a wide variety of scenarios.
Ultrasonic acoustic emissions from the sapwood of cedar and hemlock : an examination of three hypotheses regarding cavitations.

PubMed

Tyree, M T; Dixon, M A; Tyree, E L; Johnson, R

1984-08-01

Measurements are reported of ultrasonic acoustic emissions (AEs) measured from sapwood samples of Thuja occidentalis L. and Tsuga canadensis (L.) Carr. during air dehydration. The measurements were undertaken to test the following three hypotheses: (a) Each cavitation event produces one ultrasonic AE. (b) Large tracheids are more likely to cavitate than small tracheids. (c) When stem water potentials are >-0.4 MPa, a significant fraction of the water content of sapwood is held by ;capillary forces.' The last two hypotheses were recently discussed at length by M. H. Zimmermann. Experimental evidence consistent with all three hypotheses was obtained. The evidence for each hypothesis respectively is: (a) the cumulative number of AEs nearly equals the number of tracheids in small samples; (b) more water is lost per AE event at the beginning of the dehydration process than at the end, and (c) sapwood samples dehydrated from an initial water potential of 0 MPa lost significantly more water before AEs started than lost by samples dehydrated from an initial water potential of about -0.4 MPa. The extra water held by fully hydrated sapwood samples may have been capillary water as defined by Zimmerman.We also report an improved method for the measurement of the ;intensity' of ultrasonic AEs. Intensity is defined here as the area under the positive spikes of the AE signal (plotted as voltage versus time). This method was applied to produce a frequency histogram of the number of AEs versus intensity. A large fraction of the total number of AEs were of low intensity even in small samples (4 mm diameter by 10 mm length). This suggests that the effective ;listening distance' for most AEs was less than 5 to 10 mm.
Ultrasonic Acoustic Emissions from the Sapwood of Cedar and Hemlock 1

PubMed Central

Tyree, Melvin T.; Dixon, Michael A.; Tyree, E. Loeta; Johnson, Robert

1984-01-01

Measurements are reported of ultrasonic acoustic emissions (AEs) measured from sapwood samples of Thuja occidentalis L. and Tsuga canadensis (L.) Carr. during air dehydration. The measurements were undertaken to test the following three hypotheses: (a) Each cavitation event produces one ultrasonic AE. (b) Large tracheids are more likely to cavitate than small tracheids. (c) When stem water potentials are >−0.4 MPa, a significant fraction of the water content of sapwood is held by `capillary forces.' The last two hypotheses were recently discussed at length by M. H. Zimmermann. Experimental evidence consistent with all three hypotheses was obtained. The evidence for each hypothesis respectively is: (a) the cumulative number of AEs nearly equals the number of tracheids in small samples; (b) more water is lost per AE event at the beginning of the dehydration process than at the end, and (c) sapwood samples dehydrated from an initial water potential of 0 MPa lost significantly more water before AEs started than lost by samples dehydrated from an initial water potential of about −0.4 MPa. The extra water held by fully hydrated sapwood samples may have been capillary water as defined by Zimmerman. We also report an improved method for the measurement of the `intensity' of ultrasonic AEs. Intensity is defined here as the area under the positive spikes of the AE signal (plotted as voltage versus time). This method was applied to produce a frequency histogram of the number of AEs versus intensity. A large fraction of the total number of AEs were of low intensity even in small samples (4 mm diameter by 10 mm length). This suggests that the effective `listening distance' for most AEs was less than 5 to 10 mm. PMID:16663774
Accurate, high-throughput typing of copy number variation using paralogue ratios from dispersed repeats

PubMed Central

Armour, John A. L.; Palla, Raquel; Zeeuwen, Patrick L. J. M.; den Heijer, Martin; Schalkwijk, Joost; Hollox, Edward J.

2007-01-01

Recent work has demonstrated an unexpected prevalence of copy number variation in the human genome, and has highlighted the part this variation may play in predisposition to common phenotypes. Some important genes vary in number over a high range (e.g. DEFB4, which commonly varies between two and seven copies), and have posed formidable technical challenges for accurate copy number typing, so that there are no simple, cheap, high-throughput approaches suitable for large-scale screening. We have developed a simple comparative PCR method based on dispersed repeat sequences, using a single pair of precisely designed primers to amplify products simultaneously from both test and reference loci, which are subsequently distinguished and quantified via internal sequence differences. We have validated the method for the measurement of copy number at DEFB4 by comparison of results from >800 DNA samples with copy number measurements by MAPH/REDVR, MLPA and array-CGH. The new Paralogue Ratio Test (PRT) method can require as little as 10 ng genomic DNA, appears to be comparable in accuracy to the other methods, and for the first time provides a rapid, simple and inexpensive method for copy number analysis, suitable for application to typing thousands of samples in large case-control association studies. PMID:17175532
Violation of the Sphericity Assumption and Its Effect on Type-I Error Rates in Repeated Measures ANOVA and Multi-Level Linear Models (MLM).

PubMed

Haverkamp, Nicolas; Beauducel, André

2017-01-01

We investigated the effects of violations of the sphericity assumption on Type I error rates for different methodical approaches of repeated measures analysis using a simulation approach. In contrast to previous simulation studies on this topic, up to nine measurement occasions were considered. Effects of the level of inter-correlations between measurement occasions on Type I error rates were considered for the first time. Two populations with non-violation of the sphericity assumption, one with uncorrelated measurement occasions and one with moderately correlated measurement occasions, were generated. One population with violation of the sphericity assumption combines uncorrelated with highly correlated measurement occasions. A second population with violation of the sphericity assumption combines moderately correlated and highly correlated measurement occasions. From these four populations without any between-group effect or within-subject effect 5,000 random samples were drawn. Finally, the mean Type I error rates for Multilevel linear models (MLM) with an unstructured covariance matrix (MLM-UN), MLM with compound-symmetry (MLM-CS) and for repeated measures analysis of variance (rANOVA) models (without correction, with Greenhouse-Geisser-correction, and Huynh-Feldt-correction) were computed. To examine the effect of both the sample size and the number of measurement occasions, sample sizes of n = 20, 40, 60, 80, and 100 were considered as well as measurement occasions of m = 3, 6, and 9. With respect to rANOVA, the results plead for a use of rANOVA with Huynh-Feldt-correction, especially when the sphericity assumption is violated, the sample size is rather small and the number of measurement occasions is large. For MLM-UN, the results illustrate a massive progressive bias for small sample sizes ( n = 20) and m = 6 or more measurement occasions. This effect could not be found in previous simulation studies with a smaller number of measurement occasions. The proportionality of bias and number of measurement occasions should be considered when MLM-UN is used. The good news is that this proportionality can be compensated by means of large sample sizes. Accordingly, MLM-UN can be recommended even for small sample sizes for about three measurement occasions and for large sample sizes for about nine measurement occasions.
Sampling studies to estimate the HIV prevalence rate in female commercial sex workers.

PubMed

Pascom, Ana Roberta Pati; Szwarcwald, Célia Landmann; Barbosa Júnior, Aristides

2010-01-01

We investigated sampling methods being used to estimate the HIV prevalence rate among female commercial sex workers. The studies were classified according to the adequacy or not of the sample size to estimate HIV prevalence rate and according to the sampling method (probabilistic or convenience). We identified 75 studies that estimated the HIV prevalence rate among female sex workers. Most of the studies employed convenience samples. The sample size was not adequate to estimate HIV prevalence rate in 35 studies. The use of convenience sample limits statistical inference for the whole group. It was observed that there was an increase in the number of published studies since 2005, as well as in the number of studies that used probabilistic samples. This represents a large advance in the monitoring of risk behavior practices and HIV prevalence rate in this group.
Electrofishing effort required to estimate biotic condition in southern Idaho Rivers

USGS Publications Warehouse

Maret, Terry R.; Ott, Douglas S.; Herlihy, Alan T.

2007-01-01

An important issue surrounding biomonitoring in large rivers is the minimum sampling effort required to collect an adequate number of fish for accurate and precise determinations of biotic condition. During the summer of 2002, we sampled 15 randomly selected large-river sites in southern Idaho to evaluate the effects of sampling effort on an index of biotic integrity (IBI). Boat electrofishing was used to collect sample populations of fish in river reaches representing 40 and 100 times the mean channel width (MCW; wetted channel) at base flow. Minimum sampling effort was assessed by comparing the relation between reach length sampled and change in IBI score. Thirty-two species of fish in the families Catostomidae, Centrarchidae, Cottidae, Cyprinidae, Ictaluridae, Percidae, and Salmonidae were collected. Of these, 12 alien species were collected at 80% (12 of 15) of the sample sites; alien species represented about 38% of all species (N = 32) collected during the study. A total of 60% (9 of 15) of the sample sites had poor IBI scores. A minimum reach length of about 36 times MCW was determined to be sufficient for collecting an adequate number of fish for estimating biotic condition based on an IBI score. For most sites, this equates to collecting 275 fish at a site. Results may be applicable to other semiarid, fifth-order through seventh-order rivers sampled during summer low-flow conditions.
Improvements in medical quality and patient safety through implementation of a case bundle management strategy in a large outpatient blood collection center.

PubMed

Zhao, Shuzhen; He, Lujia; Feng, Chenchen; He, Xiaoli

2018-06-01

Laboratory errors in blood collection center (BCC) are most common in the preanalytical phase. It is, therefore, of vital importance for administrators to take measures to improve healthcare quality and patient safety.In 2015, a case bundle management strategy was applied in a large outpatient BCC to improve its medical quality and patient safety.Unqualified blood sampling, complications, patient waiting time, largest number of patients waiting during peak hours, patient complaints, and patient satisfaction were compared over the period from 2014 to 2016.The strategy reduced unqualified blood sampling, complications, patient waiting time, largest number of patients waiting during peak hours, and patient complaints, while improving patient satisfaction.This strategy was effective in improving BCC healthcare quality and patient safety.
An improved filter elution and cell culture assay procedure for evaluating public groundwater systems for culturable enteroviruses.

PubMed

Dahling, Daniel R

2002-01-01

Large-scale virus studies of groundwater systems require practical and sensitive procedures for both sample processing and viral assay. Filter adsorption-elution procedures have traditionally been used to process large-volume water samples for viruses. In this study, five filter elution procedures using cartridge filters were evaluated for their effectiveness in processing samples. Of the five procedures tested, the third method, which incorporated two separate beef extract elutions (one being an overnight filter immersion in beef extract), recovered 95% of seeded poliovirus compared with recoveries of 36 to 70% for the other methods. For viral enumeration, an expanded roller bottle quantal assay was evaluated using seeded poliovirus. This cytopathic-based method was considerably more sensitive than the standard plaque assay method. The roller bottle system was more economical than the plaque assay for the evaluation of comparable samples. Using roller bottles required less time and manipulation than the plaque procedure and greatly facilitated the examination of large numbers of samples. The combination of the improved filter elution procedure and the roller bottle assay for viral analysis makes large-scale virus studies of groundwater systems practical. This procedure was subsequently field tested during a groundwater study in which large-volume samples (exceeding 800 L) were processed through the filters.
It's a Girl! Random Numbers, Simulations, and the Law of Large Numbers

ERIC Educational Resources Information Center

Goodwin, Chris; Ortiz, Enrique

2015-01-01

Modeling using mathematics and making inferences about mathematical situations are becoming more prevalent in most fields of study. Descriptive statistics cannot be used to generalize about a population or make predictions of what can occur. Instead, inference must be used. Simulation and sampling are essential in building a foundation for…
EFL Students' Perceptions of Corpus-Tools as Writing References

ERIC Educational Resources Information Center

Lai, Shu-Li

2015-01-01

A number of studies have suggested the potentials of corpus tools in vocabulary learning. However, there are still some concerns. Corpus tools might be too complicated to use; example sentences retrieved from corpus tools might be too difficult to understand; processing large number of sample sentences could be challenging and time-consuming;…

Assays for the activities of polyamine biosynthetic enzymes using intact tissues

Treesearch

Rakesh Minocha; Stephanie Long; Hisae Maki; Subhash C. Minocha

1999-01-01

Traditionally, most enzyme assays utilize homogenized cell extracts with or without dialysis. Homogenization and centrifugation of large numbers of samples for screening of mutants and transgenic cell lines is quite cumbersome and generally requires sufficiently large amounts (hundreds of milligrams) of tissue. However, in situations where the tissue is available in...
Incidental Learning of Melodic Structure of North Indian Music

ERIC Educational Resources Information Center

Rohrmeier, Martin; Widdess, Richard

2017-01-01

Musical knowledge is largely implicit. It is acquired without awareness of its complex rules, through interaction with a large number of samples during musical enculturation. Whereas several studies explored implicit learning of mostly abstract and less ecologically valid features of Western music, very little work has been done with respect to…
MGIS: Managing banana (Musa spp.) genetic resources information and high-throughput genotyping data

USDA-ARS?s Scientific Manuscript database

Unraveling genetic diversity held in genebanks on a large scale is underway, due to the advances in Next-generation sequence-based technologies that produce high-density genetic markers for a large number of samples at low cost. Genebank users should be in a position to identify and select germplasm...
Development and validation of a low-density SNP panel related to prolificacy in sheep

USDA-ARS?s Scientific Manuscript database

High-density SNP panels (e.g., 50,000 and 600,000 markers) have been used in exploratory population genetic studies with commercial and minor breeds of sheep. However, routine genetic diversity evaluations of large numbers of samples with large panels are in general cost-prohibitive for gene banks. ...
Variational Approach to Enhanced Sampling and Free Energy Calculations

NASA Astrophysics Data System (ADS)

Valsson, Omar; Parrinello, Michele

2014-08-01

The ability of widely used sampling methods, such as molecular dynamics or Monte Carlo simulations, to explore complex free energy landscapes is severely hampered by the presence of kinetic bottlenecks. A large number of solutions have been proposed to alleviate this problem. Many are based on the introduction of a bias potential which is a function of a small number of collective variables. However constructing such a bias is not simple. Here we introduce a functional of the bias potential and an associated variational principle. The bias that minimizes the functional relates in a simple way to the free energy surface. This variational principle can be turned into a practical, efficient, and flexible sampling method. A number of numerical examples are presented which include the determination of a three-dimensional free energy surface. We argue that, beside being numerically advantageous, our variational approach provides a convenient and novel standpoint for looking at the sampling problem.
Improved technique that allows the performance of large-scale SNP genotyping on DNA immobilized by FTA technology.

PubMed

He, Hongbin; Argiro, Laurent; Dessein, Helia; Chevillard, Christophe

2007-01-01

FTA technology is a novel method designed to simplify the collection, shipment, archiving and purification of nucleic acids from a wide variety of biological sources. The number of punches that can normally be obtained from a single specimen card are often however, insufficient for the testing of the large numbers of loci required to identify genetic factors that control human susceptibility or resistance to multifactorial diseases. In this study, we propose an improved technique to perform large-scale SNP genotyping. We applied a whole genome amplification method to amplify DNA from buccal cell samples stabilized using FTA technology. The results show that using the improved technique it is possible to perform up to 15,000 genotypes from one buccal cell sample. Furthermore, the procedure is simple. We consider this improved technique to be a promising methods for performing large-scale SNP genotyping because the FTA technology simplifies the collection, shipment, archiving and purification of DNA, while whole genome amplification of FTA card bound DNA produces sufficient material for the determination of thousands of SNP genotypes.
Rayleigh- and Prandtl-number dependence of the large-scale flow-structure in weakly-rotating turbulent thermal convection

NASA Astrophysics Data System (ADS)

Weiss, Stephan; Wei, Ping; Ahlers, Guenter

2015-11-01

Turbulent thermal convection under rotation shows a remarkable variety of different flow states. The Nusselt number (Nu) at slow rotation rates (expressed as the dimensionless inverse Rossby number 1/Ro), for example, is not a monotonic function of 1/Ro. Different 1/Ro-ranges can be observed with different slopes ∂Nu / ∂ (1 / Ro) . Some of these ranges are connected by sharp transitions where ∂Nu / ∂ (1 / Ro) changes discontinuously. We investigate different regimes in cylindrical samples of aspect ratio Γ = 1 by measuring temperatures at the sidewall of the sample for various Prandtl numbers in the range 3 < Pr < 35 and Rayleigh numbers in the range of 108 < Ra < 4 ×1011 . From these measurements we deduce changes of the flow structure. We learn about the stability and dynamics of the large-scale circulation (LSC), as well as about its breakdown and the onset of vortex formation close to the top and bottom plate. We shall examine correlations between these measurements and changes in the heat transport. This work was supported by NSF grant DRM11-58514. SW acknowledges support by the Deutsche Forschungsgemeinschaft.
Search for large extra dimensions in final states containing one photon or jet and large missing transverse energy produced in pp collisions at square root[s]=1.96 TeV.

PubMed

Aaltonen, T; Adelman, J; Akimoto, T; Albrow, M G; Alvarez González, B; Amerio, S; Amidei, D; Anastassov, A; Annovi, A; Antos, J; Apollinari, G; Apresyan, A; Arisawa, T; Artikov, A; Ashmanskas, W; Attal, A; Aurisano, A; Azfar, F; Azzurri, P; Badgett, W; Barbaro-Galtieri, A; Barnes, V E; Barnett, B A; Bartsch, V; Bauer, G; Beauchemin, P-H; Bedeschi, F; Bednar, P; Beecher, D; Behari, S; Bellettini, G; Bellinger, J; Benjamin, D; Beretvas, A; Beringer, J; Bhatti, A; Binkley, M; Bisello, D; Bizjak, I; Blair, R E; Blocker, C; Blumenfeld, B; Bocci, A; Bodek, A; Boisvert, V; Bolla, G; Bortoletto, D; Boudreau, J; Boveia, A; Brau, B; Bridgeman, A; Brigliadori, L; Bromberg, C; Brubaker, E; Budagov, J; Budd, H S; Budd, S; Burkett, K; Busetto, G; Bussey, P; Buzatu, A; Byrum, K L; Cabrera, S; Calancha, C; Campanelli, M; Campbell, M; Canelli, F; Canepa, A; Carlsmith, D; Carosi, R; Carrillo, S; Carron, S; Casal, B; Casarsa, M; Castro, A; Catastini, P; Cauz, D; Cavaliere, V; Cavalli-Sforza, M; Cerri, A; Cerrito, L; Chang, S H; Chen, Y C; Chertok, M; Chiarelli, G; Chlachidze, G; Chlebana, F; Cho, K; Chokheli, D; Chou, J P; Choudalakis, G; Chuang, S H; Chung, K; Chung, W H; Chung, Y S; Ciobanu, C I; Ciocci, M A; Clark, A; Clark, D; Compostella, G; Convery, M E; Conway, J; Copic, K; Cordelli, M; Cortiana, G; Cox, D J; Crescioli, F; Cuenca Almenar, C; Cuevas, J; Culbertson, R; Cully, J C; Dagenhart, D; Datta, M; Davies, T; de Barbaro, P; De Cecco, S; Deisher, A; De Lorenzo, G; Dell'orso, M; Deluca, C; Demortier, L; Deng, J; Deninno, M; Derwent, P F; di Giovanni, G P; Dionisi, C; Di Ruzza, B; Dittmann, J R; D'Onofrio, M; Donati, S; Dong, P; Donini, J; Dorigo, T; Dube, S; Efron, J; Elagin, A; Erbacher, R; Errede, D; Errede, S; Eusebi, R; Fang, H C; Farrington, S; Fedorko, W T; Feild, R G; Feindt, M; Fernandez, J P; Ferrazza, C; Field, R; Flanagan, G; Forrest, R; Franklin, M; Freeman, J C; Furic, I; Gallinaro, M; Galyardt, J; Garberson, F; Garcia, J E; Garfinkel, A F; Genser, K; Gerberich, H; Gerdes, D; Gessler, A; Giagu, S; Giakoumopoulou, V; Giannetti, P; Gibson, K; Gimmell, J L; Ginsburg, C M; Giokaris, N; Giordani, M; Giromini, P; Giunta, M; Giurgiu, G; Glagolev, V; Glenzinski, D; Gold, M; Goldschmidt, N; Golossanov, A; Gomez, G; Gomez-Ceballos, G; Goncharov, M; González, O; Gorelov, I; Goshaw, A T; Goulianos, K; Gresele, A; Grinstein, S; Grosso-Pilcher, C; Grundler, U; Guimaraes da Costa, J; Gunay-Unalan, Z; Haber, C; Hahn, K; Hahn, S R; Halkiadakis, E; Han, B-Y; Han, J Y; Handler, R; Happacher, F; Hara, K; Hare, D; Hare, M; Harper, S; Harr, R F; Harris, R M; Hartz, M; Hatakeyama, K; Hauser, J; Hays, C; Heck, M; Heijboer, A; Heinemann, B; Heinrich, J; Henderson, C; Herndon, M; Heuser, J; Hewamanage, S; Hidas, D; Hill, C S; Hirschbuehl, D; Hocker, A; Hou, S; Houlden, M; Hsu, S-C; Huffman, B T; Hughes, R E; Husemann, U; Huston, J; Incandela, J; Introzzi, G; Iori, M; Ivanov, A; James, E; Jayatilaka, B; Jeon, E J; Jha, M K; Jindariani, S; Johnson, W; Jones, M; Joo, K K; Jun, S Y; Jung, J E; Junk, T R; Kamon, T; Kar, D; Karchin, P E; Kato, Y; Kephart, R; Keung, J; Khotilovich, V; Kilminster, B; Kim, D H; Kim, H S; Kim, J E; Kim, M J; Kim, S B; Kim, S H; Kim, Y K; Kimura, N; Kirsch, L; Klimenko, S; Knuteson, B; Ko, B R; Koay, S A; Kondo, K; Kong, D J; Konigsberg, J; Korytov, A; Kotwal, A V; Kreps, M; Kroll, J; Krop, D; Krumnack, N; Kruse, M; Krutelyov, V; Kubo, T; Kuhr, T; Kulkarni, N P; Kurata, M; Kusakabe, Y; Kwang, S; Laasanen, A T; Lami, S; Lammel, S; Lancaster, M; Lander, R L; Lannon, K; Lath, A; Latino, G; Lazzizzera, I; Lecompte, T; Lee, E; Lee, H S; Lee, S W; Leone, S; Lewis, J D; Lin, C S; Linacre, J; Lindgren, M; Lipeles, E; Lister, A; Litvintsev, D O; Liu, C; Liu, T; Lockyer, N S; Loginov, A; Loreti, M; Lovas, L; Lu, R-S; Lucchesi, D; Lueck, J; Luci, C; Lujan, P; Lukens, P; Lungu, G; Lyons, L; Lys, J; Lysak, R; Lytken, E; Mack, P; Macqueen, D; Madrak, R; Maeshima, K; Makhoul, K; Maki, T; Maksimovic, P; Malde, S; Malik, S; Manca, G; Manousakis-Katsikakis, A; Margaroli, F; Marino, C; Marino, C P; Martin, A; Martin, V; Martínez, M; Martínez-Ballarín, R; Maruyama, T; Mastrandrea, P; Masubuchi, T; Mattson, M E; Mazzanti, P; McFarland, K S; McIntyre, P; McNulty, R; Mehta, A; Mehtala, P; Menzione, A; Merkel, P; Mesropian, C; Miao, T; Miladinovic, N; Miller, R; Mills, C; Milnik, M; Mitra, A; Mitselmakher, G; Miyake, H; Moggi, N; Moon, C S; Moore, R; Morello, M J; Morlok, J; Movilla Fernandez, P; Mülmenstädt, J; Mukherjee, A; Muller, Th; Mumford, R; Murat, P; Mussini, M; Nachtman, J; Nagai, Y; Nagano, A; Naganoma, J; Nakamura, K; Nakano, I; Napier, A; Necula, V; Neu, C; Neubauer, M S; Nielsen, J; Nodulman, L; Norman, M; Norniella, O; Nurse, E; Oakes, L; Oh, S H; Oh, Y D; Oksuzian, I; Okusawa, T; Orava, R; Osterberg, K; Pagan Griso, S; Pagliarone, C; Palencia, E; Papadimitriou, V; Papaikonomou, A; Paramonov, A A; Parks, B; Pashapour, S; Patrick, J; Pauletta, G; Paulini, M; Paus, C; Pellett, D E; Penzo, A; Phillips, T J; Piacentino, G; Pianori, E; Pinera, L; Pitts, K; Plager, C; Pondrom, L; Poukhov, O; Pounder, N; Prakoshyn, F; Pronko, A; Proudfoot, J; Ptohos, F; Pueschel, E; Punzi, G; Pursley, J; Rademacker, J; Rahaman, A; Ramakrishnan, V; Ranjan, N; Redondo, I; Reisert, B; Rekovic, V; Renton, P; Rescigno, M; Richter, S; Rimondi, F; Ristori, L; Robson, A; Rodrigo, T; Rodriguez, T; Rogers, E; Rolli, S; Roser, R; Rossi, M; Rossin, R; Roy, P; Ruiz, A; Russ, J; Rusu, V; Saarikko, H; Safonov, A; Sakumoto, W K; Saltó, O; Santi, L; Sarkar, S; Sartori, L; Sato, K; Savard, P; Savoy-Navarro, A; Scheidle, T; Schlabach, P; Schmidt, A; Schmidt, E E; Schmidt, M A; Schmidt, M P; Schmitt, M; Schwarz, T; Scodellaro, L; Scott, A L; Scribano, A; Scuri, F; Sedov, A; Seidel, S; Seiya, Y; Semenov, A; Sexton-Kennedy, L; Sfyrla, A; Shalhout, S Z; Shears, T; Shepard, P F; Sherman, D; Shimojima, M; Shiraishi, S; Shochet, M; Shon, Y; Shreyber, I; Sidoti, A; Sinervo, P; Sisakyan, A; Slaughter, A J; Slaunwhite, J; Sliwa, K; Smith, J R; Snider, F D; Snihur, R; Soha, A; Somalwar, S; Sorin, V; Spalding, J; Spreitzer, T; Squillacioti, P; Stanitzki, M; St Denis, R; Stelzer, B; Stelzer-Chilton, O; Stentz, D; Strologas, J; Stuart, D; Suh, J S; Sukhanov, A; Suslov, I; Suzuki, T; Taffard, A; Takashima, R; Takeuchi, Y; Tanaka, R; Tecchio, M; Teng, P K; Terashi, K; Thom, J; Thompson, A S; Thompson, G A; Thomson, E; Tipton, P; Tiwari, V; Tkaczyk, S; Toback, D; Tokar, S; Tollefson, K; Tomura, T; Tonelli, D; Torre, S; Torretta, D; Totaro, P; Tourneur, S; Tu, Y; Turini, N; Ukegawa, F; Vallecorsa, S; van Remortel, N; Varganov, A; Vataga, E; Vázquez, F; Velev, G; Vellidis, C; Veszpremi, V; Vidal, M; Vidal, R; Vila, I; Vilar, R; Vine, T; Vogel, M; Volobouev, I; Volpi, G; Würthwein, F; Wagner, P; Wagner, R G; Wagner, R L; Wagner-Kuhr, J; Wagner, W; Wakisaka, T; Wallny, R; Wang, S M; Warburton, A; Waters, D; Weinberger, M; Wester, W C; Whitehouse, B; Whiteson, D; Wicklund, A B; Wicklund, E; Williams, G; Williams, H H; Wilson, P; Winer, B L; Wittich, P; Wolbers, S; Wolfe, C; Wright, T; Wu, X; Wynne, S M; Xie, S; Yagil, A; Yamamoto, K; Yamaoka, J; Yang, U K; Yang, Y C; Yao, W M; Yeh, G P; Yoh, J; Yorita, K; Yoshida, T; Yu, G B; Yu, I; Yu, S S; Yun, J C; Zanello, L; Zanetti, A; Zaw, I; Zhang, X; Zheng, Y; Zucchelli, S

2008-10-31

We present the results of searches for large extra dimensions in samples of events with large missing transverse energy E_{T} and either a photon or a jet produced in pp[over ] collisions at sqrt[s]=1.96 TeV collected with the Collider Detector at Fermilab II. For gamma+E_{T} and jet+E_{T} candidate samples corresponding to 2.0 and 1.1 fb;{-1} of integrated luminosity, respectively, we observe good agreement with standard model expectations and obtain a combined lower limit on the fundamental parameter of the large extra dimensions model M_{D} as a function of the number of extra dimensions in the model.
Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment

PubMed Central

Cheng, Ningtao; Wu, Leihong; Cheng, Yiyu

2013-01-01

The promise of microarray technology in providing prediction classifiers for cancer outcome estimation has been confirmed by a number of demonstrable successes. However, the reliability of prediction results relies heavily on the accuracy of statistical parameters involved in classifiers. It cannot be reliably estimated with only a small number of training samples. Therefore, it is of vital importance to determine the minimum number of training samples and to ensure the clinical value of microarrays in cancer outcome prediction. We evaluated the impact of training sample size on model performance extensively based on 3 large-scale cancer microarray datasets provided by the second phase of MicroArray Quality Control project (MAQC-II). An SSNR-based (scale of signal-to-noise ratio) protocol was proposed in this study for minimum training sample size determination. External validation results based on another 3 cancer datasets confirmed that the SSNR-based approach could not only determine the minimum number of training samples efficiently, but also provide a valuable strategy for estimating the underlying performance of classifiers in advance. Once translated into clinical routine applications, the SSNR-based protocol would provide great convenience in microarray-based cancer outcome prediction in improving classifier reliability. PMID:23861920
Enhanced, targeted sampling of high-dimensional free-energy landscapes using variationally enhanced sampling, with an application to chignolin

PubMed Central

Shaffer, Patrick; Valsson, Omar; Parrinello, Michele

2016-01-01

The capabilities of molecular simulations have been greatly extended by a number of widely used enhanced sampling methods that facilitate escaping from metastable states and crossing large barriers. Despite these developments there are still many problems which remain out of reach for these methods which has led to a vigorous effort in this area. One of the most important problems that remains unsolved is sampling high-dimensional free-energy landscapes and systems that are not easily described by a small number of collective variables. In this work we demonstrate a new way to compute free-energy landscapes of high dimensionality based on the previously introduced variationally enhanced sampling, and we apply it to the miniprotein chignolin. PMID:26787868
The large bright quasar survey. 6: Quasar catalog and survey parameters

NASA Astrophysics Data System (ADS)

Hewett, Paul C.; Foltz, Craig B.; Chaffee, Frederic H.

1995-04-01

Positions, redshifts, and magnitudes for the 1055 quasars in the Large Bright Quasar Survey (LBQS) are presented in a single catalog. Celestial positions have been derived using the PPM catalog to provide an improved reference frame. J2000.0 coordinates are given together with improved b1950.0 positions. Redshifts calculated via cross correlation with a high signal-to-noise ratio composite quasar spectrum are included and the small number of typographic and redshift misidentifications in the discovery papers are corrected. Spectra of the 12 quasars added to the sample since the publication of the discovery papers are included. Discriptions of the plate material, magnitude calibration, quasar candidate selection procedures, and the identification spectroscopy are given. Calculation of the effective area of the survey for the 1055 quasars comprising the well-defined LBQS sample specified in detail. Number-redshift and number-magnitude relations for the quasars are derived and the strengths and limitastions of the LBSQ sample summarized. Comparison with existing surveys is made and a qualitative assessment of the effectiveness of the LBQS undertaken. Positions, magnitudes, and optical spectra of the eight objects (less than 1%) in the survey that remain unidentified are also presented.
An improved methodology of asymmetric flow field flow fractionation hyphenated with inductively coupled mass spectrometry for the determination of size distribution of gold nanoparticles in dietary supplements.

PubMed

Mudalige, Thilak K; Qu, Haiou; Linder, Sean W

2015-11-13

Engineered nanoparticles are available in large numbers of commercial products claiming various health benefits. Nanoparticle absorption, distribution, metabolism, excretion, and toxicity in a biological system are dependent on particle size, thus the determination of size and size distribution is essential for full characterization. Number based average size and size distribution is a major parameter for full characterization of the nanoparticle. In the case of polydispersed samples, large numbers of particles are needed to obtain accurate size distribution data. Herein, we report a rapid methodology, demonstrating improved nanoparticle recovery and excellent size resolution, for the characterization of gold nanoparticles in dietary supplements using asymmetric flow field flow fractionation coupled with visible absorption spectrometry and inductively coupled plasma mass spectrometry. A linear relationship between gold nanoparticle size and retention times was observed, and used for characterization of unknown samples. The particle size results from unknown samples were compared to results from traditional size analysis by transmission electron microscopy, and found to have less than a 5% deviation in size for unknown product over the size range from 7 to 30 nm. Published by Elsevier B.V.
Buccal swabbing as a noninvasive method to determine bacterial, archaeal, and eukaryotic microbial community structures in the rumen.

PubMed

Kittelmann, Sandra; Kirk, Michelle R; Jonker, Arjan; McCulloch, Alan; Janssen, Peter H

2015-11-01

Analysis of rumen microbial community structure based on small-subunit rRNA marker genes in metagenomic DNA samples provides important insights into the dominant taxa present in the rumen and allows assessment of community differences between individuals or in response to treatments applied to ruminants. However, natural animal-to-animal variation in rumen microbial community composition can limit the power of a study considerably, especially when only subtle differences are expected between treatment groups. Thus, trials with large numbers of animals may be necessary to overcome this variation. Because ruminants pass large amounts of rumen material to their oral cavities when they chew their cud, oral samples may contain good representations of the rumen microbiota and be useful in lieu of rumen samples to study rumen microbial communities. We compared bacterial, archaeal, and eukaryotic community structures in DNAs extracted from buccal swabs to those in DNAs from samples collected directly from the rumen by use of a stomach tube for sheep on four different diets. After bioinformatic depletion of potential oral taxa from libraries of samples collected via buccal swabs, bacterial communities showed significant clustering by diet (R = 0.37; analysis of similarity [ANOSIM]) rather than by sampling method (R = 0.07). Archaeal, ciliate protozoal, and anaerobic fungal communities also showed significant clustering by diet rather than by sampling method, even without adjustment for potentially orally associated microorganisms. These findings indicate that buccal swabs may in future allow quick and noninvasive sampling for analysis of rumen microbial communities in large numbers of ruminants. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Correlated Observations, the Law of Small Numbers and Bank Runs

PubMed Central

2016-01-01

Empirical descriptions and studies suggest that generally depositors observe a sample of previous decisions before deciding if to keep their funds deposited or to withdraw them. These observed decisions may exhibit different degrees of correlation across depositors. In our model depositors decide sequentially and are assumed to follow the law of small numbers in the sense that they believe that a bank run is underway if the number of observed withdrawals in their sample is large. Theoretically, with highly correlated samples and infinite depositors runs occur with certainty, while with random samples it needs not be the case, as for many parameter settings the likelihood of bank runs is zero. We investigate the intermediate cases and find that i) decreasing the correlation and ii) increasing the sample size reduces the likelihood of bank runs, ceteris paribus. Interestingly, the multiplicity of equilibria, a feature of the canonical Diamond-Dybvig model that we use also, disappears almost completely in our setup. Our results have relevant policy implications. PMID:27035435
Correlated Observations, the Law of Small Numbers and Bank Runs.

PubMed

Horváth, Gergely; Kiss, Hubert János

2016-01-01

Empirical descriptions and studies suggest that generally depositors observe a sample of previous decisions before deciding if to keep their funds deposited or to withdraw them. These observed decisions may exhibit different degrees of correlation across depositors. In our model depositors decide sequentially and are assumed to follow the law of small numbers in the sense that they believe that a bank run is underway if the number of observed withdrawals in their sample is large. Theoretically, with highly correlated samples and infinite depositors runs occur with certainty, while with random samples it needs not be the case, as for many parameter settings the likelihood of bank runs is zero. We investigate the intermediate cases and find that i) decreasing the correlation and ii) increasing the sample size reduces the likelihood of bank runs, ceteris paribus. Interestingly, the multiplicity of equilibria, a feature of the canonical Diamond-Dybvig model that we use also, disappears almost completely in our setup. Our results have relevant policy implications.
SU-F-J-193: Efficient Dose Extinction Method for Water Equivalent Path Length (WEPL) of Real Tissue Samples for Validation of CT HU to Stopping Power Conversion

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, R; Baer, E; Jee, K

Purpose: For proton therapy, an accurate model of CT HU to relative stopping power (RSP) conversion is essential. In current practice, validation of these models relies solely on measurements of tissue substitutes with standard compositions. Validation based on real tissue samples would be much more direct and can address variations between patients. This study intends to develop an efficient and accurate system based on the concept of dose extinction to measure WEPL and retrieve RSP in biological tissue in large number of types. Methods: A broad AP proton beam delivering a spread out Bragg peak (SOBP) is used to irradiatemore » the samples with a Matrixx detector positioned immediately below. A water tank was placed on top of the samples, with the water level controllable in sub-millimeter by a remotely controlled dosing pump. While gradually lowering the water level with beam on, the transmission dose was recorded at 1 frame/sec. The WEPL were determined as the difference between the known beam range of the delivered SOBP (80%) and the water level corresponding to 80% of measured dose profiles in time. A Gammex 467 phantom was used to test the system and various types of biological tissue was measured. Results: RSP for all Gammex inserts, expect the one made with lung-450 material (<2% error), were determined within ±0.5% error. Depends on the WEPL of investigated phantom, a measurement takes around 10 min, which can be accelerated by a faster pump. Conclusion: Based on the concept of dose extinction, a system was explored to measure WEPL efficiently and accurately for a large number of samples. This allows the validation of CT HU to stopping power conversions based on large number of samples and real tissues. It also allows the assessment of beam uncertainties due to variations over patients, which issue has never been sufficiently studied before.« less
Modified Technique For Chemisorption Measurements

NASA Technical Reports Server (NTRS)

Schryer, David R.; Brown, Kenneth G.; Schryer, Jacqueline

1989-01-01

In measurements of chemisorption of CO on Pt/SnO2 catalyst observed that if small numbers of relatively large volumes of adsorbate gas are passed through sample, very little removal of CO detected. In these cases little or no CO has been chemisorbed on Pt/SnO2. Technique of using large number of small volumes of adsorbate gas to measure chemisorption applicable to many gas/material combinations other than CO on Pt/SnO2. Volume used chosen so that at least 10 percent of adsorbate gas removed during each exposure.
The effect of numbers of noise events on people's reactions to noise - An analysis of existing survey data

NASA Technical Reports Server (NTRS)

Fields, J. M.

1984-01-01

Even though there are surveys in which annoyance decreases as the number of events increases above about 150 a day, the available evidence is not considered strong enough to reject the conventional assumption that reactions are related to the logarithm of the number of events. The data do not make it possible to reject the conventional assumption that the effects of the number of events and the peak noise level are additive. It is found that even when equivalent questionnaire items and definitions of noise events could be used, differences between the surveys' estimates of the effect of the number of events remained large. Three explanations are suggested for inconsistent estimates. The first has to do with errors in specifying the values of noise parameters, the second with the effects of unmeasured acoustical and area characteristics that are correlated with noise level or number, and the third with large sampling errors deriving from community differences in response to noise. It is concluded that significant advances in the knowledge about the effects of the number of noise events can be made only if surveys include large numbers of study areas.
Paleobiology and comparative morphology of a late Neandertal sample from El Sidron, Asturias, Spain.

PubMed

Rosas, Antonio; Martínez-Maza, Cayetana; Bastir, Markus; García-Tabernero, Antonio; Lalueza-Fox, Carles; Huguet, Rosa; Ortiz, José Eugenio; Julià, Ramón; Soler, Vicente; de Torres, Trinidad; Martínez, Enrique; Cañaveras, Juan Carlos; Sánchez-Moral, Sergio; Cuezva, Soledad; Lario, Javier; Santamaría, David; de la Rasilla, Marco; Fortea, Javier

2006-12-19

Fossil evidence from the Iberian Peninsula is essential for understanding Neandertal evolution and history. Since 2000, a new sample approximately 43,000 years old has been systematically recovered at the El Sidrón cave site (Asturias, Spain). Human remains almost exclusively compose the bone assemblage. All of the skeletal parts are preserved, and there is a moderate occurrence of Middle Paleolithic stone tools. A minimum number of eight individuals are represented, and ancient mtDNA has been extracted from dental and osteological remains. Paleobiology of the El Sidrón archaic humans fits the pattern found in other Neandertal samples: a high incidence of dental hypoplasia and interproximal grooves, yet no traumatic lesions are present. Moreover, unambiguous evidence of human-induced modifications has been found on the human remains. Morphologically, the El Sidrón humans show a large number of Neandertal lineage-derived features even though certain traits place the sample at the limits of Neandertal variation. Integrating the El Sidrón human mandibles into the larger Neandertal sample reveals a north-south geographic patterning, with southern Neandertals showing broader faces with increased lower facial heights. The large El Sidrón sample therefore augments the European evolutionary lineage fossil record and supports ecogeographical variability across Neandertal populations.
Evaluation of single and two-stage adaptive sampling designs for estimation of density and abundance of freshwater mussels in a large river

USGS Publications Warehouse

Smith, D.R.; Rogala, J.T.; Gray, B.R.; Zigler, S.J.; Newton, T.J.

2011-01-01

Reliable estimates of abundance are needed to assess consequences of proposed habitat restoration and enhancement projects on freshwater mussels in the Upper Mississippi River (UMR). Although there is general guidance on sampling techniques for population assessment of freshwater mussels, the actual performance of sampling designs can depend critically on the population density and spatial distribution at the project site. To evaluate various sampling designs, we simulated sampling of populations, which varied in density and degree of spatial clustering. Because of logistics and costs of large river sampling and spatial clustering of freshwater mussels, we focused on adaptive and non-adaptive versions of single and two-stage sampling. The candidate designs performed similarly in terms of precision (CV) and probability of species detection for fixed sample size. Both CV and species detection were determined largely by density, spatial distribution and sample size. However, designs did differ in the rate that occupied quadrats were encountered. Occupied units had a higher probability of selection using adaptive designs than conventional designs. We used two measures of cost: sample size (i.e. number of quadrats) and distance travelled between the quadrats. Adaptive and two-stage designs tended to reduce distance between sampling units, and thus performed better when distance travelled was considered. Based on the comparisons, we provide general recommendations on the sampling designs for the freshwater mussels in the UMR, and presumably other large rivers.

Short-term variability and long-term change in the composition of the littoral zone fish community in Spirit Lake, Iowa

USGS Publications Warehouse

Pierce, C.L.; Sexton, M.D.; Pelham, M.E.; Larscheid, J.G.

2001-01-01

We assessed short-term variability and long-term change in the composition of the littoral fish community in Spirit Lake, Iowa. Fish were sampled in several locations at night with large beach seines during spring, summer and fall of 1995-1998. Long-term changes were inferred from comparison with a similar study conducted over 70 y earlier in Spirit Lake. We found 26 species in the littoral zone. The number of species per sample ranged from 4 to 18, averaging 11.8. The average number of species per sample was higher at stations with greater vegetation density. A distinct seasonal pattern was evident in the number of species collected per sample in most years, increasing steadily from spring to fall. Patterns of variability within our 1995-1998 study period suggest that: (1) numerous samples are necessary to adequately characterize a littoral fish community, (2) sampling should be done when vegetation and young-of-year densities are highest and (3) sampling during a single year is inadequate to reveal the full community. The number of native species has declined by approximately 25% over the last 70 y. A coincident decline in littoral vegetation and associated habitat changes during the same period are likely causes of the long-term community change.
Generation and analysis of chemical compound libraries

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gregoire, John M.; Jin, Jian; Kan, Kevin S.

2017-10-03

Various samples are generated on a substrate. The samples each includes or consists of one or more analytes. In some instances, the samples are generated through the use of gels or through vapor deposition techniques. The samples are used in an instrument for screening large numbers of analytes by locating the samples between a working electrode and a counter electrode assembly. The instrument also includes one or more light sources for illuminating each of the samples. The instrument is configured to measure the photocurrent formed through a sample as a result of the illumination of the sample.
Robust estimation of microbial diversity in theory and in practice

PubMed Central

Haegeman, Bart; Hamelin, Jérôme; Moriarty, John; Neal, Peter; Dushoff, Jonathan; Weitz, Joshua S

2013-01-01

Quantifying diversity is of central importance for the study of structure, function and evolution of microbial communities. The estimation of microbial diversity has received renewed attention with the advent of large-scale metagenomic studies. Here, we consider what the diversity observed in a sample tells us about the diversity of the community being sampled. First, we argue that one cannot reliably estimate the absolute and relative number of microbial species present in a community without making unsupported assumptions about species abundance distributions. The reason for this is that sample data do not contain information about the number of rare species in the tail of species abundance distributions. We illustrate the difficulty in comparing species richness estimates by applying Chao's estimator of species richness to a set of in silico communities: they are ranked incorrectly in the presence of large numbers of rare species. Next, we extend our analysis to a general family of diversity metrics (‘Hill diversities'), and construct lower and upper estimates of diversity values consistent with the sample data. The theory generalizes Chao's estimator, which we retrieve as the lower estimate of species richness. We show that Shannon and Simpson diversity can be robustly estimated for the in silico communities. We analyze nine metagenomic data sets from a wide range of environments, and show that our findings are relevant for empirically-sampled communities. Hence, we recommend the use of Shannon and Simpson diversity rather than species richness in efforts to quantify and compare microbial diversity. PMID:23407313
Collaborative filtering for brain-computer interaction using transfer learning and active class selection.

PubMed

Wu, Dongrui; Lance, Brent J; Parsons, Thomas D

2013-01-01

Brain-computer interaction (BCI) and physiological computing are terms that refer to using processed neural or physiological signals to influence human interaction with computers, environment, and each other. A major challenge in developing these systems arises from the large individual differences typically seen in the neural/physiological responses. As a result, many researchers use individually-trained recognition algorithms to process this data. In order to minimize time, cost, and barriers to use, there is a need to minimize the amount of individual training data required, or equivalently, to increase the recognition accuracy without increasing the number of user-specific training samples. One promising method for achieving this is collaborative filtering, which combines training data from the individual subject with additional training data from other, similar subjects. This paper describes a successful application of a collaborative filtering approach intended for a BCI system. This approach is based on transfer learning (TL), active class selection (ACS), and a mean squared difference user-similarity heuristic. The resulting BCI system uses neural and physiological signals for automatic task difficulty recognition. TL improves the learning performance by combining a small number of user-specific training samples with a large number of auxiliary training samples from other similar subjects. ACS optimally selects the classes to generate user-specific training samples. Experimental results on 18 subjects, using both k nearest neighbors and support vector machine classifiers, demonstrate that the proposed approach can significantly reduce the number of user-specific training data samples. This collaborative filtering approach will also be generalizable to handling individual differences in many other applications that involve human neural or physiological data, such as affective computing.
Collaborative Filtering for Brain-Computer Interaction Using Transfer Learning and Active Class Selection

PubMed Central

Wu, Dongrui; Lance, Brent J.; Parsons, Thomas D.

2013-01-01

Brain-computer interaction (BCI) and physiological computing are terms that refer to using processed neural or physiological signals to influence human interaction with computers, environment, and each other. A major challenge in developing these systems arises from the large individual differences typically seen in the neural/physiological responses. As a result, many researchers use individually-trained recognition algorithms to process this data. In order to minimize time, cost, and barriers to use, there is a need to minimize the amount of individual training data required, or equivalently, to increase the recognition accuracy without increasing the number of user-specific training samples. One promising method for achieving this is collaborative filtering, which combines training data from the individual subject with additional training data from other, similar subjects. This paper describes a successful application of a collaborative filtering approach intended for a BCI system. This approach is based on transfer learning (TL), active class selection (ACS), and a mean squared difference user-similarity heuristic. The resulting BCI system uses neural and physiological signals for automatic task difficulty recognition. TL improves the learning performance by combining a small number of user-specific training samples with a large number of auxiliary training samples from other similar subjects. ACS optimally selects the classes to generate user-specific training samples. Experimental results on 18 subjects, using both nearest neighbors and support vector machine classifiers, demonstrate that the proposed approach can significantly reduce the number of user-specific training data samples. This collaborative filtering approach will also be generalizable to handling individual differences in many other applications that involve human neural or physiological data, such as affective computing. PMID:23437188
Identification and Classification of OFDM Based Signals Using Preamble Correlation and Cyclostationary Feature Extraction

DTIC Science & Technology

2009-09-01

rapidly advancing technologies of wireless communication networks are providing enormous opportunities. A large number of users in emerging markets ...base element of the 802.16 frame is the physical slot, having the duration 4ps s t f  (2.10) where sf is the sampling frequency. The number of
Differential Relations between Facets of Complex Problem Solving and Students' Immigration Background

ERIC Educational Resources Information Center

Sonnleitner, Philipp; Brunner, Martin; Keller, Ulrich; Martin, Romain

2014-01-01

Whereas the assessment of complex problem solving (CPS) has received increasing attention in the context of international large-scale assessments, its fairness in regard to students' cultural background has gone largely unexplored. On the basis of a student sample of 9th-graders (N = 299), including a representative number of immigrant students (N…
Bloodmeal Host Congregation and Landscape Structure Impact the Estimation of Female Mosquito (Diptera: Culicidae) Abundance Using Dry Ice-Baited Traps

PubMed Central

THIEMANN, TARA; NELMS, BRITTANY; REISEN, WILLIAM K.

2011-01-01

Vegetation patterns and the presence of large numbers of nesting herons and egrets significantly altered the number of host-seeking Culex tarsalis Coquillett (Diptera: Culicidae) collected at dry ice-baited traps. The numbers of females collected per trap night at traps along the ecotone of Eucalyptus stands with and without a heron colony were always greater or equal to numbers collected at traps within or under canopy. No Cx. tarsalis were collected within or under Eucaplytus canopy during the peak heron nesting season, even though these birds frequently were infected with West Nile virus and large number of engorged females could be collected at resting boxes. These data indicate a diversion of host-seeking females from traps to nesting birds reducing sampling efficiency. PMID:21661310
Improvement of High-throughput Genotype Analysis After Implementation of a Dual-curve Sybr Green I-based Quantification and Normalization Procedure

USDA-ARS?s Scientific Manuscript database

The ability to rapidly screen a large number of individuals is the key to any successful plant breeding program. One of the primary bottlenecks in high throughput screening is the preparation of DNA samples, particularly the quantification and normalization of samples for downstream processing. A ...
Getting something for nothing: Regeneration of peptide signals from apparently exhausted MALDI samples by “waterboarding"

USDA-ARS?s Scientific Manuscript database

An often cited advantage of MALDI-MS is the ability to archive and reuse sample plates after the initial analysis is complete. However, experience demonstrates that the peptide ion signals decay rapidly as the number of laser shots becomes large. Thus, the signal level obtainable from an archived sa...
Detecting spatial structures in throughfall data: The effect of extent, sample size, sampling design, and variogram estimation method

NASA Astrophysics Data System (ADS)

Voss, Sebastian; Zimmermann, Beate; Zimmermann, Alexander

2016-09-01

In the last decades, an increasing number of studies analyzed spatial patterns in throughfall by means of variograms. The estimation of the variogram from sample data requires an appropriate sampling scheme: most importantly, a large sample and a layout of sampling locations that often has to serve both variogram estimation and geostatistical prediction. While some recommendations on these aspects exist, they focus on Gaussian data and high ratios of the variogram range to the extent of the study area. However, many hydrological data, and throughfall data in particular, do not follow a Gaussian distribution. In this study, we examined the effect of extent, sample size, sampling design, and calculation method on variogram estimation of throughfall data. For our investigation, we first generated non-Gaussian random fields based on throughfall data with large outliers. Subsequently, we sampled the fields with three extents (plots with edge lengths of 25 m, 50 m, and 100 m), four common sampling designs (two grid-based layouts, transect and random sampling) and five sample sizes (50, 100, 150, 200, 400). We then estimated the variogram parameters by method-of-moments (non-robust and robust estimators) and residual maximum likelihood. Our key findings are threefold. First, the choice of the extent has a substantial influence on the estimation of the variogram. A comparatively small ratio of the extent to the correlation length is beneficial for variogram estimation. Second, a combination of a minimum sample size of 150, a design that ensures the sampling of small distances and variogram estimation by residual maximum likelihood offers a good compromise between accuracy and efficiency. Third, studies relying on method-of-moments based variogram estimation may have to employ at least 200 sampling points for reliable variogram estimates. These suggested sample sizes exceed the number recommended by studies dealing with Gaussian data by up to 100 %. Given that most previous throughfall studies relied on method-of-moments variogram estimation and sample sizes ≪200, currently available data are prone to large uncertainties.
SAChES: Scalable Adaptive Chain-Ensemble Sampling.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Swiler, Laura Painton; Ray, Jaideep; Ebeida, Mohamed Salah

We present the development of a parallel Markov Chain Monte Carlo (MCMC) method called SAChES, Scalable Adaptive Chain-Ensemble Sampling. This capability is targed to Bayesian calibration of com- putationally expensive simulation models. SAChES involves a hybrid of two methods: Differential Evo- lution Monte Carlo followed by Adaptive Metropolis. Both methods involve parallel chains. Differential evolution allows one to explore high-dimensional parameter spaces using loosely coupled (i.e., largely asynchronous) chains. Loose coupling allows the use of large chain ensembles, with far more chains than the number of parameters to explore. This reduces per-chain sampling burden, enables high-dimensional inversions and the usemore » of computationally expensive forward models. The large number of chains can also ameliorate the impact of silent-errors, which may affect only a few chains. The chain ensemble can also be sampled to provide an initial condition when an aberrant chain is re-spawned. Adaptive Metropolis takes the best points from the differential evolution and efficiently hones in on the poste- rior density. The multitude of chains in SAChES is leveraged to (1) enable efficient exploration of the parameter space; and (2) ensure robustness to silent errors which may be unavoidable in extreme-scale computational platforms of the future. This report outlines SAChES, describes four papers that are the result of the project, and discusses some additional results.« less
Galaxy And Mass Assembly: evolution of the Hα luminosity function and star formation rate density up to z < 0.35

NASA Astrophysics Data System (ADS)

Gunawardhana, M. L. P.; Hopkins, A. M.; Bland-Hawthorn, J.; Brough, S.; Sharp, R.; Loveday, J.; Taylor, E.; Jones, D. H.; Lara-López, M. A.; Bauer, A. E.; Colless, M.; Owers, M.; Baldry, I. K.; López-Sánchez, A. R.; Foster, C.; Bamford, S.; Brown, M. J. I.; Driver, S. P.; Drinkwater, M. J.; Liske, J.; Meyer, M.; Norberg, P.; Robotham, A. S. G.; Ching, J. H. Y.; Cluver, M. E.; Croom, S.; Kelvin, L.; Prescott, M.; Steele, O.; Thomas, D.; Wang, L.

2013-08-01

Measurements of the low-z Hα luminosity function, Φ, have a large dispersion in the local number density of sources (˜0.5-1 Mpc-3 dex-1), and correspondingly in the star formation rate density (SFRD). The possible causes for these discrepancies include limited volume sampling, biases arising from survey sample selection, different methods of correcting for dust obscuration and active galactic nucleus contamination. The Galaxy And Mass Assembly (GAMA) survey and Sloan Digital Sky Survey (SDSS) provide deep spectroscopic observations over a wide sky area enabling detection of a large sample of star-forming galaxies spanning 0.001 < SFRHα (M⊙ yr- 1) < 100 with which to robustly measure the evolution of the SFRD in the low-z Universe. The large number of high-SFR galaxies present in our sample allow an improved measurement of the bright end of the luminosity function, indicating that the decrease in Φ at bright luminosities is best described by a Saunders functional form rather than the traditional Schechter function. This result is consistent with other published luminosity functions in the far-infrared and radio. For GAMA and SDSS, we find the r-band apparent magnitude limit, combined with the subsequent requirement for Hα detection leads to an incompleteness due to missing bright Hα sources with faint r-band magnitudes.
A Structure-Adaptive Hybrid RBF-BP Classifier with an Optimized Learning Strategy

PubMed Central

Wen, Hui; Xie, Weixin; Pei, Jihong

2016-01-01

This paper presents a structure-adaptive hybrid RBF-BP (SAHRBF-BP) classifier with an optimized learning strategy. SAHRBF-BP is composed of a structure-adaptive RBF network and a BP network of cascade, where the number of RBF hidden nodes is adjusted adaptively according to the distribution of sample space, the adaptive RBF network is used for nonlinear kernel mapping and the BP network is used for nonlinear classification. The optimized learning strategy is as follows: firstly, a potential function is introduced into training sample space to adaptively determine the number of initial RBF hidden nodes and node parameters, and a form of heterogeneous samples repulsive force is designed to further optimize each generated RBF hidden node parameters, the optimized structure-adaptive RBF network is used for adaptively nonlinear mapping the sample space; then, according to the number of adaptively generated RBF hidden nodes, the number of subsequent BP input nodes can be determined, and the overall SAHRBF-BP classifier is built up; finally, different training sample sets are used to train the BP network parameters in SAHRBF-BP. Compared with other algorithms applied to different data sets, experiments show the superiority of SAHRBF-BP. Especially on most low dimensional and large number of data sets, the classification performance of SAHRBF-BP outperforms other training SLFNs algorithms. PMID:27792737
Segment-Wise Genome-Wide Association Analysis Identifies a Candidate Region Associated with Schizophrenia in Three Independent Samples

PubMed Central

Rietschel, Marcella; Mattheisen, Manuel; Breuer, René; Schulze, Thomas G.; Nöthen, Markus M.; Levinson, Douglas; Shi, Jianxin; Gejman, Pablo V.; Cichon, Sven; Ophoff, Roel A.

2012-01-01

Recent studies suggest that variation in complex disorders (e.g., schizophrenia) is explained by a large number of genetic variants with small effect size (Odds Ratio∼1.05–1.1). The statistical power to detect these genetic variants in Genome Wide Association (GWA) studies with large numbers of cases and controls (∼15,000) is still low. As it will be difficult to further increase sample size, we decided to explore an alternative method for analyzing GWA data in a study of schizophrenia, dramatically reducing the number of statistical tests. The underlying hypothesis was that at least some of the genetic variants related to a common outcome are collocated in segments of chromosomes at a wider scale than single genes. Our approach was therefore to study the association between relatively large segments of DNA and disease status. An association test was performed for each SNP and the number of nominally significant tests in a segment was counted. We then performed a permutation-based binomial test to determine whether this region contained significantly more nominally significant SNPs than expected under the null hypothesis of no association, taking linkage into account. Genome Wide Association data of three independent schizophrenia case/control cohorts with European ancestry (Dutch, German, and US) using segments of DNA with variable length (2 to 32 Mbp) was analyzed. Using this approach we identified a region at chromosome 5q23.3-q31.3 (128–160 Mbp) that was significantly enriched with nominally associated SNPs in three independent case-control samples. We conclude that considering relatively wide segments of chromosomes may reveal reliable relationships between the genome and schizophrenia, suggesting novel methodological possibilities as well as raising theoretical questions. PMID:22723893
Rotor assembly and method for automatically processing liquids

DOEpatents

Burtis, Carl A.; Johnson, Wayne F.; Walker, William A.

1992-01-01

A rotor assembly for performing a relatively large number of processing steps upon a sample, such as a whole blood sample, and a diluent, such as water, includes a rotor body for rotation about an axis and including a network of chambers within which various processing steps are performed upon the sample and diluent and passageways through which the sample and diluent are transferred. A transfer mechanism is movable through the rotor body by the influence of a magnetic field generated adjacent the transfer mechanism and movable along the rotor body, and the assembly utilizes centrifugal force, a transfer of momentum and capillary action to perform any of a number of processing steps such as separation, aliquoting, transference, washing, reagent addition and mixing of the sample and diluent within the rotor body. The rotor body is particularly suitable for automatic immunoassay analyses.
Stochastic coupled cluster theory: Efficient sampling of the coupled cluster expansion

NASA Astrophysics Data System (ADS)

Scott, Charles J. C.; Thom, Alex J. W.

2017-09-01

We consider the sampling of the coupled cluster expansion within stochastic coupled cluster theory. Observing the limitations of previous approaches due to the inherently non-linear behavior of a coupled cluster wavefunction representation, we propose new approaches based on an intuitive, well-defined condition for sampling weights and on sampling the expansion in cluster operators of different excitation levels. We term these modifications even and truncated selections, respectively. Utilising both approaches demonstrates dramatically improved calculation stability as well as reduced computational and memory costs. These modifications are particularly effective at higher truncation levels owing to the large number of terms within the cluster expansion that can be neglected, as demonstrated by the reduction of the number of terms to be sampled when truncating at triple excitations by 77% and hextuple excitations by 98%.
Random sampling of elementary flux modes in large-scale metabolic networks.

PubMed

Machado, Daniel; Soons, Zita; Patil, Kiran Raosaheb; Ferreira, Eugénio C; Rocha, Isabel

2012-09-15

The description of a metabolic network in terms of elementary (flux) modes (EMs) provides an important framework for metabolic pathway analysis. However, their application to large networks has been hampered by the combinatorial explosion in the number of modes. In this work, we develop a method for generating random samples of EMs without computing the whole set. Our algorithm is an adaptation of the canonical basis approach, where we add an additional filtering step which, at each iteration, selects a random subset of the new combinations of modes. In order to obtain an unbiased sample, all candidates are assigned the same probability of getting selected. This approach avoids the exponential growth of the number of modes during computation, thus generating a random sample of the complete set of EMs within reasonable time. We generated samples of different sizes for a metabolic network of Escherichia coli, and observed that they preserve several properties of the full EM set. It is also shown that EM sampling can be used for rational strain design. A well distributed sample, that is representative of the complete set of EMs, should be suitable to most EM-based methods for analysis and optimization of metabolic networks. Source code for a cross-platform implementation in Python is freely available at http://code.google.com/p/emsampler. dmachado@deb.uminho.pt Supplementary data are available at Bioinformatics online.
Applying Active Learning to Assertion Classification of Concepts in Clinical Text

PubMed Central

Chen, Yukun; Mani, Subramani; Xu, Hua

2012-01-01

Supervised machine learning methods for clinical natural language processing (NLP) research require a large number of annotated samples, which are very expensive to build because of the involvement of physicians. Active learning, an approach that actively samples from a large pool, provides an alternative solution. Its major goal in classification is to reduce the annotation effort while maintaining the quality of the predictive model. However, few studies have investigated its uses in clinical NLP. This paper reports an application of active learning to a clinical text classification task: to determine the assertion status of clinical concepts. The annotated corpus for the assertion classification task in the 2010 i2b2/VA Clinical NLP Challenge was used in this study. We implemented several existing and newly developed active learning algorithms and assessed their uses. The outcome is reported in the global ALC score, based on the Area under the average Learning Curve of the AUC (Area Under the Curve) score. Results showed that when the same number of annotated samples was used, active learning strategies could generate better classification models (best ALC – 0.7715) than the passive learning method (random sampling) (ALC – 0.7411). Moreover, to achieve the same classification performance, active learning strategies required fewer samples than the random sampling method. For example, to achieve an AUC of 0.79, the random sampling method used 32 samples, while our best active learning algorithm required only 12 samples, a reduction of 62.5% in manual annotation effort. PMID:22127105
A prototype splitter apparatus for dividing large catches of small fish

USGS Publications Warehouse

Stapanian, Martin A.; Edwards, William H.

2012-01-01

Due to financial and time constraints, it is often necessary in fisheries studies to divide large samples of fish and estimate total catch from the subsample. The subsampling procedure may involve potential human biases or may be difficult to perform in rough conditions. We present a prototype gravity-fed splitter apparatus for dividing large samples of small fish (30–100 mm TL). The apparatus features a tapered hopper with a sliding and removable shutter. The apparatus provides a comparatively stable platform for objectively obtaining subsamples, and it can be modified to accommodate different sizes of fish and different sample volumes. The apparatus is easy to build, inexpensive, and convenient to use in the field. To illustrate the performance of the apparatus, we divided three samples (total N = 2,000 fish) composed of four fish species. Our results indicated no significant bias in estimating either the number or proportion of each species from the subsample. Use of this apparatus or a similar apparatus can help to standardize subsampling procedures in large surveys of fish. The apparatus could be used for other applications that require dividing a large amount of material into one or more smaller subsamples.

Sampling--how big a sample?

PubMed

Aitken, C G

1999-07-01

It is thought that, in a consignment of discrete units, a certain proportion of the units contain illegal material. A sample of the consignment is to be inspected. Various methods for the determination of the sample size are compared. The consignment will be considered as a random sample from some super-population of units, a certain proportion of which contain drugs. For large consignments, a probability distribution, known as the beta distribution, for the proportion of the consignment which contains illegal material is obtained. This distribution is based on prior beliefs about the proportion. Under certain specific conditions the beta distribution gives the same numerical results as an approach based on the binomial distribution. The binomial distribution provides a probability for the number of units in a sample which contain illegal material, conditional on knowing the proportion of the consignment which contains illegal material. This is in contrast to the beta distribution which provides probabilities for the proportion of a consignment which contains illegal material, conditional on knowing the number of units in the sample which contain illegal material. The interpretation when the beta distribution is used is much more intuitively satisfactory. It is also much more flexible in its ability to cater for prior beliefs which may vary given the different circumstances of different crimes. For small consignments, a distribution, known as the beta-binomial distribution, for the number of units in the consignment which are found to contain illegal material, is obtained, based on prior beliefs about the number of units in the consignment which are thought to contain illegal material. As with the beta and binomial distributions for large samples, it is shown that, in certain specific conditions, the beta-binomial and hypergeometric distributions give the same numerical results. However, the beta-binomial distribution, as with the beta distribution, has a more intuitively satisfactory interpretation and greater flexibility. The beta and the beta-binomial distributions provide methods for the determination of the minimum sample size to be taken from a consignment in order to satisfy a certain criterion. The criterion requires the specification of a proportion and a probability.
Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hero, Alfred O.; Rajaratnam, Bala

When can reliable inference be drawn in the ‘‘Big Data’’ context? This article presents a framework for answering this fundamental question in the context of correlation mining, with implications for general large-scale inference. In large-scale data applications like genomics, connectomics, and eco-informatics, the data set is often variable rich but sample starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than the number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for ‘‘Big Data.’’ Sample complexity, however, hasmore » received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address this gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where the variable dimension is fixed and the sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; and 3) the purely high-dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa-scale data dimension. We illustrate this high-dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables that are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. We demonstrate various regimes of correlation mining based on the unifying perspective of high-dimensional learning rates and sample complexity for different structured covariance models and different inference tasks.« less
Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining

PubMed Central

Hero, Alfred O.; Rajaratnam, Bala

2015-01-01

When can reliable inference be drawn in fue “Big Data” context? This paper presents a framework for answering this fundamental question in the context of correlation mining, wifu implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics fue dataset is often variable-rich but sample-starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than fue number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for “Big Data”. Sample complexity however has received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address fuis gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where fue variable dimension is fixed and fue sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; 3) the purely high dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa cale data dimension. We illustrate this high dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables fua t are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. we demonstrate various regimes of correlation mining based on the unifying perspective of high dimensional learning rates and sample complexity for different structured covariance models and different inference tasks. PMID:27087700
Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining

DOE PAGES

Hero, Alfred O.; Rajaratnam, Bala

2015-12-09

When can reliable inference be drawn in the ‘‘Big Data’’ context? This article presents a framework for answering this fundamental question in the context of correlation mining, with implications for general large-scale inference. In large-scale data applications like genomics, connectomics, and eco-informatics, the data set is often variable rich but sample starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than the number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for ‘‘Big Data.’’ Sample complexity, however, hasmore » received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address this gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where the variable dimension is fixed and the sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; and 3) the purely high-dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa-scale data dimension. We illustrate this high-dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables that are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. We demonstrate various regimes of correlation mining based on the unifying perspective of high-dimensional learning rates and sample complexity for different structured covariance models and different inference tasks.« less
Airport trial of a system for the mass screening of baggage or cargo

NASA Astrophysics Data System (ADS)

Bennett, Gordon; Sleeman, Richard; Davidson, William R.; Stott, William R.

1994-10-01

An eight month trial of a system capable of checking every bag from a particular flight for the presence of narcotics has been carried out at a major UK airport. The British Aerospace CONDOR tandem mass-spectrometer system, fitted with a real-time sampler, was used to check in-coming baggage for a range of illegal drugs. Because of the rapid sampling and analysis capability of this instrument, it was possible to check every bag from a flight without delay to the passengers. During the trial a very large number of bags, from flights from various parts of the world, were sampled. A number of detections were made, which resulted in a number of seizures and the apprehension of a number of smugglers.
Estimating and comparing microbial diversity in the presence of sequencing errors

PubMed Central

Chiu, Chun-Huo

2016-01-01

Estimating and comparing microbial diversity are statistically challenging due to limited sampling and possible sequencing errors for low-frequency counts, producing spurious singletons. The inflated singleton count seriously affects statistical analysis and inferences about microbial diversity. Previous statistical approaches to tackle the sequencing errors generally require different parametric assumptions about the sampling model or about the functional form of frequency counts. Different parametric assumptions may lead to drastically different diversity estimates. We focus on nonparametric methods which are universally valid for all parametric assumptions and can be used to compare diversity across communities. We develop here a nonparametric estimator of the true singleton count to replace the spurious singleton count in all methods/approaches. Our estimator of the true singleton count is in terms of the frequency counts of doubletons, tripletons and quadrupletons, provided these three frequency counts are reliable. To quantify microbial alpha diversity for an individual community, we adopt the measure of Hill numbers (effective number of taxa) under a nonparametric framework. Hill numbers, parameterized by an order q that determines the measures’ emphasis on rare or common species, include taxa richness (q = 0), Shannon diversity (q = 1, the exponential of Shannon entropy), and Simpson diversity (q = 2, the inverse of Simpson index). A diversity profile which depicts the Hill number as a function of order q conveys all information contained in a taxa abundance distribution. Based on the estimated singleton count and the original non-singleton frequency counts, two statistical approaches (non-asymptotic and asymptotic) are developed to compare microbial diversity for multiple communities. (1) A non-asymptotic approach refers to the comparison of estimated diversities of standardized samples with a common finite sample size or sample completeness. This approach aims to compare diversity estimates for equally-large or equally-complete samples; it is based on the seamless rarefaction and extrapolation sampling curves of Hill numbers, specifically for q = 0, 1 and 2. (2) An asymptotic approach refers to the comparison of the estimated asymptotic diversity profiles. That is, this approach compares the estimated profiles for complete samples or samples whose size tends to be sufficiently large. It is based on statistical estimation of the true Hill number of any order q ≥ 0. In the two approaches, replacing the spurious singleton count by our estimated count, we can greatly remove the positive biases associated with diversity estimates due to spurious singletons and also make fair comparisons across microbial communities, as illustrated in our simulation results and in applying our method to analyze sequencing data from viral metagenomes. PMID:26855872
The Effect of Number of Ability Intervals on the Stability of Item Bias Detection.

ERIC Educational Resources Information Center

Loyd, Brenda

The chi-square procedure has been suggested as a viable index of test bias because it provides the best agreement with the three parameter item characteristic curve without the large sample requirement, computer complexity, and cost. This study examines the effect of using different numbers of ability intervals on the reliability of chi-square…
An algorithm for deciding the number of clusters and validating using simulated data with application to exploring crop population structure

USDA-ARS?s Scientific Manuscript database

A first step in exploring population structure in crop plants and other organisms is to define the number of subpopulations that exist for a given data set. The genetic marker data sets being generated have become increasingly large over time and commonly are the high-dimension, low sample size (HDL...
CHRONICITY OF DEPRESSION AND MOLECULAR MARKERS IN A LARGE SAMPLE OF HAN CHINESE WOMEN.

PubMed

Edwards, Alexis C; Aggen, Steven H; Cai, Na; Bigdeli, Tim B; Peterson, Roseann E; Docherty, Anna R; Webb, Bradley T; Bacanu, Silviu-Alin; Flint, Jonathan; Kendler, Kenneth S

2016-04-25

Major depressive disorder (MDD) has been associated with changes in mean telomere length and mitochondrial DNA (mtDNA) copy number. This study investigates if clinical features of MDD differentially impact these molecular markers. Data from a large, clinically ascertained sample of Han Chinese women with recurrent MDD were used to examine whether symptom presentation, severity, and comorbidity were related to salivary telomere length and/or mtDNA copy number (maximum N = 5,284 for both molecular and phenotypic data). Structural equation modeling revealed that duration of longest episode was positively associated with mtDNA copy number, while earlier age of onset of most severe episode and a history of dysthymia were associated with shorter telomeres. Other factors, such as symptom presentation, family history of depression, and other comorbid internalizing disorders, were not associated with these molecular markers. Chronicity of depressive symptoms is related to more pronounced telomere shortening and increased mtDNA copy number among individuals with a history of recurrent MDD. As these molecular markers have previously been implicated in physiological aging and morbidity, individuals who experience prolonged depressive symptoms are potentially at greater risk of adverse medical outcomes. © 2016 Wiley Periodicals, Inc.
The use of DRG for identifying clinical trials centers with high recruitment potential: a feasability study.

PubMed

Aegerter, Philippe; Bendersky, Noelle; Tran, Thi-Chien; Ropers, Jacques; Taright, Namik; Chatellier, Gilles

2014-01-01

Recruitment of large samples of patients is crucial for evidence level and efficacy of clinical trials (CT). Clinical Trial Recruitment Support Systems (CTRSS) used to estimate patient recruitment are generally specific to Hospital Information Systems and few were evaluated on a large number of trials. Our aim was to assess, on a large number of CT, the usefulness of commonly available data as Diagnosis Related Groups (DRG) databases in order to estimate potential recruitment. We used the DRG database of a large French multicenter medical institution (1.2 million inpatient stays and 400 new trials each year). Eligibility criteria of protocols were broken down into in atomic entities (diagnosis, procedures, treatments...) then translated into codes and operators recorded in a standardized form. A program parsed the forms and generated requests on the DRG database. A large majority of selection criteria could be coded and final estimations of number of eligible patients were close to observed ones (median difference = 25). Such a system could be part of the feasability evaluation and center selection process before the start of the clinical trial.
An empirical study using permutation-based resampling in meta-regression

PubMed Central

2012-01-01

Background In meta-regression, as the number of trials in the analyses decreases, the risk of false positives or false negatives increases. This is partly due to the assumption of normality that may not hold in small samples. Creation of a distribution from the observed trials using permutation methods to calculate P values may allow for less spurious findings. Permutation has not been empirically tested in meta-regression. The objective of this study was to perform an empirical investigation to explore the differences in results for meta-analyses on a small number of trials using standard large sample approaches verses permutation-based methods for meta-regression. Methods We isolated a sample of randomized controlled clinical trials (RCTs) for interventions that have a small number of trials (herbal medicine trials). Trials were then grouped by herbal species and condition and assessed for methodological quality using the Jadad scale, and data were extracted for each outcome. Finally, we performed meta-analyses on the primary outcome of each group of trials and meta-regression for methodological quality subgroups within each meta-analysis. We used large sample methods and permutation methods in our meta-regression modeling. We then compared final models and final P values between methods. Results We collected 110 trials across 5 intervention/outcome pairings and 5 to 10 trials per covariate. When applying large sample methods and permutation-based methods in our backwards stepwise regression the covariates in the final models were identical in all cases. The P values for the covariates in the final model were larger in 78% (7/9) of the cases for permutation and identical for 22% (2/9) of the cases. Conclusions We present empirical evidence that permutation-based resampling may not change final models when using backwards stepwise regression, but may increase P values in meta-regression of multiple covariates for relatively small amount of trials. PMID:22587815
DOE Office of Scientific and Technical Information (OSTI.GOV)

Wollaber, Allan Benton

This is a powerpoint presentation which serves as lecture material for the Parallel Computing summer school. It goes over the fundamentals of the Monte Carlo calculation method. The material is presented according to the following outline: Introduction (background, a simple example: estimating π), Why does this even work? (The Law of Large Numbers, The Central Limit Theorem), How to sample (inverse transform sampling, rejection), and An example from particle transport.
JPRS Report, Science & Technology, Japan

DTIC Science & Technology

1988-10-05

collagen, we are conducting research on the immobilization, through chemical bond rather than physical absorption , of collagen on synthetic material...of a large number of samples are conducted by using automated apparatus and enzymatic reagents, it is natural to devise a method to use natural...improvement of enzymatic analytical methods ; 3) development of reaction system and instrumentation system; 4) research on sample treatment methods ; and
Quantitative Research on the Outcomes of China's Inland Tibetan Classes and Schools Policy: A Survey of Graduates

ERIC Educational Resources Information Center

Xiaorong, Wu

2015-01-01

Under the Inland Tibetan Classes and Schools Policy, China has trained a large number of personnel to facilitate the social, economic, and cultural development of Tibet. This study used a multistage, random sample survey to collect data on the comprehensive qualities of two sample groups of personnel in Tibet: graduates and nongraduates of inland…
Ion Beam Analyses Of Bark And Wood In Environmental Studies

NASA Astrophysics Data System (ADS)

Lill, J.-O.; Saarela, K.-E.; Harju, L.; Rajander, J.; Lindroos, A.; Heselius, S.-J.

2011-06-01

A large number of wood and bark samples have been analysed utilizing particle-induced X-ray emission (PIXE) and particle-induced gamma-ray emission (PIGE) techniques. Samples of common tree species like Scots Pine, Norway Spruce and birch were collected from a large number of sites in Southern and Southwestern Finland. Some of the samples were from a heavily polluted area in the vicinity of a copper-nickel smelter. The samples were dry ashed at 550 °C for the removal of the organic matrix in order to increase the analytical sensitivity of the method. The sensitivity was enhanced by a factor of 50 for wood and slightly less for bark. The ashed samples were pressed into pellets and irradiated as thick targets with a millimetre-sized proton beam. By including the ashing procedure in the method, the statistical dispersion due to elemental heterogeneities in wood material could be reduced. As a by-product, information about the elemental composition of ashes was obtained. By comparing the concentration of an element in bark ash to the concentration in wood ash of the same tree useful information from environmental point of view was obtained. The obtained ratio of the ashes was used to distinguish between elemental contributions from anthropogenic atmospheric sources and natural geochemical sources, like soil and bedrock.
Comparison of water-quality samples collected by siphon samplers and automatic samplers in Wisconsin

USGS Publications Warehouse

Graczyk, David J.; Robertson, Dale M.; Rose, William J.; Steur, Jeffrey J.

2000-01-01

In small streams, flow and water-quality concentrations often change quickly in response to meteorological events. Hydrologists, field technicians, or locally hired stream ob- servers involved in water-data collection are often unable to reach streams quickly enough to observe or measure these rapid changes. Therefore, in hydrologic studies designed to describe changes in water quality, a combination of manual and automated sampling methods have commonly been used manual methods when flow is relatively stable and automated methods when flow is rapidly changing. Auto- mated sampling, which makes use of equipment programmed to collect samples in response to changes in stage and flow of a stream, has been shown to be an effective method of sampling to describe the rapid changes in water quality (Graczyk and others, 1993). Because of the high cost of automated sampling, however, especially for studies examining a large number of sites, alternative methods have been considered for collecting samples during rapidly changing stream conditions. One such method employs the siphon sampler (fig. 1). also referred to as the "single-stage sampler." Siphon samplers are inexpensive to build (about $25- $50 per sampler), operate, and maintain, so they are cost effective to use at a large number of sites. Their ability to collect samples representing the average quality of water passing though the entire cross section of a stream, however, has not been fully demonstrated for many types of stream sites.
Microbial secondary metabolites in school buildings inspected for moisture damage in Finland, The Netherlands and Spain.

PubMed

Peitzsch, Mirko; Sulyok, Michael; Täubel, Martin; Vishwanath, Vinay; Krop, Esmeralda; Borràs-Santos, Alicia; Hyvärinen, Anne; Nevalainen, Aino; Krska, Rudolf; Larsson, Lennart

2012-08-01

Secondary metabolites produced by fungi and bacteria are among the potential agents that contribute to adverse health effects observed in occupants of buildings affected by moisture damage, dampness and associated microbial growth. However, few attempts have been made to assess the occurrence of these compounds in relation to moisture damage and dampness in buildings. This study conducted in the context of the HITEA project (Health Effects of Indoor Pollutants: Integrating microbial, toxicological and epidemiological approaches) aimed at providing systematic information on the prevalence of microbial secondary metabolites in a large number of school buildings in three European countries, considering both buildings with and without moisture damage and/or dampness observations. In order to address the multitude and diversity of secondary metabolites a large number of more than 180 analytes was targeted in settled dust and surface swab samples using liquid chromatography/mass spectrometry (LC/MS) based methodology. While 42%, 58% and 44% of all samples collected in Spanish, Dutch and Finnish schools, respectively, were positive for at least one of the metabolites analyzed, frequency of detection for the individual microbial secondary metabolites - with the exceptions of emodin, certain enniatins and physcion - was low, typically in the range of and below 10% of positive samples. In total, 30 different fungal and bacterial secondary metabolites were found in the samples. Some differences in the metabolite profiles were observed between countries and between index and reference school buildings. A major finding in this study was that settled dust derived from moisture damaged, damp schools contained larger numbers of microbial secondary metabolites at higher levels compared to respective dust samples from schools not affected by moisture damage and dampness. This observation was true for schools in each of the three countries, but became statistically significant only when combining schools from all countries and thus increasing the sample number in the statistical analyses.
Variance Estimation, Design Effects, and Sample Size Calculations for Respondent-Driven Sampling

PubMed Central

2006-01-01

Hidden populations, such as injection drug users and sex workers, are central to a number of public health problems. However, because of the nature of these groups, it is difficult to collect accurate information about them, and this difficulty complicates disease prevention efforts. A recently developed statistical approach called respondent-driven sampling improves our ability to study hidden populations by allowing researchers to make unbiased estimates of the prevalence of certain traits in these populations. Yet, not enough is known about the sample-to-sample variability of these prevalence estimates. In this paper, we present a bootstrap method for constructing confidence intervals around respondent-driven sampling estimates and demonstrate in simulations that it outperforms the naive method currently in use. We also use simulations and real data to estimate the design effects for respondent-driven sampling in a number of situations. We conclude with practical advice about the power calculations that are needed to determine the appropriate sample size for a study using respondent-driven sampling. In general, we recommend a sample size twice as large as would be needed under simple random sampling. PMID:16937083
Semantic classification of urban buildings combining VHR image and GIS data: An improved random forest approach

NASA Astrophysics Data System (ADS)

Du, Shihong; Zhang, Fangli; Zhang, Xiuyuan

2015-07-01

While most existing studies have focused on extracting geometric information on buildings, only a few have concentrated on semantic information. The lack of semantic information cannot satisfy many demands on resolving environmental and social issues. This study presents an approach to semantically classify buildings into much finer categories than those of existing studies by learning random forest (RF) classifier from a large number of imbalanced samples with high-dimensional features. First, a two-level segmentation mechanism combining GIS and VHR image produces single image objects at a large scale and intra-object components at a small scale. Second, a semi-supervised method chooses a large number of unbiased samples by considering the spatial proximity and intra-cluster similarity of buildings. Third, two important improvements in RF classifier are made: a voting-distribution ranked rule for reducing the influences of imbalanced samples on classification accuracy and a feature importance measurement for evaluating each feature's contribution to the recognition of each category. Fourth, the semantic classification of urban buildings is practically conducted in Beijing city, and the results demonstrate that the proposed approach is effective and accurate. The seven categories used in the study are finer than those in existing work and more helpful to studying many environmental and social problems.
House dust fungal communities' characterization: a double take on the six by sixty by six (6 × 60 × 6) project

NASA Astrophysics Data System (ADS)

Amaro, Raquel; Coelho, Sónia D.; Pastorinho, M. Ramiro; Taborda-Barata, Luís; Vaz-Patto, Maria A.; Monteiro, Marisa; Nepomuceno, Miguel C. S.; Lanzinha, Joăo C. G.; Teixeira, Joăo P.; Pereira, Cristiana C.; Sousa, Ana C. A.

2016-11-01

Fungi are a group of microbes that are found with particular incidence in the indoor environment. Their direct toxicity or capability of generating toxic compounds has been associated with a large number of adverse health effects, such as infectious diseases and allergies. Given that in modern society people spend a large part of their time indoors; fungal communities' characterization of this environmental compartment assumes paramount importance in the comprehension of health effects. House dust is an easy to obtain, time-integrative matrix, being its use in epidemiological studies on human exposure to environmental contaminants highly recommended. Furthermore, dust can carry a great variety of fungal content that undergoes a large number of processes that modulate and further complexify human exposure. Our study aims to identify and quantify the fungal community on house dust samples collected using two different methodologies (an approach not often seen in the literature): active (vacuum cleaner bags) and passive sampling (dust settled in petri dishes). Sampling was performed as part of the ongoing 6 × 60 × 6 Project in which six houses from Covilhă (Portugal), with building dates representative of six decades, were studied for a period of sixty days.

Distilled single-cell genome sequencing and de novo assembly for sparse microbial communities.

PubMed

Taghavi, Zeinab; Movahedi, Narjes S; Draghici, Sorin; Chitsaz, Hamidreza

2013-10-01

Identification of every single genome present in a microbial sample is an important and challenging task with crucial applications. It is challenging because there are typically millions of cells in a microbial sample, the vast majority of which elude cultivation. The most accurate method to date is exhaustive single-cell sequencing using multiple displacement amplification, which is simply intractable for a large number of cells. However, there is hope for breaking this barrier, as the number of different cell types with distinct genome sequences is usually much smaller than the number of cells. Here, we present a novel divide and conquer method to sequence and de novo assemble all distinct genomes present in a microbial sample with a sequencing cost and computational complexity proportional to the number of genome types, rather than the number of cells. The method is implemented in a tool called Squeezambler. We evaluated Squeezambler on simulated data. The proposed divide and conquer method successfully reduces the cost of sequencing in comparison with the naïve exhaustive approach. Squeezambler and datasets are available at http://compbio.cs.wayne.edu/software/squeezambler/.
Simplified pupal surveys of Aedes aegypti (L.) for entomologic surveillance and dengue control.

PubMed

Barrera, Roberto

2009-07-01

Pupal surveys of Aedes aegypti (L.) are useful indicators of risk for dengue transmission, although sample sizes for reliable estimations can be large. This study explores two methods for making pupal surveys more practical yet reliable and used data from 10 pupal surveys conducted in Puerto Rico during 2004-2008. The number of pupae per person for each sampling followed a negative binomial distribution, thus showing aggregation. One method found a common aggregation parameter (k) for the negative binomial distribution, a finding that enabled the application of a sequential sampling method requiring few samples to determine whether the number of pupae/person was above a vector density threshold for dengue transmission. A second approach used the finding that the mean number of pupae/person is correlated with the proportion of pupa-infested households and calculated equivalent threshold proportions of pupa-positive households. A sequential sampling program was also developed for this method to determine whether observed proportions of infested households were above threshold levels. These methods can be used to validate entomological thresholds for dengue transmission.
Antarctic Meteorite Newsletter, Volume 11, Number 2, August 1988

NASA Technical Reports Server (NTRS)

1988-01-01

Presented are classifications and descriptions of a large number of meteorites which include the last samples from the 1984 collection and the first samples from the 1987 collection. There is a particularly good selection of meteorites of special petrologic type in the 1987 collection. The achondrites include aubrites, ureilites, howardites, eucrites, and a diogenite. The howardites are particularly notable because of their size and previous scarcity in the Antarctic collection. Noteworthy among the 7 irons and 3 mesosiderities are 2 anamolous irons and 2 large mesosiderites. The carbonaceous chondrites include good suites of C2 and C4 meteorites, and 2 highly equilibrated carbonaceous chondrites tentatively identified as C5 and C6 meteorites. Also included are surveys of numerous meteorites for Al-26 and thermoluminescence. These studies provide information on the thermal and radiation histories of the meteorites and can be used as measures of their terrestrial ages.
Radiation sensitivity of foodborne pathogens in meat byproducts with different packaging

NASA Astrophysics Data System (ADS)

Yong, Hae In; Kim, Hyun-Joo; Nam, Ki Chang; Kwon, Joong Ho; Jo, Cheorun

2015-10-01

The aim of this study was to determine radiation sensitivity of Escherichia coli O157:H7 and Listeria monocytogenes in edible meat byproducts. Seven beef byproducts (heart, liver, lung, lumen, omasum, large intestine, and small intestine) and four pork byproducts (heart, large intestine, liver, and small intestine) were used. Electron beam irradiation significantly reduced the numbers of pathogenic microorganisms in meat byproducts and no viable cells were detected in both aerobically- and vacuum-packaged samples irradiated at 4 kGy. Meat byproducts packed under vacuum had higher D10 value than the ones packed aerobically. No significant difference was observed between the D10 values of E. coli O157:H7 and L. monocytogenes inoculated in either aerobically or vacuum packaged samples. These results suggest that low-dose electron beam irradiation can significantly decrease microbial numbers and reduce the risk of meat byproduct contamination by the foodborne pathogens.
Detecting a Weak Association by Testing its Multiple Perturbations: a Data Mining Approach

NASA Astrophysics Data System (ADS)

Lo, Min-Tzu; Lee, Wen-Chung

2014-05-01

Many risk factors/interventions in epidemiologic/biomedical studies are of minuscule effects. To detect such weak associations, one needs a study with a very large sample size (the number of subjects, n). The n of a study can be increased but unfortunately only to an extent. Here, we propose a novel method which hinges on increasing sample size in a different direction-the total number of variables (p). We construct a p-based `multiple perturbation test', and conduct power calculations and computer simulations to show that it can achieve a very high power to detect weak associations when p can be made very large. As a demonstration, we apply the method to analyze a genome-wide association study on age-related macular degeneration and identify two novel genetic variants that are significantly associated with the disease. The p-based method may set a stage for a new paradigm of statistical tests.
Direct PCR Offers a Fast and Reliable Alternative to Conventional DNA Isolation Methods for Gut Microbiomes.

PubMed

Videvall, Elin; Strandh, Maria; Engelbrecht, Anel; Cloete, Schalk; Cornwallis, Charlie K

2017-01-01

The gut microbiome of animals is emerging as an important factor influencing ecological and evolutionary processes. A major bottleneck in obtaining microbiome data from large numbers of samples is the time-consuming laboratory procedures required, specifically the isolation of DNA and generation of amplicon libraries. Recently, direct PCR kits have been developed that circumvent conventional DNA extraction steps, thereby streamlining the laboratory process by reducing preparation time and costs. However, the reliability and efficacy of direct PCR for measuring host microbiomes have not yet been investigated other than in humans with 454 sequencing. Here, we conduct a comprehensive evaluation of the microbial communities obtained with direct PCR and the widely used Mo Bio PowerSoil DNA extraction kit in five distinct gut sample types (ileum, cecum, colon, feces, and cloaca) from 20 juvenile ostriches, using 16S rRNA Illumina MiSeq sequencing. We found that direct PCR was highly comparable over a range of measures to the DNA extraction method in cecal, colon, and fecal samples. However, the two methods significantly differed in samples with comparably low bacterial biomass: cloacal and especially ileal samples. We also sequenced 100 replicate sample pairs to evaluate repeatability during both extraction and PCR stages and found that both methods were highly consistent for cecal, colon, and fecal samples ( r s > 0.7) but had low repeatability for cloacal ( r s = 0.39) and ileal ( r s = -0.24) samples. This study indicates that direct PCR provides a fast, cheap, and reliable alternative to conventional DNA extraction methods for retrieving 16S rRNA data, which can aid future gut microbiome studies. IMPORTANCE The microbial communities of animals can have large impacts on their hosts, and the number of studies using high-throughput sequencing to measure gut microbiomes is rapidly increasing. However, the library preparation procedure in microbiome research is both costly and time-consuming, especially for large numbers of samples. We investigated a cheaper and faster direct PCR method designed to bypass the DNA isolation steps during 16S rRNA library preparation and compared it with a standard DNA extraction method. We used both techniques on five different gut sample types collected from 20 juvenile ostriches and sequenced samples with Illumina MiSeq. The methods were highly comparable and highly repeatable in three sample types with high microbial biomass (cecum, colon, and feces), but larger differences and low repeatability were found in the microbiomes obtained from the ileum and cloaca. These results will help microbiome researchers assess library preparation procedures and plan their studies accordingly.
The x ray properties of a large, uniform QSO sample: Einstein observations of the LBQS

NASA Technical Reports Server (NTRS)

Margon, B.; Anderson, S. F.; Xu, X.; Green, P. J.; Foltz, C. B.

1992-01-01

Although there are large numbers of Quasi Stellar Objects (QSO's) now observed in X rays, extensive X-ray observations of uniformly selected, 'complete' QSO samples are more rare. The Large Bright QSO Survey (LBQS) consists of about 1000 objects with well understood properties, most brighter than B = 18.8 and thus amenable to X-ray detections in relatively brief exposures. The sample is thought to be highly complete in the range 0.2 less than z less than 3.3, a significantly broader interval than many other surveys. The Einstein IPC observed 150 of these objects, mostly serendipitously, during its lifetime. We report the results of an analysis of these IPC data, considering not only the 20 percent of the objects we find to have positive X-ray detections, but also the ensemble X-ray properties derived by 'image stacking'.
GARN: Sampling RNA 3D Structure Space with Game Theory and Knowledge-Based Scoring Strategies.

PubMed

Boudard, Mélanie; Bernauer, Julie; Barth, Dominique; Cohen, Johanne; Denise, Alain

2015-01-01

Cellular processes involve large numbers of RNA molecules. The functions of these RNA molecules and their binding to molecular machines are highly dependent on their 3D structures. One of the key challenges in RNA structure prediction and modeling is predicting the spatial arrangement of the various structural elements of RNA. As RNA folding is generally hierarchical, methods involving coarse-grained models hold great promise for this purpose. We present here a novel coarse-grained method for sampling, based on game theory and knowledge-based potentials. This strategy, GARN (Game Algorithm for RNa sampling), is often much faster than previously described techniques and generates large sets of solutions closely resembling the native structure. GARN is thus a suitable starting point for the molecular modeling of large RNAs, particularly those with experimental constraints. GARN is available from: http://garn.lri.fr/.
DESCARTES' RULE OF SIGNS AND THE IDENTIFIABILITY OF POPULATION DEMOGRAPHIC MODELS FROM GENOMIC VARIATION DATA.

PubMed

Bhaskar, Anand; Song, Yun S

2014-01-01

The sample frequency spectrum (SFS) is a widely-used summary statistic of genomic variation in a sample of homologous DNA sequences. It provides a highly efficient dimensional reduction of large-scale population genomic data and its mathematical dependence on the underlying population demography is well understood, thus enabling the development of efficient inference algorithms. However, it has been recently shown that very different population demographies can actually generate the same SFS for arbitrarily large sample sizes. Although in principle this nonidentifiability issue poses a thorny challenge to statistical inference, the population size functions involved in the counterexamples are arguably not so biologically realistic. Here, we revisit this problem and examine the identifiability of demographic models under the restriction that the population sizes are piecewise-defined where each piece belongs to some family of biologically-motivated functions. Under this assumption, we prove that the expected SFS of a sample uniquely determines the underlying demographic model, provided that the sample is sufficiently large. We obtain a general bound on the sample size sufficient for identifiability; the bound depends on the number of pieces in the demographic model and also on the type of population size function in each piece. In the cases of piecewise-constant, piecewise-exponential and piecewise-generalized-exponential models, which are often assumed in population genomic inferences, we provide explicit formulas for the bounds as simple functions of the number of pieces. Lastly, we obtain analogous results for the "folded" SFS, which is often used when there is ambiguity as to which allelic type is ancestral. Our results are proved using a generalization of Descartes' rule of signs for polynomials to the Laplace transform of piecewise continuous functions.
DESCARTES’ RULE OF SIGNS AND THE IDENTIFIABILITY OF POPULATION DEMOGRAPHIC MODELS FROM GENOMIC VARIATION DATA1

PubMed Central

Bhaskar, Anand; Song, Yun S.

2016-01-01

The sample frequency spectrum (SFS) is a widely-used summary statistic of genomic variation in a sample of homologous DNA sequences. It provides a highly efficient dimensional reduction of large-scale population genomic data and its mathematical dependence on the underlying population demography is well understood, thus enabling the development of efficient inference algorithms. However, it has been recently shown that very different population demographies can actually generate the same SFS for arbitrarily large sample sizes. Although in principle this nonidentifiability issue poses a thorny challenge to statistical inference, the population size functions involved in the counterexamples are arguably not so biologically realistic. Here, we revisit this problem and examine the identifiability of demographic models under the restriction that the population sizes are piecewise-defined where each piece belongs to some family of biologically-motivated functions. Under this assumption, we prove that the expected SFS of a sample uniquely determines the underlying demographic model, provided that the sample is sufficiently large. We obtain a general bound on the sample size sufficient for identifiability; the bound depends on the number of pieces in the demographic model and also on the type of population size function in each piece. In the cases of piecewise-constant, piecewise-exponential and piecewise-generalized-exponential models, which are often assumed in population genomic inferences, we provide explicit formulas for the bounds as simple functions of the number of pieces. Lastly, we obtain analogous results for the “folded” SFS, which is often used when there is ambiguity as to which allelic type is ancestral. Our results are proved using a generalization of Descartes’ rule of signs for polynomials to the Laplace transform of piecewise continuous functions. PMID:28018011
On the use of total aerobic spore bacteria to make treatment decisions due to Cryptosporidium risk at public water system wells.

PubMed

Berger, Philip; Messner, Michael J; Crosby, Jake; Vacs Renwick, Deborah; Heinrich, Austin

2018-05-01

Spore reduction can be used as a surrogate measure of Cryptosporidium natural filtration efficiency. Estimates of log10 (log) reduction were derived from spore measurements in paired surface and well water samples in Casper Wyoming and Kearney Nebraska. We found that these data were suitable for testing the hypothesis (H 0 ) that the average reduction at each site was 2 log or less, using a one-sided Student's t-test. After establishing data quality objectives for the test (expressed as tolerable Type I and Type II error rates), we evaluated the test's performance as a function of the (a) true log reduction, (b) number of paired samples assayed and (c) variance of observed log reductions. We found that 36 paired spore samples are sufficient to achieve the objectives over a wide range of variance, including the variances observed in the two data sets. We also explored the feasibility of using smaller numbers of paired spore samples to supplement bioparticle counts for screening purposes in alluvial aquifers, to differentiate wells with large volume surface water induced recharge from wells with negligible surface water induced recharge. With key assumptions, we propose a normal statistical test of the same hypothesis (H 0 ), but with different performance objectives. As few as six paired spore samples appear adequate as a screening metric to supplement bioparticle counts to differentiate wells in alluvial aquifers with large volume surface water induced recharge. For the case when all available information (including failure to reject H 0 based on the limited paired spore data) leads to the conclusion that wells have large surface water induced recharge, we recommend further evaluation using additional paired biweekly spore samples. Published by Elsevier GmbH.
Rotor assembly and method for automatically processing liquids

DOEpatents

Burtis, C.A.; Johnson, W.F.; Walker, W.A.

1992-12-22

A rotor assembly is described for performing a relatively large number of processing steps upon a sample, such as a whole blood sample, and a diluent, such as water. It includes a rotor body for rotation about an axis and includes a network of chambers within which various processing steps are performed upon the sample and diluent and passageways through which the sample and diluent are transferred. A transfer mechanism is movable through the rotor body by the influence of a magnetic field generated adjacent the transfer mechanism and movable along the rotor body, and the assembly utilizes centrifugal force, a transfer of momentum and capillary action to perform any of a number of processing steps such as separation, aliquoting, transference, washing, reagent addition and mixing of the sample and diluent within the rotor body. The rotor body is particularly suitable for automatic immunoassay analyses. 34 figs.
Isotope analysis of crystalline impact melt rocks from Apollo 16 stations 11 and 13, North Ray Crater

NASA Technical Reports Server (NTRS)

Reimold, W. U.; Nyquist, L. E.; Bansal, B. M.; Shih, C.-Y.; Weismann, H.; Wooden, J. L.; Mackinnon, I. D. R.

1985-01-01

The North Ray Crater Target Rock Consortium was formed to study a large number of rake samples collected at Apollo 16 stations 11 and 13 with comparative chemical, mineralogical, and chronological techniques in order to provide a larger data base for the discussion of lunar highland evolution in the vicinity of the Apollo 16 landing region. The present investigation is concerned with Rb-Sr and Sm-Nd isotopic analyses of a number of whole-rock samples of feldspathic microporhyritic (FM) impact melt, a sample type especially abundant among the North Ray crater (station 11) sample collection. Aspects of sample mineralogy and analytical procedures are discussed, taking into account FM impact melt rocks 6715 and 63538, intergranular impact melt rock 67775, subophitic impact melt rock 67747, subophitic impact melt rock 67559, and studies based on the utilization of electron microscopy and mass spectroscopy.
Best Practices and Joint Calling of the HumanExome BeadChip: The CHARGE Consortium

PubMed Central

Grove, Megan L.; Yu, Bing; Cochran, Barbara J.; Haritunians, Talin; Bis, Joshua C.; Taylor, Kent D.; Hansen, Mark; Borecki, Ingrid B.; Cupples, L. Adrienne; Fornage, Myriam; Gudnason, Vilmundur; Harris, Tamara B.; Kathiresan, Sekar; Kraaij, Robert; Launer, Lenore J.; Levy, Daniel; Liu, Yongmei; Mosley, Thomas; Peloso, Gina M.; Psaty, Bruce M.; Rich, Stephen S.; Rivadeneira, Fernando; Siscovick, David S.; Smith, Albert V.; Uitterlinden, Andre; van Duijn, Cornelia M.; Wilson, James G.; O’Donnell, Christopher J.; Rotter, Jerome I.; Boerwinkle, Eric

2013-01-01

Genotyping arrays are a cost effective approach when typing previously-identified genetic polymorphisms in large numbers of samples. One limitation of genotyping arrays with rare variants (e.g., minor allele frequency [MAF] <0.01) is the difficulty that automated clustering algorithms have to accurately detect and assign genotype calls. Combining intensity data from large numbers of samples may increase the ability to accurately call the genotypes of rare variants. Approximately 62,000 ethnically diverse samples from eleven Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium cohorts were genotyped with the Illumina HumanExome BeadChip across seven genotyping centers. The raw data files for the samples were assembled into a single project for joint calling. To assess the quality of the joint calling, concordance of genotypes in a subset of individuals having both exome chip and exome sequence data was analyzed. After exclusion of low performing SNPs on the exome chip and non-overlap of SNPs derived from sequence data, genotypes of 185,119 variants (11,356 were monomorphic) were compared in 530 individuals that had whole exome sequence data. A total of 98,113,070 pairs of genotypes were tested and 99.77% were concordant, 0.14% had missing data, and 0.09% were discordant. We report that joint calling allows the ability to accurately genotype rare variation using array technology when large sample sizes are available and best practices are followed. The cluster file from this experiment is available at www.chargeconsortium.com/main/exomechip. PMID:23874508
Divergent estimation error in portfolio optimization and in linear regression

NASA Astrophysics Data System (ADS)

Kondor, I.; Varga-Haszonits, I.

2008-08-01

The problem of estimation error in portfolio optimization is discussed, in the limit where the portfolio size N and the sample size T go to infinity such that their ratio is fixed. The estimation error strongly depends on the ratio N/T and diverges for a critical value of this parameter. This divergence is the manifestation of an algorithmic phase transition, it is accompanied by a number of critical phenomena, and displays universality. As the structure of a large number of multidimensional regression and modelling problems is very similar to portfolio optimization, the scope of the above observations extends far beyond finance, and covers a large number of problems in operations research, machine learning, bioinformatics, medical science, economics, and technology.
The reliability and stability of visual working memory capacity.

PubMed

Xu, Z; Adam, K C S; Fang, X; Vogel, E K

2018-04-01

Because of the central role of working memory capacity in cognition, many studies have used short measures of working memory capacity to examine its relationship to other domains. Here, we measured the reliability and stability of visual working memory capacity, measured using a single-probe change detection task. In Experiment 1, the participants (N = 135) completed a large number of trials of a change detection task (540 in total, 180 each of set sizes 4, 6, and 8). With large numbers of both trials and participants, reliability estimates were high (α > .9). We then used an iterative down-sampling procedure to create a look-up table for expected reliability in experiments with small sample sizes. In Experiment 2, the participants (N = 79) completed 31 sessions of single-probe change detection. The first 30 sessions took place over 30 consecutive days, and the last session took place 30 days later. This unprecedented number of sessions allowed us to examine the effects of practice on stability and internal reliability. Even after much practice, individual differences were stable over time (average between-session r = .76).
Measuring mercury and other elemental components in tree rings

USGS Publications Warehouse

Gillan, C.; Hollerman, W.A.; Doyle, T.W.; Lewis, T.E.

2004-01-01

There has been considerable interest in measuring heavy metal pollution, such as mercury, using tree ring analysis. Since 1970, this method has provided a historical snapshot of pollutant concentrations near hazardous waste sites. Traditional methods of analysis have long been used with heavy metal pollutants such as mercury. These methods, such as atomic fluorescence and laser ablation, are sometimes time consuming and expensive to implement. In recent years, ion beam techniques, such as Particle Induced X-Ray Emission (PIXE), have been used to measure large numbers of elements. Most of the existing research in this area has been completed for low to medium atomic number pollutants, such as titanium, cobalt, nickel, and copper. Due to the reduction of sensitivity, it is often difficult or impossible to use traditional low energy (few MeV) PIXE analysis for pollutants with large atomic numbers. For example, the PIXE detection limit for mercury was recently measured to be about 1 ppm for a spiked Southern Magnolia wood sample [ref. 1]. This presentation will compare PIXE and standard chemical concentration results for a variety of wood samples.
Measuring mercury and other elemental components in tree rings

USGS Publications Warehouse

Gillan, C.; Hollerman, W.A.; Doyle, T.W.; Lewis, T.E.

2004-01-01

There has been considerable interest in measuring heavy metal pollution, such as mercury, using tree ring analysis. Since 1970, this method has provided a historical snapshot of pollutant concentrations near hazardous waste sites. Traditional methods of analysis have long been used with heavy metal pollutants such as mercury. These methods, such as atomic fluorescence and laser ablation, are sometimes time consuming and expensive to implement. In recent years, ion beam techniques, such as Particle Induced X-Ray Emission (PIXE), have been used to measure large numbers of elements. Most of the existing research in this area has been completed for low to medium atomic number pollutants, such as titanium, cobalt, nickel, and copper. Due to the reduction of sensitivity, it is often difficult or impossible to use traditional low energy (few MeV) PIXE analysis for pollutants with large atomic numbers. For example, the PIXE detection limit for mercury was recently measured to be about 1 ppm for a spiked Southern Magnolia wood sample [ref. 1]. This presentation will compare PIXE and standard chemical concentration results for a variety of wood samples. Copyright 2004 by ISA.
Clonal evolution in relapsed and refractory diffuse large B-cell lymphoma is characterized by high dynamics of subclones.

PubMed

Melchardt, Thomas; Hufnagl, Clemens; Weinstock, David M; Kopp, Nadja; Neureiter, Daniel; Tränkenschuh, Wolfgang; Hackl, Hubert; Weiss, Lukas; Rinnerthaler, Gabriel; Hartmann, Tanja N; Greil, Richard; Weigert, Oliver; Egle, Alexander

2016-08-09

Little information is available about the role of certain mutations for clonal evolution and the clinical outcome during relapse in diffuse large B-cell lymphoma (DLBCL). Therefore, we analyzed formalin-fixed-paraffin-embedded tumor samples from first diagnosis, relapsed or refractory disease from 28 patients using next-generation sequencing of the exons of 104 coding genes. Non-synonymous mutations were present in 74 of the 104 genes tested. Primary tumor samples showed a median of 8 non-synonymous mutations (range: 0-24) with the used gene set. Lower numbers of non-synonymous mutations in the primary tumor were associated with a better median OS compared with higher numbers (28 versus 15 months, p=0.031). We observed three patterns of clonal evolution during relapse of disease: large global change, subclonal selection and no or minimal change possibly suggesting preprogrammed resistance. We conclude that targeted re-sequencing is a feasible and informative approach to characterize the molecular pattern of relapse and it creates novel insights into the role of dynamics of individual genes.
The mean density and two-point correlation function for the CfA redshift survey slices

NASA Technical Reports Server (NTRS)

De Lapparent, Valerie; Geller, Margaret J.; Huchra, John P.

1988-01-01

The effect of large-scale inhomogeneities on the determination of the mean number density and the two-point spatial correlation function were investigated for two complete slices of the extension of the Center for Astrophysics (CfA) redshift survey (de Lapparent et al., 1986). It was found that the mean galaxy number density for the two strips is uncertain by 25 percent, more so than previously estimated. The large uncertainty in the mean density introduces substantial uncertainty in the determination of the two-point correlation function, particularly at large scale; thus, for the 12-deg slice of the CfA redshift survey, the amplitude of the correlation function at intermediate scales is uncertain by a factor of 2. The large uncertainties in the correlation functions might reflect the lack of a fair sample.

A comparison of bootstrap methods and an adjusted bootstrap approach for estimating the prediction error in microarray classification.

PubMed

Jiang, Wenyu; Simon, Richard

2007-12-20

This paper first provides a critical review on some existing methods for estimating the prediction error in classifying microarray data where the number of genes greatly exceeds the number of specimens. Special attention is given to the bootstrap-related methods. When the sample size n is small, we find that all the reviewed methods suffer from either substantial bias or variability. We introduce a repeated leave-one-out bootstrap (RLOOB) method that predicts for each specimen in the sample using bootstrap learning sets of size ln. We then propose an adjusted bootstrap (ABS) method that fits a learning curve to the RLOOB estimates calculated with different bootstrap learning set sizes. The ABS method is robust across the situations we investigate and provides a slightly conservative estimate for the prediction error. Even with small samples, it does not suffer from large upward bias as the leave-one-out bootstrap and the 0.632+ bootstrap, and it does not suffer from large variability as the leave-one-out cross-validation in microarray applications. Copyright (c) 2007 John Wiley & Sons, Ltd.
A Radio-Map Automatic Construction Algorithm Based on Crowdsourcing

PubMed Central

Yu, Ning; Xiao, Chenxian; Wu, Yinfeng; Feng, Renjian

2016-01-01

Traditional radio-map-based localization methods need to sample a large number of location fingerprints offline, which requires huge amount of human and material resources. To solve the high sampling cost problem, an automatic radio-map construction algorithm based on crowdsourcing is proposed. The algorithm employs the crowd-sourced information provided by a large number of users when they are walking in the buildings as the source of location fingerprint data. Through the variation characteristics of users’ smartphone sensors, the indoor anchors (doors) are identified and their locations are regarded as reference positions of the whole radio-map. The AP-Cluster method is used to cluster the crowdsourced fingerprints to acquire the representative fingerprints. According to the reference positions and the similarity between fingerprints, the representative fingerprints are linked to their corresponding physical locations and the radio-map is generated. Experimental results demonstrate that the proposed algorithm reduces the cost of fingerprint sampling and radio-map construction and guarantees the localization accuracy. The proposed method does not require users’ explicit participation, which effectively solves the resource-consumption problem when a location fingerprint database is established. PMID:27070623
High and uneven levels of 45S rDNA site-number variation across wild populations of a diploid plant genus (Anacyclus, Asteraceae)

PubMed Central

Rosato, Marcela; Álvarez, Inés; Nieto Feliner, Gonzalo

2017-01-01

The nuclear genome harbours hundreds to several thousand copies of ribosomal DNA. Despite their essential role in cellular ribogenesis few studies have addressed intrapopulation, interpopulation and interspecific levels of rDNA variability in wild plants. Some studies have assessed the extent of rDNA variation at the sequence and copy-number level with large sampling in several species. However, comparable studies on rDNA site number variation in plants, assessed with extensive hierarchical sampling at several levels (individuals, populations, species) are lacking. In exploring the possible causes for ribosomal loci dynamism, we have used the diploid genus Anacyclus (Asteraceae) as a suitable system to examine the evolution of ribosomal loci. To this end, the number and chromosomal position of 45S rDNA sites have been determined in 196 individuals from 47 populations in all Anacyclus species using FISH. The 45S rDNA site-number has been assessed in a significant sample of seed plants, which usually exhibit rather consistent features, except for polyploid plants. In contrast, the level of rDNA site-number variation detected in Anacyclus is outstanding in the context of angiosperms particularly regarding populations of the same species. The number of 45S rDNA sites ranged from four to 11, accounting for 14 karyological ribosomal phenotypes. Our results are not even across species and geographical areas, and show that there is no clear association between the number of 45S rDNA loci and the life cycle in Anacyclus. A single rDNA phenotype was detected in several species, but a more complex pattern that included intra-specific and intra-population polymorphisms was recorded in A. homogamos, A. clavatus and A. valentinus, three weedy species showing large and overlapping distribution ranges. It is likely that part of the cytogenetic changes and inferred dynamism found in these species have been triggered by genomic rearrangements resulting from contemporary hybridisation. PMID:29088249
High and uneven levels of 45S rDNA site-number variation across wild populations of a diploid plant genus (Anacyclus, Asteraceae).

PubMed

Rosato, Marcela; Álvarez, Inés; Nieto Feliner, Gonzalo; Rosselló, Josep A

2017-01-01

The nuclear genome harbours hundreds to several thousand copies of ribosomal DNA. Despite their essential role in cellular ribogenesis few studies have addressed intrapopulation, interpopulation and interspecific levels of rDNA variability in wild plants. Some studies have assessed the extent of rDNA variation at the sequence and copy-number level with large sampling in several species. However, comparable studies on rDNA site number variation in plants, assessed with extensive hierarchical sampling at several levels (individuals, populations, species) are lacking. In exploring the possible causes for ribosomal loci dynamism, we have used the diploid genus Anacyclus (Asteraceae) as a suitable system to examine the evolution of ribosomal loci. To this end, the number and chromosomal position of 45S rDNA sites have been determined in 196 individuals from 47 populations in all Anacyclus species using FISH. The 45S rDNA site-number has been assessed in a significant sample of seed plants, which usually exhibit rather consistent features, except for polyploid plants. In contrast, the level of rDNA site-number variation detected in Anacyclus is outstanding in the context of angiosperms particularly regarding populations of the same species. The number of 45S rDNA sites ranged from four to 11, accounting for 14 karyological ribosomal phenotypes. Our results are not even across species and geographical areas, and show that there is no clear association between the number of 45S rDNA loci and the life cycle in Anacyclus. A single rDNA phenotype was detected in several species, but a more complex pattern that included intra-specific and intra-population polymorphisms was recorded in A. homogamos, A. clavatus and A. valentinus, three weedy species showing large and overlapping distribution ranges. It is likely that part of the cytogenetic changes and inferred dynamism found in these species have been triggered by genomic rearrangements resulting from contemporary hybridisation.
Of Small Beauties and Large Beasts: The Quality of Distractors on Multiple-Choice Tests Is More Important than Their Quantity

ERIC Educational Resources Information Center

Papenberg, Martin; Musch, Jochen

2017-01-01

In multiple-choice tests, the quality of distractors may be more important than their number. We therefore examined the joint influence of distractor quality and quantity on test functioning by providing a sample of 5,793 participants with five parallel test sets consisting of items that differed in the number and quality of distractors.…
North American vegetation model for land-use planning in a changing climate: A solution to large classification problems

Treesearch

Gerald E. Rehfeldt; Nicholas L. Crookston; Cuauhtemoc Saenz-Romero; Elizabeth M. Campbell

2012-01-01

Data points intensively sampling 46 North American biomes were used to predict the geographic distribution of biomes from climate variables using the Random Forests classification tree. Techniques were incorporated to accommodate a large number of classes and to predict the future occurrence of climates beyond the contemporary climatic range of the biomes. Errors of...
COMPARISON OF TWO METHODS FOR THE ISOLATION OF SALMONELLAE FROM IMPORTED FOODS.

PubMed

TAYLOR, W I; HOBBS, B C; SMITH, M E

1964-01-01

Two methods for the detection of salmonellae in foods were compared in 179 imported meat and egg samples. The number of positive samples and replications, and the number of strains and kinds of serotypes were statistically comparable by both the direct enrichment method of the Food Hygiene Laboratory in England, and the pre-enrichment method devised for processed foods in the United States. Boneless frozen beef, veal, and horsemeat imported from five countries for consumption in England were found to have salmonellae present in 48 of 116 (41%) samples. Dried egg products imported from three countries were observed to have salmonellae in 10 of 63 (16%) samples. The high incidence of salmonellae isolated from imported foods illustrated the existence of an international health hazard resulting from the continuous introduction of exogenous strains of pathogenic microorganisms on a large scale.
Spelling Equivalency Awareness

ERIC Educational Resources Information Center

Berk, Barbara; Mazurkiewicz, Albert J.

1976-01-01

Concludes that despite instructional emphasis on one correct spelling, a large segment of the sample populations in this study spell differently from that usually thought correct and that a number of students, teachers, and parents recognize the existence of equally correct alternatives. (RB)
HUMAN EXPOSURE ASSESSMENT USING IMMUNOASSAY

EPA Science Inventory

The National Exposure Research Laboratory-Las Vegas is developing analytical methods for human exposure assessment studies. Critical exposure studies generate a large number of samples which must be analyzed in a reliable, cost-effective and timely manner. TCP (3,5,6-trichlor...
Microgravity

NASA Image and Video Library

1995-09-15

Large Isothermal Furnace (LIF) was flown on a mission in cooperation with the National Space Development Agency (NASDA) of Japan. LIF is a vacuum-heating furnace designed to heat large samples uniformly. The furnace consists of a sample container and heating element surrounded by a vacuum chamber. A crewmemeber will insert a sample cartridge into the furnace. The furnace will be activated and operations will be controlled automatically by a computer in response to an experiment number entered on the control panel. At the end of operations, helium will be discharged into the furnace, allowing cooling to start. Cooling will occur through the use of a water jacket while rapid cooling of samples can be accomplished through a controlled flow of helium. Data from experiments will help scientists better understand this important process which is vital to the production of high-quality semiconductor crystals.
Ultra-broadband ptychography with self-consistent coherence estimation from a high harmonic source

NASA Astrophysics Data System (ADS)

Odstrčil, M.; Baksh, P.; Kim, H.; Boden, S. A.; Brocklesby, W. S.; Frey, J. G.

2015-09-01

With the aim of improving imaging using table-top extreme ultraviolet sources, we demonstrate coherent diffraction imaging (CDI) with relative bandwidth of 20%. The coherence properties of the illumination probe are identified using the same imaging setup. The presented methods allows for the use of fewer monochromating optics, obtaining higher flux at the sample and thus reach higher resolution or shorter exposure time. This is important in the case of ptychography when a large number of diffraction patterns need to be collected. Our microscopy setup was tested on a reconstruction of an extended sample to show the quality of the reconstruction. We show that high harmonic generation based EUV tabletop microscope can provide reconstruction of samples with a large field of view and high resolution without additional prior knowledge about the sample or illumination.
Using LUCAS topsoil database to estimate soil organic carbon content in local spectral libraries

NASA Astrophysics Data System (ADS)

Castaldi, Fabio; van Wesemael, Bas; Chabrillat, Sabine; Chartin, Caroline

2017-04-01

The quantification of the soil organic carbon (SOC) content over large areas is mandatory to obtain accurate soil characterization and classification, which can improve site specific management at local or regional scale exploiting the strong relationship between SOC and crop growth. The estimation of the SOC is not only important for agricultural purposes: in recent years, the increasing attention towards global warming highlighted the crucial role of the soil in the global carbon cycle. In this context, soil spectroscopy is a well consolidated and widespread method to estimate soil variables exploiting the interaction between chromophores and electromagnetic radiation. The importance of spectroscopy in soil science is reflected by the increasing number of large soil spectral libraries collected in the world. These large libraries contain soil samples derived from a consistent number of pedological regions and thus from different parent material and soil types; this heterogeneity entails, in turn, a large variability in terms of mineralogical and organic composition. In the light of the huge variability of the spectral responses to SOC content and composition, a rigorous classification process is necessary to subset large spectral libraries and to avoid the calibration of global models failing to predict local variation in SOC content. In this regard, this study proposes a method to subset the European LUCAS topsoil database into soil classes using a clustering analysis based on a large number of soil properties. The LUCAS database was chosen to apply a standardized multivariate calibration approach valid for large areas without the need for extensive field and laboratory work for calibration of local models. Seven soil classes were detected by the clustering analyses and the samples belonging to each class were used to calibrate specific partial least square regression (PLSR) models to estimate SOC content of three local libraries collected in Belgium (Loam belt and Wallonia) and Luxembourg. The three local libraries only consist of spectral data (199 samples) acquired using the same protocol as the one used for the LUCAS database. SOC was estimated with a good accuracy both within each local library (RMSE: 1.2 ÷ 5.4 g kg-1; RPD: 1.41 ÷ 2.06) and for the samples of the three libraries together (RMSE: 3.9 g kg-1; RPD: 2.47). The proposed approach could allow to estimate SOC everywhere in Europe only collecting spectra, without the need for chemical laboratory analyses, exploiting the potentiality of the LUCAS database and specific PLSR models.
Exploiting Multi-Step Sample Trajectories for Approximate Value Iteration

DTIC Science & Technology

2013-09-01

WORK UNIT NUMBER IH 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) AFRL/ RISC 525 Brooks Road, Rome NY 13441-4505 Binghamton University...S) AND ADDRESS(ES) Air Force Research Laboratory/Information Directorate Rome Research Site/ RISC 525 Brooks Road Rome NY 13441-4505 10. SPONSOR...iteration methods for reinforcement learning (RL) generalize experience from limited samples across large state-action spaces. The function approximators
Variability of Hormonal Stress Markers and Stress Responses in a Large Cross-Sectional Sample of Elephant Seals

DTIC Science & Technology

2011-09-30

massey.ac.nz Award Number: N000141110434 LONG-TERM GOALS Physiological indicators of stress in wild marine mammals, the interrelationships between...hormones (GC), aldosterone (A), thyroid hormones (TH), and catecholamines within a free-ranging northern elephant seal population and its...additional individuals per year). Serum samples will be processed for ACTH, cortisol, aldosterone , catecholamines (epinephrine, norepinephrine), and
Similar frequency of the McGurk effect in large samples of native Mandarin Chinese and American English speakers.

PubMed

Magnotti, John F; Basu Mallick, Debshila; Feng, Guo; Zhou, Bin; Zhou, Wen; Beauchamp, Michael S

2015-09-01

Humans combine visual information from mouth movements with auditory information from the voice to recognize speech. A common method for assessing multisensory speech perception is the McGurk effect: When presented with particular pairings of incongruent auditory and visual speech syllables (e.g., the auditory speech sounds for "ba" dubbed onto the visual mouth movements for "ga"), individuals perceive a third syllable, distinct from the auditory and visual components. Chinese and American cultures differ in the prevalence of direct facial gaze and in the auditory structure of their languages, raising the possibility of cultural- and language-related group differences in the McGurk effect. There is no consensus in the literature about the existence of these group differences, with some studies reporting less McGurk effect in native Mandarin Chinese speakers than in English speakers and others reporting no difference. However, these studies sampled small numbers of participants tested with a small number of stimuli. Therefore, we collected data on the McGurk effect from large samples of Mandarin-speaking individuals from China and English-speaking individuals from the USA (total n = 307) viewing nine different stimuli. Averaged across participants and stimuli, we found similar frequencies of the McGurk effect between Chinese and American participants (48 vs. 44 %). In both groups, we observed a large range of frequencies both across participants (range from 0 to 100 %) and stimuli (15 to 83 %) with the main effect of culture and language accounting for only 0.3 % of the variance in the data. High individual variability in perception of the McGurk effect necessitates the use of large sample sizes to accurately estimate group differences.
Microbiological and mycological beach sand quality in a volcanic environment: Madeira archipelago, Portugal.

PubMed

Pereira, Elisabete; Figueira, Celso; Aguiar, Nuno; Vasconcelos, Rita; Vasconcelos, Sílvia; Calado, Graça; Brandão, João; Prada, Susana

2013-09-01

Madeira forms a mid-Atlantic volcanic archipelago, whose economy is largely dependent on tourism. There, one can encounter different types of sand beach: natural basaltic, natural calcareous and artificial calcareous. Microbiological and mycological quality of the sand was analyzed in two different years. Bacterial indicators were detected in higher number in 2010 (36.7% of the samples) than in 2011 (9.1%). Mycological indicators were detected in a similar percentage of samples in 2010 (68.3%) and 2011 (75%), even though the total number of colonies detected in 2010 was much higher (827 in 41 samples) than in 2011 (427 in 66 samples). Enterococci and potentially pathogenic and allergenic fungi (particularly Penicillium sp.) were the most common indicators detected in both years. Candida sp. yeast was also commonly detected in the samples. The analysis of the 3rd quartile and maximum numbers of all indicators in samples showed that artificial beaches tend to be more contaminated than the natural ones. However, a significant difference between the variables was lacking. More monitoring data (number of bathers, sea birds, radiation intensity variation, and a greater number of samples) should be collected in order to confirm if these differences are significant. In general, the sand quality in the archipelago's beaches was good. As the sand may be a vector of diseases, an international common set of indicators and values and a compatible methodologies for assessing sand contamination, should be defined, in order to provide the bather's with an indication of beach sand quality, rather than only the water. Copyright © 2013 Elsevier B.V. All rights reserved.
Quantification of hygiene indicators and Salmonella in the tonsils, oral cavity and rectal content samples of pigs during slaughter.

PubMed

Van Damme, Inge; Mattheus, Wesley; Bertrand, Sophie; De Zutter, Lieven

2018-05-01

The tonsils, oral cavity and faeces of 94 pigs at slaughter were sampled to assess the numbers of total aerobic bacteria, Enterobacteriaceae and Escherichia coli in the rectal content, tonsils and oral cavity of pigs at time of evisceration. Moreover, the prevalence, numbers and types of Salmonella spp. were determined. Mean numbers of Enterobacteriaceae in tonsils and the oral cavity differed between slaughterhouses. The proportion of Enterobacteriaceae relative to total aerobic bacteria differed between the different tissues, though large variations were observed between animals. Salmonella spp. were mostly detected in oral cavity swabs (n = 51, 54%), of which six samples were contaminated in numbers over 2.0 log CFU/100 cm 2 . Salmonella spp. were also recovered from 17 tonsillar tissue samples (18%) and 12 tonsillar swabs (13%). Out of the 29 rectal content samples from which Salmonella was recovered (31%), most were lowly contaminated, in the range between -1 and 0 log CFU/g. The predominant serotypes were S. Typhimurium and its monophasic variant, which were recovered from 33 and 13 pigs, respectively. In most cases, the same serotypes and MLVA profiles were found in pigs slaughtered during the same day, thus suggesting a common source of contamination. Copyright © 2017 Elsevier Ltd. All rights reserved.
Counting glomeruli and podocytes: rationale and methodologies

PubMed Central

Puelles, Victor G.; Bertram, John F.

2015-01-01

Purpose of review There is currently much interest in the numbers of both glomeruli and podocytes. This interest stems from greater understanding of the effects of suboptimal fetal events on nephron endowment, the associations between low nephron number and chronic cardiovascular and kidney disease in adults, and the emergence of the podocyte depletion hypothesis. Recent findings Obtaining accurate and precise estimates of glomerular and podocyte number has proven surprisingly difficult. When whole kidneys or large tissue samples are available, design-based stereological methods are considered gold-standard because they are based on principles that negate systematic bias. However, these methods are often tedious and time-consuming, and oftentimes inapplicable when dealing with small samples such as biopsies. Therefore, novel methods suitable for small tissue samples, and innovative approaches to facilitate high through put measurements, such as magnetic resonance imaging (MRI) to estimate glomerular number and flow cytometry to estimate podocyte number, have recently been described. Summary This review describes current gold-standard methods for estimating glomerular and podocyte number, as well as methods developed in the past 3 years. We are now better placed than ever before to accurately and precisely estimate glomerular and podocyte number, and to examine relationships between these measurements and kidney health and disease. PMID:25887899
Accuracy and differential bias in copy number measurement of CCL3L1 in association studies with three auto-immune disorders.

PubMed

Carpenter, Danielle; Walker, Susan; Prescott, Natalie; Schalkwijk, Joost; Armour, John Al

2011-08-18

Copy number variation (CNV) contributes to the variation observed between individuals and can influence human disease progression, but the accurate measurement of individual copy numbers is technically challenging. In the work presented here we describe a modification to a previously described paralogue ratio test (PRT) method for genotyping the CCL3L1/CCL4L1 copy variable region, which we use to ascertain CCL3L1/CCL4L1 copy number in 1581 European samples. As the products of CCL3L1 and CCL4L1 potentially play a role in autoimmunity we performed case control association studies with Crohn's disease, rheumatoid arthritis and psoriasis clinical cohorts. We evaluate the PRT methodology used, paying particular attention to accuracy and precision, and highlight the problems of differential bias in copy number measurements. Our PRT methods for measuring copy number were of sufficient precision to detect very slight but systematic differential bias between results from case and control DNA samples in one study. We find no evidence for an association between CCL3L1 copy number and Crohn's disease, rheumatoid arthritis or psoriasis. Differential bias of this small magnitude, but applied systematically across large numbers of samples, would create a serious risk of false positive associations in copy number, if measured using methods of lower precision, or methods relying on single uncorroborated measurements. In this study the small differential bias detected by PRT in one sample set was resolved by a simple pre-treatment by restriction enzyme digestion.
Accuracy and differential bias in copy number measurement of CCL3L1 in association studies with three auto-immune disorders

PubMed Central

2011-01-01

Background Copy number variation (CNV) contributes to the variation observed between individuals and can influence human disease progression, but the accurate measurement of individual copy numbers is technically challenging. In the work presented here we describe a modification to a previously described paralogue ratio test (PRT) method for genotyping the CCL3L1/CCL4L1 copy variable region, which we use to ascertain CCL3L1/CCL4L1 copy number in 1581 European samples. As the products of CCL3L1 and CCL4L1 potentially play a role in autoimmunity we performed case control association studies with Crohn's disease, rheumatoid arthritis and psoriasis clinical cohorts. Results We evaluate the PRT methodology used, paying particular attention to accuracy and precision, and highlight the problems of differential bias in copy number measurements. Our PRT methods for measuring copy number were of sufficient precision to detect very slight but systematic differential bias between results from case and control DNA samples in one study. We find no evidence for an association between CCL3L1 copy number and Crohn's disease, rheumatoid arthritis or psoriasis. Conclusions Differential bias of this small magnitude, but applied systematically across large numbers of samples, would create a serious risk of false positive associations in copy number, if measured using methods of lower precision, or methods relying on single uncorroborated measurements. In this study the small differential bias detected by PRT in one sample set was resolved by a simple pre-treatment by restriction enzyme digestion. PMID:21851606

Robust sampling of decision information during perceptual choice

PubMed Central

Vandormael, Hildward; Herce Castañón, Santiago; Balaguer, Jan; Li, Vickie; Summerfield, Christopher

2017-01-01

Humans move their eyes to gather information about the visual world. However, saccadic sampling has largely been explored in paradigms that involve searching for a lone target in a cluttered array or natural scene. Here, we investigated the policy that humans use to overtly sample information in a perceptual decision task that required information from across multiple spatial locations to be combined. Participants viewed a spatial array of numbers and judged whether the average was greater or smaller than a reference value. Participants preferentially sampled items that were less diagnostic of the correct answer (“inlying” elements; that is, elements closer to the reference value). This preference to sample inlying items was linked to decisions, enhancing the tendency to give more weight to inlying elements in the final choice (“robust averaging”). These findings contrast with a large body of evidence indicating that gaze is directed preferentially to deviant information during natural scene viewing and visual search, and suggest that humans may sample information “robustly” with their eyes during perceptual decision-making. PMID:28223519
Determining optimal parameters of the self-referent encoding task: A large-scale examination of self-referent cognition and depression.

PubMed

Dainer-Best, Justin; Lee, Hae Yeon; Shumake, Jason D; Yeager, David S; Beevers, Christopher G

2018-06-07

Although the self-referent encoding task (SRET) is commonly used to measure self-referent cognition in depression, many different SRET metrics can be obtained. The current study used best subsets regression with cross-validation and independent test samples to identify the SRET metrics most reliably associated with depression symptoms in three large samples: a college student sample (n = 572), a sample of adults from Amazon Mechanical Turk (n = 293), and an adolescent sample from a school field study (n = 408). Across all 3 samples, SRET metrics associated most strongly with depression severity included number of words endorsed as self-descriptive and rate of accumulation of information required to decide whether adjectives were self-descriptive (i.e., drift rate). These metrics had strong intratask and split-half reliability and high test-retest reliability across a 1-week period. Recall of SRET stimuli and traditional reaction time (RT) metrics were not robustly associated with depression severity. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
U.S. Food safety and Inspection Service testing for Salmonella in selected raw meat and poultry products in the United States, 1998 through 2003: an establishment-level analysis.

PubMed

Eblen, Denise R; Barlow, Kristina E; Naugle, Alecia Larew

2006-11-01

The U.S. Food Safety and Inspection Service (FSIS) pathogen reduction-hazard analysis critical control point systems final rule, published in 1996, established Salmonella performance standards for broiler chicken, cow and bull, market hog, and steer and heifer carcasses and for ground beef, chicken, and turkey meat. In 1998, the FSIS began testing to verify that establishments are meeting performance standards. Samples are collected in sets in which the number of samples is defined but varies according to product class. A sample set fails when the number of positive Salmonella samples exceeds the maximum number of positive samples allowed under the performance standard. Salmonella sample sets collected at 1,584 establishments from 1998 through 2003 were examined to identify factors associated with failure of one or more sets. Overall, 1,282 (80.9%) of establishments never had failed sets. In establishments that did experience set failure(s), generally the failed sets were collected early in the establishment testing history, with the exception of broiler establishments where failure(s) occurred both early and late in the course of testing. Small establishments were more likely to have experienced a set failure than were large or very small establishments, and broiler establishments were more likely to have failed than were ground beef, market hog, or steer-heifer establishments. Agency response to failed Salmonella sample sets in the form of in-depth verification reviews and related establishment-initiated corrective actions have likely contributed to declines in the number of establishments that failed sets. A focus on food safety measures in small establishments and broiler processing establishments should further reduce the number of sample sets that fail to meet the Salmonella performance standard.
Particle Filter Based Tracking in a Detection Sparse Discrete Event Simulation Environment

DTIC Science & Technology

2007-03-01

obtained by disqualifying a large number of particles. 52 (a) (b) ( c ) Figure 31. Particle Disqualification via Sanitization b...1 B. RESEARCH APPROACH..............................................................................5 C . THESIS ORGANIZATION...38 b. Detection Distribution Sampling............................................43 c . Estimated Position Calculation
Fluid sample collection and distribution system. [qualitative analysis of aqueous samples from several points

NASA Technical Reports Server (NTRS)

Brooks, R. L. (Inventor)

1979-01-01

A multipoint fluid sample collection and distribution system is provided wherein the sample inputs are made through one or more of a number of sampling valves to a progressive cavity pump which is not susceptible to damage by large unfiltered particles. The pump output is through a filter unit that can provide a filtered multipoint sample. An unfiltered multipoint sample is also provided. An effluent sample can be taken and applied to a second progressive cavity pump for pumping to a filter unit that can provide one or more filtered effluent samples. The second pump can also provide an unfiltered effluent sample. Means are provided to periodically back flush each filter unit without shutting off the whole system.
Automatic sample Dewar for MX beam-line

DOE Office of Scientific and Technical Information (OSTI.GOV)

Charignon, T.; Tanchon, J.; Trollier, T.

2014-01-29

It is very common for crystals of large biological macromolecules to show considerable variation in quality of their diffraction. In order to increase the number of samples that are tested for diffraction quality before any full data collections at the ESRF*, an automatic sample Dewar has been implemented. Conception and performances of the Dewar are reported in this paper. The automatic sample Dewar has 240 samples capability with automatic loading/unloading ports. The storing Dewar is capable to work with robots and it can be integrated in a full automatic MX** beam-line. The samples are positioned in the front of themore » loading/unloading ports with and automatic rotating plate. A view port has been implemented for data matrix camera reading on each sample loaded in the Dewar. At last, the Dewar is insulated with polyurethane foam that keeps the liquid nitrogen consumption below 1.6 L/h. At last, the static insulation also makes vacuum equipment and maintenance unnecessary. This Dewar will be useful for increasing the number of samples tested in synchrotrons.« less
Application of a luminescent bacterial biosensor for the detection of tetracyclines in routine analysis of poultry muscle samples.

PubMed

Pikkemaat, M G; Rapallini, M L B A; Karp, M T; Elferink, J W A

2010-08-01

Tetracyclines are extensively used in veterinary medicine. For the detection of tetracycline residues in animal products, a broad array of methods is available. Luminescent bacterial biosensors represent an attractive inexpensive, simple and fast method for screening large numbers of samples. A previously developed cell-biosensor method was subjected to an evaluation study using over 300 routine poultry samples and the results were compared with a microbial inhibition test. The cell-biosensor assay yielded many more suspect samples, 10.2% versus 2% with the inhibition test, which all could be confirmed by liquid chromatography-tandem mass spectrometry (LC-MS/MS). Only one sample contained a concentration above the maximum residue limit (MRL) of 100 microg kg(-1), while residue levels in most of the suspect samples were very low (<10 microg kg(-1)). The method appeared to be specific and robust. Using an experimental set-up comprising the analysis of a series of three sample dilutions allowed an appropriate cut-off for confirmatory analysis, limiting the number of samples and requiring further analysis to a minimum.
Comparing the accuracy and precision of three techniques used for estimating missing landmarks when reconstructing fossil hominin crania.

PubMed

Neeser, Rudolph; Ackermann, Rebecca Rogers; Gain, James

2009-09-01

Various methodological approaches have been used for reconstructing fossil hominin remains in order to increase sample sizes and to better understand morphological variation. Among these, morphometric quantitative techniques for reconstruction are increasingly common. Here we compare the accuracy of three approaches--mean substitution, thin plate splines, and multiple linear regression--for estimating missing landmarks of damaged fossil specimens. Comparisons are made varying the number of missing landmarks, sample sizes, and the reference species of the population used to perform the estimation. The testing is performed on landmark data from individuals of Homo sapiens, Pan troglodytes and Gorilla gorilla, and nine hominin fossil specimens. Results suggest that when a small, same-species fossil reference sample is available to guide reconstructions, thin plate spline approaches perform best. However, if no such sample is available (or if the species of the damaged individual is uncertain), estimates of missing morphology based on a single individual (or even a small sample) of close taxonomic affinity are less accurate than those based on a large sample of individuals drawn from more distantly related extant populations using a technique (such as a regression method) able to leverage the information (e.g., variation/covariation patterning) contained in this large sample. Thin plate splines also show an unexpectedly large amount of error in estimating landmarks, especially over large areas. Recommendations are made for estimating missing landmarks under various scenarios. Copyright 2009 Wiley-Liss, Inc.
Massive processing of pyro-chromatogram mass spectra (py-GCMS) of soil samples using the PARAFAC2 algorithm

NASA Astrophysics Data System (ADS)

Cécillon, Lauric; Quénéa, Katell; Anquetil, Christelle; Barré, Pierre

2015-04-01

Due to its large heterogeneity at all scales (from soil core to the globe), several measurements are often mandatory to get a meaningful value of a measured soil property. A large number of measurements can therefore be needed to study a soil property whatever the scale of the study. Moreover, several soil investigation techniques produce large and complex datasets, such as pyrolysis-gas chromatography-mass spectrometry (Py-GC-MS) which produces complex 3-way data. In this context, straightforward methods designed to speed up data treatments are needed to deal with large datasets. GC-MS pyrolysis (py-GCMS) is a powerful and frequently used tool to characterize soil organic matter (SOM). However, the treatment of the results of a py-GCMS analysis of soil sample is time consuming (number of peaks, co-elution, etc.) and the treatment of large data set of py-GCMS results is rather laborious. Moreover, peak position shifts and baseline drifts between analyses make the automation of GCMS programs data treatment difficult. These problems can be fixed using the Parallel Factor Analysis 2 (PARAFAC 2, Kiers et al., 1999; Bro et al., 1999). This algorithm has been applied frequently on chromatography data but has never been applied to analyses of SOM. We developed a Matlab routine based on existing Matlab packages dedicated to the simultaneous treatment of dozens of pyro-chromatograms mass spectra. We applied this routine on 40 soil samples. The benefits and expected improvements of our method will be discussed in our poster. References Kiers et al. (1999) PARAFAC2 - PartI. A direct fitting algorithm for the PARAFAC2 model. Journal of Chemometrics, 13: 275-294. Bro et al. (1999) PARAFAC2 - PartII. Modeling chromatographic data with retention time shifts. Journal of Chemometrics, 13: 295-309.
Operational Evaluation of the Rapid Viability PCR Method for ...

EPA Pesticide Factsheets

Journal Article This research work has a significant impact on the use of the RV-PCR method to analyze post-decontamination environmental samples during an anthrax event. The method has shown 98% agreement with the traditional culture based method. With such a success, this method, upon validation, will significantly increase the laboratory throughput/capacity to analyze a large number of anthrax event samples in a relatively short time.
Liquid chromatographic determination of sennosides in Cassia angustifolia leaves.

PubMed

Srivastava, Alpuna; Pandey, Richa; Verma, Ram K; Gupta, Madan M

2006-01-01

A simple liquid chromatographic method was developed for the determination of sennosides B and A in leaves of Cassia angustifolia. These compounds were extracted from leaves with a mixture of methanol-water (70 + 30, v/v) after defatting with hexane. Analyte separation and quantitation were achieved by gradient reversed-phase liquid chromatography and UV absorbance at 270 nm using a photodiode array detector. The method involves the use of an RP-18 Lichrocart reversed-phase column (5 microm, 125 x 4.0 mm id) and a binary gradient mobile-phase profile. The various other aspects of analysis, namely, peak purity, similarity, recovery, repeatability, and robustness, were validated. Average recoveries of 98.5 and 98.6%, with a coefficient of variation of 0.8 and 0.3%, were obtained by spiking sample solution with 3 different concentration solutions of standards (60, 100, and 200 microg/mL). Detection limits were 10 microg/mL for sennoside B and 35 microg/mL for sennoside A, present in the sample solution. The quantitation limits were 28 and 100 microg/mL. The analytical method was applied to a large number of senna leaf samples. The new method provides a reliable tool for rapid screening of C. angustifolia samples in large numbers, which is needed in breeding/genetic engineering and genetic mapping experiments.
Mesoscale spatial variability of selected aquatic invertebrate community metrics from a minimally impaired stream segment

USGS Publications Warehouse

Gebler, J.B.

2004-01-01

The related topics of spatial variability of aquatic invertebrate community metrics, implications of spatial patterns of metric values to distributions of aquatic invertebrate communities, and ramifications of natural variability to the detection of human perturbations were investigated. Four metrics commonly used for stream assessment were computed for 9 stream reaches within a fairly homogeneous, minimally impaired stream segment of the San Pedro River, Arizona. Metric variability was assessed for differing sampling scenarios using simple permutation procedures. Spatial patterns of metric values suggest that aquatic invertebrate communities are patchily distributed on subsegment and segment scales, which causes metric variability. Wide ranges of metric values resulted in wide ranges of metric coefficients of variation (CVs) and minimum detectable differences (MDDs), and both CVs and MDDs often increased as sample size (number of reaches) increased, suggesting that any particular set of sampling reaches could yield misleading estimates of population parameters and effects that can be detected. Mean metric variabilities were substantial, with the result that only fairly large differences in metrics would be declared significant at ?? = 0.05 and ?? = 0.20. The number of reaches required to obtain MDDs of 10% and 20% varied with significance level and power, and differed for different metrics, but were generally large, ranging into tens and hundreds of reaches. Study results suggest that metric values from one or a small number of stream reach(es) may not be adequate to represent a stream segment, depending on effect sizes of interest, and that larger sample sizes are necessary to obtain reasonable estimates of metrics and sample statistics. For bioassessment to progress, spatial variability may need to be investigated in many systems and should be considered when designing studies and interpreting data.
The K2 Galactic Archaeology Program Data Release. I. Asteroseismic Results from Campaign 1

NASA Astrophysics Data System (ADS)

Stello, Dennis; Zinn, Joel; Elsworth, Yvonne; Garcia, Rafael A.; Kallinger, Thomas; Mathur, Savita; Mosser, Benoit; Sharma, Sanjib; Chaplin, William J.; Davies, Guy; Huber, Daniel; Jones, Caitlin D.; Miglio, Andrea; Silva Aguirre, Victor

2017-01-01

NASA's K2 mission is observing tens of thousands of stars along the ecliptic, providing data suitable for large-scale asteroseismic analyses to inform galactic archaeology studies. Its first campaign covered a field near the north Galactic cap, a region never covered before by large asteroseismic-ensemble investigations, and was therefore of particular interest for exploring this part of our Galaxy. Here we report the asteroseismic analysis of all stars selected by the K2 Galactic Archaeology Program during the mission's “north Galactic cap” campaign 1. Our consolidated analysis uses six independent methods to measure the global seismic properties, in particular the large frequency separation and the frequency of maximum power. From the full target sample of 8630 stars we find about 1200 oscillating red giants, a number comparable with estimates from galactic synthesis modeling. Thus, as a valuable by-product we find roughly 7500 stars to be dwarfs, which provide a sample well suited for galactic exoplanet occurrence studies because they originate from our simple and easily reproducible selection function. In addition, to facilitate the full potential of the data set for galactic archaeology, we assess the detection completeness of our sample of oscillating red giants. We find that the sample is at least nearly complete for stars with 40 ≲ {ν }\\max /μHz ≲ 270 and {ν }\\max ,{detect}< 2.6× {10}6\\cdot {2}-{\\text{Kp}} μHz. There is a detection bias against helium core burning stars with {ν }\\max ˜ 30 μHz, affecting the number of measurements of {{Δ }}ν and possibly also {ν }\\max . Although we can detect oscillations down to {\\text{Kp}} = 15, our campaign 1 sample lacks enough faint giants to assess the detection completeness for stars fainter than {\\text{Kp}} ˜ 14.5.
A Hybrid Semi-supervised Classification Scheme for Mining Multisource Geospatial Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vatsavai, Raju; Bhaduri, Budhendra L

2011-01-01

Supervised learning methods such as Maximum Likelihood (ML) are often used in land cover (thematic) classification of remote sensing imagery. ML classifier relies exclusively on spectral characteristics of thematic classes whose statistical distributions (class conditional probability densities) are often overlapping. The spectral response distributions of thematic classes are dependent on many factors including elevation, soil types, and ecological zones. A second problem with statistical classifiers is the requirement of large number of accurate training samples (10 to 30 |dimensions|), which are often costly and time consuming to acquire over large geographic regions. With the increasing availability of geospatial databases, itmore » is possible to exploit the knowledge derived from these ancillary datasets to improve classification accuracies even when the class distributions are highly overlapping. Likewise newer semi-supervised techniques can be adopted to improve the parameter estimates of statistical model by utilizing a large number of easily available unlabeled training samples. Unfortunately there is no convenient multivariate statistical model that can be employed for mulitsource geospatial databases. In this paper we present a hybrid semi-supervised learning algorithm that effectively exploits freely available unlabeled training samples from multispectral remote sensing images and also incorporates ancillary geospatial databases. We have conducted several experiments on real datasets, and our new hybrid approach shows over 25 to 35% improvement in overall classification accuracy over conventional classification schemes.« less
Rare behavior of growth processes via umbrella sampling of trajectories

NASA Astrophysics Data System (ADS)

Klymko, Katherine; Geissler, Phillip L.; Garrahan, Juan P.; Whitelam, Stephen

2018-03-01

We compute probability distributions of trajectory observables for reversible and irreversible growth processes. These results reveal a correspondence between reversible and irreversible processes, at particular points in parameter space, in terms of their typical and atypical trajectories. Thus key features of growth processes can be insensitive to the precise form of the rate constants used to generate them, recalling the insensitivity to microscopic details of certain equilibrium behavior. We obtained these results using a sampling method, inspired by the "s -ensemble" large-deviation formalism, that amounts to umbrella sampling in trajectory space. The method is a simple variant of existing approaches, and applies to ensembles of trajectories controlled by the total number of events. It can be used to determine large-deviation rate functions for trajectory observables in or out of equilibrium.
Comparison of Two Methods for the Isolation of Salmonellae From Imported Foods

PubMed Central

Taylor, Welton I.; Hobbs, Betty C.; Smith, Muriel E.

1964-01-01

Two methods for the detection of salmonellae in foods were compared in 179 imported meat and egg samples. The number of positive samples and replications, and the number of strains and kinds of serotypes were statistically comparable by both the direct enrichment method of the Food Hygiene Laboratory in England, and the pre-enrichment method devised for processed foods in the United States. Boneless frozen beef, veal, and horsemeat imported from five countries for consumption in England were found to have salmonellae present in 48 of 116 (41%) samples. Dried egg products imported from three countries were observed to have salmonellae in 10 of 63 (16%) samples. The high incidence of salmonellae isolated from imported foods illustrated the existence of an international health hazard resulting from the continuous introduction of exogenous strains of pathogenic microorganisms on a large scale. PMID:14106941
Efficient Sample Tracking With OpenLabFramework

PubMed Central

List, Markus; Schmidt, Steffen; Trojnar, Jakub; Thomas, Jochen; Thomassen, Mads; Kruse, Torben A.; Tan, Qihua; Baumbach, Jan; Mollenhauer, Jan

2014-01-01

The advance of new technologies in biomedical research has led to a dramatic growth in experimental throughput. Projects therefore steadily grow in size and involve a larger number of researchers. Spreadsheets traditionally used are thus no longer suitable for keeping track of the vast amounts of samples created and need to be replaced with state-of-the-art laboratory information management systems. Such systems have been developed in large numbers, but they are often limited to specific research domains and types of data. One domain so far neglected is the management of libraries of vector clones and genetically engineered cell lines. OpenLabFramework is a newly developed web-application for sample tracking, particularly laid out to fill this gap, but with an open architecture allowing it to be extended for other biological materials and functional data. Its sample tracking mechanism is fully customizable and aids productivity further through support for mobile devices and barcoded labels. PMID:24589879
Software engineering the mixed model for genome-wide association studies on large samples.

PubMed

Zhang, Zhiwu; Buckler, Edward S; Casstevens, Terry M; Bradbury, Peter J

2009-11-01

Mixed models improve the ability to detect phenotype-genotype associations in the presence of population stratification and multiple levels of relatedness in genome-wide association studies (GWAS), but for large data sets the resource consumption becomes impractical. At the same time, the sample size and number of markers used for GWAS is increasing dramatically, resulting in greater statistical power to detect those associations. The use of mixed models with increasingly large data sets depends on the availability of software for analyzing those models. While multiple software packages implement the mixed model method, no single package provides the best combination of fast computation, ability to handle large samples, flexible modeling and ease of use. Key elements of association analysis with mixed models are reviewed, including modeling phenotype-genotype associations using mixed models, population stratification, kinship and its estimation, variance component estimation, use of best linear unbiased predictors or residuals in place of raw phenotype, improving efficiency and software-user interaction. The available software packages are evaluated, and suggestions made for future software development.
Level of endogenous formaldehyde in maple syrup as determined by spectrofluorimetry.

PubMed

Lagacé, Luc; Guay, Stéphane; Martin, Nathalie

2003-01-01

The level of endogenous formaldehyde in maple syrup was established from a large number (n = 300) of authentic maple syrup samples collected during 2000 and 2001 in the province of Quebec, Canada. The average level of formaldehyde from these authentic samples was measured at 0.18 mg/kg in 2000 and 0.28 mg/kg in 2001, which is lower than previously published. These average values can be attributed to the improved spectrofluorimetric method used for the determination. However, the formaldehyde values obtained demonstrate a relatively large distribution with maximums observed at 1.04 and 1.54 mg/kg. These values are still under the maximum tolerance level of 2.0 mg/kg paraformaldehyde pesticide residue. Extensive heat treatment of maple syrup samples greatly enhanced the formaldehyde concentration of the samples, suggesting that extensive heat degradation of the sap constituents during evaporation could be responsible for the highest formaldehyde values in maple syrup.
Microbial community analysis using MEGAN.

PubMed

Huson, Daniel H; Weber, Nico

2013-01-01

Metagenomics, the study of microbes in the environment using DNA sequencing, depends upon dedicated software tools for processing and analyzing very large sequencing datasets. One such tool is MEGAN (MEtaGenome ANalyzer), which can be used to interactively analyze and compare metagenomic and metatranscriptomic data, both taxonomically and functionally. To perform a taxonomic analysis, the program places the reads onto the NCBI taxonomy, while functional analysis is performed by mapping reads to the SEED, COG, and KEGG classifications. Samples can be compared taxonomically and functionally, using a wide range of different charting and visualization techniques. PCoA analysis and clustering methods allow high-level comparison of large numbers of samples. Different attributes of the samples can be captured and used within analysis. The program supports various input formats for loading data and can export analysis results in different text-based and graphical formats. The program is designed to work with very large samples containing many millions of reads. It is written in Java and installers for the three major computer operating systems are available from http://www-ab.informatik.uni-tuebingen.de. © 2013 Elsevier Inc. All rights reserved.

SMN1 and SMN2 copy numbers in cell lines derived from patients with spinal muscular atrophy as measured by array digital PCR.

PubMed

Stabley, Deborah L; Harris, Ashlee W; Holbrook, Jennifer; Chubbs, Nicholas J; Lozo, Kevin W; Crawford, Thomas O; Swoboda, Kathryn J; Funanage, Vicky L; Wang, Wenlan; Mackenzie, William; Scavina, Mena; Sol-Church, Katia; Butchbach, Matthew E R

2015-07-01

Proximal spinal muscular atrophy (SMA) is an early-onset motor neuron disease characterized by loss of α-motor neurons and associated muscle atrophy. SMA is caused by deletion or other disabling mutation of survival motor neuron 1 (SMN1). In the human genome, a large duplication of the SMN-containing region gives rise to a second copy of this gene (SMN2) that is distinguishable by a single nucleotide change in exon 7. Within the SMA population, there is substantial variation in SMN2 copy number; in general, those individuals with SMA who have a high SMN2 copy number have a milder disease. Because SMN2 functions as a disease modifier, its accurate copy number determination may have clinical relevance. In this study, we describe the development of an assay to assess SMN1 and SMN2 copy numbers in DNA samples using an array-based digital PCR (dPCR) system. This dPCR assay can accurately and reliably measure the number of SMN1 and SMN2 copies in DNA samples. In a cohort of SMA patient-derived cell lines, the assay confirmed a strong inverse correlation between SMN2 copy number and disease severity. Array dPCR is a practical technique to determine, accurately and reliably, SMN1 and SMN2 copy numbers from SMA samples.
Stormbow: A Cloud-Based Tool for Reads Mapping and Expression Quantification in Large-Scale RNA-Seq Studies

PubMed Central

Zhao, Shanrong; Prenger, Kurt; Smith, Lance

2013-01-01

RNA-Seq is becoming a promising replacement to microarrays in transcriptome profiling and differential gene expression study. Technical improvements have decreased sequencing costs and, as a result, the size and number of RNA-Seq datasets have increased rapidly. However, the increasing volume of data from large-scale RNA-Seq studies poses a practical challenge for data analysis in a local environment. To meet this challenge, we developed Stormbow, a cloud-based software package, to process large volumes of RNA-Seq data in parallel. The performance of Stormbow has been tested by practically applying it to analyse 178 RNA-Seq samples in the cloud. In our test, it took 6 to 8 hours to process an RNA-Seq sample with 100 million reads, and the average cost was $3.50 per sample. Utilizing Amazon Web Services as the infrastructure for Stormbow allows us to easily scale up to handle large datasets with on-demand computational resources. Stormbow is a scalable, cost effective, and open-source based tool for large-scale RNA-Seq data analysis. Stormbow can be freely downloaded and can be used out of box to process Illumina RNA-Seq datasets. PMID:25937948
Stormbow: A Cloud-Based Tool for Reads Mapping and Expression Quantification in Large-Scale RNA-Seq Studies.

PubMed

Zhao, Shanrong; Prenger, Kurt; Smith, Lance

2013-01-01

RNA-Seq is becoming a promising replacement to microarrays in transcriptome profiling and differential gene expression study. Technical improvements have decreased sequencing costs and, as a result, the size and number of RNA-Seq datasets have increased rapidly. However, the increasing volume of data from large-scale RNA-Seq studies poses a practical challenge for data analysis in a local environment. To meet this challenge, we developed Stormbow, a cloud-based software package, to process large volumes of RNA-Seq data in parallel. The performance of Stormbow has been tested by practically applying it to analyse 178 RNA-Seq samples in the cloud. In our test, it took 6 to 8 hours to process an RNA-Seq sample with 100 million reads, and the average cost was $3.50 per sample. Utilizing Amazon Web Services as the infrastructure for Stormbow allows us to easily scale up to handle large datasets with on-demand computational resources. Stormbow is a scalable, cost effective, and open-source based tool for large-scale RNA-Seq data analysis. Stormbow can be freely downloaded and can be used out of box to process Illumina RNA-Seq datasets.
Experimental layout, data analysis, and thresholds in ELISA testing of maize for aphid-borne viruses.

PubMed

Caciagli, P; Verderio, A

2003-06-30

Several aspects of enzyme-linked immunosorbent assay (ELISA) procedures and data analysis have been examined in an attempt to find a rapid and reliable method for discriminating between 'positive' and 'negative' results when testing a large number of samples. A layout of ELISA plates was designed to reduce uncontrolled variation and to optimize the number of negative and positive controls. A transformation using the fourth root (A(1/4)) of the optical density readings corrected for the blank (A) stabilized the variance of most ELISA data examined. Transformed A values were used to calculate the true limits, at a set protection level, for false positive (C) and false negative (D). Methods are discussed to reduce the number of undifferentiated samples, i.e. the samples with response falling between C and D. The whole procedure was set up for use with an electronic spreadsheet. With the addition of few instructions of the type 'if em leader then em leader else' in the spreadsheet, the ELISA results were obtained in the simple trichotomous form 'negative/undefined/positive'. This allowed rapid analysis of more than 1100 maize samples testing for the presence of seven aphid-borne viruses-in fact almost 8000 ELISA samples.
Inverse sampling regression for pooled data.

PubMed

Montesinos-López, Osval A; Montesinos-López, Abelardo; Eskridge, Kent; Crossa, José

2017-06-01

Because pools are tested instead of individuals in group testing, this technique is helpful for estimating prevalence in a population or for classifying a large number of individuals into two groups at a low cost. For this reason, group testing is a well-known means of saving costs and producing precise estimates. In this paper, we developed a mixed-effect group testing regression that is useful when the data-collecting process is performed using inverse sampling. This model allows including covariate information at the individual level to incorporate heterogeneity among individuals and identify which covariates are associated with positive individuals. We present an approach to fit this model using maximum likelihood and we performed a simulation study to evaluate the quality of the estimates. Based on the simulation study, we found that the proposed regression method for inverse sampling with group testing produces parameter estimates with low bias when the pre-specified number of positive pools (r) to stop the sampling process is at least 10 and the number of clusters in the sample is also at least 10. We performed an application with real data and we provide an NLMIXED code that researchers can use to implement this method.
Preparation of highly multiplexed small RNA sequencing libraries.

PubMed

Persson, Helena; Søkilde, Rolf; Pirona, Anna Chiara; Rovira, Carlos

2017-08-01

MicroRNAs (miRNAs) are ~22-nucleotide-long small non-coding RNAs that regulate the expression of protein-coding genes by base pairing to partially complementary target sites, preferentially located in the 3´ untranslated region (UTR) of target mRNAs. The expression and function of miRNAs have been extensively studied in human disease, as well as the possibility of using these molecules as biomarkers for prognostication and treatment guidance. To identify and validate miRNAs as biomarkers, their expression must be screened in large collections of patient samples. Here, we develop a scalable protocol for the rapid and economical preparation of a large number of small RNA sequencing libraries using dual indexing for multiplexing. Combined with the use of off-the-shelf reagents, more samples can be sequenced simultaneously on large-scale sequencing platforms at a considerably lower cost per sample. Sample preparation is simplified by pooling libraries prior to gel purification, which allows for the selection of a narrow size range while minimizing sample variation. A comparison with publicly available data from benchmarking of miRNA analysis platforms showed that this method captures absolute and differential expression as effectively as commercially available alternatives.
Internal pilots for a class of linear mixed models with Gaussian and compound symmetric data

PubMed Central

Gurka, Matthew J.; Coffey, Christopher S.; Muller, Keith E.

2015-01-01

SUMMARY An internal pilot design uses interim sample size analysis, without interim data analysis, to adjust the final number of observations. The approach helps to choose a sample size sufficiently large (to achieve the statistical power desired), but not too large (which would waste money and time). We report on recent research in cerebral vascular tortuosity (curvature in three dimensions) which would benefit greatly from internal pilots due to uncertainty in the parameters of the covariance matrix used for study planning. Unfortunately, observations correlated across the four regions of the brain and small sample sizes preclude using existing methods. However, as in a wide range of medical imaging studies, tortuosity data have no missing or mistimed data, a factorial within-subject design, the same between-subject design for all responses, and a Gaussian distribution with compound symmetry. For such restricted models, we extend exact, small sample univariate methods for internal pilots to linear mixed models with any between-subject design (not just two groups). Planning a new tortuosity study illustrates how the new methods help to avoid sample sizes that are too small or too large while still controlling the type I error rate. PMID:17318914
High-precision 40Ar/39Ar dating of Quaternary basalts from Auckland Volcanic Field, New Zealand, with implications for eruption rates and paleomagnetic correlations

NASA Astrophysics Data System (ADS)

Leonard, Graham S.; Calvert, Andrew T.; Hopkins, Jenni L.; Wilson, Colin J. N.; Smid, Elaine R.; Lindsay, Jan M.; Champion, Duane E.

2017-09-01

The Auckland Volcanic Field (AVF), which last erupted ca. 550 years ago, is a late Quaternary monogenetic basaltic volcanic field (ca. 500 km2) in the northern North Island of New Zealand. Prior to this study only 12 out of the 53 identified eruptive centres of the AVF had been reliably dated. Careful sample preparation and 40Ar/39Ar analysis has increased the number of well-dated centres in the AVF to 35. The high precision of the results is attributed to selection of fresh, non-vesicular, non-glassy samples from lava flow interiors. Sample selection was coupled with separation techniques that targeted only the groundmass of samples with < 5% glass and with groundmass feldspars > 10 μm wide, coupled with ten-increment furnace step-heating of large quantities (up to 200 mg) of material. The overall AVF age data indicate an onset at 193.2 ± 2.8 ka, an apparent six-eruption flare-up from 30 to 34 ka, and a ≤ 10 kyr hiatus between the latest and second-to-latest eruptions. Such non-uniformity shows that averaging the number of eruptions over the life-span of the AVF to yield a mean eruption rate is overly simplistic. Together with large variations in eruption volumes, and the large sizes and unusual chemistry within the latest eruptions (Rangitoto 1 and Rangitoto 2), our results illuminate a complex episodic eruption history. In particular, the rate of volcanism in AVF has increased since 60 ka, suggesting that the field is still in its infancy. Multiple centres with unusual paleomagnetic inclination and declination orientations are confirmed to fit into a number of geomagnetic excursions, with five identified in the Mono Lake, two within the Laschamp, one within the post-Blake or Blake, and two possibly within the Hilina Pali.
Survey of Large Methane Emitters in North America

NASA Astrophysics Data System (ADS)

Deiker, S.

2017-12-01

It has been theorized that methane emissions in the oil and gas industry follow log normal or "fat tail" distributions, with large numbers of small sources for every very large source. Such distributions would have significant policy and operational implications. Unfortunately, by their very nature such distributions would require large sample sizes to verify. Until recently, such large-scale studies would be prohibitively expensive. The largest public study to date sampled 450 wells, an order of magnitude too low to effectively constrain these models. During 2016 and 2017, Kairos Aerospace conducted a series of surveys the LeakSurveyor imaging spectrometer, mounted on light aircraft. This small, lightweight instrument was designed to rapidly locate large emission sources. The resulting survey covers over three million acres of oil and gas production. This includes over 100,000 wells, thousands of storage tanks and over 7,500 miles of gathering lines. This data set allows us to now probe the distribution of large methane emitters. Results of this survey, and implications for methane emission distribution, methane policy and LDAR will be discussed.
Collective behavior of large-scale neural networks with GPU acceleration.

PubMed

Qu, Jingyi; Wang, Rubin

2017-12-01

In this paper, the collective behaviors of a small-world neuronal network motivated by the anatomy of a mammalian cortex based on both Izhikevich model and Rulkov model are studied. The Izhikevich model can not only reproduce the rich behaviors of biological neurons but also has only two equations and one nonlinear term. Rulkov model is in the form of difference equations that generate a sequence of membrane potential samples in discrete moments of time to improve computational efficiency. These two models are suitable for the construction of large scale neural networks. By varying some key parameters, such as the connection probability and the number of nearest neighbor of each node, the coupled neurons will exhibit types of temporal and spatial characteristics. It is demonstrated that the implementation of GPU can achieve more and more acceleration than CPU with the increasing of neuron number and iterations. These two small-world network models and GPU acceleration give us a new opportunity to reproduce the real biological network containing a large number of neurons.
Classical boson sampling algorithms with superior performance to near-term experiments

NASA Astrophysics Data System (ADS)

Neville, Alex; Sparrow, Chris; Clifford, Raphaël; Johnston, Eric; Birchall, Patrick M.; Montanaro, Ashley; Laing, Anthony

2017-12-01

It is predicted that quantum computers will dramatically outperform their conventional counterparts. However, large-scale universal quantum computers are yet to be built. Boson sampling is a rudimentary quantum algorithm tailored to the platform of linear optics, which has sparked interest as a rapid way to demonstrate such quantum supremacy. Photon statistics are governed by intractable matrix functions, which suggests that sampling from the distribution obtained by injecting photons into a linear optical network could be solved more quickly by a photonic experiment than by a classical computer. The apparently low resource requirements for large boson sampling experiments have raised expectations of a near-term demonstration of quantum supremacy by boson sampling. Here we present classical boson sampling algorithms and theoretical analyses of prospects for scaling boson sampling experiments, showing that near-term quantum supremacy via boson sampling is unlikely. Our classical algorithm, based on Metropolised independence sampling, allowed the boson sampling problem to be solved for 30 photons with standard computing hardware. Compared to current experiments, a demonstration of quantum supremacy over a successful implementation of these classical methods on a supercomputer would require the number of photons and experimental components to increase by orders of magnitude, while tackling exponentially scaling photon loss.
Accounting for sampling patterns reverses the relative importance of trade and climate for the global sharing of exotic plants

USGS Publications Warehouse

Sofaer, Helen R.; Jarnevich, Catherine S.

2017-01-01

AimThe distributions of exotic species reflect patterns of human-mediated dispersal, species climatic tolerances and a suite of other biotic and abiotic factors. The relative importance of each of these factors will shape how the spread of exotic species is affected by ongoing economic globalization and climate change. However, patterns of trade may be correlated with variation in scientific sampling effort globally, potentially confounding studies that do not account for sampling patterns.LocationGlobal.Time periodMuseum records, generally from the 1800s up to 2015.Major taxa studiedPlant species exotic to the United States.MethodsWe used data from the Global Biodiversity Information Facility (GBIF) to summarize the number of plant species with exotic occurrences in the United States that also occur in each other country world-wide. We assessed the relative importance of trade and climatic similarity for explaining variation in the number of shared species while evaluating several methods to account for variation in sampling effort among countries.ResultsAccounting for variation in sampling effort reversed the relative importance of trade and climate for explaining numbers of shared species. Trade was strongly correlated with numbers of shared U.S. exotic plants between the United States and other countries before, but not after, accounting for sampling variation among countries. Conversely, accounting for sampling effort strengthened the relationship between climatic similarity and species sharing. Using the number of records as a measure of sampling effort provided a straightforward approach for the analysis of occurrence data, whereas species richness estimators and rarefaction were less effective at removing sampling bias.Main conclusionsOur work provides support for broad-scale climatic limitation on the distributions of exotic species, illustrates the need to account for variation in sampling effort in large biodiversity databases, and highlights the difficulty in inferring causal links between the economic drivers of invasion and global patterns of exotic species occurrence.
Fuzzy support vector machine for microarray imbalanced data classification

NASA Astrophysics Data System (ADS)

Ladayya, Faroh; Purnami, Santi Wulan; Irhamah

2017-11-01

DNA microarrays are data containing gene expression with small sample sizes and high number of features. Furthermore, imbalanced classes is a common problem in microarray data. This occurs when a dataset is dominated by a class which have significantly more instances than the other minority classes. Therefore, it is needed a classification method that solve the problem of high dimensional and imbalanced data. Support Vector Machine (SVM) is one of the classification methods that is capable of handling large or small samples, nonlinear, high dimensional, over learning and local minimum issues. SVM has been widely applied to DNA microarray data classification and it has been shown that SVM provides the best performance among other machine learning methods. However, imbalanced data will be a problem because SVM treats all samples in the same importance thus the results is bias for minority class. To overcome the imbalanced data, Fuzzy SVM (FSVM) is proposed. This method apply a fuzzy membership to each input point and reformulate the SVM such that different input points provide different contributions to the classifier. The minority classes have large fuzzy membership so FSVM can pay more attention to the samples with larger fuzzy membership. Given DNA microarray data is a high dimensional data with a very large number of features, it is necessary to do feature selection first using Fast Correlation based Filter (FCBF). In this study will be analyzed by SVM, FSVM and both methods by applying FCBF and get the classification performance of them. Based on the overall results, FSVM on selected features has the best classification performance compared to SVM.
Sensitive and Rapid Detection of Viable Giardia Cysts and Cryptosporidium parvum Oocysts in Large-Volume Water Samples with Wound Fiberglass Cartridge Filters and Reverse Transcription-PCR

PubMed Central

Kaucner, Christine; Stinear, Timothy

1998-01-01

We recently described a reverse transcription-PCR (RT-PCR) for detecting low numbers of viable Cryptosporidium parvum oocysts spiked into clarified environmental water concentrates. We have now modified the assay for direct analysis of primary sample concentrates with simultaneous detection of viable C. parvum oocysts, Giardia cysts, and a novel type of internal positive control (IPC). The IPC was designed to assess both efficiency of mRNA isolation and potential RT-PCR inhibition. Sensitivity testing showed that low numbers of organisms, in the range of a single viable cyst and oocyst, could be detected when spiked into 100-μl packed pellet volumes of concentrates from creek and river water samples. The RT-PCR was compared with an immunofluorescence (IF) assay by analyzing 29 nonspiked environmental water samples. Sample volumes of 20 to 1,500 liters were concentrated with a wound fiberglass cartridge filter. Frequency of detection for viable Giardia cysts increased from 24% by IF microscopy to 69% by RT-PCR. Viable C. parvum oocysts were detected only once by RT-PCR (3%) in contrast to detection of viable Cryptosporidium spp. in four samples by IF microscopy (14%), suggesting that Cryptosporidium species other than C. parvum were present in the water. This combination of the large-volume sampling method with RT-PCR represents a significant advance in terms of protozoan pathogen monitoring and in the wider application of PCR technology to this field of microbiology. PMID:9572946
A comparative analysis of whole genome sequencing of esophageal adenocarcinoma pre- and post-chemotherapy

PubMed Central

Noorani, Ayesha; Lynch, Andy G.; Achilleos, Achilleas; Eldridge, Matthew; Bower, Lawrence; Weaver, Jamie M.J.; Crawte, Jason; Ong, Chin-Ann; Shannon, Nicholas; MacRae, Shona; Grehan, Nicola; Nutzinger, Barbara; O'Donovan, Maria; Hardwick, Richard; Tavaré, Simon; Fitzgerald, Rebecca C.

2017-01-01

The scientific community has avoided using tissue samples from patients that have been exposed to systemic chemotherapy to infer the genomic landscape of a given cancer. Esophageal adenocarcinoma is a heterogeneous, chemoresistant tumor for which the availability and size of pretreatment endoscopic samples are limiting. This study compares whole-genome sequencing data obtained from chemo-naive and chemo-treated samples. The quality of whole-genomic sequencing data is comparable across all samples regardless of chemotherapy status. Inclusion of samples collected post-chemotherapy increased the proportion of late-stage tumors. When comparing matched pre- and post-chemotherapy samples from 10 cases, the mutational signatures, copy number, and SNV mutational profiles reflect the expected heterogeneity in this disease. Analysis of SNVs in relation to allele-specific copy-number changes pinpoints the common ancestor to a point prior to chemotherapy. For cases in which pre- and post-chemotherapy samples do show substantial differences, the timing of the divergence is near-synchronous with endoreduplication. Comparison across a large prospective cohort (62 treatment-naive, 58 chemotherapy-treated samples) reveals no significant differences in the overall mutation rate, mutation signatures, specific recurrent point mutations, or copy-number events in respect to chemotherapy status. In conclusion, whole-genome sequencing of samples obtained following neoadjuvant chemotherapy is representative of the genomic landscape of esophageal adenocarcinoma. Excluding these samples reduces the material available for cataloging and introduces a bias toward the earlier stages of cancer. PMID:28465312
An investigation of the measurement properties of the Spot-the-Word test in a community sample.

PubMed

Mackinnon, Andrew; Christensen, Helen

2007-12-01

Intellectual ability is assessed with the Spot-the-Word (STW) test (A. Baddeley, H. Emslie, & I. Nimmo Smith, 1993) by asking respondents to identify a word in a word-nonword item pair. Results in moderate-sized samples suggest this ability is resistant to decline due to dementia. The authors used a 3-parameter item response theory model to investigate the measurement properties of the STW in a large community-dwelling sample (n=2,480) 60 to 64 years of age. A number of poorly performing items were identified. Substantial guessing was present; however, the number of words correctly identified was found to be an accurate index of ability. Performance was moderately related to a number of tests of cognitive performance and was effectively unrelated to visual acuity and to physical or mental health status. The STW is a promising test of ability that, in the future, may be refined by the deletion or replacement of poorly functioning items.
An adaptive importance sampling algorithm for Bayesian inversion with multimodal distributions

DOE PAGES

Li, Weixuan; Lin, Guang

2015-03-21

Parametric uncertainties are encountered in the simulations of many physical systems, and may be reduced by an inverse modeling procedure that calibrates the simulation results to observations on the real system being simulated. Following Bayes’ rule, a general approach for inverse modeling problems is to sample from the posterior distribution of the uncertain model parameters given the observations. However, the large number of repetitive forward simulations required in the sampling process could pose a prohibitive computational burden. This difficulty is particularly challenging when the posterior is multimodal. We present in this paper an adaptive importance sampling algorithm to tackle thesemore » challenges. Two essential ingredients of the algorithm are: 1) a Gaussian mixture (GM) model adaptively constructed as the proposal distribution to approximate the possibly multimodal target posterior, and 2) a mixture of polynomial chaos (PC) expansions, built according to the GM proposal, as a surrogate model to alleviate the computational burden caused by computational-demanding forward model evaluations. In three illustrative examples, the proposed adaptive importance sampling algorithm demonstrates its capabilities of automatically finding a GM proposal with an appropriate number of modes for the specific problem under study, and obtaining a sample accurately and efficiently representing the posterior with limited number of forward simulations.« less
Bremsstrahlung-Based Imaging and Assays of Radioactive, Mixed and Hazardous Waste

NASA Astrophysics Data System (ADS)

Kwofie, J.; Wells, D. P.; Selim, F. A.; Harmon, F.; Duttagupta, S. P.; Jones, J. L.; White, T.; Roney, T.

2003-08-01

A new nondestructive accelerator based x-ray fluorescence (AXRF) approach has been developed to identify heavy metals in large-volume samples. Such samples are an important part of the process and waste streams of U.S Department of Energy sites, as well as other industries such as mining and milling. Distributions of heavy metal impurities in these process and waste samples can range from homogeneous to highly inhomogeneous, and non-destructive assays and imaging that can address both are urgently needed. Our approach is based on using high-energy, pulsed bremsstrahlung beams (3-6.5 MeV) from small electron accelerators to produce K-shell atomic fluorescence x-rays. In addition we exploit pair-production, Compton scattering and x-ray transmission measurements from these beams to probe locations of high density and high atomic number. The excellent penetrability of these beams allows assays and images for soil-like samples at least 15 g/cm2 thick, with elemental impurities of atomic number greater than approximately 50. Fluorescence yield of a variety of targets was measured as a function of impurity atomic number, impurity homogeneity, and sample thickness. We report on actual and potential detection limits of heavy metal impurities in a soil matrix for a variety of samples, and on the potential for imaging, using AXRF and these related probes.
An adaptive importance sampling algorithm for Bayesian inversion with multimodal distributions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, Weixuan; Lin, Guang, E-mail: guanglin@purdue.edu

2015-08-01

Parametric uncertainties are encountered in the simulations of many physical systems, and may be reduced by an inverse modeling procedure that calibrates the simulation results to observations on the real system being simulated. Following Bayes' rule, a general approach for inverse modeling problems is to sample from the posterior distribution of the uncertain model parameters given the observations. However, the large number of repetitive forward simulations required in the sampling process could pose a prohibitive computational burden. This difficulty is particularly challenging when the posterior is multimodal. We present in this paper an adaptive importance sampling algorithm to tackle thesemore » challenges. Two essential ingredients of the algorithm are: 1) a Gaussian mixture (GM) model adaptively constructed as the proposal distribution to approximate the possibly multimodal target posterior, and 2) a mixture of polynomial chaos (PC) expansions, built according to the GM proposal, as a surrogate model to alleviate the computational burden caused by computational-demanding forward model evaluations. In three illustrative examples, the proposed adaptive importance sampling algorithm demonstrates its capabilities of automatically finding a GM proposal with an appropriate number of modes for the specific problem under study, and obtaining a sample accurately and efficiently representing the posterior with limited number of forward simulations.« less
Sampled-data chain-observer design for a class of delayed nonlinear systems

NASA Astrophysics Data System (ADS)

Kahelras, M.; Ahmed-Ali, T.; Giri, F.; Lamnabhi-Lagarrigue, F.

2018-05-01

The problem of observer design is addressed for a class of triangular nonlinear systems with not-necessarily small delay and sampled output measurements. One more difficulty is that the system state matrix is dependent on the un-delayed output signal which is not accessible to measurement, making existing observers inapplicable. A new chain observer, composed of m elementary observers in series, is designed to compensate for output sampling and arbitrary large delays. The larger the time-delay the larger the number m. Each elementary observer includes an output predictor that is conceived to compensate for the effects of output sampling and a fractional delay. The predictors are defined by first-order ordinary differential equations (ODEs) much simpler than those of existing predictors which involve both output and state predictors. Using a small gain type analysis, sufficient conditions for the observer to be exponentially convergent are established in terms of the minimal number m of elementary observers and the maximum sampling interval.

An efficient reliability algorithm for locating design point using the combination of importance sampling concepts and response surface method

NASA Astrophysics Data System (ADS)

Shayanfar, Mohsen Ali; Barkhordari, Mohammad Ali; Roudak, Mohammad Amin

2017-06-01

Monte Carlo simulation (MCS) is a useful tool for computation of probability of failure in reliability analysis. However, the large number of required random samples makes it time-consuming. Response surface method (RSM) is another common method in reliability analysis. Although RSM is widely used for its simplicity, it cannot be trusted in highly nonlinear problems due to its linear nature. In this paper, a new efficient algorithm, employing the combination of importance sampling, as a class of MCS, and RSM is proposed. In the proposed algorithm, analysis starts with importance sampling concepts and using a represented two-step updating rule of design point. This part finishes after a small number of samples are generated. Then RSM starts to work using Bucher experimental design, with the last design point and a represented effective length as the center point and radius of Bucher's approach, respectively. Through illustrative numerical examples, simplicity and efficiency of the proposed algorithm and the effectiveness of the represented rules are shown.
A new low-cost procedure for detecting nucleic acids in low-incidence samples: a case study of detecting spores of Paenibacillus larvae from bee debris.

PubMed

Ryba, Stepan; Kindlmann, Pavel; Titera, Dalibor; Haklova, Marcela; Stopka, Pavel

2012-10-01

American foulbrood, because of its virulence and worldwide spread, is currently one of the most dangerous diseases of honey bees. Quick diagnosis of this disease is therefore vitally important. For its successful eradication, however, all the hives in the region must be tested. This is time consuming and costly. Therefore, a fast and sensitive method of detecting American foulbrood is needed. Here we present a method that significantly reduces the number of tests needed by combining batches of samples from different hives. The results of this method were verified by testing each sample. A simulation study was used to compare the efficiency of the new method with testing all the samples and to develop a decision tool for determining when best to use the new method. The method is suitable for testing large numbers of samples (over 100) when the incidence of the disease is low (10% or less).
Deriving photometric redshifts using fuzzy archetypes and self-organizing maps - I. Methodology

NASA Astrophysics Data System (ADS)

Speagle, Joshua S.; Eisenstein, Daniel J.

2017-07-01

We propose a method to substantially increase the flexibility and power of template fitting-based photometric redshifts by transforming a large number of galaxy spectral templates into a corresponding collection of 'fuzzy archetypes' using a suitable set of perturbative priors designed to account for empirical variation in dust attenuation and emission-line strengths. To bypass widely separated degeneracies in parameter space (e.g. the redshift-reddening degeneracy), we train self-organizing maps (SOMs) on large 'model catalogues' generated from Monte Carlo sampling of our fuzzy archetypes to cluster the predicted observables in a topologically smooth fashion. Subsequent sampling over the SOM then allows full reconstruction of the relevant probability distribution functions (PDFs). This combined approach enables the multimodal exploration of known variation among galaxy spectral energy distributions with minimal modelling assumptions. We demonstrate the power of this approach to recover full redshift PDFs using discrete Markov chain Monte Carlo sampling methods combined with SOMs constructed from Large Synoptic Survey Telescope ugrizY and Euclid YJH mock photometry.
Photometric Redshifts for the Large-Area Stripe 82X Multiwavelength Survey

NASA Astrophysics Data System (ADS)

Tasnim Ananna, Tonima; Salvato, Mara; Urry, C. Megan; LaMassa, Stephanie M.; STRIPE 82X

2016-06-01

The Stripe 82X survey currently includes 6000 X-ray sources in 31.3 square degrees of XMM-Newton and Chandra X-ray coverage, most of which are AGN. Using a maximum-likelihood approach, we identified optical and infrared counterparts in the SDSS, VHS K-band and WISE W1-band catalogs. 1200 objects which had different best associations in different catalogs were checked by eye. Our most recent paper provided the multiwavelength catalogs for this sample. More than 1000 counterparts have spectroscopic redshifts, either from SDSS spectroscopy or our own follow-up program. Using the extensive multiwavelength data in this field, we provide photometric redshift estimates for most of the remaining sources, which are 80-90% accurate according to the training set. Our sample has a large number of candidates that are very faint in optical and bright in IR. We expect a large fraction of these objects to be the obscured AGN sample we need to complete the census on black hole growth at a range of redshifts.
Performance of maximum likelihood mixture models to estimate nursery habitat contributions to fish stocks: a case study on sea bream Sparus aurata

PubMed Central

Darnaude, Audrey M.

2016-01-01

Background Mixture models (MM) can be used to describe mixed stocks considering three sets of parameters: the total number of contributing sources, their chemical baseline signatures and their mixing proportions. When all nursery sources have been previously identified and sampled for juvenile fish to produce baseline nursery-signatures, mixing proportions are the only unknown set of parameters to be estimated from the mixed-stock data. Otherwise, the number of sources, as well as some/all nursery-signatures may need to be also estimated from the mixed-stock data. Our goal was to assess bias and uncertainty in these MM parameters when estimated using unconditional maximum likelihood approaches (ML-MM), under several incomplete sampling and nursery-signature separation scenarios. Methods We used a comprehensive dataset containing otolith elemental signatures of 301 juvenile Sparus aurata, sampled in three contrasting years (2008, 2010, 2011), from four distinct nursery habitats. (Mediterranean lagoons) Artificial nursery-source and mixed-stock datasets were produced considering: five different sampling scenarios where 0–4 lagoons were excluded from the nursery-source dataset and six nursery-signature separation scenarios that simulated data separated 0.5, 1.5, 2.5, 3.5, 4.5 and 5.5 standard deviations among nursery-signature centroids. Bias (BI) and uncertainty (SE) were computed to assess reliability for each of the three sets of MM parameters. Results Both bias and uncertainty in mixing proportion estimates were low (BI ≤ 0.14, SE ≤ 0.06) when all nursery-sources were sampled but exhibited large variability among cohorts and increased with the number of non-sampled sources up to BI = 0.24 and SE = 0.11. Bias and variability in baseline signature estimates also increased with the number of non-sampled sources, but tended to be less biased, and more uncertain than mixing proportion ones, across all sampling scenarios (BI < 0.13, SE < 0.29). Increasing separation among nursery signatures improved reliability of mixing proportion estimates, but lead to non-linear responses in baseline signature parameters. Low uncertainty, but a consistent underestimation bias affected the estimated number of nursery sources, across all incomplete sampling scenarios. Discussion ML-MM produced reliable estimates of mixing proportions and nursery-signatures under an important range of incomplete sampling and nursery-signature separation scenarios. This method failed, however, in estimating the true number of nursery sources, reflecting a pervasive issue affecting mixture models, within and beyond the ML framework. Large differences in bias and uncertainty found among cohorts were linked to differences in separation of chemical signatures among nursery habitats. Simulation approaches, such as those presented here, could be useful to evaluate sensitivity of MM results to separation and variability in nursery-signatures for other species, habitats or cohorts. PMID:27761305
Galaxy evolution and large-scale structure in the far-infrared. I - IRAS pointed observations

NASA Astrophysics Data System (ADS)

Lonsdale, Carol J.; Hacking, Perry B.

1989-04-01

Redshifts for 66 galaxies were obtained from a sample of 93 60-micron sources detected serendipitously in 22 IRAS deep pointed observations, covering a total area of 18.4 sq deg. The flux density limit of this survey is 150 mJy, 4 times fainter than the IRAS Point Source Catalog (PSC). The luminosity function is similar in shape with those previously published for samples selected from the PSC, with a median redshift of 0.048 for the fainter sample, but shifted to higher space densities. There is evidence that some of the excess number counts in the deeper sample can be explained in terms of a large-scale density enhancement beyond the Pavo-Indus supercluster. In addition, the faintest counts in the new sample confirm the result of Hacking et al. (1989) that faint IRAS 60-micron source counts lie significantly in excess of an extrapolation of the PSC counts assuming no luminosity or density evolution.
Galaxy evolution and large-scale structure in the far-infrared. I. IRAS pointed observations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lonsdale, C.J.; Hacking, P.B.

1989-04-01

Redshifts for 66 galaxies were obtained from a sample of 93 60-micron sources detected serendipitously in 22 IRAS deep pointed observations, covering a total area of 18.4 sq deg. The flux density limit of this survey is 150 mJy, 4 times fainter than the IRAS Point Source Catalog (PSC). The luminosity function is similar in shape with those previously published for samples selected from the PSC, with a median redshift of 0.048 for the fainter sample, but shifted to higher space densities. There is evidence that some of the excess number counts in the deeper sample can be explained inmore » terms of a large-scale density enhancement beyond the Pavo-Indus supercluster. In addition, the faintest counts in the new sample confirm the result of Hacking et al. (1989) that faint IRAS 60-micron source counts lie significantly in excess of an extrapolation of the PSC counts assuming no luminosity or density evolution. 81 refs.« less
Galaxy evolution and large-scale structure in the far-infrared. I - IRAS pointed observations

NASA Technical Reports Server (NTRS)

Lonsdale, Carol J.; Hacking, Perry B.

1989-01-01

Redshifts for 66 galaxies were obtained from a sample of 93 60-micron sources detected serendipitously in 22 IRAS deep pointed observations, covering a total area of 18.4 sq deg. The flux density limit of this survey is 150 mJy, 4 times fainter than the IRAS Point Source Catalog (PSC). The luminosity function is similar in shape with those previously published for samples selected from the PSC, with a median redshift of 0.048 for the fainter sample, but shifted to higher space densities. There is evidence that some of the excess number counts in the deeper sample can be explained in terms of a large-scale density enhancement beyond the Pavo-Indus supercluster. In addition, the faintest counts in the new sample confirm the result of Hacking et al. (1989) that faint IRAS 60-micron source counts lie significantly in excess of an extrapolation of the PSC counts assuming no luminosity or density evolution.
ARTS: automated randomization of multiple traits for study design.

PubMed

Maienschein-Cline, Mark; Lei, Zhengdeng; Gardeux, Vincent; Abbasi, Taimur; Machado, Roberto F; Gordeuk, Victor; Desai, Ankit A; Saraf, Santosh; Bahroos, Neil; Lussier, Yves

2014-06-01

Collecting data from large studies on high-throughput platforms, such as microarray or next-generation sequencing, typically requires processing samples in batches. There are often systematic but unpredictable biases from batch-to-batch, so proper randomization of biologically relevant traits across batches is crucial for distinguishing true biological differences from experimental artifacts. When a large number of traits are biologically relevant, as is common for clinical studies of patients with varying sex, age, genotype and medical background, proper randomization can be extremely difficult to prepare by hand, especially because traits may affect biological inferences, such as differential expression, in a combinatorial manner. Here we present ARTS (automated randomization of multiple traits for study design), which aids researchers in study design by automatically optimizing batch assignment for any number of samples, any number of traits and any batch size. ARTS is implemented in Perl and is available at github.com/mmaiensc/ARTS. ARTS is also available in the Galaxy Tool Shed, and can be used at the Galaxy installation hosted by the UIC Center for Research Informatics (CRI) at galaxy.cri.uic.edu. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Applications of species accumulation curves in large-scale biological data analysis.

PubMed

Deng, Chao; Daley, Timothy; Smith, Andrew D

2015-09-01

The species accumulation curve, or collector's curve, of a population gives the expected number of observed species or distinct classes as a function of sampling effort. Species accumulation curves allow researchers to assess and compare diversity across populations or to evaluate the benefits of additional sampling. Traditional applications have focused on ecological populations but emerging large-scale applications, for example in DNA sequencing, are orders of magnitude larger and present new challenges. We developed a method to estimate accumulation curves for predicting the complexity of DNA sequencing libraries. This method uses rational function approximations to a classical non-parametric empirical Bayes estimator due to Good and Toulmin [Biometrika, 1956, 43, 45-63]. Here we demonstrate how the same approach can be highly effective in other large-scale applications involving biological data sets. These include estimating microbial species richness, immune repertoire size, and k -mer diversity for genome assembly applications. We show how the method can be modified to address populations containing an effectively infinite number of species where saturation cannot practically be attained. We also introduce a flexible suite of tools implemented as an R package that make these methods broadly accessible.
Applications of species accumulation curves in large-scale biological data analysis

PubMed Central

Deng, Chao; Daley, Timothy; Smith, Andrew D

2016-01-01

The species accumulation curve, or collector’s curve, of a population gives the expected number of observed species or distinct classes as a function of sampling effort. Species accumulation curves allow researchers to assess and compare diversity across populations or to evaluate the benefits of additional sampling. Traditional applications have focused on ecological populations but emerging large-scale applications, for example in DNA sequencing, are orders of magnitude larger and present new challenges. We developed a method to estimate accumulation curves for predicting the complexity of DNA sequencing libraries. This method uses rational function approximations to a classical non-parametric empirical Bayes estimator due to Good and Toulmin [Biometrika, 1956, 43, 45–63]. Here we demonstrate how the same approach can be highly effective in other large-scale applications involving biological data sets. These include estimating microbial species richness, immune repertoire size, and k-mer diversity for genome assembly applications. We show how the method can be modified to address populations containing an effectively infinite number of species where saturation cannot practically be attained. We also introduce a flexible suite of tools implemented as an R package that make these methods broadly accessible. PMID:27252899
Using High-Throughput Sequencing to Leverage Surveillance of Genetic Diversity and Oseltamivir Resistance: A Pilot Study during the 2009 Influenza A(H1N1) Pandemic

PubMed Central

Téllez-Sosa, Juan; Rodríguez, Mario Henry; Gómez-Barreto, Rosa E.; Valdovinos-Torres, Humberto; Hidalgo, Ana Cecilia; Cruz-Hervert, Pablo; Luna, René Santos; Carrillo-Valenzo, Erik; Ramos, Celso; García-García, Lourdes; Martínez-Barnetche, Jesús

2013-01-01

Background Influenza viruses display a high mutation rate and complex evolutionary patterns. Next-generation sequencing (NGS) has been widely used for qualitative and semi-quantitative assessment of genetic diversity in complex biological samples. The “deep sequencing” approach, enabled by the enormous throughput of current NGS platforms, allows the identification of rare genetic viral variants in targeted genetic regions, but is usually limited to a small number of samples. Methodology and Principal Findings We designed a proof-of-principle study to test whether redistributing sequencing throughput from a high depth-small sample number towards a low depth-large sample number approach is feasible and contributes to influenza epidemiological surveillance. Using 454-Roche sequencing, we sequenced at a rather low depth, a 307 bp amplicon of the neuraminidase gene of the Influenza A(H1N1) pandemic (A(H1N1)pdm) virus from cDNA amplicons pooled in 48 barcoded libraries obtained from nasal swab samples of infected patients (n = 299) taken from May to November, 2009 pandemic period in Mexico. This approach revealed that during the transition from the first (May-July) to second wave (September-November) of the pandemic, the initial genetic variants were replaced by the N248D mutation in the NA gene, and enabled the establishment of temporal and geographic associations with genetic diversity and the identification of mutations associated with oseltamivir resistance. Conclusions NGS sequencing of a short amplicon from the NA gene at low sequencing depth allowed genetic screening of a large number of samples, providing insights to viral genetic diversity dynamics and the identification of genetic variants associated with oseltamivir resistance. Further research is needed to explain the observed replacement of the genetic variants seen during the second wave. As sequencing throughput rises and library multiplexing and automation improves, we foresee that the approach presented here can be scaled up for global genetic surveillance of influenza and other infectious diseases. PMID:23843978
EXPANDING THE ROLE OF ENVIRONMENTAL IMMUNOASSAYS: TECHNICAL CAPABILITIES REGULATORY ISSUES AND COMMUNICATION VEHICLES

EPA Science Inventory

Large numbers of samples are commonplace in environmental monitoring and human exposure assessment studies. When the goals of the US Environmental Protection Agency (EPA) Office of research and Development (sound methods, integrated with human and ecological health, common sense...
Bar-Code System for a Microbiological Laboratory

NASA Technical Reports Server (NTRS)

Law, Jennifer; Kirschner, Larry

2007-01-01

A bar-code system has been assembled for a microbiological laboratory that must examine a large number of samples. The system includes a commercial bar-code reader, computer hardware and software components, plus custom-designed database software. The software generates a user-friendly, menu-driven interface.
Random sampling of constrained phylogenies: conducting phylogenetic analyses when the phylogeny is partially known.

PubMed

Housworth, E A; Martins, E P

2001-01-01

Statistical randomization tests in evolutionary biology often require a set of random, computer-generated trees. For example, earlier studies have shown how large numbers of computer-generated trees can be used to conduct phylogenetic comparative analyses even when the phylogeny is uncertain or unknown. These methods were limited, however, in that (in the absence of molecular sequence or other data) they allowed users to assume that no phylogenetic information was available or that all possible trees were known. Intermediate situations where only a taxonomy or other limited phylogenetic information (e.g., polytomies) are available are technically more difficult. The current study describes a procedure for generating random samples of phylogenies while incorporating limited phylogenetic information (e.g., four taxa belong together in a subclade). The procedure can be used to conduct comparative analyses when the phylogeny is only partially resolved or can be used in other randomization tests in which large numbers of possible phylogenies are needed.
Microplate-based filter paper assay to measure total cellulase activity.

PubMed

Xiao, Zhizhuang; Storms, Reginald; Tsang, Adrian

2004-12-30

The standard filter paper assay (FPA) published by the International Union of Pure and Applied Chemistry (IUPAC) is widely used to determine total cellulase activity. However, the IUPAC method is not suitable for the parallel analyses of large sample numbers. We describe here a microplate-based method for assaying large sample numbers. To achieve this, we reduced the enzymatic reaction volume to 60 microl from the 1.5 ml used in the IUPAC method. The modified 60-microl format FPA can be carried out in 96-well assay plates. Statistical analyses showed that the cellulase activities of commercial cellulases from Trichoderma reesei and Aspergillus species determined with our 60-microl format FPA were not significantly different from the activities measured with the standard FPA. Our results also indicate that the 60-microl format FPA is quantitative and highly reproducible. Moreover, the addition of excess beta-glucosidase increased the sensitivity of the assay by up to 60%. 2004 Wiley Periodicals, Inc.
Minimum Sobolev norm interpolation of scattered derivative data

NASA Astrophysics Data System (ADS)

Chandrasekaran, S.; Gorman, C. H.; Mhaskar, H. N.

2018-07-01

We study the problem of reconstructing a function on a manifold satisfying some mild conditions, given data of the values and some derivatives of the function at arbitrary points on the manifold. While the problem of finding a polynomial of two variables with total degree ≤n given the values of the polynomial and some of its derivatives at exactly the same number of points as the dimension of the polynomial space is sometimes impossible, we show that such a problem always has a solution in a very general situation if the degree of the polynomials is sufficiently large. We give estimates on how large the degree should be, and give explicit constructions for such a polynomial even in a far more general case. As the number of sampling points at which the data is available increases, our polynomials converge to the target function on the set where the sampling points are dense. Numerical examples in single and double precision show that this method is stable, efficient, and of high-order.
The Stratigraphy and Evolution of the Lunar Crust

NASA Technical Reports Server (NTRS)

McCallum, I. Stewart

1998-01-01

Reconstruction of stratigraphic relationships in the ancient lunar crust has proved to be a formidable task. The intense bombardment during the first 700 m.y. of lunar history has severely perturbed the original stratigraphy and destroyed the primary textures of all but a few nonmare rocks. However, a knowledge of the crustal stratigraphy as it existed prior to the cataclysmic bombardment about 3.9 Ga is essential to test the major models proposed for crustal origin, i.e., crystal fractionation in a global magmasphere or serial magmatism in a large number of smaller bodies. Despite the large difference in scale implicit in these two models, both require an efficient separation of plagioclase and mafic minerals to form the anorthositic crust and the mafic mantle. Despite the havoc wreaked by the large body impactors, these same impact processes have brought to the lunar surface crystalline samples derived from at least the upper half of the lunar crust, thereby providing an opportunity to reconstruct the stratigraphy in areas sampled by the Apollo missions. As noted, ejecta from the large multiring basins are dominantly, or even exclusively, of crustal origin. Given the most recent determinations of crustal thicknesses, this implies an upper limit to the depth of excavation of about 60 km. Of all the lunar samples studied, a small set has been recognized as "pristine", and within this pristine group, a small fraction have retained some vestiges of primary features formed during the earliest stages of crystallization or recrystallization prior to 4.0 Ga. We have examined a number of these samples that have retained some record of primary crystallization to deduce thermal histories from an analysis of structural, textural, and compositional features in minerals from these samples. Specifically, by quantitative modeling of (1) the growth rate and development of compositional profiles of exsolution lamellae in pyroxenes and (2) the rate of Fe-Mg ordering in orthopyroxenes, we can constrain the cooling rates of appropriate lunar samples. These cooling rates are used to compute depths of burial at the time of crystallization, which enable us to reconstruct parts of the crustal stratigraphy as it existed during the earliest stages of lunar history.
Identification of missing variants by combining multiple analytic pipelines.

PubMed

Ren, Yingxue; Reddy, Joseph S; Pottier, Cyril; Sarangi, Vivekananda; Tian, Shulan; Sinnwell, Jason P; McDonnell, Shannon K; Biernacka, Joanna M; Carrasquillo, Minerva M; Ross, Owen A; Ertekin-Taner, Nilüfer; Rademakers, Rosa; Hudson, Matthew; Mainzer, Liudmila Sergeevna; Asmann, Yan W

2018-04-16

After decades of identifying risk factors using array-based genome-wide association studies (GWAS), genetic research of complex diseases has shifted to sequencing-based rare variants discovery. This requires large sample sizes for statistical power and has brought up questions about whether the current variant calling practices are adequate for large cohorts. It is well-known that there are discrepancies between variants called by different pipelines, and that using a single pipeline always misses true variants exclusively identifiable by other pipelines. Nonetheless, it is common practice today to call variants by one pipeline due to computational cost and assume that false negative calls are a small percent of total. We analyzed 10,000 exomes from the Alzheimer's Disease Sequencing Project (ADSP) using multiple analytic pipelines consisting of different read aligners and variant calling strategies. We compared variants identified by using two aligners in 50,100, 200, 500, 1000, and 1952 samples; and compared variants identified by adding single-sample genotyping to the default multi-sample joint genotyping in 50,100, 500, 2000, 5000 and 10,000 samples. We found that using a single pipeline missed increasing numbers of high-quality variants correlated with sample sizes. By combining two read aligners and two variant calling strategies, we rescued 30% of pass-QC variants at sample size of 2000, and 56% at 10,000 samples. The rescued variants had higher proportions of low frequency (minor allele frequency [MAF] 1-5%) and rare (MAF < 1%) variants, which are the very type of variants of interest. In 660 Alzheimer's disease cases with earlier onset ages of ≤65, 4 out of 13 (31%) previously-published rare pathogenic and protective mutations in APP, PSEN1, and PSEN2 genes were undetected by the default one-pipeline approach but recovered by the multi-pipeline approach. Identification of the complete variant set from sequencing data is the prerequisite of genetic association analyses. The current analytic practice of calling genetic variants from sequencing data using a single bioinformatics pipeline is no longer adequate with the increasingly large projects. The number and percentage of quality variants that passed quality filters but are missed by the one-pipeline approach rapidly increased with sample size.
Microbiological corrosion of ASTM SA105 carbon steel pipe for industrial fire water usage

NASA Astrophysics Data System (ADS)

Chidambaram, S.; Ashok, K.; Karthik, V.; Venkatakrishnan, P. G.

2018-02-01

The large number of metallic systems developed for last few decades against both general uniform corrosion and localized corrosion. Among all microbiological induced corrosion (MIC) is attractive, multidisciplinary and complex in nature. Many chemical processing industries utilizes fresh water for fire service to nullify major/minor fire. One such fire water service line pipe attacked by micro-organisms leads to leakage which is industrially important from safety point of view. Also large numbers of leakage reported in similar fire water service of nearby food processing plant, paper & pulp plant, steel plant, electricity board etc…In present investigation one such industrial fire water service line failure analysis of carbon steel line pipe was analyzed to determine the cause of failure. The water sample subjected to various chemical and bacterial analyses. Turbidity, pH, calcium hardness, free chlorine, oxidation reduction potential, fungi, yeasts, sulphide reducing bacteria (SRB) and total bacteria (TB) were measured on water sample analysis. The corrosion rate was measured on steel samples and corrosion coupon measurements were installed in fire water for validating non flow assisted localized corrosion. The sulphide reducing bacteria (SRB) presents in fire water causes a localized micro biological corrosion attack of line pipe.

Estimating the duration of geologic intervals from a small number of age determinations: A challenge common to petrology and paleobiology

NASA Astrophysics Data System (ADS)

Glazner, Allen F.; Sadler, Peter M.

2016-12-01

The duration of a geologic interval, such as the time over which a given volume of magma accumulated to form a pluton, or the lifespan of a large igneous province, is commonly determined from a relatively small number of geochronologic determinations (e.g., 4-10) within that interval. Such sample sets can underestimate the true length of the interval by a significant amount. For example, the average interval determined from a sample of size n = 5, drawn from a uniform random distribution, will underestimate the true interval by 50%. Even for n = 10, the average sample only captures ˜80% of the interval. If the underlying distribution is known then a correction factor can be determined from theory or Monte Carlo analysis; for a uniform random distribution, this factor is n+1n-1. Systematic undersampling of interval lengths can have a large effect on calculated magma fluxes in plutonic systems. The problem is analogous to determining the duration of an extinct species from its fossil occurrences. Confidence interval statistics developed for species origination and extinction times are applicable to the onset and cessation of magmatic events.
Soil moisture optimal sampling strategy for Sentinel 1 validation super-sites in Poland

NASA Astrophysics Data System (ADS)

Usowicz, Boguslaw; Lukowski, Mateusz; Marczewski, Wojciech; Lipiec, Jerzy; Usowicz, Jerzy; Rojek, Edyta; Slominska, Ewa; Slominski, Jan

2014-05-01

Soil moisture (SM) exhibits a high temporal and spatial variability that is dependent not only on the rainfall distribution, but also on the topography of the area, physical properties of soil and vegetation characteristics. Large variability does not allow on certain estimation of SM in the surface layer based on ground point measurements, especially in large spatial scales. Remote sensing measurements allow estimating the spatial distribution of SM in the surface layer on the Earth, better than point measurements, however they require validation. This study attempts to characterize the SM distribution by determining its spatial variability in relation to the number and location of ground point measurements. The strategy takes into account the gravimetric and TDR measurements with different sampling steps, abundance and distribution of measuring points on scales of arable field, wetland and commune (areas: 0.01, 1 and 140 km2 respectively), taking into account the different status of SM. Mean values of SM were lowly sensitive on changes in the number and arrangement of sampling, however parameters describing the dispersion responded in a more significant manner. Spatial analysis showed autocorrelations of the SM, which lengths depended on the number and the distribution of points within the adopted grids. Directional analysis revealed a differentiated anisotropy of SM for different grids and numbers of measuring points. It can therefore be concluded that both the number of samples, as well as their layout on the experimental area, were reflected in the parameters characterizing the SM distribution. This suggests the need of using at least two variants of sampling, differing in the number and positioning of the measurement points, wherein the number of them must be at least 20. This is due to the value of the standard error and range of spatial variability, which show little change with the increase in the number of samples above this figure. Gravimetric method gives a more varied distribution of SM than those derived from TDR measurements. It should be noted that reducing the number of samples in the measuring grid leads to flattening the distribution of SM from both methods and increasing the estimation error at the same time. Grid of sensors for permanent measurement points should include points that have similar distributions of SM in the vicinity. Results of the analysis including number, the maximum correlation ranges and the acceptable estimation error should be taken into account when choosing of the measurement points. Adoption or possible adjustment of the distribution of the measurement points should be verified by performing additional measuring campaigns during the dry and wet periods. Presented approach seems to be appropriate for creation of regional-scale test (super) sites, to validate products of satellites equipped with SAR (Synthetic Aperture Radar), operating in C-band, with spatial resolution suited to single field scale, as for example: ERS-1, ERS-2, Radarsat and Sentinel-1, which is going to be launched in next few months. The work was partially funded by the Government of Poland through an ESA Contract under the PECS ELBARA_PD project No. 4000107897/13/NL/KML.
Review of PCBs in US schools: a brief history, an estimate of the number of impacted schools, and an approach for evaluating indoor air samples.

PubMed

Herrick, Robert F; Stewart, James H; Allen, Joseph G

2016-02-01

PCBs in building materials such as caulks and sealants are a largely unrecognized source of contamination in the building environment. Schools are of particular interest, as the period of extensive school construction (about 1950 to 1980) coincides with the time of greatest use of PCBs as plasticizers in building materials. In the USA, we estimate that the number of schools with PCB in building caulk ranges from 12,960 to 25,920 based upon the number of schools built in the time of PCB use and the proportion of buildings found to contain PCB caulk and sealants. Field and laboratory studies have demonstrated that PCBs from both interior and exterior caulking can be the source of elevated PCB air concentrations in these buildings, at levels that exceed health-based PCB exposure guidelines for building occupants. Air sampling in buildings containing PCB caulk has shown that the airborne PCB concentrations can be highly variable, even in repeat samples collected within a room. Sampling and data analysis strategies that recognize this variability can provide the basis for informed decision making about compliance with health-based exposure limits, even in cases where small numbers of samples are taken. The health risks posed by PCB exposures, particularly among children, mandate precautionary approaches to managing PCBs in building materials.
Sample size guidelines for fitting a lognormal probability distribution to censored most probable number data with a Markov chain Monte Carlo method.

PubMed

Williams, Michael S; Cao, Yong; Ebel, Eric D

2013-07-15

Levels of pathogenic organisms in food and water have steadily declined in many parts of the world. A consequence of this reduction is that the proportion of samples that test positive for the most contaminated product-pathogen pairings has fallen to less than 0.1. While this is unequivocally beneficial to public health, datasets with very few enumerated samples present an analytical challenge because a large proportion of the observations are censored values. One application of particular interest to risk assessors is the fitting of a statistical distribution function to datasets collected at some point in the farm-to-table continuum. The fitted distribution forms an important component of an exposure assessment. A number of studies have compared different fitting methods and proposed lower limits on the proportion of samples where the organisms of interest are identified and enumerated, with the recommended lower limit of enumerated samples being 0.2. This recommendation may not be applicable to food safety risk assessments for a number of reasons, which include the development of new Bayesian fitting methods, the use of highly sensitive screening tests, and the generally larger sample sizes found in surveys of food commodities. This study evaluates the performance of a Markov chain Monte Carlo fitting method when used in conjunction with a screening test and enumeration of positive samples by the Most Probable Number technique. The results suggest that levels of contamination for common product-pathogen pairs, such as Salmonella on poultry carcasses, can be reliably estimated with the proposed fitting method and samples sizes in excess of 500 observations. The results do, however, demonstrate that simple guidelines for this application, such as the proportion of positive samples, cannot be provided. Published by Elsevier B.V.
VizieR Online Data Catalog: The CLASS blazar survey. I. (Marcha+, 2001)

NASA Astrophysics Data System (ADS)

Marcha, M. J.; Caccianiga, A.; Browne, I. W. A.; Jackson, N.

2002-04-01

This paper presents a new complete and well-defined sample of flat-spectrum radio sources (FSRS) selected from the Cosmic Lens All-Sky Survey (CLASS), with the further constraint of a bright (mag<=17.5) optical counterpart. The sample has been designed to produce a large number of low-luminosity blazars in order to test the current unifying models in the low-luminosity regime. In this first paper the new sample is presented and the radio properties of the 325 sources contained therein are discussed. (1 data file).
Changes in numbers of large ovarian follicles, plasma luteinizing hormone and estradiol-17beta concentrations and egg production figures in farmed ostriches throughout the year.

PubMed

Bronneberg, R G G; Stegeman, J A; Vernooij, J C M; Dieleman, S J; Decuypere, E; Bruggeman, V; Taverne, M A M

2007-06-01

In this study we described and analysed changes in the numbers of large ovarian follicles (diameter 6.1-9.0 cm) and in the plasma concentrations of luteinizing hormone (LH) and estradiol-17beta (E(2)beta) in relation to individual egg production figures of farmed ostriches (Struthio camelus spp.) throughout one year. Ultrasound scanning and blood sampling for plasma hormone analysis were performed in 9 hens on a monthly basis during the breeding season and in two periods of the non-breeding season. Our data demonstrated that: (1) large follicles were detected and LH concentrations were elevated already 1 month before first ovipositions of the egg production season took place; (2) E(2)beta concentrations increased as soon as the egg production season started; (3) numbers of large follicles, LH and E(2)beta concentrations were elevated during the entire egg production season; and that (4) numbers of large follicles, LH and E(2)beta concentrations decreased simultaneous with or following the last ovipositions of the egg production season. By comparing these parameters during the egg production season with their pre-and post-seasonal values, significant differences were found in the numbers of large follicles and E(2)beta concentrations between the pre-seasonal, seasonal and post-seasonal period; while LH concentrations were significantly different between the seasonal and post-seasonal period. In conclusion, our data demonstrate that changes in numbers of large follicles and in concentrations of LH and E(2)beta closely parallel individual egg production figures and provide some new cues that egg production in ostriches is confined to a marked reproductive season. Moreover, our data provide indications that mechanism, initiating, maintaining and terminating the egg production season in farmed breeding ostriches are quite similar to those already known for other seasonal breeding bird species.
Practicability of monitoring soil Cd, Hg, and Pb pollution based on a geochemical survey in China.

PubMed

Xia, Xueqi; Yang, Zhongfang; Li, Guocheng; Yu, Tao; Hou, Qingye; Mutelo, Admire Muchimamui

2017-04-01

Repeated visiting, i.e., sampling and analysis at two or more temporal points, is one of the important ways of monitoring soil heavy metal contamination. However, with the concern about the cost, determination of the number of samples and the temporal interval, and their capability to detect a certain change is a key technical problem to be solved. This depends on the spatial variation of the parameters in the monitoring units. The "National Multi-Purpose Regional Geochemical Survey" (NMPRGS) project in China, acquired the spatial distribution of heavy metals using a high density sampling method in the most arable regions in China. Based on soil Cd, Hg, and Pb data and taking administrative regions as the monitoring units, the number of samples and temporal intervals that may be used for monitoring soil heavy metal contamination were determined. It was found that there is a large variety of spatial variation of the elements in each NMPRGS region. This results in the difficulty in the determination of the minimum detectable changes (MDC), the number of samples, and temporal intervals for revisiting. This paper recommends a suitable set of the number of samples (n r ) for each region under the balance of cost, practicability, and monitoring precision. Under n r , MDC values are acceptable for all the regions, and the minimum temporal intervals are practical with the range of 3.3-13.3 years. Copyright © 2017 Elsevier Ltd. All rights reserved.
Microbiological sampling of swine carcasses: a comparison of data obtained by swabbing with medical gauze and data collected routinely by excision at Swedish abattoirs.

PubMed

Lindblad, M

2007-09-15

Swab sample data from a 13-month microbiological baseline study of swine carcasses at Swedish abattoirs were combined with excision sample data collected routinely at five abattoirs. The aim was to compare the numbers of total aerobic counts, Enterobacteriaceae, and Escherichia coli, recovered by swabbing four carcass sites with gauze (total area 400 cm2) with those obtained by excision at equivalent sites (total area 20 cm2). The results are considered in relation to the process hygiene criteria that are stated in Commission Regulation (EC) No 2073/2005. These criteria apply only to destructive sampling of total aerobic counts and Enterobacteriaceae, but alternative sampling schemes, as well as alternative indicator organisms such as E. coli, are allowed if equivalent guarantees of food safety can be provided. Swab sampling resulted in higher mean log numbers of total aerobic counts at four of the five abattoirs, compared with excision, and lower or equal standard deviations at all abattoirs. The percentage of swab and excision samples positive for Enterobacteriaceae at the different abattoirs ranged from 68 to 100% and 15 to 24%, respectively. Similarly, the percentages of swab samples that were positive for E. coli were higher than the percentages of positive excision samples (range 52 to 84% and 3 to 14%, respectively). Due to the low percentage of positive excision results, the mean log numbers of Enterobacteriaceae and E. coli were only compared at two and one abattoirs, respectively, using log probability regression to substitute censored observations. Higher mean log numbers of Enterobacteriaceae were recovered by swabbing compared with excision at one abattoir, whereas the numbers of Enterobacteriaceae and E. coli did not differ significantly between sampling methods at one abattoir. This study suggests that the same process hygiene criteria as those stipulated for excision can be used for swabbing with gauze without compromising food safety. For monitoring of low numbers of Enterobacteriaceae and E. coli, like those found on swine carcasses at Swedish abattoirs, the results also show that swabbing of a relatively large area is superior to excision of a smaller area.
IMa2p - Parallel MCMC and inference of ancient demography under the Isolation with Migration (IM) model

PubMed Central

Sethuraman, Arun; Hey, Jody

2015-01-01

IMa2 and related programs are used to study the divergence of closely related species and of populations within species. These methods are based on the sampling of genealogies using MCMC, and they can proceed quite slowly for larger data sets. We describe a parallel implementation, called IMa2p, that provides a nearly linear increase in genealogy sampling rate with the number of processors in use. IMa2p is written in OpenMPI and C++, and scales well for demographic analyses of a large number of loci and populations, which are difficult to study using the serial version of the program. PMID:26059786
Reconnaissance geochemical survey of the Farah Garan-Kutam mineral belt, Kingdom of Saudi Arabia

USGS Publications Warehouse

Samater, R.M.; Johnson, P.R.; Bookstrom, A.A.

1991-01-01

In the present survey, geochemical anomalies locate all the sites of mineralization known from previous work. The survey is therefore technically a success. However, a large number of these anomalies probably result from contamination of the wadi systems by metal dispersed from ancient mine workings, and this particular survey, overall, may be of limited value as a guide to the discovery of hitherto unknown mineralization. Nevertheless, the survey outlines two areas that may mark extensions to known mineralization, and a number of other areas in which no mineralization is known. Based on a consideration of the character of the bedrock geology, the value of each reported analytical result in relation to the respective element thresholds, and the number of anomalous samples that cluster in any given area, four areas are recommended for high-priority follow-up sampling.
Apparatus and process for microbial detection and enumeration

NASA Technical Reports Server (NTRS)

Wilkins, J. R.; Grana, D. (Inventor)

1982-01-01

An apparatus and process for detecting and enumerating specific microorganisms from large volume samples containing small numbers of the microorganisms is presented. The large volume samples are filtered through a membrane filter to concentrate the microorganisms. The filter is positioned between two absorbent pads and previously moistened with a growth medium for the microorganisms. A pair of electrodes are disposed against the filter and the pad electrode filter assembly is retained within a petri dish by retainer ring. The cover is positioned on base of petri dish and sealed at the edges by a parafilm seal prior to being electrically connected via connectors to a strip chart recorder for detecting and enumerating the microorganisms collected on filter.
Open star clusters and Galactic structure

NASA Astrophysics Data System (ADS)

Joshi, Yogesh C.

2018-04-01

In order to understand the Galactic structure, we perform a statistical analysis of the distribution of various cluster parameters based on an almost complete sample of Galactic open clusters yet available. The geometrical and physical characteristics of a large number of open clusters given in the MWSC catalogue are used to study the spatial distribution of clusters in the Galaxy and determine the scale height, solar offset, local mass density and distribution of reddening material in the solar neighbourhood. We also explored the mass-radius and mass-age relations in the Galactic open star clusters. We find that the estimated parameters of the Galactic disk are largely influenced by the choice of cluster sample.
Large-area synthesis of high-quality and uniform monolayer WS2 on reusable Au foils

PubMed Central

Gao, Yang; Liu, Zhibo; Sun, Dong-Ming; Huang, Le; Ma, Lai-Peng; Yin, Li-Chang; Ma, Teng; Zhang, Zhiyong; Ma, Xiu-Liang; Peng, Lian-Mao; Cheng, Hui-Ming; Ren, Wencai

2015-01-01

Large-area monolayer WS2 is a desirable material for applications in next-generation electronics and optoelectronics. However, the chemical vapour deposition (CVD) with rigid and inert substrates for large-area sample growth suffers from a non-uniform number of layers, small domain size and many defects, and is not compatible with the fabrication process of flexible devices. Here we report the self-limited catalytic surface growth of uniform monolayer WS2 single crystals of millimetre size and large-area films by ambient-pressure CVD on Au. The weak interaction between the WS2 and Au enables the intact transfer of the monolayers to arbitrary substrates using the electrochemical bubbling method without sacrificing Au. The WS2 shows high crystal quality and optical and electrical properties comparable or superior to mechanically exfoliated samples. We also demonstrate the roll-to-roll/bubbling production of large-area flexible films of uniform monolayer, double-layer WS2 and WS2/graphene heterostructures, and batch fabrication of large-area flexible monolayer WS2 film transistor arrays. PMID:26450174
Large-Angular-Scale Clustering as a Clue to the Source of UHECRs

NASA Astrophysics Data System (ADS)

Berlind, Andreas A.; Farrar, Glennys R.

We explore what can be learned about the sources of UHECRs from their large-angular-scale clustering (referred to as their "bias" by the cosmology community). Exploiting the clustering on large scales has the advantage over small-scale correlations of being insensitive to uncertainties in source direction from magnetic smearing or measurement error. In a Cold Dark Matter cosmology, the amplitude of large-scale clustering depends on the mass of the system, with more massive systems such as galaxy clusters clustering more strongly than less massive systems such as ordinary galaxies or AGN. Therefore, studying the large-scale clustering of UHECRs can help determine a mass scale for their sources, given the assumption that their redshift depth is as expected from the GZK cutoff. We investigate the constraining power of a given UHECR sample as a function of its cutoff energy and number of events. We show that current and future samples should be able to distinguish between the cases of their sources being galaxy clusters, ordinary galaxies, or sources that are uncorrelated with the large-scale structure of the universe.
Partial Least Square Analyses of Landscape and Surface Water Biota Associations in the Savannah River Basin

EPA Science Inventory

Ecologists are often faced with problem of small sample size, correlated and large number of predictors, and high noise-to-signal relationships. This necessitates excluding important variables from the model when applying standard multiple or multivariate regression analyses. In ...
Supercolor coding methods for large-scale multiplexing of biochemical assays.

PubMed

Rajagopal, Aditya; Scherer, Axel; Homyk, Andrew; Kartalov, Emil

2013-08-20

We present a novel method for the encoding and decoding of multiplexed biochemical assays. The method enables a theoretically unlimited number of independent targets to be detected and uniquely identified in any combination in the same sample. For example, the method offers easy access to 12-plex and larger PCR assays, as contrasted to the current 4-plex assays. This advancement would allow for large panels of tests to be run simultaneously in the same sample, saving reagents, time, consumables, and manual labor, while also avoiding the traditional loss of sensitivity due to sample aliquoting. Thus, the presented method is a major technological breakthrough with far-reaching impact on biotechnology, biomedical science, and clinical diagnostics. Herein, we present the mathematical theory behind the method as well as its experimental proof of principle using Taqman PCR on sequences specific to infectious diseases.
[Distributions of the numbers of monitoring stations in the surveillance of infectious diseases in Japan].

PubMed

Murakami, Y; Hashimoto, S; Taniguchi, K; Nagai, M

1999-12-01

To describe the characteristics of monitoring stations for the infectious disease surveillance system in Japan, we compared the distributions of the number of monitoring stations in terms of population, region, size of medical institution, and medical specialty. The distributions of annual number of reported cases in terms of the type of diseases, the size of medical institution, and medical specialty were also compared. We conducted a nationwide survey of the pediatrics stations (16 diseases), ophthalmology stations (3 diseases) and the stations of sexually transmitted diseases (STD) (5 diseases) in Japan. In the survey, we collected the data of monitoring stations and the annual reported cases of diseases. We also collected the data on the population, served by the health center where the monitoring stations existed, from the census. First, we compared the difference between the present number of monitoring stations and the current standard established by the Ministry of Health and Welfare (MHW). Second, we compared the distribution of all medical institutions in Japan and the monitoring stations in terms of the size of the medical institution. Third, we compared the average number of annual reported cases of diseases in terms of the size of medical institution and the medical specialty. In most health centers, the number of monitoring stations achieved the current standard of MHW, while a few health centers had no monitoring station, although they had a large population. Most prefectures also achieved the current standard of MHW, but some prefectures were well below the standard. Among pediatric stations, the sampling proportion of large hospitals was higher than other categories. Among the ophthalmology stations, the sampling proportion of hospitals was higher than other categories. Among the STD stations, the sampling proportion of clinics of obstetrics and gynecology was lower than other categories. Except for some diseases, it made little difference in the average number of annual reported cases of diseases in terms of the type of medical institution. Among STD, there was a great difference in the average number of annual reported cases of diseases in terms of medical specialty.
A New Stratified Sampling Procedure which Decreases Error Estimation of Varroa Mite Number on Sticky Boards.

PubMed

Kretzschmar, A; Durand, E; Maisonnasse, A; Vallon, J; Le Conte, Y

2015-06-01

A new procedure of stratified sampling is proposed in order to establish an accurate estimation of Varroa destructor populations on sticky bottom boards of the hive. It is based on the spatial sampling theory that recommends using regular grid stratification in the case of spatially structured process. The distribution of varroa mites on sticky board being observed as spatially structured, we designed a sampling scheme based on a regular grid with circles centered on each grid element. This new procedure is then compared with a former method using partially random sampling. Relative error improvements are exposed on the basis of a large sample of simulated sticky boards (n=20,000) which provides a complete range of spatial structures, from a random structure to a highly frame driven structure. The improvement of varroa mite number estimation is then measured by the percentage of counts with an error greater than a given level. © The Authors 2015. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Automated design of paralogue ratio test assays for the accurate and rapid typing of copy number variation

PubMed Central

Veal, Colin D.; Xu, Hang; Reekie, Katherine; Free, Robert; Hardwick, Robert J.; McVey, David; Brookes, Anthony J.; Hollox, Edward J.; Talbot, Christopher J.

2013-01-01

Motivation: Genomic copy number variation (CNV) can influence susceptibility to common diseases. High-throughput measurement of gene copy number on large numbers of samples is a challenging, yet critical, stage in confirming observations from sequencing or array Comparative Genome Hybridization (CGH). The paralogue ratio test (PRT) is a simple, cost-effective method of accurately determining copy number by quantifying the amplification ratio between a target and reference amplicon. PRT has been successfully applied to several studies analyzing common CNV. However, its use has not been widespread because of difficulties in assay design. Results: We present PRTPrimer (www.prtprimer.org) software for automated PRT assay design. In addition to stand-alone software, the web site includes a database of pre-designed assays for the human genome at an average spacing of 6 kb and a web interface for custom assay design. Other reference genomes can also be analyzed through local installation of the software. The usefulness of PRTPrimer was tested within known CNV, and showed reproducible quantification. This software and database provide assays that can rapidly genotype CNV, cost-effectively, on a large number of samples and will enable the widespread adoption of PRT. Availability: PRTPrimer is available in two forms: a Perl script (version 5.14 and higher) that can be run from the command line on Linux systems and as a service on the PRTPrimer web site (www.prtprimer.org). Contact: cjt14@le.ac.uk Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:23742985
A priori evaluation of two-stage cluster sampling for accuracy assessment of large-area land-cover maps

USGS Publications Warehouse

Wickham, J.D.; Stehman, S.V.; Smith, J.H.; Wade, T.G.; Yang, L.

2004-01-01

Two-stage cluster sampling reduces the cost of collecting accuracy assessment reference data by constraining sample elements to fall within a limited number of geographic domains (clusters). However, because classification error is typically positively spatially correlated, within-cluster correlation may reduce the precision of the accuracy estimates. The detailed population information to quantify a priori the effect of within-cluster correlation on precision is typically unavailable. Consequently, a convenient, practical approach to evaluate the likely performance of a two-stage cluster sample is needed. We describe such an a priori evaluation protocol focusing on the spatial distribution of the sample by land-cover class across different cluster sizes and costs of different sampling options, including options not imposing clustering. This protocol also assesses the two-stage design's adequacy for estimating the precision of accuracy estimates for rare land-cover classes. We illustrate the approach using two large-area, regional accuracy assessments from the National Land-Cover Data (NLCD), and describe how the a priorievaluation was used as a decision-making tool when implementing the NLCD design.

Evaluating information content of SNPs for sample-tagging in re-sequencing projects.

PubMed

Hu, Hao; Liu, Xiang; Jin, Wenfei; Hilger Ropers, H; Wienker, Thomas F

2015-05-15

Sample-tagging is designed for identification of accidental sample mix-up, which is a major issue in re-sequencing studies. In this work, we develop a model to measure the information content of SNPs, so that we can optimize a panel of SNPs that approach the maximal information for discrimination. The analysis shows that as low as 60 optimized SNPs can differentiate the individuals in a population as large as the present world, and only 30 optimized SNPs are in practice sufficient in labeling up to 100 thousand individuals. In the simulated populations of 100 thousand individuals, the average Hamming distances, generated by the optimized set of 30 SNPs are larger than 18, and the duality frequency, is lower than 1 in 10 thousand. This strategy of sample discrimination is proved robust in large sample size and different datasets. The optimized sets of SNPs are designed for Whole Exome Sequencing, and a program is provided for SNP selection, allowing for customized SNP numbers and interested genes. The sample-tagging plan based on this framework will improve re-sequencing projects in terms of reliability and cost-effectiveness.
Low energy atmospheric muon neutrinos in MACRO

NASA Astrophysics Data System (ADS)

Ambrosio, M.; Antolini, R.; Auriemma, G.; Bakari, D.; Baldini, A.; Barbarino, G. C.; Barish, B. C.; Battistoni, G.; Bellotti, R.; Bemporad, C.; Bernardini, P.; Bilokon, H.; Bisi, V.; Bloise, C.; Bower, C.; Brigida, M.; Bussino, S.; Cafagna, F.; Calicchio, M.; Campana, D.; Carboni, M.; Cecchini, S.; Cei, F.; Chiarella, V.; Choudhary, B. C.; Coutu, S.; De Cataldo, G.; Dekhissi, H.; De Marzo, C.; De Mitri, I.; Derkaoui, J.; De Vincenzi, M.; Di Credico, A.; Erriquez, O.; Favuzzi, C.; Forti, C.; Fusco, P.; Giacomelli, G.; Giannini, G.; Giglietto, N.; Giorgini, M.; Grassi, M.; Gray, L.; Grillo, A.; Guarino, F.; Gustavino, C.; Habig, A.; Hanson, K.; Heinz, R.; Iarocci, E.; Katsavounidis, E.; Katsavounidis, I.; Kearns, E.; Kim, H.; Kyriazopoulou, S.; Lamanna, E.; Lane, C.; Levin, D. S.; Lipari, P.; Longley, N. P.; Longo, M. J.; Loparco, F.; Maaroufi, F.; Mancarella, G.; Mandrioli, G.; Margiotta, A.; Marini, A.; Martello, D.; Marzari-Chiesa, A.; Mazziotta, M. N.; Michael, D. G.; Mikheyev, S.; Miller, L.; Monacelli, P.; Montaruli, T.; Monteno, M.; Mufson, S.; Musser, J.; Nicolò, D.; Nolty, R.; Orth, C.; Osteria, G.; Ouchrif, M.; Palamara, O.; Patera, V.; Patrizii, L.; Pazzi, R.; Peck, C. W.; Perrone, L.; Petrera, S.; Pistilli, P.; Popa, V.; Rainò, A.; Reynoldson, J.; Ronga, F.; Satriano, C.; Satta, L.; Scapparone, E.; Scholberg, K.; Sciubba, A.; Serra, P.; Sioli, M.; Sirri, G.; Sitta, M.; Spinelli, P.; Spinetti, M.; Spurio, M.; Steinberg, R.; Stone, J. L.; Sulak, L. R.; Surdo, A.; Tarlè, G.; Togo, V.; Vakili, M.; Vilela, E.; Walter, C. W.; Webb, R.

2000-04-01

We present the measurement of two event samples induced by atmospheric νμ of average energy Eoverlineν~4 GeV. In the first sample, a neutrino interacts inside the MACRO detector producing an upward-going muon leaving the apparatus. The ratio of the number of observed to expected events is 0.57+/-0.05stat+/-0.06syst+/-0.14theor with an angular distribution similar to that expected from the Bartol atmospheric neutrino flux. The second is a mixed sample of internally produced downward-going muons and externally produced upward-going muons stopping inside the detector. These two subsamples are selected by topological criteria; the lack of timing information makes it impossible to distinguish stopping from downgoing muons. The ratio of the number of observed to expected events is 0.71+/-0.05stat+/-0.07syst+/-0.18theor. The observed deficits in each subsample is in agreement with neutrino oscillations, although the significance is reduced by the large theoretical errors. However, the ratio of the two samples causes a large cancellation of theoretical and of some systematic errors. With the ratio, we rule out the no-oscillation hypothesis at 95% c.l. Furthermore, the ratio tests the pathlength dependence of possible oscillations. The data of both samples and their ratio favor maximal mixing and Δm2~10-3-10-2 eV2. These parameters are in agreement with our results from upward throughgoing muons, induced by νμ of much higher energies.
Comparison of Submental Blood Collection with the Retroorbital and Submandibular Methods in Mice (Mus musculus)

PubMed Central

Regan, Rainy D; Fenyk-Melody, Judy E; Tran, Sam M; Chen, Guang; Stocking, Kim L

2016-01-01

Nonterminal blood sample collection of sufficient volume and quality for research is complicated in mice due to their small size and anatomy. Large (>100 μL) nonterminal volumes of unhemolyzed or unclotted blood currently are typically collected from the retroorbital sinus or submandibular plexus. We developed a third method—submental blood collection—which is similar in execution to the submandibular method but with minor changes in animal restraint and collection location. Compared with other techniques, submental collection is easier to perform due to the direct visibility of the target vessels, which are located in a sparsely furred region. Compared with the submandibular method, the submental method did not differ regarding weight change and clotting score but significantly decreased hemolysis and increased the overall number of high-quality samples. The submental method was performed with smaller lancets for the majority of the bleeds, yet resulted in fewer repeat collection attempts, fewer insufficient samples, and less extraneous blood loss and was qualitatively less traumatic. Compared with the retroorbital technique, the submental method was similar regarding weight change but decreased hemolysis, clotting, and the number of overall high-quality samples; however the retroorbital method resulted in significantly fewer incidents of insufficient sample collection. Extraneous blood loss was roughly equivalent between the submental and retroorbital methods. We conclude that the submental method is an acceptable venipuncture technique for obtaining large, nonterminal volumes of blood from mice. PMID:27657712
Differential gene expression detection and sample classification using penalized linear regression models.

PubMed

Wu, Baolin

2006-02-15

Differential gene expression detection and sample classification using microarray data have received much research interest recently. Owing to the large number of genes p and small number of samples n (p > n), microarray data analysis poses big challenges for statistical analysis. An obvious problem owing to the 'large p small n' is over-fitting. Just by chance, we are likely to find some non-differentially expressed genes that can classify the samples very well. The idea of shrinkage is to regularize the model parameters to reduce the effects of noise and produce reliable inferences. Shrinkage has been successfully applied in the microarray data analysis. The SAM statistics proposed by Tusher et al. and the 'nearest shrunken centroid' proposed by Tibshirani et al. are ad hoc shrinkage methods. Both methods are simple, intuitive and prove to be useful in empirical studies. Recently Wu proposed the penalized t/F-statistics with shrinkage by formally using the (1) penalized linear regression models for two-class microarray data, showing good performance. In this paper we systematically discussed the use of penalized regression models for analyzing microarray data. We generalize the two-class penalized t/F-statistics proposed by Wu to multi-class microarray data. We formally derive the ad hoc shrunken centroid used by Tibshirani et al. using the (1) penalized regression models. And we show that the penalized linear regression models provide a rigorous and unified statistical framework for sample classification and differential gene expression detection.
Scope of Various Random Number Generators in Ant System Approach for TSP

NASA Technical Reports Server (NTRS)

Sen, S. K.; Shaykhian, Gholam Ali

2007-01-01

Experimented on heuristic, based on an ant system approach for traveling Salesman problem, are several quasi and pseudo-random number generators. This experiment is to explore if any particular generator is most desirable. Such an experiment on large samples has the potential to rank the performance of the generators for the foregoing heuristic. This is just to seek an answer to the controversial performance ranking of the generators in probabilistic/statically sense.
Soil sampling and analytical strategies for mapping fallout in nuclear emergencies based on the Fukushima Dai-ichi Nuclear Power Plant accident.

PubMed

Onda, Yuichi; Kato, Hiroaki; Hoshi, Masaharu; Takahashi, Yoshio; Nguyen, Minh-Long

2015-01-01

The Fukushima Dai-ichi Nuclear Power Plant (FDNPP) accident resulted in extensive radioactive contamination of the environment via deposited radionuclides such as radiocesium and (131)I. Evaluating the extent and level of environmental contamination is critical to protecting citizens in affected areas and to planning decontamination efforts. However, a standardized soil sampling protocol is needed in such emergencies to facilitate the collection of large, tractable samples for measuring gamma-emitting radionuclides. In this study, we developed an emergency soil sampling protocol based on preliminary sampling from the FDNPP accident-affected area. We also present the results of a preliminary experiment aimed to evaluate the influence of various procedures (e.g., mixing, number of samples) on measured radioactivity. Results show that sample mixing strongly affects measured radioactivity in soil samples. Furthermore, for homogenization, shaking the plastic sample container at least 150 times or disaggregating soil by hand-rolling in a disposable plastic bag is required. Finally, we determined that five soil samples within a 3 m × 3-m area are the minimum number required for reducing measurement uncertainty in the emergency soil sampling protocol proposed here. Copyright © 2014 Elsevier Ltd. All rights reserved.
Genetic Structures of Copy Number Variants Revealed by Genotyping Single Sperm

PubMed Central

Luo, Minjie; Cui, Xiangfeng; Fredman, David; Brookes, Anthony J.; Azaro, Marco A.; Greenawalt, Danielle M.; Hu, Guohong; Wang, Hui-Yun; Tereshchenko, Irina V.; Lin, Yong; Shentu, Yue; Gao, Richeng; Shen, Li; Li, Honghua

2009-01-01

Background Copy number variants (CNVs) occupy a significant portion of the human genome and may have important roles in meiotic recombination, human genome evolution and gene expression. Many genetic diseases may be underlain by CNVs. However, because of the presence of their multiple copies, variability in copy numbers and the diploidy of the human genome, detailed genetic structure of CNVs cannot be readily studied by available techniques. Methodology/Principal Findings Single sperm samples were used as the primary subjects for the study so that CNV haplotypes in the sperm donors could be studied individually. Forty-eight CNVs characterized in a previous study were analyzed using a microarray-based high-throughput genotyping method after multiplex amplification. Seventeen single nucleotide polymorphisms (SNPs) were also included as controls. Two single-base variants, either allelic or paralogous, could be discriminated for all markers. Microarray data were used to resolve SNP alleles and CNV haplotypes, to quantitatively assess the numbers and compositions of the paralogous segments in each CNV haplotype. Conclusions/Significance This is the first study of the genetic structure of CNVs on a large scale. Resulting information may help understand evolution of the human genome, gain insight into many genetic processes, and discriminate between CNVs and SNPs. The highly sensitive high-throughput experimental system with haploid sperm samples as subjects may be used to facilitate detailed large-scale CNV analysis. PMID:19384415
Assessment of fish assemblages and minimum sampling effort required to determine botic integrity of large rivers in southern Idaho, 2002

USGS Publications Warehouse

Maret, Terry R.; Ott, D.S.

2004-01-01

width was determined to be sufficient for collecting an adequate number of fish to estimate species richness and evaluate biotic integrity. At most sites, about 250 fish were needed to effectively represent 95 percent of the species present. Fifty-three percent of the sites assessed, using an IBI developed specifically for large Idaho rivers, received scores of less than 50, indicating poor biotic integrity.
National Databases for Neurosurgical Outcomes Research: Options, Strengths, and Limitations.

PubMed

Karhade, Aditya V; Larsen, Alexandra M G; Cote, David J; Dubois, Heloise M; Smith, Timothy R

2017-08-05

Quality improvement, value-based care delivery, and personalized patient care depend on robust clinical, financial, and demographic data streams of neurosurgical outcomes. The neurosurgical literature lacks a comprehensive review of large national databases. To assess the strengths and limitations of various resources for outcomes research in neurosurgery. A review of the literature was conducted to identify surgical outcomes studies using national data sets. The databases were assessed for the availability of patient demographics and clinical variables, longitudinal follow-up of patients, strengths, and limitations. The number of unique patients contained within each data set ranged from thousands (Quality Outcomes Database [QOD]) to hundreds of millions (MarketScan). Databases with both clinical and financial data included PearlDiver, Premier Healthcare Database, Vizient Clinical Data Base and Resource Manager, and the National Inpatient Sample. Outcomes collected by databases included patient-reported outcomes (QOD); 30-day morbidity, readmissions, and reoperations (National Surgical Quality Improvement Program); and disease incidence and disease-specific survival (Surveillance, Epidemiology, and End Results-Medicare). The strengths of large databases included large numbers of rare pathologies and multi-institutional nationally representative sampling; the limitations of these databases included variable data veracity, variable data completeness, and missing disease-specific variables. The improvement of existing large national databases and the establishment of new registries will be crucial to the future of neurosurgical outcomes research. Copyright © 2017 by the Congress of Neurological Surgeons
Using Technology to Better Characterize the Apollo Sample Suite: A Retroactive PET Analysis and Potential Model for Future Sample Return Missions

NASA Technical Reports Server (NTRS)

Zeigler, R. A.

2015-01-01

From 1969-1972 the Apollo missions collected 382 kg of lunar samples from six distinct locations on the Moon. Studies of the Apollo sample suite have shaped our understanding of the formation and early evolution of the Earth-Moon system, and have had important implications for studies of the other terrestrial planets (e.g., through the calibration of the crater counting record) and even the outer planets (e.g., the Nice model of the dynamical evolution of the Solar System). Despite nearly 50 years of detailed research on Apollo samples, scientists are still developing new theories about the origin and evolution of the Moon. Three areas of active research are: (1) the abundance of water (and other volatiles) in the lunar mantle, (2) the timing of the formation of the Moon and the duration of lunar magma ocean crystallization, (3) the formation of evolved lunar lithologies (e.g., granites) and implications for tertiary crustal processes on the Moon. In order to fully understand these (and many other) theories about the Moon, scientists need access to "new" lunar samples, particularly new plutonic samples. Over 100 lunar meteorites have been identified over the past 30 years, and the study of these samples has greatly aided in our understanding of the Moon. However, terrestrial alteration and the lack of geologic context limit what can be learned from the lunar meteorites. Although no "new" large plutonic samples (i.e., hand-samples) remain to be discovered in the Apollo sample collection, there are many large polymict breccias in the Apollo collection containing relatively large (approximately 1 cm or larger) previously identified plutonic clasts, as well as a large number of unclassified lithic clasts. In addition, new, previously unidentified plutonic clasts are potentially discoverable within these breccias. The question becomes how to non-destructively locate and identify new lithic clasts of interest while minimizing the contamination and physical degradation of the samples.
Quantifying in situ Zooplankton Movement and Trophic Impacts on Thin Layers in East Sound, Washington

DTIC Science & Technology

2006-09-30

strength of the combination is that the tracking system quantifies swimming behaviors of protists in natural seawater samples with large numbers of motile...Sound was to link observations of thin layers to behavioral analysis of protists resident above, within, and below these features. Analysis of our...cells and diatom chains. We are not yet able to make statistical statements about swimming characteristics of the motile protists in our video samples
The cosmological principle is not in the sky

NASA Astrophysics Data System (ADS)

Park, Chan-Gyung; Hyun, Hwasu; Noh, Hyerim; Hwang, Jai-chan

2017-08-01

The homogeneity of matter distribution at large scales, known as the cosmological principle, is a central assumption in the standard cosmological model. The case is testable though, thus no longer needs to be a principle. Here we perform a test for spatial homogeneity using the Sloan Digital Sky Survey Luminous Red Galaxies (LRG) sample by counting galaxies within a specified volume with the radius scale varying up to 300 h-1 Mpc. We directly confront the large-scale structure data with the definition of spatial homogeneity by comparing the averages and dispersions of galaxy number counts with allowed ranges of the random distribution with homogeneity. The LRG sample shows significantly larger dispersions of number counts than the random catalogues up to 300 h-1 Mpc scale, and even the average is located far outside the range allowed in the random distribution; the deviations are statistically impossible to be realized in the random distribution. This implies that the cosmological principle does not hold even at such large scales. The same analysis of mock galaxies derived from the N-body simulation, however, suggests that the LRG sample is consistent with the current paradigm of cosmology, thus the simulation is also not homogeneous in that scale. We conclude that the cosmological principle is neither in the observed sky nor demanded to be there by the standard cosmological world model. This reveals the nature of the cosmological principle adopted in the modern cosmology paradigm, and opens a new field of research in theoretical cosmology.
Application of a Multivariant, Caucasian-Specific, Genotyped Donor Panel for Performance Validation of MDmulticard®, ID-System®, and Scangel® RhD/ABO Serotyping

PubMed Central

Gassner, Christoph; Rainer, Esther; Pircher, Elfriede; Markut, Lydia; Körmöczi, Günther F.; Jungbauer, Christof; Wessin, Dietmar; Klinghofer, Roswitha; Schennach, Harald; Schwind, Peter; Schönitzer, Diether

2009-01-01

Summary Background Validations of routinely used serological typing methods require intense performance evaluations typically including large numbers of samples before routine application. However, such evaluations could be improved considering information about the frequency of standard blood groups and their variants. Methods Using RHD and ABO population genetic data, a Caucasian-specific donor panel was compiled for a performance comparison of the three RhD and ABO serological typing methods MDmulticard (Medion Diagnostics), ID-System (DiaMed) and ScanGel (Bio-Rad). The final test panel included standard and variant RHD and ABO genotypes, e.g. RhD categories, partial and weak RhDs, RhD DELs, and ABO samples, mainly to interpret weak serological reactivity for blood group A specificity. All samples were from individuals recorded in our local DNA blood group typing database. Results For ‘standard’ blood groups, results of performance were clearly interpretable for all three serological methods compared. However, when focusing on specific variant phenotypes, pronounced differences in reaction strengths and specificities were observed between them. Conclusions A genetically and ethnically predefined donor test panel consisting of 93 individual samples only, delivered highly significant results for serological performance comparisons. Such small panels offer impressive representative powers, higher as such based on statistical chances and large numbers only. PMID:21113264
Conformational sampling with stochastic proximity embedding and self-organizing superimposition: establishing reasonable parameters for their practical use.

PubMed

Tresadern, Gary; Agrafiotis, Dimitris K

2009-12-01

Stochastic proximity embedding (SPE) and self-organizing superimposition (SOS) are two recently introduced methods for conformational sampling that have shown great promise in several application domains. Our previous validation studies aimed at exploring the limits of these methods and have involved rather exhaustive conformational searches producing a large number of conformations. However, from a practical point of view, such searches have become the exception rather than the norm. The increasing popularity of virtual screening has created a need for 3D conformational search methods that produce meaningful answers in a relatively short period of time and work effectively on a large scale. In this work, we examine the performance of these algorithms and the effects of different parameter settings at varying levels of sampling. Our goal is to identify search protocols that can produce a diverse set of chemically sensible conformations and have a reasonable probability of sampling biologically active space within a small number of trials. Our results suggest that both SPE and SOS are extremely competitive in this regard and produce very satisfactory results with as few as 500 conformations per molecule. The results improve even further when the raw conformations are minimized with a molecular mechanics force field to remove minor imperfections and any residual strain. These findings provide additional evidence that these methods are suitable for many everyday modeling tasks, both high- and low-throughput.
Rapid quantification of proanthocyanidins (condensed tannins) with a continuous flow analyzer

Treesearch

James K. Nitao; Bruce A. Birr; Muraleedharan G. Nair; Daniel A. Herms; William J. Mattson

2001-01-01

Proanthocyanidins (condensed tannins) frequently need to be quantified in large numbers of samples in food, plant, and environmental studies. An automated colorimetric method to quantify proanthocyanidins with sulfuric acid (H2SO4) was therefore developed for use in a continuous flow analyzer. Assay conditions were...
Prison Volunteers: Profiles, Motivations, Satisfaction

ERIC Educational Resources Information Center

Tewksbury, Richard; Dabney, Dean

2004-01-01

Large numbers of correctional institutions rely on volunteers to assist staff in various programs and tasks. At present there exists a paucity of literature describing these programs and/or subjecting them to systematic evaluation. The present study uses self-report data from a sample of active volunteers at a medium-security Southern prison to…
Uranium isotopes quantitatively determined by modified method of atomic absorption spectrophotometry

NASA Technical Reports Server (NTRS)

Lee, G. H.

1967-01-01

Hollow-cathode discharge tubes determine the quantities of uranium isotopes in a sample by using atomic absorption spectrophotometry. Dissociation of the uranium atoms allows a large number of ground state atoms to be produced, absorbing the incident radiation that is different for the two major isotopes.
Evaluation of PLS, LS-SVM, and LWR for quantitative spectroscopic analysis of soils

USDA-ARS?s Scientific Manuscript database

Soil testing requires the analysis of large numbers of samples in laboratory that are often time consuming and expensive. Mid-infrared spectroscopy (mid-IR) and near-infrared spectroscopy (NIRS) are fast, non-destructive, and inexpensive analytical methods that have been used for soil analysis, in l...
Impulse radar studfinder

DOEpatents

McEwan, Thomas E.

1995-01-01

An impulse radar studfinder propagates electromagnetic pulses and detects reflected pulses from a fixed range. Unmodulated pulses, about 200 ps wide, are emitted. A large number of reflected pulses are sampled and averaged. Background reflections are subtracted. Reflections from wall studs or other hidden objects are detected and displayed using light emitting diodes.
Pesticides in Urban Multiunit Dwellings: Hazard IdentificationUsing Classification and Regression Tree (CART) Analysis

EPA Science Inventory

Many units in public housing or other low-income urban dwellings may have elevated pesticide residues, given recurring infestation, but it would be logistically and economically infeasible to sample a large number of units to identify highly exposed households to design interven...

Impulse radar studfinder

DOEpatents

McEwan, T.E.

1995-10-10

An impulse radar studfinder propagates electromagnetic pulses and detects reflected pulses from a fixed range. Unmodulated pulses, about 200 ps wide, are emitted. A large number of reflected pulses are sampled and averaged. Background reflections are subtracted. Reflections from wall studs or other hidden objects are detected and displayed using light emitting diodes. 9 figs.
How Effective Is the Multidisciplinary Approach? A Follow-Up Study.

ERIC Educational Resources Information Center

Hochstadt, Neil J.; Harwicke, Neil J.

1985-01-01

The effectiveness of the multidisciplinary approach was assessed by examining the number of recommended services obtained by 180 children one year after multidisciplinary evaluation. Results indicated that a large percentage of services recommended were obtained, compared with the low probability reported in samples of abused and neglected…
Complete Genome Sequence of a Porcine Polyomavirus from Nasal Swabs of Pigs with Respiratory Disease

PubMed Central

Smith, Catherine; Bishop, Brian; Stewart, Chelsea; Simonson, Randy

2018-01-01

ABSTRACT Metagenomic sequencing of pooled nasal swabs from pigs with unexplained respiratory disease identified a large number of reads mapping to a previously uncharacterized porcine polyomavirus. Sus scrofa polyomavirus 2 was most closely related to betapolyomaviruses frequently detected in mammalian respiratory samples. PMID:29700160
MetaSRA: normalized human sample-specific metadata for the Sequence Read Archive.

PubMed

Bernstein, Matthew N; Doan, AnHai; Dewey, Colin N

2017-09-15

The NCBI's Sequence Read Archive (SRA) promises great biological insight if one could analyze the data in the aggregate; however, the data remain largely underutilized, in part, due to the poor structure of the metadata associated with each sample. The rules governing submissions to the SRA do not dictate a standardized set of terms that should be used to describe the biological samples from which the sequencing data are derived. As a result, the metadata include many synonyms, spelling variants and references to outside sources of information. Furthermore, manual annotation of the data remains intractable due to the large number of samples in the archive. For these reasons, it has been difficult to perform large-scale analyses that study the relationships between biomolecular processes and phenotype across diverse diseases, tissues and cell types present in the SRA. We present MetaSRA, a database of normalized SRA human sample-specific metadata following a schema inspired by the metadata organization of the ENCODE project. This schema involves mapping samples to terms in biomedical ontologies, labeling each sample with a sample-type category, and extracting real-valued properties. We automated these tasks via a novel computational pipeline. The MetaSRA is available at metasra.biostat.wisc.edu via both a searchable web interface and bulk downloads. Software implementing our computational pipeline is available at http://github.com/deweylab/metasra-pipeline. cdewey@biostat.wisc.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
Analysis of Clinical Cohort Data Using Nested Case-control and Case-cohort Sampling Designs. A Powerful and Economical Tool.

PubMed

Ohneberg, K; Wolkewitz, M; Beyersmann, J; Palomar-Martinez, M; Olaechea-Astigarraga, P; Alvarez-Lerma, F; Schumacher, M

2015-01-01

Sampling from a large cohort in order to derive a subsample that would be sufficient for statistical analysis is a frequently used method for handling large data sets in epidemiological studies with limited resources for exposure measurement. For clinical studies however, when interest is in the influence of a potential risk factor, cohort studies are often the first choice with all individuals entering the analysis. Our aim is to close the gap between epidemiological and clinical studies with respect to design and power considerations. Schoenfeld's formula for the number of events required for a Cox' proportional hazards model is fundamental. Our objective is to compare the power of analyzing the full cohort and the power of a nested case-control and a case-cohort design. We compare formulas for power for sampling designs and cohort studies. In our data example we simultaneously apply a nested case-control design with a varying number of controls matched to each case, a case cohort design with varying subcohort size, a random subsample and a full cohort analysis. For each design we calculate the standard error for estimated regression coefficients and the mean number of distinct persons, for whom covariate information is required. The formula for the power of a nested case-control design and the power of a case-cohort design is directly connected to the power of a cohort study using the well known Schoenfeld formula. The loss in precision of parameter estimates is relatively small compared to the saving in resources. Nested case-control and case-cohort studies, but not random subsamples yield an attractive alternative for analyzing clinical studies in the situation of a low event rate. Power calculations can be conducted straightforwardly to quantify the loss of power compared to the savings in the num-ber of patients using a sampling design instead of analyzing the full cohort.
Intensity of Territorial Marking Predicts Wolf Reproduction: Implications for Wolf Monitoring

PubMed Central

García, Emilio J.

2014-01-01

Background The implementation of intensive and complex approaches to monitor large carnivores is resource demanding, restricted to endangered species, small populations, or small distribution ranges. Wolf monitoring over large spatial scales is difficult, but the management of such contentious species requires regular estimations of abundance to guide decision-makers. The integration of wolf marking behaviour with simple sign counts may offer a cost-effective alternative to monitor the status of wolf populations over large spatial scales. Methodology/Principal Findings We used a multi-sampling approach, based on the collection of visual and scent wolf marks (faeces and ground scratching) and the assessment of wolf reproduction using howling and observation points, to test whether the intensity of marking behaviour around the pup-rearing period (summer-autumn) could reflect wolf reproduction. Between 1994 and 2007 we collected 1,964 wolf marks in a total of 1,877 km surveyed and we searched for the pups' presence (1,497 howling and 307 observations points) in 42 sampling sites with a regular presence of wolves (120 sampling sites/year). The number of wolf marks was ca. 3 times higher in sites with a confirmed presence of pups (20.3 vs. 7.2 marks). We found a significant relationship between the number of wolf marks (mean and maximum relative abundance index) and the probability of wolf reproduction. Conclusions/Significance This research establishes a real-time relationship between the intensity of wolf marking behaviour and wolf reproduction. We suggest a conservative cutting point of 0.60 for the probability of wolf reproduction to monitor wolves on a regional scale combined with the use of the mean relative abundance index of wolf marks in a given area. We show how the integration of wolf behaviour with simple sampling procedures permit rapid, real-time, and cost-effective assessments of the breeding status of wolf packs with substantial implications to monitor wolves at large spatial scales. PMID:24663068
Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading

PubMed Central

Ellis, Ian O.; Green, Andrew R.; Hanka, Rudolf

2008-01-01

Background We consider the problem of assessing inter-rater agreement when there are missing data and a large number of raters. Previous studies have shown only ‘moderate’ agreement between pathologists in grading breast cancer tumour specimens. We analyse a large but incomplete data-set consisting of 24177 grades, on a discrete 1–3 scale, provided by 732 pathologists for 52 samples. Methodology/Principal Findings We review existing methods for analysing inter-rater agreement for multiple raters and demonstrate two further methods. Firstly, we examine a simple non-chance-corrected agreement score based on the observed proportion of agreements with the consensus for each sample, which makes no allowance for missing data. Secondly, treating grades as lying on a continuous scale representing tumour severity, we use a Bayesian latent trait method to model cumulative probabilities of assigning grade values as functions of the severity and clarity of the tumour and of rater-specific parameters representing boundaries between grades 1–2 and 2–3. We simulate from the fitted model to estimate, for each rater, the probability of agreement with the majority. Both methods suggest that there are differences between raters in terms of rating behaviour, most often caused by consistent over- or under-estimation of the grade boundaries, and also considerable variability in the distribution of grades assigned to many individual samples. The Bayesian model addresses the tendency of the agreement score to be biased upwards for raters who, by chance, see a relatively ‘easy’ set of samples. Conclusions/Significance Latent trait models can be adapted to provide novel information about the nature of inter-rater agreement when the number of raters is large and there are missing data. In this large study there is substantial variability between pathologists and uncertainty in the identity of the ‘true’ grade of many of the breast cancer tumours, a fact often ignored in clinical studies. PMID:18698346
Forensic Tools to Track and Connect Physical Samples to Related Data

NASA Astrophysics Data System (ADS)

Molineux, A.; Thompson, A. C.; Baumgardner, R. W.

2016-12-01

Identifiers, such as local sample numbers, are critical to successfully connecting physical samples and related data. However, identifiers must be globally unique. The International Geo Sample Number (IGSN) generated when registering the sample in the System for Earth Sample Registration (SESAR) provides a globally unique alphanumeric code associated with basic metadata, related samples and their current physical storage location. When registered samples are published, users can link the figured samples to the basic metadata held at SESAR. The use cases we discuss include plant specimens from a Permian core, Holocene corals and derived powders, and thin sections with SEM stubs. Much of this material is now published. The plant taxonomic study from the core is a digital pdf and samples can be directly linked from the captions to the SESAR record. The study of stable isotopes from the corals is not yet digitally available, but individual samples are accessible. Full data and media records for both studies are located in our database where higher quality images, field notes, and section diagrams may exist. Georeferences permit mapping in current and deep time plate configurations. Several aspects emerged during this study. The first, ensure adequate and consistent details are registered with SESAR. Second, educate and encourage the researcher to obtain IGSNs. Third, publish the archive numbers, assigned prior to publication, alongside the IGSN. This provides access to further data through an Integrated Publishing Toolkit (IPT)/aggregators/or online repository databases, thus placing the initial sample in a much richer context for future studies. Fourth, encourage software developers to customize community software to extract data from a database and use it to register samples in bulk. This would improve workflow and provide a path for registration of large legacy collections.
K-Nearest Neighbor Algorithm Optimization in Text Categorization

NASA Astrophysics Data System (ADS)

Chen, Shufeng

2018-01-01

K-Nearest Neighbor (KNN) classification algorithm is one of the simplest methods of data mining. It has been widely used in classification, regression and pattern recognition. The traditional KNN method has some shortcomings such as large amount of sample computation and strong dependence on the sample library capacity. In this paper, a method of representative sample optimization based on CURE algorithm is proposed. On the basis of this, presenting a quick algorithm QKNN (Quick k-nearest neighbor) to find the nearest k neighbor samples, which greatly reduces the similarity calculation. The experimental results show that this algorithm can effectively reduce the number of samples and speed up the search for the k nearest neighbor samples to improve the performance of the algorithm.
Assessment of sampling stability in ecological applications of discriminant analysis

USGS Publications Warehouse

Williams, B.K.; Titus, K.

1988-01-01

A simulation study was undertaken to assess the sampling stability of the variable loadings in linear discriminant function analysis. A factorial design was used for the factors of multivariate dimensionality, dispersion structure, configuration of group means, and sample size. A total of 32,400 discriminant analyses were conducted, based on data from simulated populations with appropriate underlying statistical distributions. A review of 60 published studies and 142 individual analyses indicated that sample sizes in ecological studies often have met that requirement. However, individual group sample sizes frequently were very unequal, and checks of assumptions usually were not reported. The authors recommend that ecologists obtain group sample sizes that are at least three times as large as the number of variables measured.
Automated flow cytometric analysis across large numbers of samples and cell types.

PubMed

Chen, Xiaoyi; Hasan, Milena; Libri, Valentina; Urrutia, Alejandra; Beitz, Benoît; Rouilly, Vincent; Duffy, Darragh; Patin, Étienne; Chalmond, Bernard; Rogge, Lars; Quintana-Murci, Lluis; Albert, Matthew L; Schwikowski, Benno

2015-04-01

Multi-parametric flow cytometry is a key technology for characterization of immune cell phenotypes. However, robust high-dimensional post-analytic strategies for automated data analysis in large numbers of donors are still lacking. Here, we report a computational pipeline, called FlowGM, which minimizes operator input, is insensitive to compensation settings, and can be adapted to different analytic panels. A Gaussian Mixture Model (GMM)-based approach was utilized for initial clustering, with the number of clusters determined using Bayesian Information Criterion. Meta-clustering in a reference donor permitted automated identification of 24 cell types across four panels. Cluster labels were integrated into FCS files, thus permitting comparisons to manual gating. Cell numbers and coefficient of variation (CV) were similar between FlowGM and conventional gating for lymphocyte populations, but notably FlowGM provided improved discrimination of "hard-to-gate" monocyte and dendritic cell (DC) subsets. FlowGM thus provides rapid high-dimensional analysis of cell phenotypes and is amenable to cohort studies. Copyright © 2015. Published by Elsevier Inc.
QNB: differential RNA methylation analysis for count-based small-sample sequencing data with a quad-negative binomial model.

PubMed

Liu, Lian; Zhang, Shao-Wu; Huang, Yufei; Meng, Jia

2017-08-31

As a newly emerged research area, RNA epigenetics has drawn increasing attention recently for the participation of RNA methylation and other modifications in a number of crucial biological processes. Thanks to high throughput sequencing techniques, such as, MeRIP-Seq, transcriptome-wide RNA methylation profile is now available in the form of count-based data, with which it is often of interests to study the dynamics at epitranscriptomic layer. However, the sample size of RNA methylation experiment is usually very small due to its costs; and additionally, there usually exist a large number of genes whose methylation level cannot be accurately estimated due to their low expression level, making differential RNA methylation analysis a difficult task. We present QNB, a statistical approach for differential RNA methylation analysis with count-based small-sample sequencing data. Compared with previous approaches such as DRME model based on a statistical test covering the IP samples only with 2 negative binomial distributions, QNB is based on 4 independent negative binomial distributions with their variances and means linked by local regressions, and in the way, the input control samples are also properly taken care of. In addition, different from DRME approach, which relies only the input control sample only for estimating the background, QNB uses a more robust estimator for gene expression by combining information from both input and IP samples, which could largely improve the testing performance for very lowly expressed genes. QNB showed improved performance on both simulated and real MeRIP-Seq datasets when compared with competing algorithms. And the QNB model is also applicable to other datasets related RNA modifications, including but not limited to RNA bisulfite sequencing, m 1 A-Seq, Par-CLIP, RIP-Seq, etc.
TomoMiner and TomoMinerCloud: A software platform for large-scale subtomogram structural analysis

PubMed Central

Frazier, Zachary; Xu, Min; Alber, Frank

2017-01-01

SUMMARY Cryo-electron tomography (cryoET) captures the 3D electron density distribution of macromolecular complexes in close to native state. With the rapid advance of cryoET acquisition technologies, it is possible to generate large numbers (>100,000) of subtomograms, each containing a macromolecular complex. Often, these subtomograms represent a heterogeneous sample due to variations in structure and composition of a complex in situ form or because particles are a mixture of different complexes. In this case subtomograms must be classified. However, classification of large numbers of subtomograms is a time-intensive task and often a limiting bottleneck. This paper introduces an open source software platform, TomoMiner, for large-scale subtomogram classification, template matching, subtomogram averaging, and alignment. Its scalable and robust parallel processing allows efficient classification of tens to hundreds of thousands of subtomograms. Additionally, TomoMiner provides a pre-configured TomoMinerCloud computing service permitting users without sufficient computing resources instant access to TomoMiners high-performance features. PMID:28552576
Scalable Performance Measurement and Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gamblin, Todd

2009-01-01

Concurrency levels in large-scale, distributed-memory supercomputers are rising exponentially. Modern machines may contain 100,000 or more microprocessor cores, and the largest of these, IBM's Blue Gene/L, contains over 200,000 cores. Future systems are expected to support millions of concurrent tasks. In this dissertation, we focus on efficient techniques for measuring and analyzing the performance of applications running on very large parallel machines. Tuning the performance of large-scale applications can be a subtle and time-consuming task because application developers must measure and interpret data from many independent processes. While the volume of the raw data scales linearly with the number ofmore » tasks in the running system, the number of tasks is growing exponentially, and data for even small systems quickly becomes unmanageable. Transporting performance data from so many processes over a network can perturb application performance and make measurements inaccurate, and storing such data would require a prohibitive amount of space. Moreover, even if it were stored, analyzing the data would be extremely time-consuming. In this dissertation, we present novel methods for reducing performance data volume. The first draws on multi-scale wavelet techniques from signal processing to compress systemwide, time-varying load-balance data. The second uses statistical sampling to select a small subset of running processes to generate low-volume traces. A third approach combines sampling and wavelet compression to stratify performance data adaptively at run-time and to reduce further the cost of sampled tracing. We have integrated these approaches into Libra, a toolset for scalable load-balance analysis. We present Libra and show how it can be used to analyze data from large scientific applications scalably.« less
Automatic identification of variables in epidemiological datasets using logic regression.

PubMed

Lorenz, Matthias W; Abdi, Negin Ashtiani; Scheckenbach, Frank; Pflug, Anja; Bülbül, Alpaslan; Catapano, Alberico L; Agewall, Stefan; Ezhov, Marat; Bots, Michiel L; Kiechl, Stefan; Orth, Andreas

2017-04-13

For an individual participant data (IPD) meta-analysis, multiple datasets must be transformed in a consistent format, e.g. using uniform variable names. When large numbers of datasets have to be processed, this can be a time-consuming and error-prone task. Automated or semi-automated identification of variables can help to reduce the workload and improve the data quality. For semi-automation high sensitivity in the recognition of matching variables is particularly important, because it allows creating software which for a target variable presents a choice of source variables, from which a user can choose the matching one, with only low risk of having missed a correct source variable. For each variable in a set of target variables, a number of simple rules were manually created. With logic regression, an optimal Boolean combination of these rules was searched for every target variable, using a random subset of a large database of epidemiological and clinical cohort data (construction subset). In a second subset of this database (validation subset), this optimal combination rules were validated. In the construction sample, 41 target variables were allocated on average with a positive predictive value (PPV) of 34%, and a negative predictive value (NPV) of 95%. In the validation sample, PPV was 33%, whereas NPV remained at 94%. In the construction sample, PPV was 50% or less in 63% of all variables, in the validation sample in 71% of all variables. We demonstrated that the application of logic regression in a complex data management task in large epidemiological IPD meta-analyses is feasible. However, the performance of the algorithm is poor, which may require backup strategies.
On the importance of incorporating sampling weights in ...

EPA Pesticide Factsheets

Occupancy models are used extensively to assess wildlife-habitat associations and to predict species distributions across large geographic regions. Occupancy models were developed as a tool to properly account for imperfect detection of a species. Current guidelines on survey design requirements for occupancy models focus on the number of sample units and the pattern of revisits to a sample unit within a season. We focus on the sampling design or how the sample units are selected in geographic space (e.g., stratified, simple random, unequal probability, etc). In a probability design, each sample unit has a sample weight which quantifies the number of sample units it represents in the finite (oftentimes areal) sampling frame. We demonstrate the importance of including sampling weights in occupancy model estimation when the design is not a simple random sample or equal probability design. We assume a finite areal sampling frame as proposed for a national bat monitoring program. We compare several unequal and equal probability designs and varying sampling intensity within a simulation study. We found the traditional single season occupancy model produced biased estimates of occupancy and lower confidence interval coverage rates compared to occupancy models that accounted for the sampling design. We also discuss how our findings inform the analyses proposed for the nascent North American Bat Monitoring Program and other collaborative synthesis efforts that propose h
GMP Cryopreservation of Large Volumes of Cells for Regenerative Medicine: Active Control of the Freezing Process

PubMed Central

Massie, Isobel; Selden, Clare; Hodgson, Humphrey; Gibbons, Stephanie; Morris, G. John

2014-01-01

Cryopreservation protocols are increasingly required in regenerative medicine applications but must deliver functional products at clinical scale and comply with Good Manufacturing Process (GMP). While GMP cryopreservation is achievable on a small scale using a Stirling cryocooler-based controlled rate freezer (CRF) (EF600), successful large-scale GMP cryopreservation is more challenging due to heat transfer issues and control of ice nucleation, both complex events that impact success. We have developed a large-scale cryocooler-based CRF (VIA Freeze) that can process larger volumes and have evaluated it using alginate-encapsulated liver cell (HepG2) spheroids (ELS). It is anticipated that ELS will comprise the cellular component of a bioartificial liver and will be required in volumes of ∼2 L for clinical use. Sample temperatures and Stirling cryocooler power consumption was recorded throughout cooling runs for both small (500 μL) and large (200 mL) volume samples. ELS recoveries were assessed using viability (FDA/PI staining with image analysis), cell number (nuclei count), and function (protein secretion), along with cryoscanning electron microscopy and freeze substitution techniques to identify possible injury mechanisms. Slow cooling profiles were successfully applied to samples in both the EF600 and the VIA Freeze, and a number of cooling and warming profiles were evaluated. An optimized cooling protocol with a nonlinear cooling profile from ice nucleation to −60°C was implemented in both the EF600 and VIA Freeze. In the VIA Freeze the nucleation of ice is detected by the control software, allowing both noninvasive detection of the nucleation event for quality control purposes and the potential to modify the cooling profile following ice nucleation in an active manner. When processing 200 mL of ELS in the VIA Freeze—viabilities at 93.4%±7.4%, viable cell numbers at 14.3±1.7 million nuclei/mL alginate, and protein secretion at 10.5±1.7 μg/mL/24 h were obtained which, compared well with control ELS (viability −98.1%±0.9%; viable cell numbers −18.3±1.0 million nuclei/mL alginate; and protein secretion −18.7±1.8 μg/mL/24 h). Large volume GMP cryopreservation of ELS is possible with good functional recovery using the VIA Freeze and may also be applied to other regenerative medicine applications. PMID:24410575
Factors affecting the sticking of insects on modified aircraft wings

NASA Technical Reports Server (NTRS)

Yi, O.; Chitsaz-Z, M. R.; Eiss, N. S.; Wightman, J. P.

1988-01-01

Previous work showed that the total number of insects sticking to an aluminum surface was reduced by coating the aluminum surface with elastomers. Due to a large number of possible experimental errors, no correlation between the modulus of elasticity, the elastomer, and the total number of insects sticking to a given elastomer was obtained. One of the errors assumed to be introduced during the road test is a variable insect flux so the number of insects striking one surface might be different from that striking another sample. To eliminate this source of error, the road test used to collect insects was simulated in a laboratory by development of an insect impacting technique using a pipe and high pressure compressed air. The insects are accelerated by a compressed air gun to high velocities and are then impacted with a stationary target on which the sample is mounted. The velocity of an object exiting from the pipe was determined and further improvement of the technique was achieved to obtain a uniform air velocity distribution.
Large tree diameter distribution modelling using sparse airborne laser scanning data in a subtropical forest in Nepal

NASA Astrophysics Data System (ADS)

Rana, Parvez; Vauhkonen, Jari; Junttila, Virpi; Hou, Zhengyang; Gautam, Basanta; Cawkwell, Fiona; Tokola, Timo

2017-12-01

Large-diameter trees (taking DBH > 30 cm to define large trees) dominate the dynamics, function and structure of a forest ecosystem. The aim here was to employ sparse airborne laser scanning (ALS) data with a mean point density of 0.8 m-2 and the non-parametric k-most similar neighbour (k-MSN) to predict tree diameter at breast height (DBH) distributions in a subtropical forest in southern Nepal. The specific objectives were: (1) to evaluate the accuracy of the large-tree fraction of the diameter distribution; and (2) to assess the effect of the number of training areas (sample size, n) on the accuracy of the predicted tree diameter distribution. Comparison of the predicted distributions with empirical ones indicated that the large tree diameter distribution can be derived in a mixed species forest with a RMSE% of 66% and a bias% of -1.33%. It was also feasible to downsize the sample size without losing the interpretability capacity of the model. For large-diameter trees, even a reduction of half of the training plots (n = 250), giving a marginal increase in the RMSE% (1.12-1.97%) was reported compared with the original training plots (n = 500). To be consistent with these outcomes, the sample areas should capture the entire range of spatial and feature variability in order to reduce the occurrence of error.
The survival of large organic molecules during hypervelocity impacts with water ice: implications for sampling the icy surfaces of moons

NASA Astrophysics Data System (ADS)

Hurst, A.; Bowden, S. A.; Parnell, J.; Burchell, M. J.; Ball, A. J.

2007-12-01

There are a number of measurements relevant to planetary geology that can only be adequately performed by physically contacting a sample. This necessitates landing on the surface of a moon or planetary body or returning samples to earth. The need to physically contact a sample is particularly important in the case of measurements that could detect medium to low concentrations of large organic molecules present in surface materials. Large organic molecules, although a trace component of many meteoritic materials and rocks on the surface of earth, carry crucial information concerning the processing of meteoritic material in the surface and subsurface environments, and can be crucial indicators for the presence of life. Unfortunately landing on the surface of a small planetary body or moon is complicated, particularly if surface topography is only poorly characterised and the atmosphere thin thus requiring a propulsion system for a soft landing. One alternative to a surface landing may be to use an impactor launched from an orbiting spacecraft to launch material from the planets surface and shallow sub-surface into orbit. Ejected material could then be collected by a follow-up spacecraft and analyzed. The mission scenario considered in the Europa-Ice Clipper mission proposal included both sample return and the analysis of captured particles. Employing such a sampling procedure to analyse large organic molecules is only viable if large organic molecules present in ices survive hypervelocity impacts (HVIs). To investigate the survival of large organic molecules in HVIs with icy bodies a two stage light air gas gun was used to fire steel projectiles (1-1.5 mm diameter) at samples of water ice containing large organic molecules (amino acids, anthracene and beta-carotene a biological pigment) at velocities > 4.8 km/s.UV-VIS spectroscopy of ejected material detected beta-carotene indicating large organic molecules can survive hypervelocity impacts. These preliminary results are yet to be scaled up to a point where they can be accurately interpreted in the context of a likely mission scenario. However, they strongly indicate that in a low mass payload mission scenario where a lander has been considered unfeasible, such a sampling strategy merits further consideration.

Large area optical mapping of surface contact angle.

PubMed

Dutra, Guilherme; Canning, John; Padden, Whayne; Martelli, Cicero; Dligatch, Svetlana

2017-09-04

Top-down contact angle measurements have been validated and confirmed to be as good if not more reliable than side-based measurements. A range of samples, including industrially relevant materials for roofing and printing, has been compared. Using the top-down approach, mapping in both 1-D and 2-D has been demonstrated. The method was applied to study the change in contact angle as a function of change in silver (Ag) nanoparticle size controlled by thermal evaporation. Large area mapping reveals good uniformity for commercial Aspen paper coated with black laser printer ink. A demonstration of the forensic and chemical analysis potential in 2-D is shown by uncovering the hidden CsF initials made with mineral oil on the coated Aspen paper. The method promises to revolutionize nanoscale characterization and industrial monitoring as well as chemical analyses by allowing rapid contact angle measurements over large areas or large numbers of samples in ways and times that have not been possible before.
[The research protocol III. Study population].

PubMed

Arias-Gómez, Jesús; Villasís-Keever, Miguel Ángel; Miranda-Novales, María Guadalupe

2016-01-01

The study population is defined as a set of cases, determined, limited, and accessible, that will constitute the subjects for the selection of the sample, and must fulfill several characteristics and distinct criteria. The objectives of this manuscript are focused on specifying each one of the elements required to make the selection of the participants of a research project, during the elaboration of the protocol, including the concepts of study population, sample, selection criteria and sampling methods. After delineating the study population, the researcher must specify the criteria that each participant has to comply. The criteria that include the specific characteristics are denominated selection or eligibility criteria. These criteria are inclusion, exclusion and elimination, and will delineate the eligible population. The sampling methods are divided in two large groups: 1) probabilistic or random sampling and 2) non-probabilistic sampling. The difference lies in the employment of statistical methods to select the subjects. In every research, it is necessary to establish at the beginning the specific number of participants to be included to achieve the objectives of the study. This number is the sample size, and can be calculated or estimated with mathematical formulas and statistic software.
Aquifer environment selects for microbial species cohorts in sediment and groundwater

PubMed Central

Hug, Laura A; Thomas, Brian C; Brown, Christopher T; Frischkorn, Kyle R; Williams, Kenneth H; Tringe, Susannah G; Banfield, Jillian F

2015-01-01

Little is known about the biogeography or stability of sediment-associated microbial community membership because these environments are biologically complex and generally difficult to sample. High-throughput-sequencing methods provide new opportunities to simultaneously genomically sample and track microbial community members across a large number of sampling sites or times, with higher taxonomic resolution than is associated with 16 S ribosomal RNA gene surveys, and without the disadvantages of primer bias and gene copy number uncertainty. We characterized a sediment community at 5 m depth in an aquifer adjacent to the Colorado River and tracked its most abundant 133 organisms across 36 different sediment and groundwater samples. We sampled sites separated by centimeters, meters and tens of meters, collected on seven occasions over 6 years. Analysis of 1.4 terabase pairs of DNA sequence showed that these 133 organisms were more consistently detected in saturated sediments than in samples from the vadose zone, from distant locations or from groundwater filtrates. Abundance profiles across aquifer locations and from different sampling times identified organism cohorts that comprised subsets of the 133 organisms that were consistently associated. The data suggest that cohorts are partly selected for by shared environmental adaptation. PMID:25647349
Genovo: De Novo Assembly for Metagenomes

NASA Astrophysics Data System (ADS)

Laserson, Jonathan; Jojic, Vladimir; Koller, Daphne

Next-generation sequencing technologies produce a large number of noisy reads from the DNA in a sample. Metagenomics and population sequencing aim to recover the genomic sequences of the species in the sample, which could be of high diversity. Methods geared towards single sequence reconstruction are not sensitive enough when applied in this setting. We introduce a generative probabilistic model of read generation from environmental samples and present Genovo, a novel de novo sequence assembler that discovers likely sequence reconstructions under the model. A Chinese restaurant process prior accounts for the unknown number of genomes in the sample. Inference is made by applying a series of hill-climbing steps iteratively until convergence. We compare the performance of Genovo to three other short read assembly programs across one synthetic dataset and eight metagenomic datasets created using the 454 platform, the largest of which has 311k reads. Genovo's reconstructions cover more bases and recover more genes than the other methods, and yield a higher assembly score.
Implications for the origins of pure anorthosites found in the feldspathic lunar meteorites, Dhofar 489 group

NASA Astrophysics Data System (ADS)

Nagaoka, Hiroshi; Takeda, Hiroshi; Karouji, Yuzuru; Ohtake, Makiko; Yamaguchi, Akira; Yoneda, Shigekazu; Hasebe, Nobuyuki

2014-12-01

Remote observation by the reflectance spectrometers onboard the Japanese lunar explorer Kaguya (SELENE) showed the purest anorthosite (PAN) spots (>98% plagioclase) at some large craters. Mineralogical and petrologic investigations on the feldspathic lunar meteorites, Dhofar 489 and Dhofar 911, revealed the presence of several pure anorthosite clasts. A comparison with Apollo nearside samples of ferroan anorthosite (FAN) indicated that of the FAN samples returned by the Apollo missions, sample 60015 is the largest anorthosite with the highest plagioclase abundance and homogeneous mafic mineral compositions. These pure anorthosites (>98% plagioclase) have large chemical variations in Mg number (Mg# = molar 100 × Mg/(Mg + Fe)) of each coexisting mafic mineral. The variations imply that these pure anorthosites underwent complex formation processes and were not formed by simple flotation of plagioclase. The lunar highland samples with pure anorthosite and the PAN observed by Kaguya suggest that pure anorthosite is widely distributed as lunar crust lithology over the entire Moon.
Integrated crystal mounting and alignment system for high-throughput biological crystallography

DOEpatents

Nordmeyer, Robert A.; Snell, Gyorgy P.; Cornell, Earl W.; Kolbe, William F.; Yegian, Derek T.; Earnest, Thomas N.; Jaklevich, Joseph M.; Cork, Carl W.; Santarsiero, Bernard D.; Stevens, Raymond C.

2007-09-25

A method and apparatus for the transportation, remote and unattended mounting, and visual alignment and monitoring of protein crystals for synchrotron generated x-ray diffraction analysis. The protein samples are maintained at liquid nitrogen temperatures at all times: during shipment, before mounting, mounting, alignment, data acquisition and following removal. The samples must additionally be stably aligned to within a few microns at a point in space. The ability to accurately perform these tasks remotely and automatically leads to a significant increase in sample throughput and reliability for high-volume protein characterization efforts. Since the protein samples are placed in a shipping-compatible layered stack of sample cassettes each holding many samples, a large number of samples can be shipped in a single cryogenic shipping container.
Integrated crystal mounting and alignment system for high-throughput biological crystallography

DOEpatents

Nordmeyer, Robert A.; Snell, Gyorgy P.; Cornell, Earl W.; Kolbe, William; Yegian, Derek; Earnest, Thomas N.; Jaklevic, Joseph M.; Cork, Carl W.; Santarsiero, Bernard D.; Stevens, Raymond C.

2005-07-19

A method and apparatus for the transportation, remote and unattended mounting, and visual alignment and monitoring of protein crystals for synchrotron generated x-ray diffraction analysis. The protein samples are maintained at liquid nitrogen temperatures at all times: during shipment, before mounting, mounting, alignment, data acquisition and following removal. The samples must additionally be stably aligned to within a few microns at a point in space. The ability to accurately perform these tasks remotely and automatically leads to a significant increase in sample throughput and reliability for high-volume protein characterization efforts. Since the protein samples are placed in a shipping-compatible layered stack of sample cassettes each holding many samples, a large number of samples can be shipped in a single cryogenic shipping container.
Utilizing the ultrasensitive Schistosoma up-converting phosphor lateral flow circulating anodic antigen (UCP-LF CAA) assay for sample pooling-strategies.

PubMed

Corstjens, Paul L A M; Hoekstra, Pytsje T; de Dood, Claudia J; van Dam, Govert J

2017-11-01

Methodological applications of the high sensitivity genus-specific Schistosoma CAA strip test, allowing detection of single worm active infections (ultimate sensitivity), are discussed for efficient utilization in sample pooling strategies. Besides relevant cost reduction, pooling of samples rather than individual testing can provide valuable data for large scale mapping, surveillance, and monitoring. The laboratory-based CAA strip test utilizes luminescent quantitative up-converting phosphor (UCP) reporter particles and a rapid user-friendly lateral flow (LF) assay format. The test includes a sample preparation step that permits virtually unlimited sample concentration with urine, reaching ultimate sensitivity (single worm detection) at 100% specificity. This facilitates testing large urine pools from many individuals with minimal loss of sensitivity and specificity. The test determines the average CAA level of the individuals in the pool thus indicating overall worm burden and prevalence. When requiring test results at the individual level, smaller pools need to be analysed with the pool-size based on expected prevalence or when unknown, on the average CAA level of a larger group; CAA negative pools do not require individual test results and thus reduce the number of tests. Straightforward pooling strategies indicate that at sub-population level the CAA strip test is an efficient assay for general mapping, identification of hotspots, determination of stratified infection levels, and accurate monitoring of mass drug administrations (MDA). At the individual level, the number of tests can be reduced i.e. in low endemic settings as the pool size can be increased as opposed to prevalence decrease. At the sub-population level, average CAA concentrations determined in urine pools can be an appropriate measure indicating worm burden. Pooling strategies allowing this type of large scale testing are feasible with the various CAA strip test formats and do not affect sensitivity and specificity. It allows cost efficient stratified testing and monitoring of worm burden at the sub-population level, ideally for large-scale surveillance generating hard data for performance of MDA programs and strategic planning when moving towards transmission-stop and elimination.
Optimizing liquid effluent monitoring at a large nuclear complex.

PubMed

Chou, Charissa J; Barnett, D Brent; Johnson, Vernon G; Olson, Phil M

2003-12-01

Effluent monitoring typically requires a large number of analytes and samples during the initial or startup phase of a facility. Once a baseline is established, the analyte list and sampling frequency may be reduced. Although there is a large body of literature relevant to the initial design, few, if any, published papers exist on updating established effluent monitoring programs. This paper statistically evaluates four years of baseline data to optimize the liquid effluent monitoring efficiency of a centralized waste treatment and disposal facility at a large defense nuclear complex. Specific objectives were to: (1) assess temporal variability in analyte concentrations, (2) determine operational factors contributing to waste stream variability, (3) assess the probability of exceeding permit limits, and (4) streamline the sampling and analysis regime. Results indicated that the probability of exceeding permit limits was one in a million under normal facility operating conditions, sampling frequency could be reduced, and several analytes could be eliminated. Furthermore, indicators such as gross alpha and gross beta measurements could be used in lieu of more expensive specific isotopic analyses (radium, cesium-137, and strontium-90) for routine monitoring. Study results were used by the state regulatory agency to modify monitoring requirements for a new discharge permit, resulting in an annual cost savings of US dollars 223,000. This case study demonstrates that statistical evaluation of effluent contaminant variability coupled with process knowledge can help plant managers and regulators streamline analyte lists and sampling frequencies based on detection history and environmental risk.
Exploitation of immunofluorescence for the quantification and characterization of small numbers of Pasteuria endospores.

PubMed

Costa, Sofia R; Kerry, Brian R; Bardgett, Richard D; Davies, Keith G

2006-12-01

The Pasteuria group of endospore-forming bacteria has been studied as a biocontrol agent of plant-parasitic nematodes. Techniques have been developed for its detection and quantification in soil samples, and these mainly focus on observations of endospore attachment to nematodes. Characterization of Pasteuria populations has recently been performed with DNA-based techniques, which usually require the extraction of large numbers of spores. We describe a simple immunological method for the quantification and characterization of Pasteuria populations. Bayesian statistics were used to determine an extraction efficiency of 43% and a threshold of detection of 210 endospores g(-1) sand. This provided a robust means of estimating numbers of endospores in small-volume samples from a natural system. Based on visual assessment of endospore fluorescence, a quantitative method was developed to characterize endospore populations, which were shown to vary according to their host.
Large Area Crop Inventory Experiment (LACIE). Detecting and monitoring agricultural vegetative water stress over large areas using LANDSAT digital data. [Great Plains

NASA Technical Reports Server (NTRS)

Thompson, D. R.; Wehmanen, O. A. (Principal Investigator)

1978-01-01

The author has identified the following significant results. The Green Number Index technique which uses LANDSAT digital data from 5X6 nautical mile sampling frames was expanded to evaluate its usefulness in detecting and monitoring vegetative water stress over the Great Plains. At known growth stages for wheat, segments were classified as drought or non drought. Good agreement was found between the 18 day remotely sensed data and a weekly ground-based crop moisture index. Operational monitoring of the 1977 U.S.S.R. and Australian wheat crops indicated drought conditions. Drought isoline maps produced by the Green Number Index technique were in good agreement with conventional sources.
Determination of $sup 241$Am in soil using an automated nuclear radiation measurement laboratory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Engstrom, D.E.; White, M.G.; Dunaway, P.B.

The recent completion of REECo's Automated Laboratory and associated software systems has provided a significant increase in capability while reducing manpower requirements. The system is designed to perform gamma spectrum analyses on the large numbers of samples required by the current Nevada Applied Ecology Group (NAEG) and Plutonium Distribution Inventory Program (PDIP) soil sampling programs while maintaining sufficient sensitivities as defined by earlier investigations of the same type. The hardware and systems are generally described in this paper, with emphasis being placed on spectrum reduction and the calibration procedures used for soil samples. (auth)
Beyond Linear Sequence Comparisons: The use of genome-levelcharacters for phylogenetic reconstruction

DOE Office of Scientific and Technical Information (OSTI.GOV)

Boore, Jeffrey L.

2004-11-27

Although the phylogenetic relationships of many organisms have been convincingly resolved by the comparisons of nucleotide or amino acid sequences, others have remained equivocal despite great effort. Now that large-scale genome sequencing projects are sampling many lineages, it is becoming feasible to compare large data sets of genome-level features and to develop this as a tool for phylogenetic reconstruction that has advantages over conventional sequence comparisons. Although it is unlikely that these will address a large number of evolutionary branch points across the broad tree of life due to the infeasibility of such sampling, they have great potential for convincinglymore » resolving many critical, contested relationships for which no other data seems promising. However, it is important that we recognize potential pitfalls, establish reasonable standards for acceptance, and employ rigorous methodology to guard against a return to earlier days of scenario-driven evolutionary reconstructions.« less
The impact of health expenditure on the number of chronic diseases.

PubMed

Becchetti, Leonardo; Conzo, Pierluigi; Salustri, Francesco

2017-09-01

We investigate the impact of health expenditure on health outcomes on a large sample of Europeans aged above 50 using individual and regional level data. We find a negative and significant effect of lagged health expenditure on subsequent changes in the number of chronic diseases. This effect varies according to age, health behavior, gender, income, and education. Our empirical findings are confirmed also when health expenditure is instrumented with parliament political composition. Copyright © 2017 Elsevier B.V. All rights reserved.
Detecting Small Amounts of Gene Flow from Phylogenies of Alleles

PubMed Central

Slatkin, M.

1989-01-01

The method of coalescents is used to find the probability that none of the ancestors of alleles sampled from a population are immigrants. If that is the case for samples from two or more populations, then there would be concordance between the phylogenies of those alleles and the geographic locations from which they are drawn. This type of concordance has been found in several studies of mitochondrial DNA from natural populations. It is shown that if the number of sequences sampled from each population is reasonably large (10 or more), then this type of concordance suggests that the average number of individuals migrating between populations is likely to be relatively small (Nm < 1) but the possibility of occasional migrants cannot be excluded. The method is applied to the data of E. Bermingham and J. C. Avise on mtDNA from the bowfin, Amia calva. PMID:2714639
Prevalence and Level of Listeria monocytogenes in Ice Cream Linked to a Listeriosis Outbreak in the United States.

PubMed

Chen, Y I; Burall, Laurel S; Macarisin, Dumitru; Pouillot, Régis; Strain, Errol; DE Jesus, Antonio J; Laasri, Anna; Wang, Hua; Ali, Laila; Tatavarthy, Aparna; Zhang, Guodong; Hu, Lijun; Day, James; Kang, Jihun; Sahu, Surasri; Srinivasan, Devayani; Klontz, Karl; Parish, Mickey; Evans, Peter S; Brown, Eric W; Hammack, Thomas S; Zink, Donald L; Datta, Atin R

2016-11-01

A most-probable-number (MPN) method was used to enumerate Listeria monocytogenes in 2,320 commercial ice cream scoops manufactured on a production line that was implicated in a 2015 listeriosis outbreak in the United States. The analyzed samples were collected from seven lots produced in November 2014, December 2014, January 2015, and March 2015. L. monocytogenes was detected in 99% (2,307 of 2,320) of the tested samples (lower limit of detection, 0.03 MPN/g), 92% of which were contaminated at <20 MPN/g. The levels of L. monocytogenes in these samples had a geometric mean per lot of 0.15 to 7.1 MPN/g. The prevalence and enumeration data from an unprecedented large number of naturally contaminated ice cream products linked to a listeriosis outbreak provided a unique data set for further understanding the risk associated with L. monocytogenes contamination for highly susceptible populations.
Determination of Aromatic Ring Number Using Multi-Channel Deep UV Native Fluorescence

NASA Technical Reports Server (NTRS)

Bhartia, R.; McDonald, G. D.; Salas, E.; Conrad, P.

2004-01-01

The in situ detection of organic material on an extraterrestrial surface requires both effective means of searching a relatively large surface area or volume for possible organic carbon, and a more specific means of identifying and quantifying compounds in indicated samples. Fluorescence spectroscopy fits the first requirement well, as it can be carried out rapidly, with minimal or no physical contact with the sample, and with sensitivity unmatched by any other organic analytical technique. Aromatic organic compounds with know fluorescence signatures have been identified in several extraterrestrial samples, including carbonaceous chondrites, interplanetary dust particles, and Martian meteorites. The compound distributions vary among these sources, however, with clear differences in relative abundances by number of aromatic rings and by degree of alkylation. This relative abundance information, therefore, can be used to infer the source of organic material detected on a planetary surface.
Factorial Structure and Age-Related Psychometrics of the MIDUS Personality Adjective Items across the Lifespan

PubMed Central

Zimprich, Daniel; Allemand, Mathias; Lachman, Margie E.

2014-01-01

The present study addresses issues of measurement invariance and comparability of factor parameters of Big Five personality adjective items across age. Data from the Midlife in the United States (MIDUS) survey were used to investigate age-related developmental psychometrics of the MIDUS personality adjective items in two large cross-sectional samples (exploratory sample: N = 862; analysis sample: N = 3,000). After having established and replicated a comprehensive five-factor structure of the measure, increasing levels of measurement invariance were tested across ten age groups. Results indicate that the measure demonstrates strict measurement invariance in terms of number of factors and factor loadings. Also, we found that factor variances and covariances were equal across age groups. By contrast, a number of age-related factor mean differences emerged. The practical implications of these results are discussed and future research is suggested. PMID:21910548
Pattern recognition methods and air pollution source identification. [based on wind direction

NASA Technical Reports Server (NTRS)

Leibecki, H. F.; King, R. B.

1978-01-01

Directional air samplers, used for resolving suspended particulate matter on the basis of time and wind direction were used to assess the feasibility of characterizing and identifying emission source types in urban multisource environments. Filters were evaluated for 16 elements and X-ray fluorescence methods yielded elemental concentrations for direction, day, and the interaction of direction and day. Large numbers of samples are necessary to compensate for large day-to-day variations caused by wind perturbations and/or source changes.
Analysis of the research sample collections of Uppsala biobank.

PubMed

Engelmark, Malin T; Beskow, Anna H

2014-10-01

Uppsala Biobank is the joint and only biobank organization of the two principals, Uppsala University and Uppsala University Hospital. Biobanks are required to have updated registries on sample collection composition and management in order to fulfill legal regulations. We report here the results from the first comprehensive and overall analysis of the 131 research sample collections organized in the biobank. The results show that the median of the number of samples in the collections was 700 and that the number of samples varied from less than 500 to over one million. Blood samples, such as whole blood, serum, and plasma, were included in the vast majority, 84.0%, of the research sample collections. Also, as much as 95.5% of the newly collected samples within healthcare included blood samples, which further supports the concept that blood samples have fundamental importance for medical research. Tissue samples were also commonly used and occurred in 39.7% of the research sample collections, often combined with other types of samples. In total, 96.9% of the 131 sample collections included samples collected for healthcare, showing the importance of healthcare as a research infrastructure. Of the collections that had accessed existing samples from healthcare, as much as 96.3% included tissue samples from the Department of Pathology, which shows the importance of pathology samples as a resource for medical research. Analysis of different research areas shows that the most common of known public health diseases are covered. Collections that had generated the most publications, up to over 300, contained a large number of samples collected systematically and repeatedly over many years. More knowledge about existing biobank materials, together with public registries on sample collections, will support research collaborations, improve transparency, and bring us closer to the goals of biobanks, which is to save and prolong human lives and improve health and quality of life.

Morphological changes in polycrystalline Fe after compression and release

NASA Astrophysics Data System (ADS)

Gunkelmann, Nina; Tramontina, Diego R.; Bringa, Eduardo M.; Urbassek, Herbert M.

2015-02-01

Despite a number of large-scale molecular dynamics simulations of shock compressed iron, the morphological properties of simulated recovered samples are still unexplored. Key questions remain open in this area, including the role of dislocation motion and deformation twinning in shear stress release. In this study, we present simulations of homogeneous uniaxial compression and recovery of large polycrystalline iron samples. Our results reveal significant recovery of the body-centered cubic grains with some deformation twinning driven by shear stress, in agreement with experimental results by Wang et al. [Sci. Rep. 3, 1086 (2013)]. The twin fraction agrees reasonably well with a semi-analytical model which assumes a critical shear stress for twinning. On reloading, twins disappear and the material reaches a very low strength value.
An Excel Workbook for Identifying Redox Processes in Ground Water

USGS Publications Warehouse

Jurgens, Bryant C.; McMahon, Peter B.; Chapelle, Francis H.; Eberts, Sandra M.

2009-01-01

The reduction/oxidation (redox) condition of ground water affects the concentration, transport, and fate of many anthropogenic and natural contaminants. The redox state of a ground-water sample is defined by the dominant type of reduction/oxidation reaction, or redox process, occurring in the sample, as inferred from water-quality data. However, because of the difficulty in defining and applying a systematic redox framework to samples from diverse hydrogeologic settings, many regional water-quality investigations do not attempt to determine the predominant redox process in ground water. Recently, McMahon and Chapelle (2008) devised a redox framework that was applied to a large number of samples from 15 principal aquifer systems in the United States to examine the effect of redox processes on water quality. This framework was expanded by Chapelle and others (in press) to use measured sulfide data to differentiate between iron(III)- and sulfate-reducing conditions. These investigations showed that a systematic approach to characterize redox conditions in ground water could be applied to datasets from diverse hydrogeologic settings using water-quality data routinely collected in regional water-quality investigations. This report describes the Microsoft Excel workbook, RedoxAssignment_McMahon&Chapelle.xls, that assigns the predominant redox process to samples using the framework created by McMahon and Chapelle (2008) and expanded by Chapelle and others (in press). Assignment of redox conditions is based on concentrations of dissolved oxygen (O2), nitrate (NO3-), manganese (Mn2+), iron (Fe2+), sulfate (SO42-), and sulfide (sum of dihydrogen sulfide [aqueous H2S], hydrogen sulfide [HS-], and sulfide [S2-]). The logical arguments for assigning the predominant redox process to each sample are performed by a program written in Microsoft Visual Basic for Applications (VBA). The program is called from buttons on the main worksheet. The number of samples that can be analyzed is only limited by the number of rows in Excel (65,536 for Excel 2003 and XP; and 1,048,576 for Excel 2007), and is therefore appropriate for large datasets.
A rat genetic map constructed by representational difference analysis markers with suitability for large-scale typing.

PubMed Central

Toyota, M; Canzian, F; Ushijima, T; Hosoya, Y; Kuramoto, T; Serikawa, T; Imai, K; Sugimura, T; Nagao, M

1996-01-01

Representational difference analysis (RDA) was applied to isolate chromosomal markers in the rat. Four series of RDA [restriction enzymes, BamHI and HindIII; subtraction of ACI/N (ACI) amplicon from BUF/Nac (BUF) amplicon and vice versa] yielded 131 polymorphic markers; 125 of these markers were mapped to all chromosomes except for chromosome X. This was done by using a mapping panel of 105 ACI x BUF F2 rats. To complement the relative paucity of chromosomal markers in the rat, genetically directed RDA, which allows isolation of polymorphic markers in the specific chromosomal region, was performed. By changing the F2 driver-DNA allele frequency around the region, four markers were isolated from the D1Ncc1 locus. Twenty-five of 27 RDA markers were informative regarding the dot blot analysis of amplicons, hybridizing only with tester amplicons. Dot blot analysis at a high density per unit of area made it possible to process a large number of samples. Quantitative trait loci can now be mapped in the rat genome by processing a large number of samples with RDA markers and then by isolating markers close to the loci of interest by genetically directed RDA. Images Fig. 1 Fig. 3 Fig. 4 PMID:8632989
Extreme Quantum Memory Advantage for Rare-Event Sampling

NASA Astrophysics Data System (ADS)

Aghamohammadi, Cina; Loomis, Samuel P.; Mahoney, John R.; Crutchfield, James P.

2018-02-01

We introduce a quantum algorithm for memory-efficient biased sampling of rare events generated by classical memoryful stochastic processes. Two efficiency metrics are used to compare quantum and classical resources for rare-event sampling. For a fixed stochastic process, the first is the classical-to-quantum ratio of required memory. We show for two example processes that there exists an infinite number of rare-event classes for which the memory ratio for sampling is larger than r , for any large real number r . Then, for a sequence of processes each labeled by an integer size N , we compare how the classical and quantum required memories scale with N . In this setting, since both memories can diverge as N →∞ , the efficiency metric tracks how fast they diverge. An extreme quantum memory advantage exists when the classical memory diverges in the limit N →∞ , but the quantum memory has a finite bound. We then show that finite-state Markov processes and spin chains exhibit memory advantage for sampling of almost all of their rare-event classes.
A robust clustering algorithm for identifying problematic samples in genome-wide association studies.

PubMed

Bellenguez, Céline; Strange, Amy; Freeman, Colin; Donnelly, Peter; Spencer, Chris C A

2012-01-01

High-throughput genotyping arrays provide an efficient way to survey single nucleotide polymorphisms (SNPs) across the genome in large numbers of individuals. Downstream analysis of the data, for example in genome-wide association studies (GWAS), often involves statistical models of genotype frequencies across individuals. The complexities of the sample collection process and the potential for errors in the experimental assay can lead to biases and artefacts in an individual's inferred genotypes. Rather than attempting to model these complications, it has become a standard practice to remove individuals whose genome-wide data differ from the sample at large. Here we describe a simple, but robust, statistical algorithm to identify samples with atypical summaries of genome-wide variation. Its use as a semi-automated quality control tool is demonstrated using several summary statistics, selected to identify different potential problems, and it is applied to two different genotyping platforms and sample collections. The algorithm is written in R and is freely available at www.well.ox.ac.uk/chris-spencer chris.spencer@well.ox.ac.uk Supplementary data are available at Bioinformatics online.
The topology of large-scale structure. III - Analysis of observations

NASA Astrophysics Data System (ADS)

Gott, J. Richard, III; Miller, John; Thuan, Trinh X.; Schneider, Stephen E.; Weinberg, David H.; Gammie, Charles; Polk, Kevin; Vogeley, Michael; Jeffrey, Scott; Bhavsar, Suketu P.; Melott, Adrian L.; Giovanelli, Riccardo; Hayes, Martha P.; Tully, R. Brent; Hamilton, Andrew J. S.

1989-05-01

A recently developed algorithm for quantitatively measuring the topology of large-scale structures in the universe was applied to a number of important observational data sets. The data sets included an Abell (1958) cluster sample out to Vmax = 22,600 km/sec, the Giovanelli and Haynes (1985) sample out to Vmax = 11,800 km/sec, the CfA sample out to Vmax = 5000 km/sec, the Thuan and Schneider (1988) dwarf sample out to Vmax = 3000 km/sec, and the Tully (1987) sample out to Vmax = 3000 km/sec. It was found that, when the topology is studied on smoothing scales significantly larger than the correlation length (i.e., smoothing length, lambda, not below 1200 km/sec), the topology is spongelike and is consistent with the standard model in which the structure seen today has grown from small fluctuations caused by random noise in the early universe. When the topology is studied on the scale of lambda of about 600 km/sec, a small shift is observed in the genus curve in the direction of a 'meatball' topology.
The topology of large-scale structure. III - Analysis of observations. [in universe

NASA Technical Reports Server (NTRS)

Gott, J. Richard, III; Weinberg, David H.; Miller, John; Thuan, Trinh X.; Schneider, Stephen E.

1989-01-01

A recently developed algorithm for quantitatively measuring the topology of large-scale structures in the universe was applied to a number of important observational data sets. The data sets included an Abell (1958) cluster sample out to Vmax = 22,600 km/sec, the Giovanelli and Haynes (1985) sample out to Vmax = 11,800 km/sec, the CfA sample out to Vmax = 5000 km/sec, the Thuan and Schneider (1988) dwarf sample out to Vmax = 3000 km/sec, and the Tully (1987) sample out to Vmax = 3000 km/sec. It was found that, when the topology is studied on smoothing scales significantly larger than the correlation length (i.e., smoothing length, lambda, not below 1200 km/sec), the topology is spongelike and is consistent with the standard model in which the structure seen today has grown from small fluctuations caused by random noise in the early universe. When the topology is studied on the scale of lambda of about 600 km/sec, a small shift is observed in the genus curve in the direction of a 'meatball' topology.
Sizing for the apparel industry using statistical analysis - a Brazilian case study

NASA Astrophysics Data System (ADS)

Capelassi, C. H.; Carvalho, M. A.; El Kattel, C.; Xu, B.

2017-10-01

The study of the body measurements of Brazilian women used the Kinect Body Imaging system for 3D body scanning. The result of the study aims to meet the needs of the apparel industry for accurate measurements. Data was statistically treated using the IBM SPSS 23 system, with 95% confidence (P<0,05) for the inferential analysis, with the purpose of grouping the measurements in sizes, so that a smaller number of sizes can cover a greater number of people. The sample consisted of 101 volunteers aged between 19 and 62 years. A cluster analysis was performed to identify the main body shapes of the sample. The results were divided between the top and bottom body portions; For the top portion, were used the measurements of the abdomen, waist and bust circumferences, as well as the height; For the bottom portion, were used the measurements of the hip circumference and the height. Three sizing systems were developed for the researched sample from the Abdomen-to-Height Ratio - AHR (top portion): Small (AHR < 0,52), Medium (AHR: 0,52-0,58), Large (AHR > 0,58) and from the Hip-to-Height Ratio - HHR (bottom portion): Small (HHR < 0,62), Medium (HHR: 0,62-0,68), Large (HHR > 0,68).
Bayesian pedigree inference with small numbers of single nucleotide polymorphisms via a factor-graph representation.

PubMed

Anderson, Eric C; Ng, Thomas C

2016-02-01

We develop a computational framework for addressing pedigree inference problems using small numbers (80-400) of single nucleotide polymorphisms (SNPs). Our approach relaxes the assumptions, which are commonly made, that sampling is complete with respect to the pedigree and that there is no genotyping error. It relies on representing the inferred pedigree as a factor graph and invoking the Sum-Product algorithm to compute and store quantities that allow the joint probability of the data to be rapidly computed under a large class of rearrangements of the pedigree structure. This allows efficient MCMC sampling over the space of pedigrees, and, hence, Bayesian inference of pedigree structure. In this paper we restrict ourselves to inference of pedigrees without loops using SNPs assumed to be unlinked. We present the methodology in general for multigenerational inference, and we illustrate the method by applying it to the inference of full sibling groups in a large sample (n=1157) of Chinook salmon typed at 95 SNPs. The results show that our method provides a better point estimate and estimate of uncertainty than the currently best-available maximum-likelihood sibling reconstruction method. Extensions of this work to more complex scenarios are briefly discussed. Published by Elsevier Inc.
Estimating Divergence Parameters With Small Samples From a Large Number of Loci

PubMed Central

Wang, Yong; Hey, Jody

2010-01-01

Most methods for studying divergence with gene flow rely upon data from many individuals at few loci. Such data can be useful for inferring recent population history but they are unlikely to contain sufficient information about older events. However, the growing availability of genome sequences suggests a different kind of sampling scheme, one that may be more suited to studying relatively ancient divergence. Data sets extracted from whole-genome alignments may represent very few individuals but contain a very large number of loci. To take advantage of such data we developed a new maximum-likelihood method for genomic data under the isolation-with-migration model. Unlike many coalescent-based likelihood methods, our method does not rely on Monte Carlo sampling of genealogies, but rather provides a precise calculation of the likelihood by numerical integration over all genealogies. We demonstrate that the method works well on simulated data sets. We also consider two models for accommodating mutation rate variation among loci and find that the model that treats mutation rates as random variables leads to better estimates. We applied the method to the divergence of Drosophila melanogaster and D. simulans and detected a low, but statistically significant, signal of gene flow from D. simulans to D. melanogaster. PMID:19917765
An atlas of L-T transition brown dwarfs with VLT/XShooter

NASA Astrophysics Data System (ADS)

Marocco, F.; Day-Jones, A. C.; Jones, H. R. A.; Pinfield, D. J.

In this contribution we present the first results from a large observing campaign we are carrying out using VLT/Xshooter to obtain spectra of a large sample (˜250 objects) of L-T transition brown dwarfs. Here we report the results based on the first ˜120 spectra already obtained. The large sample, and the wide spectral coverage (300-2480 nm) given by Xshooter, will allow us to do a new powerful analysis, at an unprecedent level. By fitting the absorption lines of a given element (e.g. Na) at different wavelengths we can test ultracool atmospheric models and draw for the first time a 3D picture of stellar atmospheres at temperatures down to 1000K. Determining the atmospheric parameters (e.g. temperature, surface gravity and metallicity) of a big sample of brown dwarfs, will allow us to understand the role of these parameters on the formation of their spectra. The large number of objects in our sample also will allow us to do a statistical significant test of the birth rate and initial mass function predictions for brown dwarfs. Determining the shape of the initial mass function for very low mass objects is a fundamental task to improve galaxy models, as recent studies tep{2010Natur.468..940V} have shown that low-mass objects dominate in massive elliptical galaxies.
One Sample, One Shot - Evaluation of sample preparation protocols for the mass spectrometric proteome analysis of human bile fluid without extensive fractionation.

PubMed

Megger, Dominik A; Padden, Juliet; Rosowski, Kristin; Uszkoreit, Julian; Bracht, Thilo; Eisenacher, Martin; Gerges, Christian; Neuhaus, Horst; Schumacher, Brigitte; Schlaak, Jörg F; Sitek, Barbara

2017-02-10

The proteome analysis of bile fluid represents a promising strategy to identify biomarker candidates for various diseases of the hepatobiliary system. However, to obtain substantive results in biomarker discovery studies large patient cohorts necessarily need to be analyzed. Consequently, this would lead to an unmanageable number of samples to be analyzed if sample preparation protocols with extensive fractionation methods are applied. Hence, the performance of simple workflows allowing for "one sample, one shot" experiments have been evaluated in this study. In detail, sixteen different protocols implying modifications at the stages of desalting, delipidation, deglycosylation and tryptic digestion have been examined. Each method has been individually evaluated regarding various performance criteria and comparative analyses have been conducted to uncover possible complementarities. Here, the best performance in terms of proteome coverage has been assessed for a combination of acetone precipitation with in-gel digestion. Finally, a mapping of all obtained protein identifications with putative biomarkers for hepatocellular carcinoma (HCC) and cholangiocellular carcinoma (CCC) revealed several proteins easily detectable in bile fluid. These results can build the basis for future studies with large and well-defined patient cohorts in a more disease-related context. Human bile fluid is a proximal body fluid and supposed to be a potential source of disease markers. However, due to its biochemical composition, the proteome analysis of bile fluid still represents a challenging task and is therefore mostly conducted using extensive fractionation procedures. This in turn leads to a high number of mass spectrometric measurements for one biological sample. Considering the fact that in order to overcome the biological variability a high number of biological samples needs to be analyzed in biomarker discovery studies, this leads to the dilemma of an unmanageable number of necessary MS-based analyses. Hence, easy sample preparation protocols are demanded representing a compromise between proteome coverage and simplicity. In the presented study, such protocols have been evaluated regarding various technical criteria (e.g. identification rates, missed cleavages, chromatographic separation) uncovering the strengths and weaknesses of various methods. Furthermore, a cumulative bile proteome list has been generated that extends the current bile proteome catalog by 248 proteins. Finally, a mapping with putative biomarkers for hepatocellular carcinoma (HCC) and cholangiocellular carcinoma (CCC) derived from tissue-based studies, revealed several of these proteins being easily and reproducibly detectable in human bile. Therefore, the presented technical work represents a solid base for future disease-related studies. Copyright © 2016 Elsevier B.V. All rights reserved.
Investigation of Large Capacity Optical Memories for Correlator Applications.

DTIC Science & Technology

1981-10-01

refringence varies to a certain extent over the area ofeach sample film due to nonuniform stretching during manufacture. Such variations make the...made with the tank placed in a lair ’ c number of positions in the scene -- clearly an excessive burden. Now when the tarets are positioned in the scene
Phased genotyping-by-sequencing enhances analysis of genetic diversity and reveals divergent copy number variants in maize

USDA-ARS?s Scientific Manuscript database

High-throughput sequencing of reduced representation genomic libraries has ushered in an era of genotyping-by-sequencing (GBS), where genome-wide genotype data can be obtained for nearly any species. However, there remains a need for imputation-free GBS methods for genotyping large samples taken fr...
THE EFFECT OF VARYING ELECTROFISHING DESIGN ON BIOASSESSMENT RESULTS OF FOUR LARGE RIVERS IN THE OHIO RIVER BASIN

EPA Science Inventory

In 1999, the effect of electrofishing design (single bank or paired banks) and sampling distance on bioassessment results was studied in four boatable rivers in the Ohio River basin. The relationship between the number of species collected and the total distance electrofished wa...
An Assessment of Decision-Making Processes in Dual-Career Marriages.

ERIC Educational Resources Information Center

Kingsbury, Nancy M.

As large numbers of women enter the labor force, decision making and power processes have assumed greater importance in marital relationships. A sample of 51 (N=101) dual-career couples were interviewed to assess independent variables predictive of process power, process outcome, and subjective outcomes of decision making in dual-career families.…
A Size Exclusion Chromatography Laboratory with Unknowns for Introductory Students

ERIC Educational Resources Information Center

McIntee, Edward J.; Graham, Kate J.; Colosky, Edward C.; Jakubowski, Henry V.

2015-01-01

Size exclusion chromatography is an important technique in the separation of biological and polymeric samples by molecular weight. While a number of laboratory experiments have been published that use this technique for the purification of large molecules, this is the first report of an experiment that focuses on purifying an unknown small…
Initial Development and Validation of the BullyHARM: The Bullying, Harassment, and Aggression Receipt Measure

ERIC Educational Resources Information Center

Hall, William J.

2016-01-01

This article describes the development and preliminary validation of the Bullying, Harassment, and Aggression Receipt Measure (BullyHARM). The development of the BullyHARM involved a number of steps and methods, including a literature review, expert review, cognitive testing, readability testing, data collection from a large sample, reliability…
Transcriptome amplification coupled with nanopore sequencing as a surveillance tool for plant pathogens in plant and insect tissues

USDA-ARS?s Scientific Manuscript database

There are many plant pathogen-specific diagnostic assays, based on PCR and immune-detection. However, the ability to test for large numbers of pathogens simultaneously is lacking. Next generation sequencing (NGS) allows one to detect all organisms within a given sample, but has computational limitat...
Developmental Stuttering in Children Who Are Hard of Hearing

ERIC Educational Resources Information Center

Arena, Richard M.; Walker, Elizabeth A.; Oleson, Jacob J.

2017-01-01

Purpose: A number of studies with large sample sizes have reported lower prevalence of stuttering in children with significant hearing loss compared to children without hearing loss. This study used a parent questionnaire to investigate the characteristics of stuttering (e.g., incidence, prevalence, and age of onset) in children who are hard of…

Metapopulation models for historical inference.

PubMed

Wakeley, John

2004-04-01

The genealogical process for a sample from a metapopulation, in which local populations are connected by migration and can undergo extinction and subsequent recolonization, is shown to have a relatively simple structure in the limit as the number of populations in the metapopulation approaches infinity. The result, which is an approximation to the ancestral behaviour of samples from a metapopulation with a large number of populations, is the same as that previously described for other metapopulation models, namely that the genealogical process is closely related to Kingman's unstructured coalescent. The present work considers a more general class of models that includes two kinds of extinction and recolonization, and the possibility that gamete production precedes extinction. In addition, following other recent work, this result for a metapopulation divided into many populations is shown to hold both for finite population sizes and in the usual diffusion limit, which assumes that population sizes are large. Examples illustrate when the usual diffusion limit is appropriate and when it is not. Some shortcomings and extensions of the model are considered, and the relevance of such models to understanding human history is discussed.
Glass sample preparation and performance investigations

NASA Astrophysics Data System (ADS)

Johnson, R. Barry

1992-04-01

This final report details the work performed under this delivery order from April 1991 through April 1992. The currently available capabilities for integrated optical performance modeling at MSFC for large and complex systems such as AXAF were investigated. The Integrated Structural Modeling (ISM) program developed by Boeing for the U.S. Air Force was obtained and installed on two DECstations 5000 at MSFC. The structural, thermal and optical analysis programs available in ISM were evaluated. As part of the optomechanical engineering activities, technical support was provided in the design of support structure, mirror assembly, filter wheel assembly and material selection for the Solar X-ray Imager (SXI) program. As part of the fabrication activities, a large number of zerodur glass samples were prepared in different sizes and shapes for acid etching, coating and polishing experiments to characterize the subsurface damage and stresses produced by the grinding and polishing operations. Various optical components for AXAF video microscope and the x-ray test facility were also fabricated. A number of glass fabrication and test instruments such as a scatter plate interferometer, a gravity feed saw and some phenolic cutting blades were fabricated, integrated and tested.
Responses of arthropods to large-scale manipulations of dead wood in loblolly pine stands of the southeastern United States.

PubMed

Ulyshen, Michael D; Hanula, James L

2009-08-01

Large-scale experimental manipulations of dead wood are needed to better understand its importance to animal communities in managed forests. In this experiment, we compared the abundance, species richness, diversity, and composition of arthropods in 9.3-ha plots in which either (1) all coarse woody debris was removed, (2) a large number of logs were added, (3) a large number of snags were added, or (4) no coarse woody debris was added or removed. The target taxa were ground-dwelling arthropods, sampled by pitfall traps, and saproxylic beetles (i.e., dependent on dead wood), sampled by flight intercept traps and emergence traps. There were no differences in total ground-dwelling arthropod abundance, richness, diversity, or composition among treatments. Only the results for ground beetles (Carabidae), which were more species rich and diverse in log input plots, supported our prediction that ground-dwelling arthropods would benefit from additions of dead wood. There were also no differences in saproxylic beetle abundance, richness, diversity, or composition among treatments. The findings from this study are encouraging in that arthropods seem less sensitive than expected to manipulations of dead wood in managed pine forests of the southeastern United States. Based on our results, we cannot recommend inputting large amounts of dead wood for conservation purposes, given the expense of such measures. However, the persistence of saproxylic beetles requires that an adequate amount of dead wood is available in the landscape, and we recommend that dead wood be retained whenever possible in managed pine forests.
Self-adaptive enhanced sampling in the energy and trajectory spaces: accelerated thermodynamics and kinetic calculations.

PubMed

Gao, Yi Qin

2008-04-07

Here, we introduce a simple self-adaptive computational method to enhance the sampling in energy, configuration, and trajectory spaces. The method makes use of two strategies. It first uses a non-Boltzmann distribution method to enhance the sampling in the phase space, in particular, in the configuration space. The application of this method leads to a broad energy distribution in a large energy range and a quickly converged sampling of molecular configurations. In the second stage of simulations, the configuration space of the system is divided into a number of small regions according to preselected collective coordinates. An enhanced sampling of reactive transition paths is then performed in a self-adaptive fashion to accelerate kinetics calculations.
Investigation of real tissue water equivalent path lengths using an efficient dose extinction method

NASA Astrophysics Data System (ADS)

Zhang, Rongxiao; Baer, Esther; Jee, Kyung-Wook; Sharp, Gregory C.; Flanz, Jay; Lu, Hsiao-Ming

2017-07-01

For proton therapy, an accurate conversion of CT HU to relative stopping power (RSP) is essential. Validation of the conversion based on real tissue samples is more direct than the current practice solely based on tissue substitutes and can potentially address variations over the population. Based on a novel dose extinction method, we measured water equivalent path lengths (WEPL) on animal tissue samples to evaluate the accuracy of CT HU to RSP conversion and potential variations over a population. A broad proton beam delivered a spread out Bragg peak to the samples sandwiched between a water tank and a 2D ion-chamber detector. WEPLs of the samples were determined from the transmission dose profiles measured as a function of the water level in the tank. Tissue substitute inserts and Lucite blocks with known WEPLs were used to validate the accuracy. A large number of real tissue samples were measured. Variations of WEPL over different batches of tissue samples were also investigated. The measured WEPLs were compared with those computed from CT scans with the Stoichiometric calibration method. WEPLs were determined within ±0.5% percentage deviation (% std/mean) and ±0.5% error for most of the tissue surrogate inserts and the calibration blocks. For biological tissue samples, percentage deviations were within ±0.3%. No considerable difference (<1%) in WEPL was observed for the same type of tissue from different sources. The differences between measured WEPLs and those calculated from CT were within 1%, except for some bony tissues. Depending on the sample size, each dose extinction measurement took around 5 min to produce ~1000 WEPL values to be compared with calculations. This dose extinction system measures WEPL efficiently and accurately, which allows the validation of CT HU to RSP conversions based on the WEPL measured for a large number of samples and real tissues.
Quadruplex MAPH: improvement of throughput in high-resolution copy number screening.

PubMed

Tyson, Jess; Majerus, Tamsin Mo; Walker, Susan; Armour, John Al

2009-09-28

Copy number variation (CNV) in the human genome is recognised as a widespread and important source of human genetic variation. Now the challenge is to screen for these CNVs at high resolution in a reliable, accurate and cost-effective way. Multiplex Amplifiable Probe Hybridisation (MAPH) is a sensitive, high-resolution technology appropriate for screening for CNVs in a defined region, for a targeted population. We have developed MAPH to a highly multiplexed format ("QuadMAPH") that allows the user a four-fold increase in the number of loci tested simultaneously. We have used this method to analyse a genomic region of 210 kb, including the MSH2 gene and 120 kb of flanking DNA. We show that the QuadMAPH probes report copy number with equivalent accuracy to simplex MAPH, reliably demonstrating diploid copy number in control samples and accurately detecting deletions in Hereditary Non-Polyposis Colorectal Cancer (HNPCC) samples. QuadMAPH is an accurate, high-resolution method that allows targeted screening of large numbers of subjects without the expense of genome-wide approaches. Whilst we have applied this technique to a region of the human genome, it is equally applicable to the genomes of other organisms.
Quadruplex MAPH: improvement of throughput in high-resolution copy number screening

PubMed Central

Tyson, Jess; Majerus, Tamsin MO; Walker, Susan; Armour, John AL

2009-01-01

Background Copy number variation (CNV) in the human genome is recognised as a widespread and important source of human genetic variation. Now the challenge is to screen for these CNVs at high resolution in a reliable, accurate and cost-effective way. Results Multiplex Amplifiable Probe Hybridisation (MAPH) is a sensitive, high-resolution technology appropriate for screening for CNVs in a defined region, for a targeted population. We have developed MAPH to a highly multiplexed format ("QuadMAPH") that allows the user a four-fold increase in the number of loci tested simultaneously. We have used this method to analyse a genomic region of 210 kb, including the MSH2 gene and 120 kb of flanking DNA. We show that the QuadMAPH probes report copy number with equivalent accuracy to simplex MAPH, reliably demonstrating diploid copy number in control samples and accurately detecting deletions in Hereditary Non-Polyposis Colorectal Cancer (HNPCC) samples. Conclusion QuadMAPH is an accurate, high-resolution method that allows targeted screening of large numbers of subjects without the expense of genome-wide approaches. Whilst we have applied this technique to a region of the human genome, it is equally applicable to the genomes of other organisms. PMID:19785739
Generation Scotland: Donor DNA Databank; A control DNA resource.

PubMed

Kerr, Shona M; Liewald, David C M; Campbell, Archie; Taylor, Kerrie; Wild, Sarah H; Newby, David; Turner, Marc; Porteous, David J

2010-11-23

Many medical disorders of public health importance are complex diseases caused by multiple genetic, environmental and lifestyle factors. Recent technological advances have made it possible to analyse the genetic variants that predispose to complex diseases. Reliable detection of these variants requires genome-wide association studies in sufficiently large numbers of cases and controls. This approach is often hampered by difficulties in collecting appropriate control samples. The Generation Scotland: Donor DNA Databank (GS:3D) aims to help solve this problem by providing a resource of control DNA and plasma samples accessible for research. GS:3D participants were recruited from volunteer blood donors attending Scottish National Blood Transfusion Service (SNBTS) clinics across Scotland. All participants gave full written consent for GS:3D to take spare blood from their normal donation. Participants also supplied demographic data by completing a short questionnaire. Over five thousand complete sets of samples, data and consent forms were collected. DNA and plasma were extracted and stored. The data and samples were unlinked from their original SNBTS identifier number. The plasma, DNA and demographic data are available for research. New data obtained from analysis of the resource will be fed back to GS:3D and will be made available to other researchers as appropriate. Recruitment of blood donors is an efficient and cost-effective way of collecting thousands of control samples. Because the collection is large, subsets of controls can be selected, based on age range, gender, and ethnic or geographic origin. The GS:3D resource should reduce time and expense for investigators who would otherwise have had to recruit their own controls.
An electrochemical immunoassay for the screening of celiac disease in saliva samples.

PubMed

Adornetto, Gianluca; Fabiani, Laura; Volpe, Giulia; De Stefano, Alessia; Martini, Sonia; Nenna, Raffaella; Lucantoni, Federica; Bonamico, Margherita; Tiberti, Claudio; Moscone, Danila

2015-09-01

A highly sensitive electrochemical immunoassay for the initial diagnosis of celiac disease (CD) in saliva samples that overcomes the problems related to its high viscosity and to the low concentration of anti-transglutaminase antigen (tTG) IgA in this medium has been developed for the first time. The system uses magnetic beads (MBs) covered with tTG, which reacts with the anti-tTG IgA antibodies present in positive saliva samples. An anti-human IgA, conjugated with alkaline phosphate (AP) enzyme, was used as the label and a strip of eight magnetized screen-printed electrodes as the electrochemical transducer. In particular, two different immunoassay approaches were optimized and blindly compared to analyze a large number of saliva samples, whose anti-tTG IgA levels were independently determined by the radioimmunoassay (RIA) method. The obtained results, expressed as Ab index, were used to perform a diagnostic test evaluation through the construction of receiver operating characteristic (ROC) curves. The approach, involving a pre-incubation between the anti-human IgA-AP and saliva samples prior to the addition of MBs-tTG, showed a cutoff of 0.022 with 95% clinical sensitivity and 96% clinical specificity. The area under the ROC curve is equal to 1, a result that classifies our test as "perfect." This study demonstrates that it is possible to perform the screening of CD with a rapid, simple, inexpensive, and sensitive method able to detect anti-tTG antibodies in saliva samples, which are easily obtained by non-invasive techniques. This aspect is of fundamental importance to screen a large number of subjects, especially in the pediatric age.
Scaling Impact-Melt and Crater Dimensions: Implications for the Lunar Cratering Record

NASA Technical Reports Server (NTRS)

Cintala , Mark J.; Grieve, Richard A. F.

1997-01-01

The consequences of impact on the solid bodies of the solar system are manifest and legion. Although the visible effects on planetary surfaces, such as the Moon's, are the most obvious testimony to the spatial and temporal importance of impacts, less dramatic chemical and petrographic characteristics of materials affected by shock abound. Both the morphologic and petrologic aspects of impact cratering are important in deciphering lunar history, and, ideally, each should complement the other. In practice, however, a gap has persisted in relating large-scale cratering processes to petrologic and geochemical data obtained from lunar samples. While this is due in no small part to the fact that no Apollo mission unambiguously sampled deposits of a large crater, it can also be attributed to the general state of our knowledge of cratering phenomena, particularly those accompanying large events. The most common shock-metamorphosed lunar samples are breccias, but a substantial number are impact-melt rocks. Indeed, numerous workers have called attention to the importance of impact-melt rocks spanning a wide range of ages in the lunar sample collection. Photogeologic studies also have demonstrated the widespread occurrence of impact-melt lithologies in and around lunar craters. Thus, it is clear that impact melting has been a fundamental process operating throughout lunar history, at scales ranging from pits formed on individual regolith grains to the largest impact basins. This contribution examines the potential relationship between impact melting on the Moon and the interior morphologies of large craters and peaking basins. It then examines some of the implications of impact melting at such large scales for lunar-sample provenance and evolution of the lunar crust.
Development and validation of InnoQuant™, a sensitive human DNA quantitation and degradation assessment method for forensic samples using high copy number mobile elements Alu and SVA.

PubMed

Pineda, Gina M; Montgomery, Anne H; Thompson, Robyn; Indest, Brooke; Carroll, Marion; Sinha, Sudhir K

2014-11-01

There is a constant need in forensic casework laboratories for an improved way to increase the first-pass success rate of forensic samples. The recent advances in mini STR analysis, SNP, and Alu marker systems have now made it possible to analyze highly compromised samples, yet few tools are available that can simultaneously provide an assessment of quantity, inhibition, and degradation in a sample prior to genotyping. Currently there are several different approaches used for fluorescence-based quantification assays which provide a measure of quantity and inhibition. However, a system which can also assess the extent of degradation in a forensic sample will be a useful tool for DNA analysts. Possessing this information prior to genotyping will allow an analyst to more informatively make downstream decisions for the successful typing of a forensic sample without unnecessarily consuming DNA extract. Real-time PCR provides a reliable method for determining the amount and quality of amplifiable DNA in a biological sample. Alu are Short Interspersed Elements (SINE), approximately 300bp insertions which are distributed throughout the human genome in large copy number. The use of an internal primer to amplify a segment of an Alu element allows for human specificity as well as high sensitivity when compared to a single copy target. The advantage of an Alu system is the presence of a large number (>1000) of fixed insertions in every human genome, which minimizes the individual specific variation possible when using a multi-copy target quantification system. This study utilizes two independent retrotransposon genomic targets to obtain quantification of an 80bp "short" DNA fragment and a 207bp "long" DNA fragment in a degraded DNA sample in the multiplex system InnoQuant™. The ratio of the two quantitation values provides a "Degradation Index", or a qualitative measure of a sample's extent of degradation. The Degradation Index was found to be predictive of the observed loss of STR markers and alleles as degradation increases. Use of a synthetic target as an internal positive control (IPC) provides an additional assessment for the presence of PCR inhibitors in the test sample. In conclusion, a DNA based qualitative/quantitative/inhibition assessment system that accurately predicts the status of a biological sample, will be a valuable tool for deciding which DNA test kit to utilize and how much target DNA to use, when processing compromised forensic samples for DNA testing. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Performance and precision of double digestion RAD (ddRAD) genotyping in large multiplexed datasets of marine fish species.

PubMed

Maroso, F; Hillen, J E J; Pardo, B G; Gkagkavouzis, K; Coscia, I; Hermida, M; Franch, R; Hellemans, B; Van Houdt, J; Simionati, B; Taggart, J B; Nielsen, E E; Maes, G; Ciavaglia, S A; Webster, L M I; Volckaert, F A M; Martinez, P; Bargelloni, L; Ogden, R

2018-06-01

The development of Genotyping-By-Sequencing (GBS) technologies enables cost-effective analysis of large numbers of Single Nucleotide Polymorphisms (SNPs), especially in "non-model" species. Nevertheless, as such technologies enter a mature phase, biases and errors inherent to GBS are becoming evident. Here, we evaluated the performance of double digest Restriction enzyme Associated DNA (ddRAD) sequencing in SNP genotyping studies including high number of samples. Datasets of sequence data were generated from three marine teleost species (>5500 samples, >2.5 × 10 12 bases in total), using a standardized protocol. A common bioinformatics pipeline based on STACKS was established, with and without the use of a reference genome. We performed analyses throughout the production and analysis of ddRAD data in order to explore (i) the loss of information due to heterogeneous raw read number across samples; (ii) the discrepancy between expected and observed tag length and coverage; (iii) the performances of reference based vs. de novo approaches; (iv) the sources of potential genotyping errors of the library preparation/bioinformatics protocol, by comparing technical replicates. Our results showed use of a reference genome and a posteriori genotype correction improved genotyping precision. Individual read coverage was a key variable for reproducibility; variance in sequencing depth between loci in the same individual was also identified as an important factor and found to correlate to tag length. A comparison of downstream analysis carried out with ddRAD vs single SNP allele specific assay genotypes provided information about the levels of genotyping imprecision that can have a significant impact on allele frequency estimations and population assignment. The results and insights presented here will help to select and improve approaches to the analysis of large datasets based on RAD-like methodologies. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.
A high-throughput robotic sample preparation system and HPLC-MS/MS for measuring urinary anatabine, anabasine, nicotine and major nicotine metabolites.

PubMed

Wei, Binnian; Feng, June; Rehmani, Imran J; Miller, Sharyn; McGuffey, James E; Blount, Benjamin C; Wang, Lanqing

2014-09-25

Most sample preparation methods characteristically involve intensive and repetitive labor, which is inefficient when preparing large numbers of samples from population-scale studies. This study presents a robotic system designed to meet the sampling requirements for large population-scale studies. Using this robotic system, we developed and validated a method to simultaneously measure urinary anatabine, anabasine, nicotine and seven major nicotine metabolites: 4-Hydroxy-4-(3-pyridyl)butanoic acid, cotinine-N-oxide, nicotine-N-oxide, trans-3'-hydroxycotinine, norcotinine, cotinine and nornicotine. We analyzed robotically prepared samples using high-performance liquid chromatography (HPLC) coupled with triple quadrupole mass spectrometry in positive electrospray ionization mode using scheduled multiple reaction monitoring (sMRM) with a total runtime of 8.5 min. The optimized procedure was able to deliver linear analyte responses over a broad range of concentrations. Responses of urine-based calibrators delivered coefficients of determination (R(2)) of >0.995. Sample preparation recovery was generally higher than 80%. The robotic system was able to prepare four 96-well plate (384 urine samples) per day, and the overall method afforded an accuracy range of 92-115%, and an imprecision of <15.0% on average. The validation results demonstrate that the method is accurate, precise, sensitive, robust, and most significantly labor-saving for sample preparation, making it efficient and practical for routine measurements in large population-scale studies such as the National Health and Nutrition Examination Survey (NHANES) and the Population Assessment of Tobacco and Health (PATH) study. Published by Elsevier B.V.
Productivity trends and collaboration patterns: A diachronic study in the eating disorders field

PubMed Central

Valderrama-Zurián, Juan-Carlos; Aguilar-Moya, Remedios; Cepeda-Benito, Antonio; Navarro-Moreno, María-Ángeles; Gandía-Balaguer, Asunción; Aleixandre-Benavent, Rafael

2017-01-01

Objective The present study seeks to extend previous bibliometric studies on eating disorders (EDs) by including a time-dependent analysis of the growth and evolution of multi-author collaborations and their correlation with ED publication trends from 1980 to 2014 (35 years). Methods Using standardized practices, we searched Web of Science (WoS) Core Collection (WoSCC) (indexes: Science Citation Index-Expanded [SCIE], & Social Science Citation Index [SSCI]) and Scopus (areas: Health Sciences, Life Sciences, & Social Sciences and Humanities) to identify a large sample of articles related to EDs. We then submitted our sample of articles to bibliometric and graph theory analyses to identify co-authorship and social network patterns. Results We present a large number of detailed findings, including a clear pattern of scientific growth measured as number of publications per five-year period or quinquennium (Q), a tremendous increase in the number of authors attracted by the ED subject, and a very high and steady growth in collaborative work. Conclusions We inferred that the noted publication growth was likely driven by the noted increase in the number of new authors per Q. Social network analyses suggested that collaborations within ED follow patters of interaction that are similar to well established and recognized disciplines, as indicated by the presence of a “giant cluster”, high cluster density, and the replication of the “small world” phenomenon—the principle that we are all linked by short chains of acquaintances. PMID:28850569
Evaluating noninvasive genetic sampling techniques to estimate large carnivore abundance.

PubMed

Mumma, Matthew A; Zieminski, Chris; Fuller, Todd K; Mahoney, Shane P; Waits, Lisette P

2015-09-01

Monitoring large carnivores is difficult because of intrinsically low densities and can be dangerous if physical capture is required. Noninvasive genetic sampling (NGS) is a safe and cost-effective alternative to physical capture. We evaluated the utility of two NGS methods (scat detection dogs and hair sampling) to obtain genetic samples for abundance estimation of coyotes, black bears and Canada lynx in three areas of Newfoundland, Canada. We calculated abundance estimates using program capwire, compared sampling costs, and the cost/sample for each method relative to species and study site, and performed simulations to determine the sampling intensity necessary to achieve abundance estimates with coefficients of variation (CV) of <10%. Scat sampling was effective for both coyotes and bears and hair snags effectively sampled bears in two of three study sites. Rub pads were ineffective in sampling coyotes and lynx. The precision of abundance estimates was dependent upon the number of captures/individual. Our simulations suggested that ~3.4 captures/individual will result in a < 10% CV for abundance estimates when populations are small (23-39), but fewer captures/individual may be sufficient for larger populations. We found scat sampling was more cost-effective for sampling multiple species, but suggest that hair sampling may be less expensive at study sites with limited road access for bears. Given the dependence of sampling scheme on species and study site, the optimal sampling scheme is likely to be study-specific warranting pilot studies in most circumstances. © 2015 John Wiley & Sons Ltd.
A comparison of confidence interval methods for the intraclass correlation coefficient in community-based cluster randomization trials with a binary outcome.

PubMed

Braschel, Melissa C; Svec, Ivana; Darlington, Gerarda A; Donner, Allan

2016-04-01

Many investigators rely on previously published point estimates of the intraclass correlation coefficient rather than on their associated confidence intervals to determine the required size of a newly planned cluster randomized trial. Although confidence interval methods for the intraclass correlation coefficient that can be applied to community-based trials have been developed for a continuous outcome variable, fewer methods exist for a binary outcome variable. The aim of this study is to evaluate confidence interval methods for the intraclass correlation coefficient applied to binary outcomes in community intervention trials enrolling a small number of large clusters. Existing methods for confidence interval construction are examined and compared to a new ad hoc approach based on dividing clusters into a large number of smaller sub-clusters and subsequently applying existing methods to the resulting data. Monte Carlo simulation is used to assess the width and coverage of confidence intervals for the intraclass correlation coefficient based on Smith's large sample approximation of the standard error of the one-way analysis of variance estimator, an inverted modified Wald test for the Fleiss-Cuzick estimator, and intervals constructed using a bootstrap-t applied to a variance-stabilizing transformation of the intraclass correlation coefficient estimate. In addition, a new approach is applied in which clusters are randomly divided into a large number of smaller sub-clusters with the same methods applied to these data (with the exception of the bootstrap-t interval, which assumes large cluster sizes). These methods are also applied to a cluster randomized trial on adolescent tobacco use for illustration. When applied to a binary outcome variable in a small number of large clusters, existing confidence interval methods for the intraclass correlation coefficient provide poor coverage. However, confidence intervals constructed using the new approach combined with Smith's method provide nominal or close to nominal coverage when the intraclass correlation coefficient is small (<0.05), as is the case in most community intervention trials. This study concludes that when a binary outcome variable is measured in a small number of large clusters, confidence intervals for the intraclass correlation coefficient may be constructed by dividing existing clusters into sub-clusters (e.g. groups of 5) and using Smith's method. The resulting confidence intervals provide nominal or close to nominal coverage across a wide range of parameters when the intraclass correlation coefficient is small (<0.05). Application of this method should provide investigators with a better understanding of the uncertainty associated with a point estimator of the intraclass correlation coefficient used for determining the sample size needed for a newly designed community-based trial. © The Author(s) 2015.
Evidence of Lake Trout reproduction at Lake Michigan's mid-lake reef complex

USGS Publications Warehouse

Janssen, J.; Jude, D.J.; Edsall, T.A.; Paddock, R.W.; Wattrus, N.; Toneys, M.; McKee, P.

2006-01-01

The Mid-Lake Reef Complex (MLRC), a large area of deep (> 40 m) reefs, was a major site where indigenous lake trout (Salvelinus namaycush) in Lake Michigan aggregated during spawning. As part of an effort to restore Lake Michigan's lake trout, which were extirpated in the 1950s, yearling lake trout have been released over the MLRC since the mid-1980s and fall gill net censuses began to show large numbers of lake trout in spawning condition beginning about 1999. We report the first evidence of viable egg deposition and successful lake trout fry production at these deep reefs. Because the area's existing bathymetry and habitat were too poorly known for a priori selection of sampling sites, we used hydroacoustics to locate concentrations of large fish in the fall; fish were congregating around slopes and ridges. Subsequent observations via unmanned submersible confirmed the large fish to be lake trout. Our technological objectives were driven by biological objectives of locating where lake trout spawn, where lake trout fry were produced, and what fishes ate lake trout eggs and fry. The unmanned submersibles were equipped with a suction sampler and electroshocker to sample eggs deposited on the reef, draw out and occasionally catch emergent fry, and collect egg predators (slimy sculpin Cottus cognatus). We observed slimy sculpin to eat unusually high numbers of lake trout eggs. Our qualitative approaches are a first step toward quantitative assessments of the importance of lake trout spawning on the MLRC.
Searching mixed DNA profiles directly against profile databases.

PubMed

Bright, Jo-Anne; Taylor, Duncan; Curran, James; Buckleton, John

2014-03-01

DNA databases have revolutionised forensic science. They are a powerful investigative tool as they have the potential to identify persons of interest in criminal investigations. Routinely, a DNA profile generated from a crime sample could only be searched for in a database of individuals if the stain was from single contributor (single source) or if a contributor could unambiguously be determined from a mixed DNA profile. This meant that a significant number of samples were unsuitable for database searching. The advent of continuous methods for the interpretation of DNA profiles offers an advanced way to draw inferential power from the considerable investment made in DNA databases. Using these methods, each profile on the database may be considered a possible contributor to a mixture and a likelihood ratio (LR) can be formed. Those profiles which produce a sufficiently large LR can serve as an investigative lead. In this paper empirical studies are described to determine what constitutes a large LR. We investigate the effect on a database search of complex mixed DNA profiles with contributors in equal proportions with dropout as a consideration, and also the effect of an incorrect assignment of the number of contributors to a profile. In addition, we give, as a demonstration of the method, the results using two crime samples that were previously unsuitable for database comparison. We show that effective management of the selection of samples for searching and the interpretation of the output can be highly informative. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Design of a practical model-observer-based image quality assessment method for x-ray computed tomography imaging systems

PubMed Central

Tseng, Hsin-Wu; Fan, Jiahua; Kupinski, Matthew A.

2016-01-01

Abstract. The use of a channelization mechanism on model observers not only makes mimicking human visual behavior possible, but also reduces the amount of image data needed to estimate the model observer parameters. The channelized Hotelling observer (CHO) and channelized scanning linear observer (CSLO) have recently been used to assess CT image quality for detection tasks and combined detection/estimation tasks, respectively. Although the use of channels substantially reduces the amount of data required to compute image quality, the number of scans required for CT imaging is still not practical for routine use. It is our desire to further reduce the number of scans required to make CHO or CSLO an image quality tool for routine and frequent system validations and evaluations. This work explores different data-reduction schemes and designs an approach that requires only a few CT scans. Three different kinds of approaches are included in this study: a conventional CHO/CSLO technique with a large sample size, a conventional CHO/CSLO technique with fewer samples, and an approach that we will show requires fewer samples to mimic conventional performance with a large sample size. The mean value and standard deviation of areas under ROC/EROC curve were estimated using the well-validated shuffle approach. The results indicate that an 80% data reduction can be achieved without loss of accuracy. This substantial data reduction is a step toward a practical tool for routine-task-based QA/QC CT system assessment. PMID:27493982
The relation between statistical power and inference in fMRI

PubMed Central

Wager, Tor D.; Yarkoni, Tal

2017-01-01

Statistically underpowered studies can result in experimental failure even when all other experimental considerations have been addressed impeccably. In fMRI the combination of a large number of dependent variables, a relatively small number of observations (subjects), and a need to correct for multiple comparisons can decrease statistical power dramatically. This problem has been clearly addressed yet remains controversial—especially in regards to the expected effect sizes in fMRI, and especially for between-subjects effects such as group comparisons and brain-behavior correlations. We aimed to clarify the power problem by considering and contrasting two simulated scenarios of such possible brain-behavior correlations: weak diffuse effects and strong localized effects. Sampling from these scenarios shows that, particularly in the weak diffuse scenario, common sample sizes (n = 20–30) display extremely low statistical power, poorly represent the actual effects in the full sample, and show large variation on subsequent replications. Empirical data from the Human Connectome Project resembles the weak diffuse scenario much more than the localized strong scenario, which underscores the extent of the power problem for many studies. Possible solutions to the power problem include increasing the sample size, using less stringent thresholds, or focusing on a region-of-interest. However, these approaches are not always feasible and some have major drawbacks. The most prominent solutions that may help address the power problem include model-based (multivariate) prediction methods and meta-analyses with related synthesis-oriented approaches. PMID:29155843

GenoCore: A simple and fast algorithm for core subset selection from large genotype datasets.

PubMed

Jeong, Seongmun; Kim, Jae-Yoon; Jeong, Soon-Chun; Kang, Sung-Taeg; Moon, Jung-Kyung; Kim, Namshin

2017-01-01

Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide association studies (GWAS), and genomics-assisted breeding of crop species, etc. Recently, a large number of genetic markers (>100,000 single nucleotide polymorphisms) have been identified from high-density single nucleotide polymorphism (SNP) arrays and next-generation sequencing (NGS) data. However, there is no software available for picking out the efficient and consistent core subset from such a huge dataset. It is necessary to develop software that can extract genetically important samples in a population with coherence. We here present a new program, GenoCore, which can find quickly and efficiently the core subset representing the entire population. We introduce simple measures of coverage and diversity scores, which reflect genotype errors and genetic variations, and can help to select a sample rapidly and accurately for crop genotype dataset. Comparison of our method to other core collection software using example datasets are performed to validate the performance according to genetic distance, diversity, coverage, required system resources, and the number of selected samples. GenoCore selects the smallest, most consistent, and most representative core collection from all samples, using less memory with more efficient scores, and shows greater genetic coverage compared to the other software tested. GenoCore was written in R language, and can be accessed online with an example dataset and test results at https://github.com/lovemun/Genocore.
Testing AGN unification via inference from large catalogs

NASA Astrophysics Data System (ADS)

Nikutta, Robert; Ivezic, Zeljko; Elitzur, Moshe; Nenkova, Maia

2018-01-01

Source orientation and clumpiness of the central dust are the main factors in AGN classification. Type-1 QSOs are easy to observe and large samples are available (e.g. in SDSS), but obscured type-2 AGN are dimmer and redder as our line of sight is more obscured, making it difficult to obtain a complete sample. WISE has found up to a million QSOs. With only 4 bands and a relatively small aperture the analysis of individual sources is challenging, but the large sample allows inference of bulk properties at a very significant level.CLUMPY (www.clumpy.org) is arguably the most popular database of AGN torus SEDs. We model the ensemble properties of the entire WISE AGN content using regularized linear regression, with orientation-dependent CLUMPY color-color-magnitude (CCM) tracks as basis functions. We can reproduce the observed number counts per CCM bin with percent-level accuracy, and simultaneously infer the probability distributions of all torus parameters, redshifts, additional SED components, and identify type-1/2 AGN populations through their IR properties alone. We increase the statistical power of our AGN unification tests even further, by adding other datasets as axes in the regression problem. To this end, we make use of the NOAO Data Lab (datalab.noao.edu), which hosts several high-level large datasets and provides very powerful tools for handling large data, e.g. cross-matched catalogs, fast remote queries, etc.
Vitamin D receptor gene and osteoporosis - author`s response

DOE Office of Scientific and Technical Information (OSTI.GOV)

Looney, J.E.; Yoon, Hyun Koo; Fischer, M.

1996-04-01

We appreciate the comments of Dr. Nguyen et al. about our recent study, but we disagree with their suggestion that the lack of an association between low bone density and the BB VDR genotype, which we reported, is an artifact generated by the small sample size. Furthermore, our results are consistent with similar conclusions reached by a number of other investigators, as recently reported by Peacock. Peacock states {open_quotes}Taken as a whole, the results of studies outlined ... indicate that VDR alleles, cannot account for the major part of the heritable component of bone density as indicated by Morrison etmore » al.{close_quotes}. The majority of the 17 studies cited in this editorial could not confirm an association between the VDR genotype and the bone phenotype. Surely one cannot criticize this combined work as representing an artifact because of a too small sample size. We do not dispute the suggestion by Nguyen et al. that large sample sizes are required to analyze small biological effects. This is evident in both Peacock`s summary and in their own bone density studies. We did not design our study with a larger sample size because, based on the work of Morrison et al., we had hypothesized a large biological effect; large sample sizes are only needed for small biological effects. 4 refs.« less
The VLT-FLAMES Tarantula Survey

NASA Astrophysics Data System (ADS)

Vink, Jorick S.; Evans, C. J.; Bestenlehner, J.; McEvoy, C.; Ramírez-Agudelo, O.; Sana, H.; Schneider, F.; VFTS Collaboration

2017-11-01

We present a number of notable results from the VLT-FLAMES Tarantula Survey (VFTS), an ESO Large Program during which we obtained multi-epoch medium-resolution optical spectroscopy of a very large sample of over 800 massive stars in the 30 Doradus region of the Large Magellanic Cloud (LMC). This unprecedented data-set has enabled us to address some key questions regarding atmospheres and winds, as well as the evolution of (very) massive stars. Here we focus on O-type runaways, the width of the main sequence, and the mass-loss rates for (very) massive stars. We also provide indications for the presence of a top-heavy initial mass function (IMF) in 30 Dor.
The enigmatic molar from Gondolin, South Africa: implications for Paranthropus paleobiology.

PubMed

Grine, Frederick E; Jacobs, Rachel L; Reed, Kaye E; Plavcan, J Michael

2012-10-01

The specific attribution of the large hominin M(2) (GDA-2) from Gondolin has significant implications for the paleobiology of Paranthropus. If it is a specimen of Paranthropus robustus it impacts that species' size range, and if it belongs to Paranthropus boisei it has important biogeographic implications. We evaluate crown size, cusp proportions and the likelihood of encountering a large-bodied mammal species in both East and South Africa in the Early Pleistocene. The tooth falls well outside the P. robustus sample range, and comfortably within that for penecontemporaneous P. boisei. Analyses of sample range, distribution and variability suggest that it is possible, albeit unlikely to find a M(2) of this size in the current P. robustus sample. However, taphonomic agents - carnivore (particularly leopard) feeding behaviors - have likely skewed the size distribution of the Swartkrans and Drimolen P. robustus assemblage. In particular, assemblages of large-bodied mammals accumulated by leopards typically display high proportions of juveniles and smaller adults. The skew in the P. robustus sample is consistent with this type of assemblage. Morphological evidence in the form of cusp proportions is congruent with GDA-2 representing P. robustus rather than P. boisei. The comparatively small number of large-bodied mammal species common to both South and East Africa in the Early Pleistocene suggests a low probability of encountering an herbivorous australopith in both. Our results are most consistent with the interpretation of the Gondolin molar as a very large specimen of P. robustus. This, in turn, suggests that large, presumptive male, specimens are rare, and that the levels of size variation (sexual dimorphism) previously ascribed to this species are likely to be gross underestimates. Copyright © 2012 Elsevier Ltd. All rights reserved.
Influence of the type of oxidant on anion exchange properties of fibrous Cladophora cellulose/polypyrrole composites.

PubMed

Razaq, Aamir; Mihranyan, Albert; Welch, Ken; Nyholm, Leif; Strømme, Maria

2009-01-15

The electrochemically controlled anion absorption properties of a novel large surface area composite paper material composed of polypyrrole (PPy) and cellulose derived from Cladophora sp. algae, synthesized with two oxidizing agents, iron(III) chloride and phosphomolybdic acid (PMo), were analyzed in four different electrolytes containing anions (i.e., chloride, aspartate, glutamate, and p-toluenesulfonate) of varying size.The composites were characterized with scanning and transmission electron microscopy, N2 gas adsorption,and conductivity measurements. The potential-controlled ion exchange properties of the materials were studied by cyclic voltammetry and chronoamperometry at varying potentials. The surface area and conductivity of the iron(III) chloride synthesized sample were 58.8 m2/g and 0.65 S/cm, respectively, while the corresponding values for the PMo synthesized sample were 31.3 m2/g and 0.12 S/cm. The number of absorbed ions per sample mass was found to be larger for the iron(III) chloride synthesized sample than for the PMo synthesized one in all four electrolytes. Although the largest extraction yields were obtained in the presence of the smallest anion (i.e., chloride) for both samples, the relative degree of extraction for the largest ions (i.e., glutamate and p-toluenesulfonate) was higher for the PMo sample. This clearly shows that it is possible to increase the extraction yield of large anions by carrying out the PPy polymerization in the presence of large anions. The results likewise show that high ion exchange capacities, as well as extraction and desorption rates, can be obtained for large anions with high surface area composites coated with relatively thin layers of PPy.
Random sampling technique for ultra-fast computations of molecular opacities for exoplanet atmospheres

NASA Astrophysics Data System (ADS)

Min, M.

2017-10-01

Context. Opacities of molecules in exoplanet atmospheres rely on increasingly detailed line-lists for these molecules. The line lists available today contain for many species up to several billions of lines. Computation of the spectral line profile created by pressure and temperature broadening, the Voigt profile, of all of these lines is becoming a computational challenge. Aims: We aim to create a method to compute the Voigt profile in a way that automatically focusses the computation time into the strongest lines, while still maintaining the continuum contribution of the high number of weaker lines. Methods: Here, we outline a statistical line sampling technique that samples the Voigt profile quickly and with high accuracy. The number of samples is adjusted to the strength of the line and the local spectral line density. This automatically provides high accuracy line shapes for strong lines or lines that are spectrally isolated. The line sampling technique automatically preserves the integrated line opacity for all lines, thereby also providing the continuum opacity created by the large number of weak lines at very low computational cost. Results: The line sampling technique is tested for accuracy when computing line spectra and correlated-k tables. Extremely fast computations ( 3.5 × 105 lines per second per core on a standard current day desktop computer) with high accuracy (≤1% almost everywhere) are obtained. A detailed recipe on how to perform the computations is given.
Normal incidence X-ray mirror for chemical microanalysis

DOEpatents

Carr, Martin J.; Romig, Jr., Alton D.

1990-01-01

A non-planar, focusing mirror, to be utilized in both electron column instruments and micro-x-ray fluorescence instruments for performing chemical microanalysis on a sample, comprises a concave, generally spherical base substrate and a predetermined number of alternating layers of high atomic number material and low atomic number material contiguously formed on the base substrate. The thickness of each layer is an integral multiple of the wavelength being reflected and may vary non-uniformly according to a predetermined design. The chemical analytical instruments in which the mirror is used also include a predetermined energy source for directing energy onto the sample and a detector for receiving and detecting the x-rays emitted from the sample; the non-planar mirror is located between the sample and detector and collects the x-rays emitted from the sample at a large solid angle and focuses the collected x-rays to the sample. For electron column instruments, the wavelengths of interest lie above 1.5 nm, while for x-ray fluorescence instruments, the range of interest is below 0.2 nm. Also, x-ray fluorescence instruments include an additional non-planar focusing mirror, formed in the same manner as the previously described m The invention described herein was made in the performance of work under contract with the Department of Energy, Contract No. DE-AC04-76DP00789, and the United States Government has rights in the invention pursuant to this contract.
A Simple Sampling Method for Estimating the Accuracy of Large Scale Record Linkage Projects.

PubMed

Boyd, James H; Guiver, Tenniel; Randall, Sean M; Ferrante, Anna M; Semmens, James B; Anderson, Phil; Dickinson, Teresa

2016-05-17

Record linkage techniques allow different data collections to be brought together to provide a wider picture of the health status of individuals. Ensuring high linkage quality is important to guarantee the quality and integrity of research. Current methods for measuring linkage quality typically focus on precision (the proportion of incorrect links), given the difficulty of measuring the proportion of false negatives. The aim of this work is to introduce and evaluate a sampling based method to estimate both precision and recall following record linkage. In the sampling based method, record-pairs from each threshold (including those below the identified cut-off for acceptance) are sampled and clerically reviewed. These results are then applied to the entire set of record-pairs, providing estimates of false positives and false negatives. This method was evaluated on a synthetically generated dataset, where the true match status (which records belonged to the same person) was known. The sampled estimates of linkage quality were relatively close to actual linkage quality metrics calculated for the whole synthetic dataset. The precision and recall measures for seven reviewers were very consistent with little variation in the clerical assessment results (overall agreement using the Fleiss Kappa statistics was 0.601). This method presents as a possible means of accurately estimating matching quality and refining linkages in population level linkage studies. The sampling approach is especially important for large project linkages where the number of record pairs produced may be very large often running into millions.
Detection of 12 respiratory viruses by duplex real time PCR assays in respiratory samples.

PubMed

Arvia, Rosaria; Corcioli, Fabiana; Ciccone, Nunziata; Della Malva, Nunzia; Azzi, Alberta

2015-12-01

Different viruses can be responsible for similar clinical manifestations of respiratory infections. Thus, the etiological diagnosis of respiratory viral diseases requires the detection of a large number of viruses. In this study, 6 duplex real-time PCR assays, using EvaGreen intercalating dye, were developed to detect 12 major viruses responsible for respiratory diseases: influenza A and B viruses, enteroviruses (including enterovirus spp, and rhinovirus spp), respiratory syncytial virus, human metapneumovirus, coronaviruses group I (of which CoV 229E and CoV NL63 are part) and II (including CoV OC43 and CoV HKU1), parainfluenza viruses type 1, 2, 3 and 4, human adenoviruses and human bocaviruses. The 2 target viruses of each duplex reaction were distinguishable by the melting temperatures of their amplicons. The 6 duplex real time PCR assays were applied for diagnostic purpose on 202 respiratory samples from 157 patients. One hundred fifty-seven samples were throat swabs and 45 were bronchoalveolar lavages. The results of the duplex PCR assays were confirmed by comparison with a commercial, validated, assay; in addition, the positive results were confirmed by sequencing. The analytical sensitivity of the duplex PCR assays varied from 10(3) copies/ml to 10(4) copies/ml. For parainfluenza virus 2 only it was 10(5) copies/ml. Seventy clinical samples (35%) from 55 patients (30 children and 25 adults) were positive for 1 or more viruses. In adult patients, influenza A virus was the most frequently detected respiratory virus followed by rhinoviruses. In contrast, respiratory syncytial virus was the most common virus in children, followed by enteroviruses, influenza A virus and coronavirus NL63. The small number of samples/patients does not allow us to draw any epidemiological conclusion. Altogether, the results of this study indicate that the 6 duplex PCR assays described in this study are sensitive, specific and cost-effective. Thus, this assay could be particularly useful to identify the main respiratory viruses directly from clinical samples, after nucleic acid extraction, and, also, to screen a large number of patients for epidemiological studies. Copyright © 2015 Elsevier Ltd. All rights reserved.
Extending the Reach of IGSN Beyond Earth: Implementing IGSN Registration to Link Nasa's Apollo Lunar Samples and Their Data

NASA Technical Reports Server (NTRS)

Todd, Nancy S.

2016-01-01

The rock and soil samples returned from the Apollo missions from 1969-72 have supported 46 years of research leading to advances in our understanding of the formation and evolution of the inner Solar System. NASA has been engaged in several initiatives that aim to restore, digitize, and make available to the public existing published and unpublished research data for the Apollo samples. One of these initiatives is a collaboration with IEDA (Interdisciplinary Earth Data Alliance) to develop MoonDB, a lunar geochemical database modeled after PetDB (Petrological Database of the Ocean Floor). In support of this initiative, NASA has adopted the use of IGSN (International Geo Sample Number) to generate persistent, unique identifiers for lunar samples that scientists can use when publishing research data. To facilitate the IGSN registration of the original 2,200 samples and over 120,000 subdivided samples, NASA has developed an application that retrieves sample metadata from the Lunar Curation Database and uses the SESAR API to automate the generation of IGSNs and registration of samples into SESAR (System for Earth Sample Registration). This presentation will describe the work done by NASA to map existing sample metadata to the IGSN metadata and integrate the IGSN registration process into the sample curation workflow, the lessons learned from this effort, and how this work can be extended in the future to help deal with the registration of large numbers of samples.
Extending the Reach of IGSN Beyond Earth: Implementing IGSN Registration to Link NASA's Apollo Lunar Samples and their Data

NASA Astrophysics Data System (ADS)

Todd, N. S.

2016-12-01

The rock and soil samples returned from the Apollo missions from 1969-72 have supported 46 years of research leading to advances in our understanding of the formation and evolution of the inner Solar System. NASA has been engaged in several initiatives that aim to restore, digitize, and make available to the public existing published and unpublished research data for the Apollo samples. One of these initiatives is a collaboration with IEDA (Interdisciplinary Earth Data Alliance) to develop MoonDB, a lunar geochemical database modeled after PetDB. In support of this initiative, NASA has adopted the use of IGSN (International Geo Sample Number) to generate persistent, unique identifiers for lunar samples that scientists can use when publishing research data. To facilitate the IGSN registration of the original 2,200 samples and over 120,000 subdivided samples, NASA has developed an application that retrieves sample metadata from the Lunar Curation Database and uses the SESAR API to automate the generation of IGSNs and registration of samples into SESAR (System for Earth Sample Registration). This presentation will describe the work done by NASA to map existing sample metadata to the IGSN metadata and integrate the IGSN registration process into the sample curation workflow, the lessons learned from this effort, and how this work can be extended in the future to help deal with the registration of large numbers of samples.
Linear models for airborne-laser-scanning-based operational forest inventory with small field sample size and highly correlated LiDAR data

USGS Publications Warehouse

Junttila, Virpi; Kauranne, Tuomo; Finley, Andrew O.; Bradford, John B.

2015-01-01

Modern operational forest inventory often uses remotely sensed data that cover the whole inventory area to produce spatially explicit estimates of forest properties through statistical models. The data obtained by airborne light detection and ranging (LiDAR) correlate well with many forest inventory variables, such as the tree height, the timber volume, and the biomass. To construct an accurate model over thousands of hectares, LiDAR data must be supplemented with several hundred field sample measurements of forest inventory variables. This can be costly and time consuming. Different LiDAR-data-based and spatial-data-based sampling designs can reduce the number of field sample plots needed. However, problems arising from the features of the LiDAR data, such as a large number of predictors compared with the sample size (overfitting) or a strong correlation among predictors (multicollinearity), may decrease the accuracy and precision of the estimates and predictions. To overcome these problems, a Bayesian linear model with the singular value decomposition of predictors, combined with regularization, is proposed. The model performance in predicting different forest inventory variables is verified in ten inventory areas from two continents, where the number of field sample plots is reduced using different sampling designs. The results show that, with an appropriate field plot selection strategy and the proposed linear model, the total relative error of the predicted forest inventory variables is only 5%–15% larger using 50 field sample plots than the error of a linear model estimated with several hundred field sample plots when we sum up the error due to both the model noise variance and the model’s lack of fit.
Analysis of volatile organic compounds. [trace amounts of organic volatiles in gas samples

NASA Technical Reports Server (NTRS)

Zlatkis, A. (Inventor)

1977-01-01

An apparatus and method are described for reproducibly analyzing trace amounts of a large number of organic volatiles existing in a gas sample. Direct injection of the trapped volatiles into a cryogenic percolum provides a sharply defined plug. Applications of the method include: (1) analyzing the headspace gas of body fluids and comparing a profile of the organic volatiles with standard profiles for the detection and monitoring of disease; (2) analyzing the headspace gas of foods and beverages and comparing the profile with standard profiles to monitor and control flavor and aroma; and (3) analyses for determining the organic pollutants in air or water samples.
A guide to large-scale RNA sample preparation.

PubMed

Baronti, Lorenzo; Karlsson, Hampus; Marušič, Maja; Petzold, Katja

2018-05-01

RNA is becoming more important as an increasing number of functions, both regulatory and enzymatic, are being discovered on a daily basis. As the RNA boom has just begun, most techniques are still in development and changes occur frequently. To understand RNA functions, revealing the structure of RNA is of utmost importance, which requires sample preparation. We review the latest methods to produce and purify a variation of RNA molecules for different purposes with the main focus on structural biology and biophysics. We present a guide aimed at identifying the most suitable method for your RNA and your biological question and highlighting the advantages of different methods. Graphical abstract In this review we present different methods for large-scale production and purification of RNAs for structural and biophysical studies.
Intelligent Detection of Structure from Remote Sensing Images Based on Deep Learning Method

NASA Astrophysics Data System (ADS)

Xin, L.

2018-04-01

Utilizing high-resolution remote sensing images for earth observation has become the common method of land use monitoring. It requires great human participation when dealing with traditional image interpretation, which is inefficient and difficult to guarantee the accuracy. At present, the artificial intelligent method such as deep learning has a large number of advantages in the aspect of image recognition. By means of a large amount of remote sensing image samples and deep neural network models, we can rapidly decipher the objects of interest such as buildings, etc. Whether in terms of efficiency or accuracy, deep learning method is more preponderant. This paper explains the research of deep learning method by a great mount of remote sensing image samples and verifies the feasibility of building extraction via experiments.
Estimation of reference intervals from small samples: an example using canine plasma creatinine.

PubMed

Geffré, A; Braun, J P; Trumel, C; Concordet, D

2009-12-01

According to international recommendations, reference intervals should be determined from at least 120 reference individuals, which often are impossible to achieve in veterinary clinical pathology, especially for wild animals. When only a small number of reference subjects is available, the possible bias cannot be known and the normality of the distribution cannot be evaluated. A comparison of reference intervals estimated by different methods could be helpful. The purpose of this study was to compare reference limits determined from a large set of canine plasma creatinine reference values, and large subsets of this data, with estimates obtained from small samples selected randomly. Twenty sets each of 120 and 27 samples were randomly selected from a set of 1439 plasma creatinine results obtained from healthy dogs in another study. Reference intervals for the whole sample and for the large samples were determined by a nonparametric method. The estimated reference limits for the small samples were minimum and maximum, mean +/- 2 SD of native and Box-Cox-transformed values, 2.5th and 97.5th percentiles by a robust method on native and Box-Cox-transformed values, and estimates from diagrams of cumulative distribution functions. The whole sample had a heavily skewed distribution, which approached Gaussian after Box-Cox transformation. The reference limits estimated from small samples were highly variable. The closest estimates to the 1439-result reference interval for 27-result subsamples were obtained by both parametric and robust methods after Box-Cox transformation but were grossly erroneous in some cases. For small samples, it is recommended that all values be reported graphically in a dot plot or histogram and that estimates of the reference limits be compared using different methods.
Advanced proteomic liquid chromatography

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xie, Fang; Smith, Richard D.; Shen, Yufeng

2012-10-26

Liquid chromatography coupled with mass spectrometry is the predominant platform used to analyze proteomics samples consisting of large numbers of proteins and their proteolytic products (e.g., truncated polypeptides) and spanning a wide range of relative concentrations. This review provides an overview of advanced capillary liquid chromatography techniques and methodologies that greatly improve separation resolving power and proteomics analysis coverage, sensitivity, and throughput.
The Use and Validation of Qualitative Methods Used in Program Evaluation.

ERIC Educational Resources Information Center

Plucker, Frank E.

When conducting a two-year college program review, there are several advantages to supplementing the standard quantitative research approach with qualitative measures. Qualitative research does not depend on a large number of random samples, it uses a flexible design which can be refined as the research is executed, and it generates findings in a…
Basic Numerical Capacities and Prevalence of Developmental Dyscalculia: The Havana Survey

ERIC Educational Resources Information Center

Reigosa-Crespo, Vivian; Valdes-Sosa, Mitchell; Butterworth, Brian; Estevez, Nancy; Rodriguez, Marisol; Santos, Elsa; Torres, Paul; Suarez, Ramon; Lage, Agustin

2012-01-01

The association of enumeration and number comparison capacities with arithmetical competence was examined in a large sample of children from 2nd to 9th grades. It was found that efficiency on numerical capacities predicted separately more than 25% of the variance in the individual differences on a timed arithmetical test, and this occurred for…

Limits on the Accuracy of Linking. Research Report. ETS RR-10-22

ERIC Educational Resources Information Center

Haberman, Shelby J.

2010-01-01

Sampling errors limit the accuracy with which forms can be linked. Limitations on accuracy are especially important in testing programs in which a very large number of forms are employed. Standard inequalities in mathematical statistics may be used to establish lower bounds on the achievable inking accuracy. To illustrate results, a variety of…
Gender in Adolescent Autonomy: Distinction between Boys and Girls Accelerates at 16 Years of Age

ERIC Educational Resources Information Center

Fleming, Manuela

2005-01-01

Introduction: Autonomy is a major developmental feature of adolescents. Its success mediates transition into adulthood. It involves a number of psychological parameters, including desire, conflict with parents and actual achievement. Method: How male and female adolescents view autonomy was investigated in a large sample of 12-17 year-old…
Comments on Correlations of IQ with Skin Color and Geographic-Demographic Variables

ERIC Educational Resources Information Center

Jensen, Arthur R.

2006-01-01

A large number of national and geographic population samples were used to test the hypothesis that the variation in mean values of skin color in the diverse populations are consistently correlated with the mean measured or estimated IQs of the various groups, as are some other physical variables, known as an ecological correlation. Straightforward…
Family Meals and Child Academic and Behavioral Outcomes

ERIC Educational Resources Information Center

Miller, Daniel P.; Waldfogel, Jane; Han, Wen-Jui

2012-01-01

This study investigates the link between the frequency of family breakfasts and dinners and child academic and behavioral outcomes in a panel sample of 21,400 children aged 5-15. It complements previous work by examining younger and older children separately and by using information on a large number of controls and rigorous analytic methods to…
The Effect of a Teacher's Sex on Career Development.

ERIC Educational Resources Information Center

Reich, Carol

A survey was conducted of a sample of 1,163 male and female teachers, consultants, and administrators of a large, urban school system. Data was collected about their formal qualifications, job performance, the extent to which they had been encouraged to apply for promotion, the number of applications they had made, and the positions they had held.…
An assessment of re-randomization methods in bark beetle (Scolytidae) trapping bioassays

Treesearch

Christopher J. Fettig; Christopher P. Dabney; Stepehen R. McKelvey; Robert R. Borys

2006-01-01

Numerous studies have explored the role of semiochemicals in the behavior of bark beetles (Scolytidae). Multiple funnel traps are often used to elucidate these behavioral responses. Sufficient sample sizes are obtained by using large numbers of traps to which treatments are randomly assigned once, or by frequent collection of trap catches and subsequent re-...
Variations in petrophysical properties of shales along a stratigraphic section in the Whitby mudstone (UK)

NASA Astrophysics Data System (ADS)

Barnhoorn, Auke; Houben, Maartje; Lie-A-Fat, Joella; Ravestein, Thomas; Drury, Martyn

2015-04-01

In unconventional tough gas reservoirs (e.g. tight sandstones or shales) the presence of fractures, either naturally formed or hydraulically induced, is almost always a prerequisite for hydrocarbon productivity to be economically viable. One of the formations classified so far as a potential interesting formation for shale gas exploration in the Netherlands is the Lower Jurassic Posidonia Shale Formation (PSF). However data of the Posidonia Shale Formation is scarce so far and samples are hard to come by, especially on the variability and heterogeneity of the petrophysical parameters of this shale little is known. Therefore research and sample collection is conducted on a time and depositional analogue of the PSF: the Whitby Mudstone Formation (WMF) in the United Kingdom. A large number of samples along a ~7m stratigraphic section of the Whitby Mudstone Formation have been collected and analysed. Standard petrophysical properties such as porosity and matrix densities are quantified for a number of samples throughout the section, as well as mineral composition analysis based on XRD/XRF and SEM analyses. Seismic velocity measurements are also conducted at multiple heights in the section and in multiple directions to elaborate on anisotropy of the material. Attenuation anisotropy is incorporated as well as Thomsen's parameters combined with elastic parameters, e.g. Young's modulus and Poisson's ratio, to quantify the elastic anisotropy. Furthermore rock mechanical experiments are conducted to determine the elastic constants, rock strength, fracture characteristics, brittleness index, fraccability and rock mechanical anisotropy across the stratigraphic section of the Whitby mudstone formation. Results show that the WMF is highly anisotropic and it exhibits an anisotropy on the large limit of anisotropy reported for US gas shales. The high anisotropy of the Whitby shales has an even larger control on the formation of the fracture network. Furthermore, most petrophysical properties are highly variable. They vary per sample, but even within a sample on a mm-scale, large variations in e.g. the porosity occur. These relatively large variations influence the potential for future shale gas exploration for these Lower Jurassic shales in northern Europe and need to be quantified in detail beforehand. Compositional analyses and rock deformation experiments on the first samples indicate relatively low brittleness indices for the Whitby shale, but variation of these parameters within the stratigraphy are present. All petrophysical analyses combined will provide a complete assessment of the potential for shale gas exploration of these Lower Jurassic shales.
Total Extracellular Small RNA Profiles from Plasma, Saliva, and Urine of Healthy Subjects

PubMed Central

Yeri, Ashish; Courtright, Amanda; Reiman, Rebecca; Carlson, Elizabeth; Beecroft, Taylor; Janss, Alex; Siniard, Ashley; Richholt, Ryan; Balak, Chris; Rozowsky, Joel; Kitchen, Robert; Hutchins, Elizabeth; Winarta, Joseph; McCoy, Roger; Anastasi, Matthew; Kim, Seungchan; Huentelman, Matthew; Van Keuren-Jensen, Kendall

2017-01-01

Interest in circulating RNAs for monitoring and diagnosing human health has grown significantly. There are few datasets describing baseline expression levels for total cell-free circulating RNA from healthy control subjects. In this study, total extracellular RNA (exRNA) was isolated and sequenced from 183 plasma samples, 204 urine samples and 46 saliva samples from 55 male college athletes ages 18–25 years. Many participants provided more than one sample, allowing us to investigate variability in an individual’s exRNA expression levels over time. Here we provide a systematic analysis of small exRNAs present in each biofluid, as well as an analysis of exogenous RNAs. The small RNA profile of each biofluid is distinct. We find that a large number of RNA fragments in plasma (63%) and urine (54%) have sequences that are assigned to YRNA and tRNA fragments respectively. Surprisingly, while many miRNAs can be detected, there are few miRNAs that are consistently detected in all samples from a single biofluid, and profiles of miRNA are different for each biofluid. Not unexpectedly, saliva samples have high levels of exogenous sequence that can be traced to bacteria. These data significantly contribute to the current number of sequenced exRNA samples from normal healthy individuals. PMID:28303895
Bacterial diversity of surface sand samples from the Gobi and Taklamaken deserts.

PubMed

An, Shu; Couteau, Cécile; Luo, Fan; Neveu, Julie; DuBow, Michael S

2013-11-01

Arid regions represent nearly 30 % of the Earth's terrestrial surface, but their microbial biodiversity is not yet well characterized. The surface sands of deserts, a subset of arid regions, are generally subjected to large temperature fluctuations plus high UV light exposure and are low in organic matter. We examined surface sand samples from the Taklamaken (China, three samples) and Gobi (Mongolia, two samples) deserts, using pyrosequencing of PCR-amplified 16S V1/V2 rDNA sequences from total extracted DNA in order to gain an assessment of the bacterial population diversity. In total, 4,088 OTUs (using ≥97 % sequence similarity levels), with Chao1 estimates varying from 1,172 to 2,425 OTUs per sample, were discernable. These could be grouped into 102 families belonging to 15 phyla, with OTUs belonging to the Firmicutes, Proteobacteria, Bacteroidetes, and Actinobacteria phyla being the most abundant. The bacterial population composition was statistically different among the samples, though members from 30 genera were found to be common among the five samples. An increase in phylotype numbers with increasing C/N ratio was noted, suggesting a possible role in the bacterial richness of these desert sand environments. Our results imply an unexpectedly large bacterial diversity residing in the harsh environment of these two Asian deserts, worthy of further investigation.
Probabilistic generation of random networks taking into account information on motifs occurrence.

PubMed

Bois, Frederic Y; Gayraud, Ghislaine

2015-01-01

Because of the huge number of graphs possible even with a small number of nodes, inference on network structure is known to be a challenging problem. Generating large random directed graphs with prescribed probabilities of occurrences of some meaningful patterns (motifs) is also difficult. We show how to generate such random graphs according to a formal probabilistic representation, using fast Markov chain Monte Carlo methods to sample them. As an illustration, we generate realistic graphs with several hundred nodes mimicking a gene transcription interaction network in Escherichia coli.
Placing and preserving priorities: projects, productivity, progress and people

PubMed Central

Babiak, John

1998-01-01

High throughput screening (HTS) involves using automated equipment to test a large number of samples against a defined molecular target to identify a reasonable number of active molecules in a timely fashion. Major factors which can influence priorities for the limited resources of the HTS group are projects, productivity, progress and people. The challenge to the HTS group is to provide excellent and timely screening services, but still devote efforts to new technologies and personnel development. This article explains why these factors are so important. PMID:18924829
Probabilistic Generation of Random Networks Taking into Account Information on Motifs Occurrence

PubMed Central

Bois, Frederic Y.

2015-01-01

Abstract Because of the huge number of graphs possible even with a small number of nodes, inference on network structure is known to be a challenging problem. Generating large random directed graphs with prescribed probabilities of occurrences of some meaningful patterns (motifs) is also difficult. We show how to generate such random graphs according to a formal probabilistic representation, using fast Markov chain Monte Carlo methods to sample them. As an illustration, we generate realistic graphs with several hundred nodes mimicking a gene transcription interaction network in Escherichia coli. PMID:25493547
Set size and culture influence children's attention to number.

PubMed

Cantrell, Lisa; Kuwabara, Megumi; Smith, Linda B

2015-03-01

Much research evidences a system in adults and young children for approximately representing quantity. Here we provide evidence that the bias to attend to discrete quantity versus other dimensions may be mediated by set size and culture. Preschool-age English-speaking children in the United States and Japanese-speaking children in Japan were tested in a match-to-sample task where number was pitted against cumulative surface area in both large and small numerical set comparisons. Results showed that children from both cultures were biased to attend to the number of items for small sets. Large set responses also showed a general attention to number when ratio difficulty was easy. However, relative to the responses for small sets, attention to number decreased for both groups; moreover, both U.S. and Japanese children showed a significant bias to attend to total amount for difficult numerical ratio distances, although Japanese children shifted attention to total area at relatively smaller set sizes than U.S. children. These results add to our growing understanding of how quantity is represented and how such representation is influenced by context--both cultural and perceptual. Copyright © 2014 Elsevier Inc. All rights reserved.
The use of laser-induced fluorescence or ultraviolet detectors for sensitive and selective analysis of tobramycin or erythropoietin in complex samples

NASA Astrophysics Data System (ADS)

Ahmed, Hytham M.; Ebeid, Wael B.

2015-05-01

Complex samples analysis is a challenge in pharmaceutical and biopharmaceutical analysis. In this work, tobramycin (TOB) analysis in human urine samples and recombinant human erythropoietin (rhEPO) analysis in the presence of similar protein were selected as representative examples of such samples analysis. Assays of TOB in urine samples are difficult because of poor detectability. Therefore laser induced fluorescence detector (LIF) was combined with a separation technique, micellar electrokinetic chromatography (MEKC), to determine TOB through derivatization with fluorescein isothiocyanate (FITC). Borate was used as background electrolyte (BGE) with negative-charged mixed micelles as additive. The method was successively applied to urine samples. The LOD and LOQ for Tobramycin in urine were 90 and 200 ng/ml respectively and recovery was >98% (n = 5). All urine samples were analyzed by direct injection without sample pre-treatment. Another use of hyphenated analytical technique, capillary zone electrophoresis (CZE) connected to ultraviolet (UV) detector was also used for sensitive analysis of rhEPO at low levels (2000 IU) in the presence of large amount of human serum albumin (HSA). Analysis of rhEPO was achieved by the use of the electrokinetic injection (EI) with discontinuous buffers. Phosphate buffer was used as BGE with metal ions as additive. The proposed method can be used for the estimation of large number of quality control rhEPO samples in a short period.
Micro injector sample delivery system for charged molecules

DOEpatents

Davidson, James C.; Balch, Joseph W.

1999-11-09

A micro injector sample delivery system for charged molecules. The injector is used for collecting and delivering controlled amounts of charged molecule samples for subsequent analysis. The injector delivery system can be scaled to large numbers (>96) for sample delivery to massively parallel high throughput analysis systems. The essence of the injector system is an electric field controllable loading tip including a section of porous material. By applying the appropriate polarity bias potential to the injector tip, charged molecules will migrate into porous material, and by reversing the polarity bias potential the molecules are ejected or forced away from the tip. The invention has application for uptake of charged biological molecules (e.g. proteins, nucleic acids, polymers, etc.) for delivery to analytical systems, and can be used in automated sample delivery systems.
SignalPlant: an open signal processing software platform.

PubMed

Plesinger, F; Jurco, J; Halamek, J; Jurak, P

2016-07-01

The growing technical standard of acquisition systems allows the acquisition of large records, often reaching gigabytes or more in size as is the case with whole-day electroencephalograph (EEG) recordings, for example. Although current 64-bit software for signal processing is able to process (e.g. filter, analyze, etc) such data, visual inspection and labeling will probably suffer from rather long latency during the rendering of large portions of recorded signals. For this reason, we have developed SignalPlant-a stand-alone application for signal inspection, labeling and processing. The main motivation was to supply investigators with a tool allowing fast and interactive work with large multichannel records produced by EEG, electrocardiograph and similar devices. The rendering latency was compared with EEGLAB and proves significantly faster when displaying an image from a large number of samples (e.g. 163-times faster for 75 × 10(6) samples). The presented SignalPlant software is available free and does not depend on any other computation software. Furthermore, it can be extended with plugins by third parties ensuring its adaptability to future research tasks and new data formats.
Correlation spectrometer

DOEpatents

Sinclair, Michael B [Albuquerque, NM; Pfeifer, Kent B [Los Lunas, NM; Flemming, Jeb H [Albuquerque, NM; Jones, Gary D [Tijeras, NM; Tigges, Chris P [Albuquerque, NM

2010-04-13

A correlation spectrometer can detect a large number of gaseous compounds, or chemical species, with a species-specific mask wheel. In this mode, the spectrometer is optimized for the direct measurement of individual target compounds. Additionally, the spectrometer can measure the transmission spectrum from a given sample of gas. In this mode, infrared light is passed through a gas sample and the infrared transmission signature of the gasses present is recorded and measured using Hadamard encoding techniques. The spectrometer can detect the transmission or emission spectra in any system where multiple species are present in a generally known volume.
Nowcasting and Forecasting the Monthly Food Stamps Data in the US Using Online Search Data

PubMed Central

Fantazzini, Dean

2014-01-01

We propose the use of Google online search data for nowcasting and forecasting the number of food stamps recipients. We perform a large out-of-sample forecasting exercise with almost 3000 competing models with forecast horizons up to 2 years ahead, and we show that models including Google search data statistically outperform the competing models at all considered horizons. These results hold also with several robustness checks, considering alternative keywords, a falsification test, different out-of-samples, directional accuracy and forecasts at the state-level. PMID:25369315
Identification of stochastic interactions in nonlinear models of structural mechanics

NASA Astrophysics Data System (ADS)

Kala, Zdeněk

2017-07-01

In the paper, the polynomial approximation is presented by which the Sobol sensitivity analysis can be evaluated with all sensitivity indices. The nonlinear FEM model is approximated. The input area is mapped using simulations runs of Latin Hypercube Sampling method. The domain of the approximation polynomial is chosen so that it were possible to apply large number of simulation runs of Latin Hypercube Sampling method. The method presented also makes possible to evaluate higher-order sensitivity indices, which could not be identified in case of nonlinear FEM.
Transport of fluorobenzoate tracers in a vegetated hydrologic control volume: 1. Experimental results

NASA Astrophysics Data System (ADS)

Queloz, Pierre; Bertuzzo, Enrico; Carraro, Luca; Botter, Gianluca; Miglietta, Franco; Rao, P. S. C.; Rinaldo, Andrea

2015-04-01

This paper reports about the experimental evidence collected on the transport of five fluorobenzoate tracers injected under controlled conditions in a vegetated hydrologic volume, a large lysimeter (fitted with load cells, sampling ports, and an underground chamber) where two willows prompting large evapotranspiration fluxes had been grown. The relevance of the study lies in the direct and indirect measures of the ways in which hydrologic fluxes, in this case, evapotranspiration from the upper surface and discharge from the bottom drainage, sample water and solutes in storage at different times under variable hydrologic forcings. Methods involve the accurate control of hydrologic inputs and outputs and a large number of suitable chemical analyses of water samples in discharge waters. Mass extraction from biomass has also been performed ex post. The results of the 2 year long experiment established that our initial premises on the tracers' behavior, known to be sorption-free under saturated conditions which we verified in column leaching tests, were unsuitable as large differences in mass recovery appeared. Issues on reactivity thus arose and were addressed in the paper, in this case attributed to microbial degradation and solute plant uptake. Our results suggest previously unknown features of fluorobenzoate compounds as hydrologic tracers, potentially interesting for catchment studies owing to their suitability for distinguishable multiple injections, and an outlook on direct experimental closures of mass balance in hydrologic transport volumes involving fluxes that are likely to sample differently stored water and solutes.

Methodological Considerations in Estimation of Phenotype Heritability Using Genome-Wide SNP Data, Illustrated by an Analysis of the Heritability of Height in a Large Sample of African Ancestry Adults

PubMed Central

Chen, Fang; He, Jing; Zhang, Jianqi; Chen, Gary K.; Thomas, Venetta; Ambrosone, Christine B.; Bandera, Elisa V.; Berndt, Sonja I.; Bernstein, Leslie; Blot, William J.; Cai, Qiuyin; Carpten, John; Casey, Graham; Chanock, Stephen J.; Cheng, Iona; Chu, Lisa; Deming, Sandra L.; Driver, W. Ryan; Goodman, Phyllis; Hayes, Richard B.; Hennis, Anselm J. M.; Hsing, Ann W.; Hu, Jennifer J.; Ingles, Sue A.; John, Esther M.; Kittles, Rick A.; Kolb, Suzanne; Leske, M. Cristina; Monroe, Kristine R.; Murphy, Adam; Nemesure, Barbara; Neslund-Dudas, Christine; Nyante, Sarah; Ostrander, Elaine A; Press, Michael F.; Rodriguez-Gil, Jorge L.; Rybicki, Ben A.; Schumacher, Fredrick; Stanford, Janet L.; Signorello, Lisa B.; Strom, Sara S.; Stevens, Victoria; Van Den Berg, David; Wang, Zhaoming; Witte, John S.; Wu, Suh-Yuh; Yamamura, Yuko; Zheng, Wei; Ziegler, Regina G.; Stram, Alexander H.; Kolonel, Laurence N.; Marchand, Loïc Le; Henderson, Brian E.; Haiman, Christopher A.; Stram, Daniel O.

2015-01-01

Height has an extremely polygenic pattern of inheritance. Genome-wide association studies (GWAS) have revealed hundreds of common variants that are associated with human height at genome-wide levels of significance. However, only a small fraction of phenotypic variation can be explained by the aggregate of these common variants. In a large study of African-American men and women (n = 14,419), we genotyped and analyzed 966,578 autosomal SNPs across the entire genome using a linear mixed model variance components approach implemented in the program GCTA (Yang et al Nat Genet 2010), and estimated an additive heritability of 44.7% (se: 3.7%) for this phenotype in a sample of evidently unrelated individuals. While this estimated value is similar to that given by Yang et al in their analyses, we remain concerned about two related issues: (1) whether in the complete absence of hidden relatedness, variance components methods have adequate power to estimate heritability when a very large number of SNPs are used in the analysis; and (2) whether estimation of heritability may be biased, in real studies, by low levels of residual hidden relatedness. We addressed the first question in a semi-analytic fashion by directly simulating the distribution of the score statistic for a test of zero heritability with and without low levels of relatedness. The second question was addressed by a very careful comparison of the behavior of estimated heritability for both observed (self-reported) height and simulated phenotypes compared to imputation R2 as a function of the number of SNPs used in the analysis. These simulations help to address the important question about whether today's GWAS SNPs will remain useful for imputing causal variants that are discovered using very large sample sizes in future studies of height, or whether the causal variants themselves will need to be genotyped de novo in order to build a prediction model that ultimately captures a large fraction of the variability of height, and by implication other complex phenotypes. Our overall conclusions are that when study sizes are quite large (5,000 or so) the additive heritability estimate for height is not apparently biased upwards using the linear mixed model; however there is evidence in our simulation that a very large number of causal variants (many thousands) each with very small effect on phenotypic variance will need to be discovered to fill the gap between the heritability explained by known versus unknown causal variants. We conclude that today's GWAS data will remain useful in the future for causal variant prediction, but that finding the causal variants that need to be predicted may be extremely laborious. PMID:26125186
Methodological Considerations in Estimation of Phenotype Heritability Using Genome-Wide SNP Data, Illustrated by an Analysis of the Heritability of Height in a Large Sample of African Ancestry Adults.

PubMed

Chen, Fang; He, Jing; Zhang, Jianqi; Chen, Gary K; Thomas, Venetta; Ambrosone, Christine B; Bandera, Elisa V; Berndt, Sonja I; Bernstein, Leslie; Blot, William J; Cai, Qiuyin; Carpten, John; Casey, Graham; Chanock, Stephen J; Cheng, Iona; Chu, Lisa; Deming, Sandra L; Driver, W Ryan; Goodman, Phyllis; Hayes, Richard B; Hennis, Anselm J M; Hsing, Ann W; Hu, Jennifer J; Ingles, Sue A; John, Esther M; Kittles, Rick A; Kolb, Suzanne; Leske, M Cristina; Millikan, Robert C; Monroe, Kristine R; Murphy, Adam; Nemesure, Barbara; Neslund-Dudas, Christine; Nyante, Sarah; Ostrander, Elaine A; Press, Michael F; Rodriguez-Gil, Jorge L; Rybicki, Ben A; Schumacher, Fredrick; Stanford, Janet L; Signorello, Lisa B; Strom, Sara S; Stevens, Victoria; Van Den Berg, David; Wang, Zhaoming; Witte, John S; Wu, Suh-Yuh; Yamamura, Yuko; Zheng, Wei; Ziegler, Regina G; Stram, Alexander H; Kolonel, Laurence N; Le Marchand, Loïc; Henderson, Brian E; Haiman, Christopher A; Stram, Daniel O

2015-01-01

Height has an extremely polygenic pattern of inheritance. Genome-wide association studies (GWAS) have revealed hundreds of common variants that are associated with human height at genome-wide levels of significance. However, only a small fraction of phenotypic variation can be explained by the aggregate of these common variants. In a large study of African-American men and women (n = 14,419), we genotyped and analyzed 966,578 autosomal SNPs across the entire genome using a linear mixed model variance components approach implemented in the program GCTA (Yang et al Nat Genet 2010), and estimated an additive heritability of 44.7% (se: 3.7%) for this phenotype in a sample of evidently unrelated individuals. While this estimated value is similar to that given by Yang et al in their analyses, we remain concerned about two related issues: (1) whether in the complete absence of hidden relatedness, variance components methods have adequate power to estimate heritability when a very large number of SNPs are used in the analysis; and (2) whether estimation of heritability may be biased, in real studies, by low levels of residual hidden relatedness. We addressed the first question in a semi-analytic fashion by directly simulating the distribution of the score statistic for a test of zero heritability with and without low levels of relatedness. The second question was addressed by a very careful comparison of the behavior of estimated heritability for both observed (self-reported) height and simulated phenotypes compared to imputation R2 as a function of the number of SNPs used in the analysis. These simulations help to address the important question about whether today's GWAS SNPs will remain useful for imputing causal variants that are discovered using very large sample sizes in future studies of height, or whether the causal variants themselves will need to be genotyped de novo in order to build a prediction model that ultimately captures a large fraction of the variability of height, and by implication other complex phenotypes. Our overall conclusions are that when study sizes are quite large (5,000 or so) the additive heritability estimate for height is not apparently biased upwards using the linear mixed model; however there is evidence in our simulation that a very large number of causal variants (many thousands) each with very small effect on phenotypic variance will need to be discovered to fill the gap between the heritability explained by known versus unknown causal variants. We conclude that today's GWAS data will remain useful in the future for causal variant prediction, but that finding the causal variants that need to be predicted may be extremely laborious.
Reducing the layer number of AB stacked multilayer graphene grown on nickel by annealing at low temperature.

PubMed

Velasco, J Marquez; Giamini, S A; Kelaidis, N; Tsipas, P; Tsoutsou, D; Kordas, G; Raptis, Y S; Boukos, N; Dimoulas, A

2015-10-09

Controlling the number of layers of graphene grown by chemical vapor deposition is crucial for large scale graphene application. We propose here an etching process of graphene which can be applied immediately after growth to control the number of layers. We use nickel (Ni) foil at high temperature (T = 900 °C) to produce multilayer-AB-stacked-graphene (MLG). The etching process is based on annealing the samples in a hydrogen/argon atmosphere at a relatively low temperature (T = 450 °C) inside the growth chamber. The extent of etching is mainly controlled by the annealing process duration. Using Raman spectroscopy we demonstrate that the number of layers was reduced, changing from MLG to few-layer-AB-stacked-graphene and in some cases to randomly oriented few layer graphene near the substrate. Furthermore, our method offers the significant advantage that it does not introduce defects in the samples, maintaining their original high quality. This fact and the low temperature our method uses make it a good candidate for controlling the layer number of already grown graphene in processes with a low thermal budget.
Assessment of organic contaminants in emissions from refuse-derived fuel combustion

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chrostowski, J.; Wait, D.; Kwong, E.

1985-09-01

Organic contaminants in emissions from refuse-derived fuel combustion were investigated in a 20-inch-diameter atmospheric fluidized-bed combustor. Combinations of coal/EcoFuel/MSW/toluene were burned inthe combustor with temperatures ranging from 1250 to 1550 degrees F. A Source Assessment Sampling System (SASS) was used to sample the stack gas; Level 1 methodology was used to analyze the organic-contaminant levels. Combustion efficiencies of 93 to 98 percent were achieved in the test burns. Combustion of the EcoFuel generated fewer organic emissions than combustion of coal at similar combustion temperatures. The fine particulate collected by the SASS train filter contained higher concentrations of extractable organics thanmore » the reactor fly ash and the SASS cyclone samples. Combustion of a toluene/EcoFuel mix generated a large number of benzene derivatives not seen in the combustion of pure EcoFuel. Polycyclic aromatic hydrocarbons were the dominant organic compounds contained in the XAD-2 resin extract from coal combustion. A number of different priority pollutants were identified in the samples collected.« less
The 'Natural Laboratory', a tool for deciphering growth, lifetime and population dynamics in larger benthic foraminifera

NASA Astrophysics Data System (ADS)

Hohenegger, Johann

2015-04-01

The shells of symbiont-bearing larger benthic Foraminifera (LBF) represent the response to physiological requirements in dependence of environmental conditions. All compartments of the shell such as chambers and chamberlets accommodate the growth of the cell protoplasm and are adaptations for housing photosymbiotic algae. Investigations on the biology of LBF were predominantly based on laboratory studies. The lifetime of LBF under natural conditions is still unclear. LBF, which can build >100 chambers during their lifetime, are thought to live at least one year under natural conditions. This is supported by studies on population dynamics of eulittoral foraminifera. In species characterized by a time-restricted single reproduction period the mean size of specimens increases from small to large during lifetime simultaneously reducing individual number. This becomes more complex when two or more reproduction times are present within a one-year cycle leading to a mixture of abundant small individuals with few large specimens during the year, while keeping mean size more or less constant. This mixture is typical for most sublittoral megalospheric (gamonts or schizonts) LBF. Nothing is known on the lifetime of agamonts, the diploid asexually reproducing generation. In all hyaline LBF it is thought to be significantly longer than 1 year based on the large size and considering the mean chamber building rate of the gamont/schizonts. Observations on LBF under natural conditions have not been performed yet in the deeper sublittoral. This reflects the difficulties due to intense hydrodynamics that hinder deploying technical equipment for studies in the natural environment. Therefore, studying growth, lifetime and reproduction of sublittoral LBF under natural conditions can be performed using the so-called 'natural laboratory' in comparison with laboratory investigations. The best sampling method in the upper sublittoral from 5 to 70 m depth is by SCUBA diving. Irregular sampling intervals caused by differing weather conditions may range from weeks to one month, whereby the latter represents the upper limit: larger intervals could render the data set worthless. The number of sampling points at the location must be more than 4, randomly distributed and approximately 5m apart to smooth the effects of patchy distributions, which are typical for most LBF. Only three simple measurements are necessary to determine chamber building rate and population dynamics under natural conditions. These are the number of individuals, number of chambers and the largest diameter of the individual. The determination of a standardized sample surface area, which is necessary for population dynamic investigations, depends on the sampling method. Reproduction and longevity can be estimated based on shell size using the date where the mean abundance of specimens with minimum size (expected after a one month's growth) characterizes the reproduction period. Then the difference to the date with the mean abundance of specimens characterized by large size indicating readiness for reproduction marks the life time. Calculation of the chamber-building rate based on chamber number is more complex and depends on the reproduction period and longevity. This can be fitted with theoretical growth functions (e.g. Michaelis Menten Function). According to the above mentioned methods, chamber building rates, longevity and population dynamics can be obtained for the shallow sublittoral symbiont-bearing LBF using the 'natural laboratory'.
Does size matter? Statistical limits of paleomagnetic field reconstruction from small rock specimens

NASA Astrophysics Data System (ADS)

Berndt, Thomas; Muxworthy, Adrian R.; Fabian, Karl

2016-01-01

As samples of ever decreasing sizes are being studied paleomagnetically, care has to be taken that the underlying assumptions of statistical thermodynamics (Maxwell-Boltzmann statistics) are being met. Here we determine how many grains and how large a magnetic moment a sample needs to have to be able to accurately record an ambient field. It is found that for samples with a thermoremanent magnetic moment larger than 10-11Am2 the assumption of a sufficiently large number of grains is usually given. Standard 25 mm diameter paleomagnetic samples usually contain enough magnetic grains such that statistical errors are negligible, but "single silicate crystal" works on, for example, zircon, plagioclase, and olivine crystals are approaching the limits of what is physically possible, leading to statistic errors in both the angular deviation and paleointensity that are comparable to other sources of error. The reliability of nanopaleomagnetic imaging techniques capable of resolving individual grains (used, for example, to study the cloudy zone in meteorites), however, is questionable due to the limited area of the material covered.
Biodiversity in canopy-forming algae: Structure and spatial variability of the Mediterranean Cystoseira assemblages

NASA Astrophysics Data System (ADS)

Piazzi, L.; Bonaviri, C.; Castelli, A.; Ceccherelli, G.; Costa, G.; Curini-Galletti, M.; Langeneck, J.; Manconi, R.; Montefalcone, M.; Pipitone, C.; Rosso, A.; Pinna, S.

2018-07-01

In the Mediterranean Sea, Cystoseira species are the most important canopy-forming algae in shallow rocky bottoms, hosting high biodiverse sessile and mobile communities. A large-scale study has been carried out to investigate the structure of the Cystoseira-dominated assemblages at different spatial scales and to test the hypotheses that alpha and beta diversity of the assemblages, the abundance and the structure of epiphytic macroalgae, epilithic macroalgae, sessile macroinvertebrates and mobile macroinvertebrates associated to Cystoseira beds changed among scales. A hierarchical sampling design in a total of five sites across the Mediterranean Sea (Croatia, Montenegro, Sardinia, Tuscany and Balearic Islands) was used. A total of 597 taxa associated to Cystoseira beds were identified with a mean number per sample ranging between 141.1 ± 6.6 (Tuscany) and 173.9 ± 8.5(Sardinia). A high variability at small (among samples) and large (among sites) scale was generally highlighted, but the studied assemblages showed different patterns of spatial variability. The relative importance of the different scales of spatial variability should be considered to optimize sampling designs and propose monitoring plans of this habitat.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Hayashida, Misa; Malac, Marek; Egerton, Ray F.

Electron tomography is a method whereby a three-dimensional reconstruction of a nanoscale object is obtained from a series of projected images measured in a transmission electron microscope. We developed an electron-diffraction method to measure the tilt and azimuth angles, with Kikuchi lines used to align a series of diffraction patterns obtained with each image of the tilt series. Since it is based on electron diffraction, the method is not affected by sample drift and is not sensitive to sample thickness, whereas tilt angle measurement and alignment using fiducial-marker methods are affected by both sample drift and thickness. The accuracy ofmore » the diffraction method benefits reconstructions with a large number of voxels, where both high spatial resolution and a large field of view are desired. The diffraction method allows both the tilt and azimuth angle to be measured, while fiducial marker methods typically treat the tilt and azimuth angle as an unknown parameter. The diffraction method can be also used to estimate the accuracy of the fiducial marker method, and the sample-stage accuracy. A nano-dot fiducial marker measurement differs from a diffraction measurement by no more than ±1°.« less
Ingestion of plastic marine debris by Common and Thick-billed Murres in the northwestern Atlantic from 1985 to 2012.

PubMed

Bond, Alexander L; Provencher, Jennifer F; Elliot, Richard D; Ryan, Pierre C; Rowe, Sherrylynn; Jones, Ian L; Robertson, Gregory J; Wilhelm, Sabina I

2013-12-15

Plastic ingestion by seabirds is a growing conservation issue, but there are few time series of plastic ingestion with large sample sizes for which one can assess temporal trends. Common and Thick-billed Murres (Uria aalge and U. lomvia) are pursuit-diving auks that are legally harvested in Newfoundland and Labrador, Canada. Here, we combined previously unpublished data on plastic ingestion (from the 1980s to the 1990s) with contemporary samples (2011-2012) to evaluate changes in murres' plastic ingestion. Approximately 7% of murres had ingested plastic, with no significant change in the frequency of ingestion among species or periods. The number of pieces of plastic/bird, and mass of plastic/bird were highest in the 1980s, lowest in the late 1990s, and intermediate in contemporary samples. Studying plastic ingestion in harvested seabird populations links harvesters to conservation and health-related issues and is a useful source of large samples for diet and plastic ingestion studies. Copyright © 2013 Elsevier Ltd. All rights reserved.
Non-destructive geochemical analysis and element mapping using bench-top μ-XRF: applications and uses for geoscience problems

NASA Astrophysics Data System (ADS)

Flude, Stephanie; Haschke, Michael; Tagle, Roald; Storey, Michael

2013-04-01

X-Ray Fluorescence (XRF) has long been used to provide valuable geochemical analysis of bulk rock samples in geological studies. However, it is a destructive technique, requiring samples to be homogenised by grinding to a fine powder and formed into a compacted pellet, or fused glass disk and the resulting sample has to be completely flat for reliable analysis. Until recently, non-destructive, high spatial resolution µ- XRF analysis was possible only at specialised Synchrotron radiation facilities, where high excitation beam energies are possible and specialised X-ray focussing optical systems are available. Recently, a number of bench-top µ-XRF systems have become available, allowing easy, rapid and non-destructive geochemical analysis of various materials. We present a number of examples of how the new bench-top M4 Tornado µ-XRF system, developed by Bruker Nano, can be used to provide valuable geochemical information on geological samples. Both quantitative and qualitative (in the form of X-Ray area-maps) data can be quickly and easily acquired for a wide range of elements (as light as Na, using a vacuum), with minimal sample preparation, using an X-Ray spot size as low as 25 µm. Large specimens up to 30 cm and 5 kg in weight can be analysed due to the large sample chamber, allowing non-destructive characterisation of rare or valuable materials. This technique is particularly useful in characterising heterogeneous samples, such as drill cores, sedimentary and pyroclastic rocks containing a variety of clasts, lavas sourced from mixed and mingled magmas, mineralised samples and fossils. An obvious application is the ability to produce element maps or line-scans of minerals, allowing zoning of major and trace elements to be identified and thus informing on crystallisation histories. An application of particular interest to 40Ar/39Ar geochronologists is the ability to screen and assess the purity of mineral separates, or to characterise polished slabs for subsequent in-situ 40Ar/39Ar laser probe analysis; in the past such samples may have been characterised using SEM, but recent work [1] suggests that charging of a sample during electron-beam excitation can cause redistribution of K, thus disturb the 40Ar/39Ar system. Finally, we assess data accuracy and precision by presenting quantitative analyses of a number of standards. [1] Flude et al., The effect of SEM imaging on the Ar/Ar system in feldspars, V51C-2215 Poster, AGU Fall Meeting 2010
Kinetic Boltzmann approach adapted for modeling highly ionized matter created by x-ray irradiation of a solid.

PubMed

Ziaja, Beata; Saxena, Vikrant; Son, Sang-Kil; Medvedev, Nikita; Barbrel, Benjamin; Woloncewicz, Bianca; Stransky, Michal

2016-05-01

We report on the kinetic Boltzmann approach adapted for simulations of highly ionized matter created from a solid by its x-ray irradiation. X rays can excite inner-shell electrons, which leads to the creation of deeply lying core holes. Their relaxation, especially in heavier elements, can take complicated paths, leading to a large number of active configurations. Their number can be so large that solving the set of respective evolution equations becomes computationally inefficient and another modeling approach should be used instead. To circumvent this complexity, the commonly used continuum models employ a superconfiguration scheme. Here, we propose an alternative approach which still uses "true" atomic configurations but limits their number by restricting the sample relaxation to the predominant relaxation paths. We test its reliability, performing respective calculations for a bulk material consisting of light atoms and comparing the results with a full calculation including all relaxation paths. Prospective application for heavy elements is discussed.
Low-Cost Nested-MIMO Array for Large-Scale Wireless Sensor Applications.

PubMed

Zhang, Duo; Wu, Wen; Fang, Dagang; Wang, Wenqin; Cui, Can

2017-05-12

In modern communication and radar applications, large-scale sensor arrays have increasingly been used to improve the performance of a system. However, the hardware cost and circuit power consumption scale linearly with the number of sensors, which makes the whole system expensive and power-hungry. This paper presents a low-cost nested multiple-input multiple-output (MIMO) array, which is capable of providing O ( 2 N 2 ) degrees of freedom (DOF) with O ( N ) physical sensors. The sensor locations of the proposed array have closed-form expressions. Thus, the aperture size and number of DOF can be predicted as a function of the total number of sensors. Additionally, with the help of time-sequence-phase-weighting (TSPW) technology, only one receiver channel is required for sampling the signals received by all of the sensors, which is conducive to reducing the hardware cost and power consumption. Numerical simulation results demonstrate the effectiveness and superiority of the proposed array.
Low-Cost Nested-MIMO Array for Large-Scale Wireless Sensor Applications

PubMed Central

Zhang, Duo; Wu, Wen; Fang, Dagang; Wang, Wenqin; Cui, Can

2017-01-01

In modern communication and radar applications, large-scale sensor arrays have increasingly been used to improve the performance of a system. However, the hardware cost and circuit power consumption scale linearly with the number of sensors, which makes the whole system expensive and power-hungry. This paper presents a low-cost nested multiple-input multiple-output (MIMO) array, which is capable of providing O(2N2) degrees of freedom (DOF) with O(N) physical sensors. The sensor locations of the proposed array have closed-form expressions. Thus, the aperture size and number of DOF can be predicted as a function of the total number of sensors. Additionally, with the help of time-sequence-phase-weighting (TSPW) technology, only one receiver channel is required for sampling the signals received by all of the sensors, which is conducive to reducing the hardware cost and power consumption. Numerical simulation results demonstrate the effectiveness and superiority of the proposed array. PMID:28498329
Statistical detection of systematic election irregularities

PubMed Central

Klimek, Peter; Yegorov, Yuri; Hanel, Rudolf; Thurner, Stefan

2012-01-01

Democratic societies are built around the principle of free and fair elections, and that each citizen’s vote should count equally. National elections can be regarded as large-scale social experiments, where people are grouped into usually large numbers of electoral districts and vote according to their preferences. The large number of samples implies statistical consequences for the polling results, which can be used to identify election irregularities. Using a suitable data representation, we find that vote distributions of elections with alleged fraud show a kurtosis substantially exceeding the kurtosis of normal elections, depending on the level of data aggregation. As an example, we show that reported irregularities in recent Russian elections are, indeed, well-explained by systematic ballot stuffing. We develop a parametric model quantifying the extent to which fraudulent mechanisms are present. We formulate a parametric test detecting these statistical properties in election results. Remarkably, this technique produces robust outcomes with respect to the resolution of the data and therefore, allows for cross-country comparisons. PMID:23010929
Topology of Large-Scale Structure by Galaxy Type: Hydrodynamic Simulations

NASA Astrophysics Data System (ADS)

Gott, J. Richard, III; Cen, Renyue; Ostriker, Jeremiah P.

1996-07-01

The topology of large-scale structure is studied as a function of galaxy type using the genus statistic. In hydrodynamical cosmological cold dark matter simulations, galaxies form on caustic surfaces (Zeldovich pancakes) and then slowly drain onto filaments and clusters. The earliest forming galaxies in the simulations (defined as "ellipticals") are thus seen at the present epoch preferentially in clusters (tending toward a meatball topology), while the latest forming galaxies (defined as "spirals") are seen currently in a spongelike topology. The topology is measured by the genus (number of "doughnut" holes minus number of isolated regions) of the smoothed density-contour surfaces. The measured genus curve for all galaxies as a function of density obeys approximately the theoretical curve expected for random- phase initial conditions, but the early-forming elliptical galaxies show a shift toward a meatball topology relative to the late-forming spirals. Simulations using standard biasing schemes fail to show such an effect. Large observational samples separated by galaxy type could be used to test for this effect.
Statistical detection of systematic election irregularities.

PubMed

Klimek, Peter; Yegorov, Yuri; Hanel, Rudolf; Thurner, Stefan

2012-10-09

Democratic societies are built around the principle of free and fair elections, and that each citizen's vote should count equally. National elections can be regarded as large-scale social experiments, where people are grouped into usually large numbers of electoral districts and vote according to their preferences. The large number of samples implies statistical consequences for the polling results, which can be used to identify election irregularities. Using a suitable data representation, we find that vote distributions of elections with alleged fraud show a kurtosis substantially exceeding the kurtosis of normal elections, depending on the level of data aggregation. As an example, we show that reported irregularities in recent Russian elections are, indeed, well-explained by systematic ballot stuffing. We develop a parametric model quantifying the extent to which fraudulent mechanisms are present. We formulate a parametric test detecting these statistical properties in election results. Remarkably, this technique produces robust outcomes with respect to the resolution of the data and therefore, allows for cross-country comparisons.
Extended Twin Study of Alcohol Use in Virginia and Australia.

PubMed

Verhulst, Brad; Neale, Michael C; Eaves, Lindon J; Medland, Sarah E; Heath, Andrew C; Martin, Nicholas G; Maes, Hermine H

2018-06-01

Drinking alcohol is a normal behavior in many societies, and prior studies have demonstrated it has both genetic and environmental sources of variation. Using two very large samples of twins and their first-degree relatives (Australia ≈ 20,000 individuals from 8,019 families; Virginia ≈ 23,000 from 6,042 families), we examine whether there are differences: (1) in the genetic and environmental factors that influence four interrelated drinking behaviors (quantity, frequency, age of initiation, and number of drinks in the last week), (2) between the twin-only design and the extended twin design, and (3) the Australian and Virginia samples. We find that while drinking behaviors are interrelated, there are substantial differences in the genetic and environmental architectures across phenotypes. Specifically, drinking quantity, frequency, and number of drinks in the past week have large broad genetic variance components, and smaller but significant environmental variance components, while age of onset is driven exclusively by environmental factors. Further, the twin-only design and the extended twin design come to similar conclusions regarding broad-sense heritability and environmental transmission, but the extended twin models provide a more nuanced perspective. Finally, we find a high level of similarity between the Australian and Virginian samples, especially for the genetic factors. The observed differences, when present, tend to be at the environmental level. Implications for the extended twin model and future directions are discussed.
Identification of proteins from 4200-year-old skin and muscle tissue biopsies from ancient Egyptian mummies of the first intermediate period shows evidence of acute inflammation and severe immune response.

PubMed

Jones, Jana; Mirzaei, Mehdi; Ravishankar, Prathiba; Xavier, Dylan; Lim, Do Seon; Shin, Dong Hoon; Bianucci, Raffaella; Haynes, Paul A

2016-10-28

We performed proteomics analysis on four skin and one muscle tissue samples taken from three ancient Egyptian mummies of the first intermediate period, approximately 4200 years old. The mummies were first dated by radiocarbon dating of the accompany-\\break ing textiles, and morphologically examined by scanning electron microscopy of additional skin samples. Proteins were extracted, separated on SDS-PAGE (sodium dodecyl sulfate polyacrylamide gel electrophoresis) gels, and in-gel digested with trypsin. The resulting peptides were analysed using nanoflow high-performance liquid chromatography-mass spectrometry. We identified a total of 230 unique proteins from the five samples, which consisted of 132 unique protein identifications. We found a large number of collagens, which was confirmed by our microscopy data, and is in agreement with previous studies showing that collagens are very long-lived. As expected, we also found a large number of keratins. We identified numerous proteins that provide evidence of activation of the innate immunity system in two of the mummies, one of which also contained proteins indicating severe tissue inflammation, possibly indicative of an infection that we can speculate may have been related to the cause of death.This article is part of the themed issue 'Quantitative mass spectrometry'. © 2016 The Author(s).
Identification of proteins from 4200-year-old skin and muscle tissue biopsies from ancient Egyptian mummies of the first intermediate period shows evidence of acute inflammation and severe immune response

PubMed Central

Jones, Jana; Mirzaei, Mehdi; Ravishankar, Prathiba; Xavier, Dylan; Lim, Do Seon; Shin, Dong Hoon; Bianucci, Raffaella

2016-01-01

We performed proteomics analysis on four skin and one muscle tissue samples taken from three ancient Egyptian mummies of the first intermediate period, approximately 4200 years old. The mummies were first dated by radiocarbon dating of the accompany-\\break ing textiles, and morphologically examined by scanning electron microscopy of additional skin samples. Proteins were extracted, separated on SDS–PAGE (sodium dodecyl sulfate polyacrylamide gel electrophoresis) gels, and in-gel digested with trypsin. The resulting peptides were analysed using nanoflow high-performance liquid chromatography–mass spectrometry. We identified a total of 230 unique proteins from the five samples, which consisted of 132 unique protein identifications. We found a large number of collagens, which was confirmed by our microscopy data, and is in agreement with previous studies showing that collagens are very long-lived. As expected, we also found a large number of keratins. We identified numerous proteins that provide evidence of activation of the innate immunity system in two of the mummies, one of which also contained proteins indicating severe tissue inflammation, possibly indicative of an infection that we can speculate may have been related to the cause of death. This article is part of the themed issue ‘Quantitative mass spectrometry’. PMID:27644972
Amelogenin test: From forensics to quality control in clinical and biochemical genomics.

PubMed

Francès, F; Portolés, O; González, J I; Coltell, O; Verdú, F; Castelló, A; Corella, D

2007-01-01

The increasing number of samples from the biomedical genetic studies and the number of centers participating in the same involves increasing risk of mistakes in the different sample handling stages. We have evaluated the usefulness of the amelogenin test for quality control in sample identification. Amelogenin test (frequently used in forensics) was undertaken on 1224 individuals participating in a biomedical study. Concordance between referred sex in the database and amelogenin test was estimated. Additional sex-error genetic detecting systems were developed. The overall concordance rate was 99.84% (1222/1224). Two samples showed a female amelogenin test outcome, being codified as males in the database. The first, after checking sex-specific biochemical and clinical profile data was found to be due to a codification error in the database. In the second, after checking the database, no apparent error was discovered because a correct male profile was found. False negatives in amelogenin male sex determination were discarded by additional tests, and feminine sex was confirmed. A sample labeling error was revealed after a new DNA extraction. The amelogenin test is a useful quality control tool for detecting sex-identification errors in large genomic studies, and can contribute to increase its validity.

Fast Bayesian experimental design: Laplace-based importance sampling for the expected information gain

NASA Astrophysics Data System (ADS)

Beck, Joakim; Dia, Ben Mansour; Espath, Luis F. R.; Long, Quan; Tempone, Raúl

2018-06-01

In calculating expected information gain in optimal Bayesian experimental design, the computation of the inner loop in the classical double-loop Monte Carlo requires a large number of samples and suffers from underflow if the number of samples is small. These drawbacks can be avoided by using an importance sampling approach. We present a computationally efficient method for optimal Bayesian experimental design that introduces importance sampling based on the Laplace method to the inner loop. We derive the optimal values for the method parameters in which the average computational cost is minimized according to the desired error tolerance. We use three numerical examples to demonstrate the computational efficiency of our method compared with the classical double-loop Monte Carlo, and a more recent single-loop Monte Carlo method that uses the Laplace method as an approximation of the return value of the inner loop. The first example is a scalar problem that is linear in the uncertain parameter. The second example is a nonlinear scalar problem. The third example deals with the optimal sensor placement for an electrical impedance tomography experiment to recover the fiber orientation in laminate composites.
Comparison of rangeland vegetation sampling techniques in the Central Grasslands

USGS Publications Warehouse

Stohlgren, T.J.; Bull, K.A.; Otsuki, Yuka

1998-01-01

Maintaining native plant diversity, detecting exotic species, and monitoring rare species are becoming important objectives in rangeland conservation. Four rangeland vegetation sampling techniques were compared to see how well they captured local pant diversity. The methods tested included the commonly used Parker transects, Daubenmire transects as modified by the USDA Forest Service, a new transect and 'large quadrat' design proposed by the USDA Agricultural Research Service, and the Modified-Whittaker multi-scale vegetation plot. The 4 methods were superimposed in shortgrass steppe, mixed grass prairie, northern mixed prairie, and tallgrass prairie in the Central Grasslands of the United States with 4 replicates in each prairie type. Analysis of variance tests showed significant method effects and prairie type effects, but no significant method X type interactions for total species richness, the number of native species, the number of species with less than 1 % cover, and the time required for sampling. The methods behaved similarly in each prairie type under a wide variety of grazing regimens. The Parker, large quadrat, and Daubenmire transects significantly underestimated the total species richness and the number of native species in each prairie type, and the number of species with less than 1 % cover in all but the tallgrass prairie type. The transect techniques also consistently missed half the exotic species, including noxious weeds, in each prairie type. The Modified-Whittaker method, which included an exhaustive search for plant species in a 20 x 50 m plot, served as the baseline for species richeness comparisons. For all prairie types, the Modified-Whittaker plot captured an average of 42. (?? 2.4; 1 S.E.) plant species per site compared to 15.9 (?? 1.3), 18.9 (?? 1.2), and 22.8 (?? 1.6) plant species per site using the Parker, large quadrat, and Daubenmire transect methods, respectively. The 4 methods captured most of the dominant species at each site and thus produced similar results for total foliar cover and soil cover. The detection and measurement of exotic plant species were greatly enhanced by using ten 1 m2 subplots in a multi-scale sampling design and searching a larger area (1,000 m2) at each site. Even with 4 replicate sites, the transect methods usually captured, and thus would monitor, 36 to 66 % of the plant species at each site. To evaluate the status and trends of common, rare, and exotic plant species at local, regional, and national scales, innovative, multi-scale methods must replace the commonly used transect methods to the past.
Sound-field measurement with moving microphones

PubMed Central

Katzberg, Fabrice; Mazur, Radoslaw; Maass, Marco; Koch, Philipp; Mertins, Alfred

2017-01-01

Closed-room scenarios are characterized by reverberation, which decreases the performance of applications such as hands-free teleconferencing and multichannel sound reproduction. However, exact knowledge of the sound field inside a volume of interest enables the compensation of room effects and allows for a performance improvement within a wide range of applications. The sampling of sound fields involves the measurement of spatially dependent room impulse responses, where the Nyquist-Shannon sampling theorem applies in the temporal and spatial domains. The spatial measurement often requires a huge number of sampling points and entails other difficulties, such as the need for exact calibration of a large number of microphones. In this paper, a method for measuring sound fields using moving microphones is presented. The number of microphones is customizable, allowing for a tradeoff between hardware effort and measurement time. The goal is to reconstruct room impulse responses on a regular grid from data acquired with microphones between grid positions, in general. For this, the sound field at equidistant positions is related to the measurements taken along the microphone trajectories via spatial interpolation. The benefits of using perfect sequences for excitation, a multigrid recovery, and the prospects for reconstruction by compressed sensing are presented. PMID:28599533
Food habits of the southwestern willow flycatcher during the nesting season

USGS Publications Warehouse

Drost, Charles A.; Paxton, Eben H.; Sogge, Mark K.; Whitfield, Mary J.

2003-01-01

The food habits and prey base of the endangered Southwestern Willow Flycatcher (Empidonax traillii extimus) are not well known. We analyzed prey remains in 59 fecal samples from an intensively-studied population of this flycatcher at the Kern River Preserve in southern California. These samples were collected during the nesting season in 1996 and 1997 from adults caught in mist nets, and from nestlings temporarily removed from the nest for banding. A total of 379 prey individuals were identified in the samples. Dominant prey taxa, both in total numbers and in percent occurrence, were true bugs (Hemiptera), flies (Diptera), and beetles (Coleoptera). Leafhoppers (Homoptera: Cicadellidae), spiders (Araneae), bees and wasps (Hymenoptera), and dragonflies and damselflies (Odonata) were also common items. Diet composition was significantly different between years, due to a large difference in the numbers of spiders between 1996 and 1997. There was also a significant difference between the diet of young and adults, with the diet of young birds having significantly higher numbers of odonates and beetles. There was a trend toward diet differences between males and females, but this was not significant at the P = 0.05 level.
A large-scale dataset of single and mixed-source short tandem repeat profiles to inform human identification strategies: PROVEDIt.

PubMed

Alfonse, Lauren E; Garrett, Amanda D; Lun, Desmond S; Duffy, Ken R; Grgicak, Catherine M

2018-01-01

DNA-based human identity testing is conducted by comparison of PCR-amplified polymorphic Short Tandem Repeat (STR) motifs from a known source with the STR profiles obtained from uncertain sources. Samples such as those found at crime scenes often result in signal that is a composite of incomplete STR profiles from an unknown number of unknown contributors, making interpretation an arduous task. To facilitate advancement in STR interpretation challenges we provide over 25,000 multiplex STR profiles produced from one to five known individuals at target levels ranging from one to 160 copies of DNA. The data, generated under 144 laboratory conditions, are classified by total copy number and contributor proportions. For the 70% of samples that were synthetically compromised, we report the level of DNA damage using quantitative and end-point PCR. In addition, we characterize the complexity of the signal by exploring the number of detected alleles in each profile. Copyright © 2017 Elsevier B.V. All rights reserved.
On the theory and simulation of multiple Coulomb scattering of heavy-charged particles.

PubMed

Striganov, S I

2005-01-01

The Moliere theory of multiple Coulomb scattering is modified to take into account the difference between processes of scattering off atomic nuclei and electrons. A simple analytical expression for angular distribution of charged particles passing through a thick absorber is found. It does not assume any special form for a differential scattering cross section and has a wider range of applicability than a gaussian approximation. A well-known method to simulate multiple Coulomb scatterings is based on treating 'soft' and 'hard' collisions differently. An angular deflection in a large number of 'soft' collisions is sampled using the proposed distribution function, a small number of 'hard' collision are simulated directly. A boundary between 'hard' and 'soft' collisions is defined, providing a precise sampling of a scattering angle (1% level) and a small number of 'hard' collisions. A corresponding simulating module takes into account projectile and nucleus charged distributions and exact kinematics of a projectile-electron interaction.
Missing Data and Influential Sites: Choice of Sites for Phylogenetic Analysis Can Be As Important As Taxon Sampling and Model Choice

PubMed Central

Shavit Grievink, Liat; Penny, David; Holland, Barbara R.

2013-01-01

Phylogenetic studies based on molecular sequence alignments are expected to become more accurate as the number of sites in the alignments increases. With the advent of genomic-scale data, where alignments have very large numbers of sites, bootstrap values close to 100% and posterior probabilities close to 1 are the norm, suggesting that the number of sites is now seldom a limiting factor on phylogenetic accuracy. This provokes the question, should we be fussy about the sites we choose to include in a genomic-scale phylogenetic analysis? If some sites contain missing data, ambiguous character states, or gaps, then why not just throw them away before conducting the phylogenetic analysis? Indeed, this is exactly the approach taken in many phylogenetic studies. Here, we present an example where the decision on how to treat sites with missing data is of equal importance to decisions on taxon sampling and model choice, and we introduce a graphical method for illustrating this. PMID:23471508
A comment on "bats killed in large numbers at United States wind energy facilities"

USGS Publications Warehouse

Huso, Manuela M.P.; Dalthorp, Dan

2014-01-01

Widespread reports of bat fatalities caused by wind turbines have raised concerns about the impacts of wind power development. Reliable estimates of the total number killed and the potential effects on populations are needed, but it is crucial that they be based on sound data. In a recent BioScience article, Hayes (2013) estimated that over 600,000 bats were killed at wind turbines in the United States in 2012. The scientific errors in the analysis are numerous, with the two most serious being that the included sites constituted a convenience sample, not a representative sample, and that the individual site estimates are derived from such different methodologies that they are inherently not comparable. This estimate is almost certainly inaccurate, but whether the actual number is much smaller, much larger, or about the same is uncertain. An accurate estimate of total bat fatality is not currently possible, given the shortcomings of the available data.
Sampling procedures for throughfall monitoring: A simulation study

NASA Astrophysics Data System (ADS)

Zimmermann, Beate; Zimmermann, Alexander; Lark, Richard Murray; Elsenbeer, Helmut

2010-01-01

What is the most appropriate sampling scheme to estimate event-based average throughfall? A satisfactory answer to this seemingly simple question has yet to be found, a failure which we attribute to previous efforts' dependence on empirical studies. Here we try to answer this question by simulating stochastic throughfall fields based on parameters for statistical models of large monitoring data sets. We subsequently sampled these fields with different sampling designs and variable sample supports. We evaluated the performance of a particular sampling scheme with respect to the uncertainty of possible estimated means of throughfall volumes. Even for a relative error limit of 20%, an impractically large number of small, funnel-type collectors would be required to estimate mean throughfall, particularly for small events. While stratification of the target area is not superior to simple random sampling, cluster random sampling involves the risk of being less efficient. A larger sample support, e.g., the use of trough-type collectors, considerably reduces the necessary sample sizes and eliminates the sensitivity of the mean to outliers. Since the gain in time associated with the manual handling of troughs versus funnels depends on the local precipitation regime, the employment of automatically recording clusters of long troughs emerges as the most promising sampling scheme. Even so, a relative error of less than 5% appears out of reach for throughfall under heterogeneous canopies. We therefore suspect a considerable uncertainty of input parameters for interception models derived from measured throughfall, in particular, for those requiring data of small throughfall events.
A Hybrid Algorithm for Period Analysis from Multiband Data with Sparse and Irregular Sampling for Arbitrary Light-curve Shapes

NASA Astrophysics Data System (ADS)

Saha, Abhijit; Vivas, A. Katherina

2017-12-01

Ongoing and future surveys with repeat imaging in multiple bands are producing (or will produce) time-spaced measurements of brightness, resulting in the identification of large numbers of variable sources in the sky. A large fraction of these are periodic variables: compilations of these are of scientific interest for a variety of purposes. Unavoidably, the data sets from many such surveys not only have sparse sampling, but also have embedded frequencies in the observing cadence that beat against the natural periodicities of any object under investigation. Such limitations can make period determination ambiguous and uncertain. For multiband data sets with asynchronous measurements in multiple passbands, we wish to maximally use the information on periodicity in a manner that is agnostic of differences in the light-curve shapes across the different channels. Given large volumes of data, computational efficiency is also at a premium. This paper develops and presents a computationally economic method for determining periodicity that combines the results from two different classes of period-determination algorithms. The underlying principles are illustrated through examples. The effectiveness of this approach for combining asynchronously sampled measurements in multiple observables that share an underlying fundamental frequency is also demonstrated.
Analysis and imaging of biocidal agrochemicals using ToF-SIMS.

PubMed

Converso, Valerio; Fearn, Sarah; Ware, Ecaterina; McPhail, David S; Flemming, Anthony J; Bundy, Jacob G

2017-09-06

ToF-SIMS has been increasingly widely used in recent years to look at biological matrices, in particular for biomedical research, although there is still a lot of development needed to maximise the value of this technique in the life sciences. The main issue for biological matrices is the complexity of the mass spectra and therefore the difficulty to specifically and precisely detect analytes in the biological sample. Here we evaluated the use of ToF-SIMS in the agrochemical field, which remains a largely unexplored area for this technique. We profiled a large number of biocidal active ingredients (herbicides, fungicides, and insecticides); we then selected fludioxonil, a halogenated fungicide, as a model compound for more detailed study, including the effect of co-occurring biomolecules on detection limits. There was a wide range of sensitivity of the ToF-SIMS for the different active ingredient compounds, but fludioxonil was readily detected in real-world samples (wheat seeds coated with a commercial formulation). Fludioxonil did not penetrate the seed to any great depth, but was largely restricted to a layer coating the seed surface. ToF-SIMS has clear potential as a tool for not only detecting biocides in biological samples, but also mapping their distribution.
Non-linear matter power spectrum covariance matrix errors and cosmological parameter uncertainties

NASA Astrophysics Data System (ADS)

Blot, L.; Corasaniti, P. S.; Amendola, L.; Kitching, T. D.

2016-06-01

The covariance of the matter power spectrum is a key element of the analysis of galaxy clustering data. Independent realizations of observational measurements can be used to sample the covariance, nevertheless statistical sampling errors will propagate into the cosmological parameter inference potentially limiting the capabilities of the upcoming generation of galaxy surveys. The impact of these errors as function of the number of realizations has been previously evaluated for Gaussian distributed data. However, non-linearities in the late-time clustering of matter cause departures from Gaussian statistics. Here, we address the impact of non-Gaussian errors on the sample covariance and precision matrix errors using a large ensemble of N-body simulations. In the range of modes where finite volume effects are negligible (0.1 ≲ k [h Mpc-1] ≲ 1.2), we find deviations of the variance of the sample covariance with respect to Gaussian predictions above ˜10 per cent at k > 0.3 h Mpc-1. Over the entire range these reduce to about ˜5 per cent for the precision matrix. Finally, we perform a Fisher analysis to estimate the effect of covariance errors on the cosmological parameter constraints. In particular, assuming Euclid-like survey characteristics we find that a number of independent realizations larger than 5000 is necessary to reduce the contribution of sampling errors to the cosmological parameter uncertainties at subpercent level. We also show that restricting the analysis to large scales k ≲ 0.2 h Mpc-1 results in a considerable loss in constraining power, while using the linear covariance to include smaller scales leads to an underestimation of the errors on the cosmological parameters.
Alpine Grassland Soil Organic Carbon Stock and Its Uncertainty in the Three Rivers Source Region of the Tibetan Plateau

PubMed Central

Chang, Xiaofeng; Wang, Shiping; Cui, Shujuan; Zhu, Xiaoxue; Luo, Caiyun; Zhang, Zhenhua; Wilkes, Andreas

2014-01-01

Alpine grassland of the Tibetan Plateau is an important component of global soil organic carbon (SOC) stocks, but insufficient field observations and large spatial heterogeneity leads to great uncertainty in their estimation. In the Three Rivers Source Region (TRSR), alpine grasslands account for more than 75% of the total area. However, the regional carbon (C) stock estimate and their uncertainty have seldom been tested. Here we quantified the regional SOC stock and its uncertainty using 298 soil profiles surveyed from 35 sites across the TRSR during 2006–2008. We showed that the upper soil (0–30 cm depth) in alpine grasslands of the TRSR stores 2.03 Pg C, with a 95% confidence interval ranging from 1.25 to 2.81 Pg C. Alpine meadow soils comprised 73% (i.e. 1.48 Pg C) of the regional SOC estimate, but had the greatest uncertainty at 51%. The statistical power to detect a deviation of 10% uncertainty in grassland C stock was less than 0.50. The required sample size to detect this deviation at a power of 90% was about 6–7 times more than the number of sample sites surveyed. Comparison of our observed SOC density with the corresponding values from the dataset of Yang et al. indicates that these two datasets are comparable. The combined dataset did not reduce the uncertainty in the estimate of the regional grassland soil C stock. This result could be mainly explained by the underrepresentation of sampling sites in large areas with poor accessibility. Further research to improve the regional SOC stock estimate should optimize sampling strategy by considering the number of samples and their spatial distribution. PMID:24819054
Efficient Bayesian mixed model analysis increases association power in large cohorts

PubMed Central

Loh, Po-Ru; Tucker, George; Bulik-Sullivan, Brendan K; Vilhjálmsson, Bjarni J; Finucane, Hilary K; Salem, Rany M; Chasman, Daniel I; Ridker, Paul M; Neale, Benjamin M; Berger, Bonnie; Patterson, Nick; Price, Alkes L

2014-01-01

Linear mixed models are a powerful statistical tool for identifying genetic associations and avoiding confounding. However, existing methods are computationally intractable in large cohorts, and may not optimize power. All existing methods require time cost O(MN2) (where N = #samples and M = #SNPs) and implicitly assume an infinitesimal genetic architecture in which effect sizes are normally distributed, which can limit power. Here, we present a far more efficient mixed model association method, BOLT-LMM, which requires only a small number of O(MN)-time iterations and increases power by modeling more realistic, non-infinitesimal genetic architectures via a Bayesian mixture prior on marker effect sizes. We applied BOLT-LMM to nine quantitative traits in 23,294 samples from the Women’s Genome Health Study (WGHS) and observed significant increases in power, consistent with simulations. Theory and simulations show that the boost in power increases with cohort size, making BOLT-LMM appealing for GWAS in large cohorts. PMID:25642633
TemperSAT: A new efficient fair-sampling random k-SAT solver

NASA Astrophysics Data System (ADS)

Fang, Chao; Zhu, Zheng; Katzgraber, Helmut G.

The set membership problem is of great importance to many applications and, in particular, database searches for target groups. Recently, an approach to speed up set membership searches based on the NP-hard constraint-satisfaction problem (random k-SAT) has been developed. However, the bottleneck of the approach lies in finding the solution to a large SAT formula efficiently and, in particular, a large number of independent solutions is needed to reduce the probability of false positives. Unfortunately, traditional random k-SAT solvers such as WalkSAT are biased when seeking solutions to the Boolean formulas. By porting parallel tempering Monte Carlo to the sampling of binary optimization problems, we introduce a new algorithm (TemperSAT) whose performance is comparable to current state-of-the-art SAT solvers for large k with the added benefit that theoretically it can find many independent solutions quickly. We illustrate our results by comparing to the currently fastest implementation of WalkSAT, WalkSATlm.
Evaluation of two outlier-detection-based methods for detecting tissue-selective genes from microarray data.

PubMed

Kadota, Koji; Konishi, Tomokazu; Shimizu, Kentaro

2007-05-01

Large-scale expression profiling using DNA microarrays enables identification of tissue-selective genes for which expression is considerably higher and/or lower in some tissues than in others. Among numerous possible methods, only two outlier-detection-based methods (an AIC-based method and Sprent's non-parametric method) can treat equally various types of selective patterns, but they produce substantially different results. We investigated the performance of these two methods for different parameter settings and for a reduced number of samples. We focused on their ability to detect selective expression patterns robustly. We applied them to public microarray data collected from 36 normal human tissue samples and analyzed the effects of both changing the parameter settings and reducing the number of samples. The AIC-based method was more robust in both cases. The findings confirm that the use of the AIC-based method in the recently proposed ROKU method for detecting tissue-selective expression patterns is correct and that Sprent's method is not suitable for ROKU.
Adding-point strategy for reduced-order hypersonic aerothermodynamics modeling based on fuzzy clustering

NASA Astrophysics Data System (ADS)

Chen, Xin; Liu, Li; Zhou, Sida; Yue, Zhenjiang

2016-09-01

Reduced order models(ROMs) based on the snapshots on the CFD high-fidelity simulations have been paid great attention recently due to their capability of capturing the features of the complex geometries and flow configurations. To improve the efficiency and precision of the ROMs, it is indispensable to add extra sampling points to the initial snapshots, since the number of sampling points to achieve an adequately accurate ROM is generally unknown in prior, but a large number of initial sampling points reduces the parsimony of the ROMs. A fuzzy-clustering-based adding-point strategy is proposed and the fuzzy clustering acts an indicator of the region in which the precision of ROMs is relatively low. The proposed method is applied to construct the ROMs for the benchmark mathematical examples and a numerical example of hypersonic aerothermodynamics prediction for a typical control surface. The proposed method can achieve a 34.5% improvement on the efficiency than the estimated mean squared error prediction algorithm and shows same-level prediction accuracy.
DNA banking and DNA databanking by academic and commercial laboratories

DOE Office of Scientific and Technical Information (OSTI.GOV)

McEwen, J.E.; Reilly, P.R.

The advent of DNA-based testing is giving rise to DNA banking (the long-term storage of cells, transformed cell lines, or extracted DNA for subsequent retrieval and analysis) and DNA data banking (the indefinite storage of information derived from DNA analysis). Large scale acquisition and storage of DNA and DNA data has important implications for the privacy rights of individuals. A survey of 148 academically based and commercial DNA diagnostic laboratories was conducted to determine: (1) the extent of their DNA banking activities; (2) their policies and experiences regarding access to DNA samples and data; (3) the quality assurance measures theymore » employ; and (4) whether they have written policies and/or depositor`s agreements addressing specific issues. These issues include: (1) who may have access to DNA samples and data; (2) whether scientists may have access to anonymous samples or data for research use; (3) whether they have plans to contact depositors or retest samples if improved tests for a disorder become available; (4) disposition of samples at the end of the contract period if the laboratory ceases operations, if storage fees are unpaid, or after a death or divorce; (5) the consequence of unauthorized release, loss, or accidental destruction of samples; and (6) whether depositors may share in profits from the commercialization of tests or treatments developed in part from studies of stored DNA. The results suggest that many laboratories are banking DNA, that many have already amassed a large number of samples, and that a significant number plan to further develop DNA banking as a laboratory service over the next two years. Few laboratories have developed written policies governing DNA banking, and fewer still have drafted documents that define the rights and obligations of the parties. There may be a need for increased regulation of DNA banking and DNA data banking and for better defined policies with respect to protecting individual privacy.« less
Quantitative assessment of anthrax vaccine immunogenicity using the dried blood spot matrix.

PubMed

Schiffer, Jarad M; Maniatis, Panagiotis; Garza, Ilana; Steward-Clark, Evelene; Korman, Lawrence T; Pittman, Phillip R; Mei, Joanne V; Quinn, Conrad P

2013-03-01

The collection, processing and transportation to a testing laboratory of large numbers of clinical samples during an emergency response situation present significant cost and logistical issues. Blood and serum are common clinical samples for diagnosis of disease. Serum preparation requires significant on-site equipment and facilities for immediate processing and cold storage, and significant costs for cold-chain transport to testing facilities. The dried blood spot (DBS) matrix offers an alternative to serum for rapid and efficient sample collection with fewer on-site equipment requirements and considerably lower storage and transport costs. We have developed and validated assay methods for using DBS in the quantitative anti-protective antigen IgG enzyme-linked immunosorbent assay (ELISA), one of the primary assays for assessing immunogenicity of anthrax vaccine and for confirmatory diagnosis of Bacillus anthracis infection in humans. We have also developed and validated high-throughput data analysis software to facilitate data handling for large clinical trials and emergency response. Published by Elsevier Ltd.
Deficiency of ''Thin'' Stellar Bars in Seyfert Host Galaxies

NASA Technical Reports Server (NTRS)

Shlosman, Isaac; Peletier, Reynier F.; Knapen, Johan

1999-01-01

Using all available major samples of Seyfert galaxies and their corresponding control samples of closely matched non-active galaxies, we find that the bar ellipticities (or axial ratios) in Seyfert galaxies are systematically different from those in non-active galaxies. Overall, there is a deficiency of bars with large ellipticities (i.e., 'fat' or 'weak' bars) in Seyferts, compared to non-active galaxies. Accompanied with a large dispersion due to small number statistics, this effect is strictly speaking at the 2 sigma level. To obtain this result, the active galaxy samples of near-infrared surface photometry were matched to those of normal galaxies in type, host galaxy ellipticity, absolute magnitude, and, to some extent, in redshift. We discuss possible theoretical explanations of this phenomenon within the framework of galactic evolution, and, in particular, of radial gas redistribution in barred galaxies. Our conclusions provide further evidence that Seyfert hosts differ systematically from their non-active counterparts on scales of a few kpc.

Monitoring suspended sediment and associated trace element and nutrient fluxes in large river basins in the USA

USGS Publications Warehouse

Horowitz, A.J.

2004-01-01

In 1996, the US Geological Survey converted its occurrence and distribution-based National Stream Quality Accounting Network (NASQAN) to a national, flux-based water-quality monitoring programme. The main objective of the revised programme is to characterize large USA river basins by measuring the fluxes of selected constituents at critical nodes in various basins. Each NASQAN site was instrumented to determine daily discharge, but water and suspended sediment samples are collected no more than 12-15 times per year. Due to the limited sampling programme, annual suspended sediment fluxes were determined from site-specific sediment rating (transport) curves. As no significant relationship could be found between either discharge or suspended sediment concentration (SSC) and suspended sediment chemistry, trace element and nutrient fluxes are estimated using site-specific mean or median chemical levels determined from a number of samples collected over a period of years, and under a variety of flow conditions.
Dosimetric measurements of Onyx embolization material for stereotactic radiosurgery

DOE Office of Scientific and Technical Information (OSTI.GOV)

Roberts, Donald A.; Balter, James M.; Chaudhary, Neeraj

2012-11-15

Purpose: Arteriovenous malformations are often treated with a combination of embolization and stereotactic radiosurgery. Concern has been expressed in the past regarding the dosimetric properties of materials used in embolization and the effects that the introduction of these materials into the brain may have on the quality of the radiosurgery plan. To quantify these effects, the authors have taken large volumes of Onyx 34 and Onyx 18 (ethylene-vinyl alcohol copolymer doped with tantalum) and measured the attenuation and interface effects of these embolization materials. Methods: The manufacturer provided large cured volumes ({approx}28 cc) of both Onyx materials. These samples weremore » 8.5 cm in diameter with a nominal thickness of 5 mm. The samples were placed on a block tray above a stack of solid water with an Attix chamber at a depth of 5 cm within the stack. The Attix chamber was used to measure the attenuation. These measurements were made for both 6 and 16 MV beams. Placing the sample directly on the solid water stack and varying the thickness of solid water between the sample and the Attix chamber measured the interface effects. The computed tomography (CT) numbers for bulk material were measured in a phantom using a wide bore CT scanner. Results: The transmission through the Onyx materials relative to solid water was approximately 98% and 97% for 16 and 6 MV beams, respectively. The interface effect shows an enhancement of approximately 2% and 1% downstream for 16 and 6 MV beams. CT numbers of approximately 2600-3000 were measured for both materials, which corresponded to an apparent relative electron density (RED) {rho}{sub e}{sup w} to water of approximately 2.7-2.9 if calculated from the commissioning data of the CT scanner. Conclusions: We performed direct measurements of attenuation and interface effects of Onyx 34 and Onyx 18 embolization materials with large samples. The introduction of embolization materials affects the dose distribution of a MV therapeutic beam, but should be of negligible consequence for effective thicknesses of less than 8 mm. The measured interface effects are also small, particularly at 6 MV. Large areas of high-density artifacts and low-density artifacts can cause errors in dose calculations and need to be identified and resolved during planning.« less
The role of large wood in retaining fine sediment, organic matter and plant propagules in a small, single-thread forest river

NASA Astrophysics Data System (ADS)

Osei, Nana A.; Gurnell, Angela M.; Harvey, Gemma L.

2015-04-01

This paper investigates associations among large wood accumulations, retained sediment, and organic matter and the establishment of a viable propagule bank within a forested reach of a lowland river, the Highland Water, UK. A wood survey within the 2-km study reach, illustrates that the quantity of wood retained within the channel is typical of relatively unmanaged river channels bordered by deciduous woodland and that the wood accumulations (jams) that are present are well developed, typically spanning the river channel and comprised of wood that is well decayed. Sediment samples were obtained in a stratified random design focusing on nine subreaches within which samples were aggregated from five different types of sampling location. Two of these locations were wood-associated (within and on bank faces immediately adjacent to wood jams), and the other three locations represented the broader river environment (gravel bars, bank faces, floodplain). The samples were analysed to establish their calibre, organic, and viable plant propagule content. The gravel bar sampling locations retained significantly coarser sediment containing a lower proportion of organic matter and viable propagules than the other four sampling locations. The two wood-related sampling locations retained sediment of intermediate calibre between the gravel bar and the bank-floodplain samples but they retained significantly more organic matter and viable propagules than were found in the other three sampling locations. In particular, the jam bank samples (areas of sediment accumulation against bank faces adjacent to wood jams) contained the highest number of propagules and the largest number of propagule species. These results suggest that retention of propagules, organic matter and relatively fine sediment in and around wood jams has the potential to support vegetation regeneration, further sediment retention, and as a consequence, landform development within woodland streams, although this process is arrested by grazing at the study site. These results also suggest that self-restoration using wood is a potentially cost-effective and far-reaching river restoration strategy but that its full effects develop gradually and require the establishment of a functioning wood budget coupled with grazing levels that are in balance with vegetation growth.
Probability of detecting nematode infestations for quarantine sampling with imperfect extraction efficacy

PubMed Central

Chen, Peichen; Liu, Shih-Chia; Liu, Hung-I; Chen, Tse-Wei

2011-01-01

For quarantine sampling, it is of fundamental importance to determine the probability of finding an infestation when a specified number of units are inspected. In general, current sampling procedures assume 100% probability (perfect) of detecting a pest if it is present within a unit. Ideally, a nematode extraction method should remove all stages of all species with 100% efficiency regardless of season, temperature, or other environmental conditions; in practice however, no method approaches these criteria. In this study we determined the probability of detecting nematode infestations for quarantine sampling with imperfect extraction efficacy. Also, the required sample and the risk involved in detecting nematode infestations with imperfect extraction efficacy are presented. Moreover, we developed a computer program to calculate confidence levels for different scenarios with varying proportions of infestation and efficacy of detection. In addition, a case study, presenting the extraction efficacy of the modified Baermann's Funnel method on Aphelenchoides besseyi, is used to exemplify the use of our program to calculate the probability of detecting nematode infestations in quarantine sampling with imperfect extraction efficacy. The result has important implications for quarantine programs and highlights the need for a very large number of samples if perfect extraction efficacy is not achieved in such programs. We believe that the results of the study will be useful for the determination of realistic goals in the implementation of quarantine sampling. PMID:22791911
Monitoring a large number of pesticides and transformation products in water samples from Spain and Italy.

PubMed

Rousis, Nikolaos I; Bade, Richard; Bijlsma, Lubertus; Zuccato, Ettore; Sancho, Juan V; Hernandez, Felix; Castiglioni, Sara

2017-07-01

Assessing the presence of pesticides in environmental waters is particularly challenging because of the huge number of substances used which may end up in the environment. Furthermore, the occurrence of pesticide transformation products (TPs) and/or metabolites makes this task even harder. Most studies dealing with the determination of pesticides in water include only a small number of analytes and in many cases no TPs. The present study applied a screening method for the determination of a large number of pesticides and TPs in wastewater (WW) and surface water (SW) from Spain and Italy. Liquid chromatography coupled to high-resolution mass spectrometry (HRMS) was used to screen a database of 450 pesticides and TPs. Detection and identification were based on specific criteria, i.e. mass accuracy, fragmentation, and comparison of retention times when reference standards were available, or a retention time prediction model when standards were not available. Seventeen pesticides and TPs from different classes (fungicides, herbicides and insecticides) were found in WW in Italy and Spain, and twelve in SW. Generally, in both countries more compounds were detected in effluent WW than in influent WW, and in SW than WW. This might be due to the analytical sensitivity in the different matrices, but also to the presence of multiple sources of pollution. HRMS proved a good screening tool to determine a large number of substances in water and identify some priority compounds for further quantitative analysis. Copyright © 2017 Elsevier Inc. All rights reserved.
Large number discrimination by mosquitofish.

PubMed

Agrillo, Christian; Piffer, Laura; Bisazza, Angelo

2010-12-22

Recent studies have demonstrated that fish display rudimentary numerical abilities similar to those observed in mammals and birds. The mechanisms underlying the discrimination of small quantities (<4) were recently investigated while, to date, no study has examined the discrimination of large numerosities in fish. Subjects were trained to discriminate between two sets of small geometric figures using social reinforcement. In the first experiment mosquitofish were required to discriminate 4 from 8 objects with or without experimental control of the continuous variables that co-vary with number (area, space, density, total luminance). Results showed that fish can use the sole numerical information to compare quantities but that they preferentially use cumulative surface area as a proxy of the number when this information is available. A second experiment investigated the influence of the total number of elements to discriminate large quantities. Fish proved to be able to discriminate up to 100 vs. 200 objects, without showing any significant decrease in accuracy compared with the 4 vs. 8 discrimination. The third experiment investigated the influence of the ratio between the numerosities. Performance was found to decrease when decreasing the numerical distance. Fish were able to discriminate numbers when ratios were 1:2 or 2:3 but not when the ratio was 3:4. The performance of a sample of undergraduate students, tested non-verbally using the same sets of stimuli, largely overlapped that of fish. Fish are able to use pure numerical information when discriminating between quantities larger than 4 units. As observed in human and non-human primates, the numerical system of fish appears to have virtually no upper limit while the numerical ratio has a clear effect on performance. These similarities further reinforce the view of a common origin of non-verbal numerical systems in all vertebrates.
Influence of dietary fiber on xylanolytic and cellulolytic bacteria of adult pigs.

PubMed Central

Varel, V H; Robinson, I M; Jung, H J

1987-01-01

Xylanolytic and cellulolytic bacteria were enumerated over an 86-day period from fecal samples of 10 8-month-old gilts that were fed either a control or a 40% alfalfa meal (high-fiber) diet. Fecal samples were collected from all pigs on days 0, 3, 5, 12, 25, 37, 58, and 86. Overall, the numbers of xylanolytic bacteria producing greater than 5-mm-diameter zones of clearing on 0.24% xylan roll tube medium after 24 to 36 h of incubation were 1.6 X 10(8) and 4.2 X 10(8)/g (dry weight) of feces for the control pigs and those fed the high-fiber diet, respectively. After 1 week of incubation, a large number of smaller zones of clearing (1 to 2 mm) appeared. Besides Bacteroides succinogenes and Ruminococcus flavefaciens, which produced faint zones of clearing in xylan roll tubes, three strains which closely resembled B. ruminicola hydrolyzed and used xylan for growth. The overall numbers of cellulolytic bacteria producing zones of clearing in 0.5% agar roll tube medium were 0.36 X 10(8) and 4.1 X 10(8)/g for the control pigs and those fed the high-fiber diet, respectively. B. succinogenes was the predominant cellulolytic isolate from both groups of pigs, and R. flavefaciens was found in a ratio of approximately 1 to 15 with B. succinogenes. Degradation of xylan and cellulose, measured by in vitro dry matter disappearance after inoculation with fecal samples, was significantly greater for pigs fed the high-fiber diet than that for the controls. These data suggest that the number of fibrolytic microorganisms and their activity in the large intestine of the adult pig can be increased by feeding pigs high-alfalfa-fiber diets and that these organisms are similar to those found in the rumen. PMID:3030194
Genetic effects of habitat restoration in the Laurentian Great Lakes: an assessment of lake sturgeon origin and genetic diversity

USGS Publications Warehouse

Jamie Marie Marranca,; Amy Welsh,; Roseman, Edward F.

2015-01-01

Lake sturgeon (Acipenser fulvescens) have experienced significant habitat loss, resulting in reduced population sizes. Three artificial reefs were built in the Huron-Erie corridor in the Great Lakes to replace lost spawning habitat. Genetic data were collected to determine the source and numbers of adult lake sturgeon spawning on the reefs and to determine if the founder effect resulted in reduced genetic diversity. DNA was extracted from larval tail clips and 12 microsatellite loci were amplified. Larval genotypes were then compared to 22 previously studied spawning lake sturgeon populations in the Great Lakes to determine the source of the parental population. The effective number of breeders (Nb) was calculated for each reef cohort. The larval genotypes were then compared to the source population to determine if there were any losses in genetic diversity that are indicative of the founder effect. The St. Clair and Detroit River adult populations were found to be the source parental population for the larvae collected on all three artificial reefs. There were large numbers of contributing adults relative to the number of sampled larvae. There was no significant difference between levels of genetic diversity in the source population and larval samples from the artificial reefs; however, there is some evidence for a genetic bottleneck in the reef populations likely due to the founder effect. Habitat restoration in the Huron-Erie corridor is likely resulting in increased habitat for the large lake sturgeon population in the system and in maintenance of the population's genetic diversity.
Deep learning with domain adaptation for accelerated projection-reconstruction MR.

PubMed

Han, Yoseob; Yoo, Jaejun; Kim, Hak Hee; Shin, Hee Jung; Sung, Kyunghyun; Ye, Jong Chul

2018-09-01

The radial k-space trajectory is a well-established sampling trajectory used in conjunction with magnetic resonance imaging. However, the radial k-space trajectory requires a large number of radial lines for high-resolution reconstruction. Increasing the number of radial lines causes longer acquisition time, making it more difficult for routine clinical use. On the other hand, if we reduce the number of radial lines, streaking artifact patterns are unavoidable. To solve this problem, we propose a novel deep learning approach with domain adaptation to restore high-resolution MR images from under-sampled k-space data. The proposed deep network removes the streaking artifacts from the artifact corrupted images. To address the situation given the limited available data, we propose a domain adaptation scheme that employs a pre-trained network using a large number of X-ray computed tomography (CT) or synthesized radial MR datasets, which is then fine-tuned with only a few radial MR datasets. The proposed method outperforms existing compressed sensing algorithms, such as the total variation and PR-FOCUSS methods. In addition, the calculation time is several orders of magnitude faster than the total variation and PR-FOCUSS methods. Moreover, we found that pre-training using CT or MR data from similar organ data is more important than pre-training using data from the same modality for different organ. We demonstrate the possibility of a domain-adaptation when only a limited amount of MR data is available. The proposed method surpasses the existing compressed sensing algorithms in terms of the image quality and computation time. © 2018 International Society for Magnetic Resonance in Medicine.
Powerful Identification of Cis-regulatory SNPs in Human Primary Monocytes Using Allele-Specific Gene Expression

PubMed Central

Almlöf, Jonas Carlsson; Lundmark, Per; Lundmark, Anders; Ge, Bing; Maouche, Seraya; Göring, Harald H. H.; Liljedahl, Ulrika; Enström, Camilla; Brocheton, Jessy; Proust, Carole; Godefroy, Tiphaine; Sambrook, Jennifer G.; Jolley, Jennifer; Crisp-Hihn, Abigail; Foad, Nicola; Lloyd-Jones, Heather; Stephens, Jonathan; Gwilliam, Rhian; Rice, Catherine M.; Hengstenberg, Christian; Samani, Nilesh J.; Erdmann, Jeanette; Schunkert, Heribert; Pastinen, Tomi; Deloukas, Panos; Goodall, Alison H.; Ouwehand, Willem H.; Cambien, François; Syvänen, Ann-Christine

2012-01-01

A large number of genome-wide association studies have been performed during the past five years to identify associations between SNPs and human complex diseases and traits. The assignment of a functional role for the identified disease-associated SNP is not straight-forward. Genome-wide expression quantitative trait locus (eQTL) analysis is frequently used as the initial step to define a function while allele-specific gene expression (ASE) analysis has not yet gained a wide-spread use in disease mapping studies. We compared the power to identify cis-acting regulatory SNPs (cis-rSNPs) by genome-wide allele-specific gene expression (ASE) analysis with that of traditional expression quantitative trait locus (eQTL) mapping. Our study included 395 healthy blood donors for whom global gene expression profiles in circulating monocytes were determined by Illumina BeadArrays. ASE was assessed in a subset of these monocytes from 188 donors by quantitative genotyping of mRNA using a genome-wide panel of SNP markers. The performance of the two methods for detecting cis-rSNPs was evaluated by comparing associations between SNP genotypes and gene expression levels in sample sets of varying size. We found that up to 8-fold more samples are required for eQTL mapping to reach the same statistical power as that obtained by ASE analysis for the same rSNPs. The performance of ASE is insensitive to SNPs with low minor allele frequencies and detects a larger number of significantly associated rSNPs using the same sample size as eQTL mapping. An unequivocal conclusion from our comparison is that ASE analysis is more sensitive for detecting cis-rSNPs than standard eQTL mapping. Our study shows the potential of ASE mapping in tissue samples and primary cells which are difficult to obtain in large numbers. PMID:23300628
Maui-VIA: A User-Friendly Software for Visual Identification, Alignment, Correction, and Quantification of Gas Chromatography–Mass Spectrometry Data

PubMed Central

Kuich, P. Henning J. L.; Hoffmann, Nils; Kempa, Stefan

2015-01-01

A current bottleneck in GC–MS metabolomics is the processing of raw machine data into a final datamatrix that contains the quantities of identified metabolites in each sample. While there are many bioinformatics tools available to aid the initial steps of the process, their use requires both significant technical expertise and a subsequent manual validation of identifications and alignments if high data quality is desired. The manual validation is tedious and time consuming, becoming prohibitively so as sample numbers increase. We have, therefore, developed Maui-VIA, a solution based on a visual interface that allows experts and non-experts to simultaneously and quickly process, inspect, and correct large numbers of GC–MS samples. It allows for the visual inspection of identifications and alignments, facilitating a unique and, due to its visualization and keyboard shortcuts, very fast interaction with the data. Therefore, Maui-Via fills an important niche by (1) providing functionality that optimizes the component of data processing that is currently most labor intensive to save time and (2) lowering the threshold of expertise required to process GC–MS data. Maui-VIA projects are initiated with baseline-corrected raw data, peaklists, and a database of metabolite spectra and retention indices used for identification. It provides functionality for retention index calculation, a targeted library search, the visual annotation, alignment, correction interface, and metabolite quantification, as well as the export of the final datamatrix. The high quality of data produced by Maui-VIA is illustrated by its comparison to data attained manually by an expert using vendor software on a previously published dataset concerning the response of Chlamydomonas reinhardtii to salt stress. In conclusion, Maui-VIA provides the opportunity for fast, confident, and high-quality data processing validation of large numbers of GC–MS samples by non-experts. PMID:25654076
Effects of Vegetated Field Borders on Arthropods in Cotton Fields in Eastern North Carolina

PubMed Central

Outward, Randy; Sorenson, Clyde E.; Bradley, J. R.

2008-01-01

The influence, if any, of 5m wide, feral, herbaceous field borders on pest and beneficial arthropods in commercial cotton, Gossypium hirsutum (L.) (Malvales: Malvaceae), fields was measured through a variety of sampling techniques over three years. In each year, 5 fields with managed, feral vegetation borders and five fields without such borders were examined. Sampling was stratified from the field border or edge in each field in an attempt to elucidate any edge effects that might have occurred. Early season thrips populations appeared to be unaffected by the presence of a border. Pitfall sampling disclosed no differences in ground-dwelling predaceous arthropods but did detect increased populations of crickets around fields with borders. Cotton aphid (Aphis gossypii Glover) (Hemiptera: Aphididae) populations were too low during the study to adequately assess border effects. Heliothines, Heliothis virescens (F.) and Helicoverpa zea (Boddie) (Lepidoptera: Noctuidae), egg numbers and damage rates were largely unaffected by the presence or absence of a border, although in one instance egg numbers were significantly lower in fields with borders. Overall, foliage-dwelling predaceous arthropods were somewhat more abundant in fields with borders than in fields without borders. Tarnished plant bugs, Lygus lineolaris (Palisot de Beauvois) (Heteroptera: Miridae) were significantly more abundant in fields with borders, but stink bugs, Acrosternum hilare (Say), and Euschistus servus (Say) (Hemiptera: Pentatomidae) numbers appeared to be largely unaffected by border treatment. Few taxa clearly exhibited distributional edge effects relative to the presence or absence of border vegetation. Field borders like those examined in this study likely will have little impact on insect pest management in cotton under current insect management regimens. PMID:20345293
Diversity and Genome Analysis of Australian and Global Oilseed Brassica napus L. Germplasm Using Transcriptomics and Whole Genome Re-sequencing.

PubMed

Malmberg, M Michelle; Shi, Fan; Spangenberg, German C; Daetwyler, Hans D; Cogan, Noel O I

2018-01-01

Intensive breeding of Brassica napus has resulted in relatively low diversity, such that B. napus would benefit from germplasm improvement schemes that sustain diversity. As such, samples representative of global germplasm pools need to be assessed for existing population structure, diversity and linkage disequilibrium (LD). Complexity reduction genotyping-by-sequencing (GBS) methods, including GBS-transcriptomics (GBS-t), enable cost-effective screening of a large number of samples, while whole genome re-sequencing (WGR) delivers the ability to generate large numbers of unbiased genomic single nucleotide polymorphisms (SNPs), and identify structural variants (SVs). Furthermore, the development of genomic tools based on whole genomes representative of global oilseed diversity and orientated by the reference genome has substantial industry relevance and will be highly beneficial for canola breeding. As recent studies have focused on European and Chinese varieties, a global diversity panel as well as a substantial number of Australian spring types were included in this study. Focusing on industry relevance, 633 varieties were initially genotyped using GBS-t to examine population structure using 61,037 SNPs. Subsequently, 149 samples representative of global diversity were selected for WGR and both data sets used for a side-by-side evaluation of diversity and LD. The WGR data was further used to develop genomic resources consisting of a list of 4,029,750 high-confidence SNPs annotated using SnpEff, and SVs in the form of 10,976 deletions and 2,556 insertions. These resources form the basis of a reliable and repeatable system allowing greater integration between canola genomics studies, with a strong focus on breeding germplasm and industry applicability.
Psychometric Properties of the Problematic Internet Use Questionnaire Short-Form (PIUQ-SF-6) in a Nationally Representative Sample of Adolescents.

PubMed

Demetrovics, Zsolt; Király, Orsolya; Koronczai, Beatrix; Griffiths, Mark D; Nagygyörgy, Katalin; Elekes, Zsuzsanna; Tamás, Domokos; Kun, Bernadette; Kökönyei, Gyöngyi; Urbán, Róbert

2016-01-01

Despite the large number of measurement tools developed to assess problematic Internet use, numerous studies use measures with only modest investigation into their psychometric properties. The goal of the present study was to validate the short (6-item) version of the Problematic Internet Use Questionnaire (PIUQ) on a nationally representative adolescent sample (n = 5,005; mean age 16.4 years, SD = 0.87) and to determine a statistically established cut-off value. Data were collected within the framework of the European School Survey Project on Alcohol and Other Drugs project. Results showed an acceptable fit of the original three-factor structure to the data. In addition, a MIMIC model was carried out to justify the need for three distinct factors. The sample was divided into users at-risk of problematic Internet use and those with no-risk using a latent profile analysis. Two latent classes were obtained with 14.4% of adolescents belonging to the at-risk group. Concurrent and convergent validity were tested by comparing the two groups across a number of variables (i.e., time spent online, academic achievement, self-esteem, depressive symptoms, and preferred online activities). Using the at-risk latent profile analysis class as the gold standard, a cut-off value of 15 (out of 30) was suggested based on sensitivity and specificity analyses. In conclusion, the brief version of the (6-item) PIUQ also appears to be an appropriate measure to differentiate between Internet users at risk of developing problematic Internet use and those not at risk. Furthermore, due to its brevity, the shortened PIUQ is advantageous to utilize within large-scale surveys assessing many different behaviors and/or constructs by reducing the overall number of survey questions, and as a consequence, likely increasing completion rates.
Simulation Studies as Designed Experiments: The Comparison of Penalized Regression Models in the “Large p, Small n” Setting

PubMed Central

Chaibub Neto, Elias; Bare, J. Christopher; Margolin, Adam A.

2014-01-01

New algorithms are continuously proposed in computational biology. Performance evaluation of novel methods is important in practice. Nonetheless, the field experiences a lack of rigorous methodology aimed to systematically and objectively evaluate competing approaches. Simulation studies are frequently used to show that a particular method outperforms another. Often times, however, simulation studies are not well designed, and it is hard to characterize the particular conditions under which different methods perform better. In this paper we propose the adoption of well established techniques in the design of computer and physical experiments for developing effective simulation studies. By following best practices in planning of experiments we are better able to understand the strengths and weaknesses of competing algorithms leading to more informed decisions about which method to use for a particular task. We illustrate the application of our proposed simulation framework with a detailed comparison of the ridge-regression, lasso and elastic-net algorithms in a large scale study investigating the effects on predictive performance of sample size, number of features, true model sparsity, signal-to-noise ratio, and feature correlation, in situations where the number of covariates is usually much larger than sample size. Analysis of data sets containing tens of thousands of features but only a few hundred samples is nowadays routine in computational biology, where “omics” features such as gene expression, copy number variation and sequence data are frequently used in the predictive modeling of complex phenotypes such as anticancer drug response. The penalized regression approaches investigated in this study are popular choices in this setting and our simulations corroborate well established results concerning the conditions under which each one of these methods is expected to perform best while providing several novel insights. PMID:25289666
A mixture model-based approach to the clustering of microarray expression data.

PubMed

McLachlan, G J; Bean, R W; Peel, D

2002-03-01

This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets. EMMIX-GENE is available at http://www.maths.uq.edu.au/~gjm/emmix-gene/
Factor Structure of the Comprehensive Trail Making Test in Children and Adolescents with Brain Dysfunction

ERIC Educational Resources Information Center

Allen, Daniel N.; Thaler, Nicholas S.; Barchard, Kimberly A.; Vertinski, Mary; Mayfield, Joan

2012-01-01

The Comprehensive Trail Making Test (CTMT) is a relatively new version of the Trail Making Test that has a number of appealing features, including a large normative sample that allows raw scores to be converted to standard "T" scores adjusted for age. Preliminary validity information suggests that CTMT scores are sensitive to brain…
Sex and age composition of Great Gray Owls (Strix nebulosa), winter 1995/1996

Treesearch

Robert W. Nero; Herbert W. R. Copland

1997-01-01

In winter 1995/1996, a nearly continent-wide movement of Great Gray Owls (Strix nebulosa) occurred. A sample of 126 owls examined during this period, mainly from northeast of Winnipeg, included a large number from the 1994 hatch-year. If our assumptions regarding molt are correct, 51 birds were from this age class. An inhibited molt condition found...
New Resources for Computer-Aided Legal Research: An Assessment of the Usefulness of the DIALOG System in Securities Regulation Studies.

ERIC Educational Resources Information Center

Gruner, Richard; Heron, Carol E.

1984-01-01

Examines usefulness of DIALOG as legal research tool through use of DIALOG's DIALINDEX database to identify those databases among almost 200 available that contain large numbers of records related to federal securities regulation. Eight databases selected for further study are detailed. Twenty-six footnotes, database statistics, and samples are…
REEP1 Mutation Spectrum and Genotype/Phenotype Correlation in Hereditary Spastic Paraplegia Type 31

ERIC Educational Resources Information Center

Beetz, Christian; Schule, Rebecca; Deconinck, Tine; Tran-Viet, Khanh-Nhat; Zhu, Hui; Kremer, Berry P. H.; Frints, Suzanna G. M.; van Zelst-Stams, Wendy A. G.; Byrne, Paula; Otto, Susanne; Nygren, Anders O. H.; Baets, Jonathan; Smets, Katrien; Ceulemans, Berten; Dan, Bernard; Nagan, Narasimhan; Kassubek, Jan; Klimpe, Sven; Klopstock, Thomas; Stolze, Henning; Smeets, Hubert J. M.; Schrander-Stumpel, Constance T. R. M.; Hutchinson, Michael; van de Warrenburg, Bart P.; Braastad, Corey; Deufel, Thomas; Pericak-Vance, Margaret; Schols, Ludger; de Jonghe, Peter; Zuchner, Stephan

2008-01-01

Mutations in the receptor expression enhancing protein 1 (REEP1) have recently been reported to cause autosomal dominant hereditary spastic paraplegia (HSP) type SPG31. In a large collaborative effort, we screened a sample of 535 unrelated HSP patients for "REEP1" mutations and copy number variations. We identified 13 novel and 2 known "REEP1"…

The Effect of Small Sample Size on Two-Level Model Estimates: A Review and Illustration

ERIC Educational Resources Information Center

McNeish, Daniel M.; Stapleton, Laura M.

2016-01-01

Multilevel models are an increasingly popular method to analyze data that originate from a clustered or hierarchical structure. To effectively utilize multilevel models, one must have an adequately large number of clusters; otherwise, some model parameters will be estimated with bias. The goals for this paper are to (1) raise awareness of the…
Advanced proteomic liquid chromatography

PubMed Central

Xie, Fang; Smith, Richard D.; Shen, Yufeng

2012-01-01

Liquid chromatography coupled with mass spectrometry is the predominant platform used to analyze proteomics samples consisting of large numbers of proteins and their proteolytic products (e.g., truncated polypeptides) and spanning a wide range of relative concentrations. This review provides an overview of advanced capillary liquid chromatography techniques and methodologies that greatly improve separation resolving power and proteomics analysis coverage, sensitivity, and throughput. PMID:22840822
Health Outcomes among Hispanic Subgroups: Data from the National Health Interview Survey, 1992-95. Advance Data, Number 310.

ERIC Educational Resources Information Center

Hajat, Anjum; Lucas, Jacqueline B.; Kington, Raynard

In this report, various health measures are compared across Hispanic subgroups in the United States. National Health Interview Survey (NHIS) data aggregated from 1992 through 1995 were analyzed. NHIS is one of the few national surveys that has a sample sufficiently large enough to allow such comparisons. Both age-adjusted and unadjusted estimates…
Sea-level rise and archaeological site destruction: An example from the southeastern United States using DINAA (Digital Index of North American Archaeology).

PubMed

Anderson, David G; Bissett, Thaddeus G; Yerka, Stephen J; Wells, Joshua J; Kansa, Eric C; Kansa, Sarah W; Myers, Kelsey Noack; DeMuth, R Carl; White, Devin A

2017-01-01

The impact of changing climate on terrestrial and underwater archaeological sites, historic buildings, and cultural landscapes can be examined through quantitatively-based analyses encompassing large data samples and broad geographic and temporal scales. The Digital Index of North American Archaeology (DINAA) is a multi-institutional collaboration that allows researchers online access to linked heritage data from multiple sources and data sets. The effects of sea-level rise and concomitant human population relocation is examined using a sample from nine states encompassing much of the Gulf and Atlantic coasts of the southeastern United States. A 1 m rise in sea-level will result in the loss of over >13,000 recorded historic and prehistoric archaeological sites, as well as over 1000 locations currently eligible for inclusion on the National Register of Historic Places (NRHP), encompassing archaeological sites, standing structures, and other cultural properties. These numbers increase substantially with each additional 1 m rise in sea level, with >32,000 archaeological sites and >2400 NRHP properties lost should a 5 m rise occur. Many more unrecorded archaeological and historic sites will also be lost as large areas of the landscape are flooded. The displacement of millions of people due to rising seas will cause additional impacts where these populations resettle. Sea level rise will thus result in the loss of much of the record of human habitation of the coastal margin in the Southeast within the next one to two centuries, and the numbers indicate the magnitude of the impact on the archaeological record globally. Construction of large linked data sets is essential to developing procedures for sampling, triage, and mitigation of these impacts.
Sea-level rise and archaeological site destruction: An example from the southeastern United States using DINAA (Digital Index of North American Archaeology)

PubMed Central

Wells, Joshua J.; Kansa, Eric C.; Kansa, Sarah W.; Myers, Kelsey Noack; DeMuth, R. Carl; White, Devin A.

2017-01-01

The impact of changing climate on terrestrial and underwater archaeological sites, historic buildings, and cultural landscapes can be examined through quantitatively-based analyses encompassing large data samples and broad geographic and temporal scales. The Digital Index of North American Archaeology (DINAA) is a multi-institutional collaboration that allows researchers online access to linked heritage data from multiple sources and data sets. The effects of sea-level rise and concomitant human population relocation is examined using a sample from nine states encompassing much of the Gulf and Atlantic coasts of the southeastern United States. A 1 m rise in sea-level will result in the loss of over >13,000 recorded historic and prehistoric archaeological sites, as well as over 1000 locations currently eligible for inclusion on the National Register of Historic Places (NRHP), encompassing archaeological sites, standing structures, and other cultural properties. These numbers increase substantially with each additional 1 m rise in sea level, with >32,000 archaeological sites and >2400 NRHP properties lost should a 5 m rise occur. Many more unrecorded archaeological and historic sites will also be lost as large areas of the landscape are flooded. The displacement of millions of people due to rising seas will cause additional impacts where these populations resettle. Sea level rise will thus result in the loss of much of the record of human habitation of the coastal margin in the Southeast within the next one to two centuries, and the numbers indicate the magnitude of the impact on the archaeological record globally. Construction of large linked data sets is essential to developing procedures for sampling, triage, and mitigation of these impacts. PMID:29186200
A large point-source outbreak of Salmonella Typhimurium linked to chicken, pork and salad rolls from a Vietnamese bakery in Sydney.

PubMed

Norton, Sophie; Huhtinen, Essi; Conaty, Stephen; Hope, Kirsty; Campbell, Brett; Tegel, Marianne; Boyd, Rowena; Cullen, Beth

2012-04-01

In January 2011, Sydney South West Public Health Unit was notified of a large number of people presenting with gastroenteritis over two days at a local hospital emergency department (ED). Case-finding was conducted through hospital EDs and general practitioners, which resulted in the notification of 154 possible cases, from which 83 outbreak cases were identified. Fifty-eight cases were interviewed about demographics, symptom profile and food histories. Stool samples were collected and submitted for analysis. An inspection was conducted at a Vietnamese bakery and food samples were collected and submitted for analysis. Further case ascertainment occurred to ensure control measures were successful. Of the 58 interviewed cases, the symptom profile included diarrhoea (100%), fever (79.3%) and vomiting (89.7%). Salmonella Typhimurium multiple-locus-variable number tandem repeats analysis (MLVA) type 3-10-8-9-523 was identified in 95.9% (47/49) of stool samples. Cases reported consuming chicken, pork or salad rolls from a single Vietnamese bakery. Environmental swabs detected widespread contamination with Salmonella at the premises. This was a large point-source outbreak associated with the consumption of Vietnamese-style pork, chicken and salad rolls. These foods have been responsible for significant outbreaks in the past. The typical ingredients of raw egg butter or mayonnaise and pate are often implicated, as are the food-handling practices in food outlets. This indicates the need for education in better food-handling practices, including the benefits of using safer products. Ongoing surveillance will monitor the success of new food regulations introduced in New South Wales during 2011 for improving food-handling practices and reducing foodborne illness.
A large point-source outbreak of Salmonella Typhimurium linked to chicken, pork and salad rolls from a Vietnamese bakery in Sydney

PubMed Central

Huhtinen, Essi; Conaty, Stephen; Hope, Kirsty; Campbell, Brett; Tegel, Marianne; Boyd, Rowena; Cullen, Beth

2012-01-01

Introduction In January 2011, Sydney South West Public Health Unit was notified of a large number of people presenting with gastroenteritis over two days at a local hospital emergency department (ED). Methods Case-finding was conducted through hospital EDs and general practitioners, which resulted in the notification of 154 possible cases, from which 83 outbreak cases were identified. Fifty-eight cases were interviewed about demographics, symptom profile and food histories. Stool samples were collected and submitted for analysis. An inspection was conducted at a Vietnamese bakery and food samples were collected and submitted for analysis. Further case ascertainment occurred to ensure control measures were successful. Results Of the 58 interviewed cases, the symptom profile included diarrhoea (100%), fever (79.3%) and vomiting (89.7%). Salmonella Typhimurium multiple-locus-variable number tandem repeats analysis (MLVA) type 3–10–8-9–523 was identified in 95.9% (47/49) of stool samples. Cases reported consuming chicken, pork or salad rolls from a single Vietnamese bakery. Environmental swabs detected widespread contamination with Salmonella at the premises. Discussion This was a large point-source outbreak associated with the consumption of Vietnamese-style pork, chicken and salad rolls. These foods have been responsible for significant outbreaks in the past. The typical ingredients of raw egg butter or mayonnaise and pate are often implicated, as are the food-handling practices in food outlets. This indicates the need for education in better food-handling practices, including the benefits of using safer products. Ongoing surveillance will monitor the success of new food regulations introduced in New South Wales during 2011 for improving food-handling practices and reducing foodborne illness. PMID:23908908
Glimpsing the imprint of local environment on the galaxy stellar mass function

NASA Astrophysics Data System (ADS)

Tomczak, Adam R.; Lemaux, Brian C.; Lubin, Lori M.; Gal, Roy R.; Wu, Po-Feng; Holden, Bradford; Kocevski, Dale D.; Mei, Simona; Pelliccia, Debora; Rumbaugh, Nicholas; Shen, Lu

2017-12-01

We investigate the impact of local environment on the galaxy stellar mass function (SMF) spanning a wide range of galaxy densities from the field up to dense cores of massive galaxy clusters. Data are drawn from a sample of eight fields from the Observations of Redshift Evolution in Large-Scale Environments (ORELSE) survey. Deep photometry allow us to select mass-complete samples of galaxies down to 109 M⊙. Taking advantage of >4000 secure spectroscopic redshifts from ORELSE and precise photometric redshifts, we construct three-dimensional density maps between 0.55 < z < 1.3 using a Voronoi tessellation approach. We find that the shape of the SMF depends strongly on local environment exhibited by a smooth, continual increase in the relative numbers of high- to low-mass galaxies towards denser environments. A straightforward implication is that local environment proportionally increases the efficiency of (a) destroying lower mass galaxies and/or (b) growth of higher mass galaxies. We also find a presence of this environmental dependence in the SMFs of star-forming and quiescent galaxies, although not quite as strongly for the quiescent subsample. To characterize the connection between the SMF of field galaxies and that of denser environments, we devise a simple semi-empirical model. The model begins with a sample of ≈106 galaxies at zstart = 5 with stellar masses distributed according to the field. Simulated galaxies then evolve down to zfinal = 0.8 following empirical prescriptions for star-formation, quenching and galaxy-galaxy merging. We run the simulation multiple times, testing a variety of scenarios with differing overall amounts of merging. Our model suggests that a large number of mergers are required to reproduce the SMF in dense environments. Additionally, a large majority of these mergers would have to occur in intermediate density environments (e.g. galaxy groups).
Prevalence and predictors of Axis I disorders in a large sample of treatment-seeking victims of sexual abuse and incest

PubMed Central

McElroy, Eoin; Shevlin, Mark; Elklit, Ask; Hyland, Philip; Murphy, Siobhan; Murphy, Jamie

2016-01-01

Background Childhood sexual abuse (CSA) is a common occurrence and a robust, yet non-specific, predictor of adult psychopathology. While many demographic and abuse factors have been shown to impact this relationship, their common and specific effects remain poorly understood. Objective This study sought to assess the prevalence of Axis I disorders in a large sample of help-seeking victims of sexual trauma, and to examine the common and specific effects of demographic and abuse characteristics across these different diagnoses. Method The participants were attendees at four treatment centres in Denmark that provide psychological therapy for victims of CSA (N=434). Axis I disorders were assessed using the Millon Clinical Multiaxial Inventory-III (MCMI-III). Multivariate logistic regression analysis was used to examine the associations between CSA characteristics (age of onset, duration, number of abusers, number of abusive acts) and 10 adult clinical syndromes. Results There was significant variation in the prevalence of disorders and the abuse characteristics were differentially associated with the outcome variables. Having experienced sexual abuse from more than one perpetrator was the strongest predictor of psychopathology. Conclusions The relationship between CSA and adult psychopathology is complex. Abuse characteristics have both unique and shared effects across different diagnoses. Highlights of the article The prevalence of Axis I disorders were assessed in a large sample of sexual abuse and incest survivors. The impact of demographic and abuse characteristics were also examined. There was significant variation in the prevalence of disorders. Abuse characteristics were differentially associated with the disorders. Abuse from multiple perpetrators was the strongest overall predictor of psychopathology. PMID:27064976
Multitarget-multisensor management for decentralized sensor networks

NASA Astrophysics Data System (ADS)

Tharmarasa, R.; Kirubarajan, T.; Sinha, A.; Hernandez, M. L.

2006-05-01

In this paper, we consider the problem of sensor resource management in decentralized tracking systems. Due to the availability of cheap sensors, it is possible to use a large number of sensors and a few fusion centers (FCs) to monitor a large surveillance region. Even though a large number of sensors are available, due to frequency, power and other physical limitations, only a few of them can be active at any one time. The problem is then to select sensor subsets that should be used by each FC at each sampling time in order to optimize the tracking performance subject to their operational constraints. In a recent paper, we proposed an algorithm to handle the above issues for joint detection and tracking, without using simplistic clustering techniques that are standard in the literature. However, in that paper, a hierarchical architecture with feedback at every sampling time was considered, and the sensor management was performed only at a central fusion center (CFC). However, in general, it is not possible to communicate with the CFC at every sampling time, and in many cases there may not even be a CFC. Sometimes, communication between CFC and local fusion centers might fail as well. Therefore performing sensor management only at the CFC is not viable in most networks. In this paper, we consider an architecture in which there is no CFC, each FC communicates only with the neighboring FCs, and communications are restricted. In this case, each FC has to decide which sensors are to be used by itself at each measurement time step. We propose an efficient algorithm to handle the above problem in real time. Simulation results illustrating the performance of the proposed algorithm are also presented.
Impact of Probiotics on Necrotizing Enterocolitis

PubMed Central

Underwood, Mark A.

2016-01-01

A large number of randomized placebo-controlled clinical trials and cohort studies have demonstrated a decrease in the incidence of necrotizing enterocolitis with administration of probiotic microbes. These studies have prompted many neonatologists to adopt routine prophylactic administration of probiotics while others await more definitive studies and/or probiotic products with demonstrated purity and stable numbers of live organisms. Cross-contamination and inadequate sample size limit the value of further traditional placebo-controlled randomized controlled trials. Key areas for future research include mechanisms of protection, optimum probiotic species or strains (or combinations thereof) and duration of treatment, interactions between diet and the administered probiotic, and the influence of genetic polymorphisms in the mother and infant on probiotic response. Next generation probiotics selected based on bacterial genetics rather than ease of production and large cluster-randomized clinical trials hold great promise for NEC prevention. PMID:27836423
Glass sample preparation and performance investigations. [solar x-ray imager

NASA Technical Reports Server (NTRS)

Johnson, R. Barry

1992-01-01

This final report details the work performed under this delivery order from April 1991 through April 1992. The currently available capabilities for integrated optical performance modeling at MSFC for large and complex systems such as AXAF were investigated. The Integrated Structural Modeling (ISM) program developed by Boeing for the U.S. Air Force was obtained and installed on two DECstations 5000 at MSFC. The structural, thermal and optical analysis programs available in ISM were evaluated. As part of the optomechanical engineering activities, technical support was provided in the design of support structure, mirror assembly, filter wheel assembly and material selection for the Solar X-ray Imager (SXI) program. As part of the fabrication activities, a large number of zerodur glass samples were prepared in different sizes and shapes for acid etching, coating and polishing experiments to characterize the subsurface damage and stresses produced by the grinding and polishing operations. Various optical components for AXAF video microscope and the x-ray test facility were also fabricated. A number of glass fabrication and test instruments such as a scatter plate interferometer, a gravity feed saw and some phenolic cutting blades were fabricated, integrated and tested.
Assay for the simultaneous determination of guanidinoacetic acid, creatinine and creatine in plasma and urine by capillary electrophoresis UV-detection.

PubMed

Zinellu, Angelo; Sotgia, Salvatore; Zinellu, Elisabetta; Chessa, Roberto; Deiana, Luca; Carru, Ciriaco

2006-03-01

Guanidinoacetic acid (GAA) measurement has recently become of great interest for the diagnosis of creatine (Cn) metabolism disorders, and research calls for rapid and inexpensive methods for its detection in plasma and urine in order to assess a large number of patients. We propose a new assay for the measurement of GAA by a simple CZE UV-detection without previous sample derivatization. Plasma samples were filtered by Microcon-10 microconcentrators and directly injected into the capillary, while for urine specimens a simple water dilution before injection was needed. A baseline separation was obtained in less than 8 min using a 60.2 cm x 75 microm uncoated silica capillary, 75 mmol/L Tris-phosphate buffer pH 2.25 at 15 degrees C. The performance of the developed method was assessed by measuring plasma creatinine and Cn in 32 normal subjects and comparing the data obtained by the new method with those found with the previous CE assay. Our new method seems to be an inexpensive, fast and specific tool to assess a large number of patients both in clinical and in research laboratories.
A simple and rapid microplate assay for glycoprotein-processing glycosidases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kang, M.S.; Zwolshen, J.H.; Harry, B.S.

1989-08-15

A simple and convenient microplate assay for glycosidases involved in the glycoprotein-processing reactions is described. The assay is based on specific binding of high-mannose-type oligosaccharide substrates to concanavalin A-Sepharose, while monosaccharides liberated by enzymatic hydrolysis do not bind to concanavalin A-Sepharose. By the use of radiolabeled substrates (( 3H)glucose for glucosidases and (3H)mannose for mannosidases), the radioactivity in the liberated monosaccharides can be determined as a measure of the enzymatic activity. This principle was employed earlier for developing assays for glycosidases previously reported. These authors have reported the separation of substrate from the product by concanavalin A-Sepharose column chromatography. Thismore » procedure is handicapped by the fact that it cannot be used for a large number of samples and is time consuming. We have simplified this procedure and adapted it to the use of a microplate (96-well plate). This would help in processing a large number of samples in a short time. In this report we show that the assay is comparable to the column assay previously reported. It is linear with time and enzyme concentration and shows expected kinetics with castanospermine, a known inhibitor of alpha-glucosidase I.« less
Estimating haplotype frequencies by combining data from large DNA pools with database information.

PubMed

Gasbarra, Dario; Kulathinal, Sangita; Pirinen, Matti; Sillanpää, Mikko J

2011-01-01

We assume that allele frequency data have been extracted from several large DNA pools, each containing genetic material of up to hundreds of sampled individuals. Our goal is to estimate the haplotype frequencies among the sampled individuals by combining the pooled allele frequency data with prior knowledge about the set of possible haplotypes. Such prior information can be obtained, for example, from a database such as HapMap. We present a Bayesian haplotyping method for pooled DNA based on a continuous approximation of the multinomial distribution. The proposed method is applicable when the sizes of the DNA pools and/or the number of considered loci exceed the limits of several earlier methods. In the example analyses, the proposed model clearly outperforms a deterministic greedy algorithm on real data from the HapMap database. With a small number of loci, the performance of the proposed method is similar to that of an EM-algorithm, which uses a multinormal approximation for the pooled allele frequencies, but which does not utilize prior information about the haplotypes. The method has been implemented using Matlab and the code is available upon request from the authors.
A regularized variable selection procedure in additive hazards model with stratified case-cohort design.

PubMed

Ni, Ai; Cai, Jianwen

2018-07-01

Case-cohort designs are commonly used in large epidemiological studies to reduce the cost associated with covariate measurement. In many such studies the number of covariates is very large. An efficient variable selection method is needed for case-cohort studies where the covariates are only observed in a subset of the sample. Current literature on this topic has been focused on the proportional hazards model. However, in many studies the additive hazards model is preferred over the proportional hazards model either because the proportional hazards assumption is violated or the additive hazards model provides more relevent information to the research question. Motivated by one such study, the Atherosclerosis Risk in Communities (ARIC) study, we investigate the properties of a regularized variable selection procedure in stratified case-cohort design under an additive hazards model with a diverging number of parameters. We establish the consistency and asymptotic normality of the penalized estimator and prove its oracle property. Simulation studies are conducted to assess the finite sample performance of the proposed method with a modified cross-validation tuning parameter selection methods. We apply the variable selection procedure to the ARIC study to demonstrate its practical use.
Exploration of large, rare copy number variants associated with psychiatric and neurodevelopmental disorders in individuals with anorexia nervosa.

PubMed

Yilmaz, Zeynep; Szatkiewicz, Jin P; Crowley, James J; Ancalade, NaEshia; Brandys, Marek K; van Elburg, Annemarie; de Kovel, Carolien G F; Adan, Roger A H; Hinney, Anke; Hebebrand, Johannes; Gratacos, Monica; Fernandez-Aranda, Fernando; Escaramis, Georgia; Gonzalez, Juan R; Estivill, Xavier; Zeggini, Eleftheria; Sullivan, Patrick F; Bulik, Cynthia M

2017-08-01

Anorexia nervosa (AN) is a serious and heritable psychiatric disorder. To date, studies of copy number variants (CNVs) have been limited and inconclusive because of small sample sizes. We conducted a case-only genome-wide CNV survey in 1983 female AN cases included in the Genetic Consortium for Anorexia Nervosa. Following stringent quality control procedures, we investigated whether pathogenic CNVs in regions previously implicated in psychiatric and neurodevelopmental disorders were present in AN cases. We observed two instances of the well-established pathogenic CNVs in AN cases. In addition, one case had a deletion in the 13q12 region, overlapping with a deletion reported previously in two AN cases. As a secondary aim, we also examined our sample for CNVs over 1 Mbp in size. Out of the 40 instances of such large CNVs that were not implicated previously for AN or neuropsychiatric phenotypes, two of them contained genes with previous neuropsychiatric associations, and only five of them had no associated reports in public CNV databases. Although ours is the largest study of its kind in AN, larger datasets are needed to comprehensively assess the role of CNVs in the etiology of AN.
Attack Detection in Sensor Network Target Localization Systems With Quantized Data

NASA Astrophysics Data System (ADS)

Zhang, Jiangfan; Wang, Xiaodong; Blum, Rick S.; Kaplan, Lance M.

2018-04-01

We consider a sensor network focused on target localization, where sensors measure the signal strength emitted from the target. Each measurement is quantized to one bit and sent to the fusion center. A general attack is considered at some sensors that attempts to cause the fusion center to produce an inaccurate estimation of the target location with a large mean-square-error. The attack is a combination of man-in-the-middle, hacking, and spoofing attacks that can effectively change both signals going into and coming out of the sensor nodes in a realistic manner. We show that the essential effect of attacks is to alter the estimated distance between the target and each attacked sensor to a different extent, giving rise to a geometric inconsistency among the attacked and unattacked sensors. Hence, with the help of two secure sensors, a class of detectors are proposed to detect the attacked sensors by scrutinizing the existence of the geometric inconsistency. We show that the false alarm and miss probabilities of the proposed detectors decrease exponentially as the number of measurement samples increases, which implies that for sufficiently large number of samples, the proposed detectors can identify the attacked and unattacked sensors with any required accuracy.
Unbiased multi-fidelity estimate of failure probability of a free plane jet

NASA Astrophysics Data System (ADS)

Marques, Alexandre; Kramer, Boris; Willcox, Karen; Peherstorfer, Benjamin

2017-11-01

Estimating failure probability related to fluid flows is a challenge because it requires a large number of evaluations of expensive models. We address this challenge by leveraging multiple low fidelity models of the flow dynamics to create an optimal unbiased estimator. In particular, we investigate the effects of uncertain inlet conditions in the width of a free plane jet. We classify a condition as failure when the corresponding jet width is below a small threshold, such that failure is a rare event (failure probability is smaller than 0.001). We estimate failure probability by combining the frameworks of multi-fidelity importance sampling and optimal fusion of estimators. Multi-fidelity importance sampling uses a low fidelity model to explore the parameter space and create a biasing distribution. An unbiased estimate is then computed with a relatively small number of evaluations of the high fidelity model. In the presence of multiple low fidelity models, this framework offers multiple competing estimators. Optimal fusion combines all competing estimators into a single estimator with minimal variance. We show that this combined framework can significantly reduce the cost of estimating failure probabilities, and thus can have a large impact in fluid flow applications. This work was funded by DARPA.
A network of epigenetic modifiers and DNA repair genes controls tissue-specific copy number alteration preference.

PubMed

Cramer, Dina; Serrano, Luis; Schaefer, Martin H

2016-11-10

Copy number alterations (CNAs) in cancer patients show a large variability in their number, length and position, but the sources of this variability are not known. CNA number and length are linked to patient survival, suggesting clinical relevance. We have identified genes that tend to be mutated in samples that have few or many CNAs, which we term CONIM genes (COpy Number Instability Modulators). CONIM proteins cluster into a densely connected subnetwork of physical interactions and many of them are epigenetic modifiers. Therefore, we investigated how the epigenome of the tissue-of-origin influences the position of CNA breakpoints and the properties of the resulting CNAs. We found that the presence of heterochromatin in the tissue-of-origin contributes to the recurrence and length of CNAs in the respective cancer type.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.