representative probability sample: Topics by Science.gov

Sample records for representative probability sample

Sampling in epidemiological research: issues, hazards and pitfalls.

PubMed

Tyrer, Stephen; Heyman, Bob

2016-04-01

Surveys of people's opinions are fraught with difficulties. It is easier to obtain information from those who respond to text messages or to emails than to attempt to obtain a representative sample. Samples of the population that are selected non-randomly in this way are termed convenience samples as they are easy to recruit. This introduces a sampling bias. Such non-probability samples have merit in many situations, but an epidemiological enquiry is of little value unless a random sample is obtained. If a sufficient number of those selected actually complete a survey, the results are likely to be representative of the population. This editorial describes probability and non-probability sampling methods and illustrates the difficulties and suggested solutions in performing accurate epidemiological research.
Sampling in epidemiological research: issues, hazards and pitfalls

PubMed Central

Tyrer, Stephen; Heyman, Bob

2016-01-01

Surveys of people's opinions are fraught with difficulties. It is easier to obtain information from those who respond to text messages or to emails than to attempt to obtain a representative sample. Samples of the population that are selected non-randomly in this way are termed convenience samples as they are easy to recruit. This introduces a sampling bias. Such non-probability samples have merit in many situations, but an epidemiological enquiry is of little value unless a random sample is obtained. If a sufficient number of those selected actually complete a survey, the results are likely to be representative of the population. This editorial describes probability and non-probability sampling methods and illustrates the difficulties and suggested solutions in performing accurate epidemiological research. PMID:27087985
Optimized probability sampling of study sites to improve generalizability in a multisite intervention trial.

PubMed

Kraschnewski, Jennifer L; Keyserling, Thomas C; Bangdiwala, Shrikant I; Gizlice, Ziya; Garcia, Beverly A; Johnston, Larry F; Gustafson, Alison; Petrovic, Lindsay; Glasgow, Russell E; Samuel-Hodge, Carmen D

2010-01-01

Studies of type 2 translation, the adaption of evidence-based interventions to real-world settings, should include representative study sites and staff to improve external validity. Sites for such studies are, however, often selected by convenience sampling, which limits generalizability. We used an optimized probability sampling protocol to select an unbiased, representative sample of study sites to prepare for a randomized trial of a weight loss intervention. We invited North Carolina health departments within 200 miles of the research center to participate (N = 81). Of the 43 health departments that were eligible, 30 were interested in participating. To select a representative and feasible sample of 6 health departments that met inclusion criteria, we generated all combinations of 6 from the 30 health departments that were eligible and interested. From the subset of combinations that met inclusion criteria, we selected 1 at random. Of 593,775 possible combinations of 6 counties, 15,177 (3%) met inclusion criteria. Sites in the selected subset were similar to all eligible sites in terms of health department characteristics and county demographics. Optimized probability sampling improved generalizability by ensuring an unbiased and representative sample of study sites.
Design and Weighting Methods for a Nationally Representative Sample of HIV-infected Adults Receiving Medical Care in the United States-Medical Monitoring Project

PubMed Central

Iachan, Ronaldo; H. Johnson, Christopher; L. Harding, Richard; Kyle, Tonja; Saavedra, Pedro; L. Frazier, Emma; Beer, Linda; L. Mattson, Christine; Skarbinski, Jacek

2016-01-01

Background: Health surveys of the general US population are inadequate for monitoring human immunodeficiency virus (HIV) infection because the relatively low prevalence of the disease (<0.5%) leads to small subpopulation sample sizes. Objective: To collect a nationally and locally representative probability sample of HIV-infected adults receiving medical care to monitor clinical and behavioral outcomes, supplementing the data in the National HIV Surveillance System. This paper describes the sample design and weighting methods for the Medical Monitoring Project (MMP) and provides estimates of the size and characteristics of this population. Methods: To develop a method for obtaining valid, representative estimates of the in-care population, we implemented a cross-sectional, three-stage design that sampled 23 jurisdictions, then 691 facilities, then 9,344 HIV patients receiving medical care, using probability-proportional-to-size methods. The data weighting process followed standard methods, accounting for the probabilities of selection at each stage and adjusting for nonresponse and multiplicity. Nonresponse adjustments accounted for differing response at both facility and patient levels. Multiplicity adjustments accounted for visits to more than one HIV care facility. Results: MMP used a multistage stratified probability sampling design that was approximately self-weighting in each of the 23 project areas and nationally. The probability sample represents the estimated 421,186 HIV-infected adults receiving medical care during January through April 2009. Methods were efficient (i.e., induced small, unequal weighting effects and small standard errors for a range of weighted estimates). Conclusion: The information collected through MMP allows monitoring trends in clinical and behavioral outcomes and informs resource allocation for treatment and prevention activities. PMID:27651851
Modulation Based on Probability Density Functions

NASA Technical Reports Server (NTRS)

Williams, Glenn L.

2009-01-01

A proposed method of modulating a sinusoidal carrier signal to convey digital information involves the use of histograms representing probability density functions (PDFs) that characterize samples of the signal waveform. The method is based partly on the observation that when a waveform is sampled (whether by analog or digital means) over a time interval at least as long as one half cycle of the waveform, the samples can be sorted by frequency of occurrence, thereby constructing a histogram representing a PDF of the waveform during that time interval.
Quantifying recent erosion and sediment delivery using probability sampling: A case study

Treesearch

Jack Lewis

2002-01-01

Abstract - Estimates of erosion and sediment delivery have often relied on measurements from locations that were selected to be representative of particular terrain types. Such judgement samples are likely to overestimate or underestimate the mean of the quantity of interest. Probability sampling can eliminate the bias due to sample selection, and it permits the...
Estimating the probability that the sample mean is within a desired fraction of the standard deviation of the true mean.

PubMed

Schillaci, Michael A; Schillaci, Mario E

2009-02-01

The use of small sample sizes in human and primate evolutionary research is commonplace. Estimating how well small samples represent the underlying population, however, is not commonplace. Because the accuracy of determinations of taxonomy, phylogeny, and evolutionary process are dependant upon how well the study sample represents the population of interest, characterizing the uncertainty, or potential error, associated with analyses of small sample sizes is essential. We present a method for estimating the probability that the sample mean is within a desired fraction of the standard deviation of the true mean using small (n<10) or very small (n < or = 5) sample sizes. This method can be used by researchers to determine post hoc the probability that their sample is a meaningful approximation of the population parameter. We tested the method using a large craniometric data set commonly used by researchers in the field. Given our results, we suggest that sample estimates of the population mean can be reasonable and meaningful even when based on small, and perhaps even very small, sample sizes.
Probabilistic Round Trip Contamination Analysis of a Mars Sample Acquisition and Handling Process Using Markovian Decompositions

NASA Technical Reports Server (NTRS)

Hudson, Nicolas; Lin, Ying; Barengoltz, Jack

2010-01-01

A method for evaluating the probability of a Viable Earth Microorganism (VEM) contaminating a sample during the sample acquisition and handling (SAH) process of a potential future Mars Sample Return mission is developed. A scenario where multiple core samples would be acquired using a rotary percussive coring tool, deployed from an arm on a MER class rover is analyzed. The analysis is conducted in a structured way by decomposing sample acquisition and handling process into a series of discrete time steps, and breaking the physical system into a set of relevant components. At each discrete time step, two key functions are defined: The probability of a VEM being released from each component, and the transport matrix, which represents the probability of VEM transport from one component to another. By defining the expected the number of VEMs on each component at the start of the sampling process, these decompositions allow the expected number of VEMs on each component at each sampling step to be represented as a Markov chain. This formalism provides a rigorous mathematical framework in which to analyze the probability of a VEM entering the sample chain, as well as making the analysis tractable by breaking the process down into small analyzable steps.
Probabilistic #D data fusion for multiresolution surface generation

NASA Technical Reports Server (NTRS)

Manduchi, R.; Johnson, A. E.

2002-01-01

In this paper we present an algorithm for adaptive resolution integration of 3D data collected from multiple distributed sensors. The input to the algorithm is a set of 3D surface points and associated sensor models. Using a probabilistic rule, a surface probability function is generated that represents the probability that a particular volume of space contains the surface. The surface probability function is represented using an octree data structure; regions of space with samples of large conariance are stored at a coarser level than regions of space containing samples with smaller covariance. The algorithm outputs an adaptive resolution surface generated by connecting points that lie on the ridge of surface probability with triangles scaled to match the local discretization of space given by the algorithm, we present results from 3D data generated by scanning lidar and structure from motion.
Trends Concerning Four Misconceptions in Students' Intuitively-Based Probabilistic Reasoning Sourced in the Heuristic of Representativeness

ERIC Educational Resources Information Center

Kustos, Paul Nicholas

2010-01-01

Student difficulty in the study of probability arises in intuitively-based misconceptions derived from heuristics. One such heuristic, the one of note for this research study, is that of representativeness, in which an individual informally assesses the probability of an event based on the degree to which the event is similar to the sample from…
Sampling Methods in Cardiovascular Nursing Research: An Overview.

PubMed

Kandola, Damanpreet; Banner, Davina; O'Keefe-McCarthy, Sheila; Jassal, Debbie

2014-01-01

Cardiovascular nursing research covers a wide array of topics from health services to psychosocial patient experiences. The selection of specific participant samples is an important part of the research design and process. The sampling strategy employed is of utmost importance to ensure that a representative sample of participants is chosen. There are two main categories of sampling methods: probability and non-probability. Probability sampling is the random selection of elements from the population, where each element of the population has an equal and independent chance of being included in the sample. There are five main types of probability sampling including simple random sampling, systematic sampling, stratified sampling, cluster sampling, and multi-stage sampling. Non-probability sampling methods are those in which elements are chosen through non-random methods for inclusion into the research study and include convenience sampling, purposive sampling, and snowball sampling. Each approach offers distinct advantages and disadvantages and must be considered critically. In this research column, we provide an introduction to these key sampling techniques and draw on examples from the cardiovascular research. Understanding the differences in sampling techniques may aid nurses in effective appraisal of research literature and provide a reference pointfor nurses who engage in cardiovascular research.
A Comparison of EPI Sampling, Probability Sampling, and Compact Segment Sampling Methods for Micro and Small Enterprises

PubMed Central

Chao, Li-Wei; Szrek, Helena; Peltzer, Karl; Ramlagan, Shandir; Fleming, Peter; Leite, Rui; Magerman, Jesswill; Ngwenya, Godfrey B.; Pereira, Nuno Sousa; Behrman, Jere

2011-01-01

Finding an efficient method for sampling micro- and small-enterprises (MSEs) for research and statistical reporting purposes is a challenge in developing countries, where registries of MSEs are often nonexistent or outdated. This lack of a sampling frame creates an obstacle in finding a representative sample of MSEs. This study uses computer simulations to draw samples from a census of businesses and non-businesses in the Tshwane Municipality of South Africa, using three different sampling methods: the traditional probability sampling method, the compact segment sampling method, and the World Health Organization’s Expanded Programme on Immunization (EPI) sampling method. Three mechanisms by which the methods could differ are tested, the proximity selection of respondents, the at-home selection of respondents, and the use of inaccurate probability weights. The results highlight the importance of revisits and accurate probability weights, but the lesser effect of proximity selection on the samples’ statistical properties. PMID:22582004
Diet- and Body Size-Related Attitudes and Behaviors Associated with Vitamin Supplement Use in a Representative Sample of Fourth-Grade Students in Texas

ERIC Educational Resources Information Center

George, Goldy C.; Hoelscher, Deanna M.; Nicklas, Theresa A.; Kelder, Steven H.

2009-01-01

Objective: To examine diet- and body size-related attitudes and behaviors associated with supplement use in a representative sample of fourth-grade students in Texas. Design: Cross-sectional data from the School Physical Activity and Nutrition study, a probability-based sample of schoolchildren. Children completed a questionnaire that assessed…
Assessing representativeness of sampling methods for reaching men who have sex with men: a direct comparison of results obtained from convenience and probability samples.

PubMed

Schwarcz, Sandra; Spindler, Hilary; Scheer, Susan; Valleroy, Linda; Lansky, Amy

2007-07-01

Convenience samples are used to determine HIV-related behaviors among men who have sex with men (MSM) without measuring the extent to which the results are representative of the broader MSM population. We compared results from a cross-sectional survey of MSM recruited from gay bars between June and October 2001 to a random digit dial telephone survey conducted between June 2002 and January 2003. The men in the probability sample were older, better educated, and had higher incomes than men in the convenience sample, the convenience sample enrolled more employed men and men of color. Substance use around the time of sex was higher in the convenience sample but other sexual behaviors were similar. HIV testing was common among men in both samples. Periodic validation, through comparison of data collected by different sampling methods, may be useful when relying on survey data for program and policy development.
RECRUITING FOR A LONGITUDINAL STUDY OF CHILDREN'S HEALTH USING A HOUSEHOLD-BASED PROBABILITY SAMPLING APPROACH

EPA Science Inventory

The sampling design for the National Children¿s Study (NCS) calls for a population-based, multi-stage, clustered household sampling approach (visit our website for more information on the NCS : www.nationalchildrensstudy.gov). The full sample is designed to be representative of ...
Point-Sampling and Line-Sampling Probability Theory, Geometric Implications, Synthesis

Treesearch

L.R. Grosenbaugh

1958-01-01

Foresters concerned with measuring tree populations on definite areas have long employed two well-known methods of representative sampling. In list or enumerative sampling the entire tree population is tallied with a known proportion being randomly selected and measured for volume or other variables. In area sampling all trees on randomly located plots or strips...
Method for predicting peptide detection in mass spectrometry

DOEpatents

Kangas, Lars [West Richland, WA; Smith, Richard D [Richland, WA; Petritis, Konstantinos [Richland, WA

2010-07-13

A method of predicting whether a peptide present in a biological sample will be detected by analysis with a mass spectrometer. The method uses at least one mass spectrometer to perform repeated analysis of a sample containing peptides from proteins with known amino acids. The method then generates a data set of peptides identified as contained within the sample by the repeated analysis. The method then calculates the probability that a specific peptide in the data set was detected in the repeated analysis. The method then creates a plurality of vectors, where each vector has a plurality of dimensions, and each dimension represents a property of one or more of the amino acids present in each peptide and adjacent peptides in the data set. Using these vectors, the method then generates an algorithm from the plurality of vectors and the calculated probabilities that specific peptides in the data set were detected in the repeated analysis. The algorithm is thus capable of calculating the probability that a hypothetical peptide represented as a vector will be detected by a mass spectrometry based proteomic platform, given that the peptide is present in a sample introduced into a mass spectrometer.
A comparison of four sampling methods among men having sex with men in China: implications for HIV/STD surveillance and prevention

PubMed Central

Guo, Yan; Li, Xiaoming; Fang, Xiaoyi; Lin, Xiuyun; Song, Yan; Jiang, Shuling; Stanton, Bonita

2011-01-01

Sample representativeness remains one of the challenges in effective HIV/STD surveillance and prevention targeting MSM worldwide. Although convenience samples are widely used in studies of MSM, previous studies suggested that these samples might not be representative of the broader MSM population. This issue becomes even more critical in many developing countries where needed resources for conducting probability sampling are limited. We examined variations in HIV and Syphilis infections and sociodemographic and behavioral factors among 307 young migrant MSM recruited using four different convenience sampling methods (peer outreach, informal social network, Internet, and venue-based) in Beijing, China in 2009. The participants completed a self-administered survey and provided blood specimens for HIV/STD testing. Among the four MSM samples using different recruitment methods, rates of HIV infections were 5.1%, 5.8%, 7.8%, and 3.4%; rates of Syphilis infection were 21.8%, 36.2%, 11.8%, and 13.8%; rates of inconsistent condom use were 57%, 52%, 58%, and 38%. Significant differences were found in various sociodemographic characteristics (e.g., age, migration history, education, income, places of employment) and risk behaviors (e.g., age at first sex, number of sex partners, involvement in commercial sex, and substance use) among samples recruited by different sampling methods. The results confirmed the challenges of obtaining representative MSM samples and underscored the importance of using multiple sampling methods to reach MSM from diverse backgrounds and in different social segments and to improve the representativeness of the MSM samples when the use of probability sampling approach is not feasible. PMID:21711162
A comparison of four sampling methods among men having sex with men in China: implications for HIV/STD surveillance and prevention.

PubMed

Guo, Yan; Li, Xiaoming; Fang, Xiaoyi; Lin, Xiuyun; Song, Yan; Jiang, Shuling; Stanton, Bonita

2011-11-01

Sample representativeness remains one of the challenges in effective HIV/STD surveillance and prevention targeting men who have sex with men (MSM) worldwide. Although convenience samples are widely used in studies of MSM, previous studies suggested that these samples might not be representative of the broader MSM population. This issue becomes even more critical in many developing countries where needed resources for conducting probability sampling are limited. We examined variations in HIV and Syphilis infections and sociodemographic and behavioral factors among 307 young migrant MSM recruited using four different convenience sampling methods (peer outreach, informal social network, Internet, and venue-based) in Beijing, China in 2009. The participants completed a self-administered survey and provided blood specimens for HIV/STD testing. Among the four MSM samples using different recruitment methods, rates of HIV infections were 5.1%, 5.8%, 7.8%, and 3.4%; rates of Syphilis infection were 21.8%, 36.2%, 11.8%, and 13.8%; and rates of inconsistent condom use were 57%, 52%, 58%, and 38%. Significant differences were found in various sociodemographic characteristics (e.g., age, migration history, education, income, and places of employment) and risk behaviors (e.g., age at first sex, number of sex partners, involvement in commercial sex, and substance use) among samples recruited by different sampling methods. The results confirmed the challenges of obtaining representative MSM samples and underscored the importance of using multiple sampling methods to reach MSM from diverse backgrounds and in different social segments and to improve the representativeness of the MSM samples when the use of probability sampling approach is not feasible.
Estimating background and threshold nitrate concentrations using probability graphs

USGS Publications Warehouse

Panno, S.V.; Kelly, W.R.; Martinsek, A.T.; Hackley, Keith C.

2006-01-01

Because of the ubiquitous nature of anthropogenic nitrate (NO 3-) in many parts of the world, determining background concentrations of NO3- in shallow ground water from natural sources is probably impossible in most environments. Present-day background must now include diffuse sources of NO3- such as disruption of soils and oxidation of organic matter, and atmospheric inputs from products of combustion and evaporation of ammonia from fertilizer and livestock waste. Anomalies can be defined as NO3- derived from nitrogen (N) inputs to the environment from anthropogenic activities, including synthetic fertilizers, livestock waste, and septic effluent. Cumulative probability graphs were used to identify threshold concentrations separating background and anomalous NO3-N concentrations and to assist in the determination of sources of N contamination for 232 spring water samples and 200 well water samples from karst aquifers. Thresholds were 0.4, 2.5, and 6.7 mg/L for spring water samples, and 0.1, 2.1, and 17 mg/L for well water samples. The 0.4 and 0.1 mg/L values are assumed to represent thresholds for present-day precipitation. Thresholds at 2.5 and 2.1 mg/L are interpreted to represent present-day background concentrations of NO3-N. The population of spring water samples with concentrations between 2.5 and 6.7 mg/L represents an amalgam of all sources of NO3- in the ground water basins that feed each spring; concentrations >6.7 mg/L were typically samples collected soon after springtime application of synthetic fertilizer. The 17 mg/L threshold (adjusted to 15 mg/L) for well water samples is interpreted as the level above which livestock wastes dominate the N sources. Copyright ?? 2006 The Author(s).

On sampling biases arising from insufficient bottle flushing

NASA Astrophysics Data System (ADS)

Codispoti, L. A.; Paver, C. R.

2016-02-01

Collection of representative water samples using carousel bottles is important for accurately determining biological and chemical gradients. The development of more technologically advanced instrumentation and sampling apparatus causes sampling packages to increase and "soak times" to decrease, increasing the probability that insufficient bottle flushing will produce biased results. Qualitative evidence from various expeditions suggest that insufficient flushing may be a problem. Here we report on multiple field experiments that were conducted to better quantify the errors that can arise from insufficient bottle flushing. Our experiments suggest that soak times of more than 2 minutes are sometimes required to collect a representative sample.
Moving on From Representativeness: Testing the Utility of the Global Drug Survey.

PubMed

Barratt, Monica J; Ferris, Jason A; Zahnow, Renee; Palamar, Joseph J; Maier, Larissa J; Winstock, Adam R

2017-01-01

A decline in response rates in traditional household surveys, combined with increased internet coverage and decreased research budgets, has resulted in increased attractiveness of web survey research designs based on purposive and voluntary opt-in sampling strategies. In the study of hidden or stigmatised behaviours, such as cannabis use, web survey methods are increasingly common. However, opt-in web surveys are often heavily criticised due to their lack of sampling frame and unknown representativeness. In this article, we outline the current state of the debate about the relevance of pursuing representativeness, the state of probability sampling methods, and the utility of non-probability, web survey methods especially for accessing hidden or minority populations. Our article has two aims: (1) to present a comprehensive description of the methodology we use at Global Drug Survey (GDS), an annual cross-sectional web survey and (2) to compare the age and sex distributions of cannabis users who voluntarily completed (a) a household survey or (b) a large web-based purposive survey (GDS), across three countries: Australia, the United States, and Switzerland. We find that within each set of country comparisons, the demographic distributions among recent cannabis users are broadly similar, demonstrating that the age and sex distributions of those who volunteer to be surveyed are not vastly different between these non-probability and probability methods. We conclude that opt-in web surveys of hard-to-reach populations are an efficient way of gaining in-depth understanding of stigmatised behaviours and are appropriate, as long as they are not used to estimate drug use prevalence of the general population.
Moving on From Representativeness: Testing the Utility of the Global Drug Survey

PubMed Central

Barratt, Monica J; Ferris, Jason A; Zahnow, Renee; Palamar, Joseph J; Maier, Larissa J; Winstock, Adam R

2017-01-01

A decline in response rates in traditional household surveys, combined with increased internet coverage and decreased research budgets, has resulted in increased attractiveness of web survey research designs based on purposive and voluntary opt-in sampling strategies. In the study of hidden or stigmatised behaviours, such as cannabis use, web survey methods are increasingly common. However, opt-in web surveys are often heavily criticised due to their lack of sampling frame and unknown representativeness. In this article, we outline the current state of the debate about the relevance of pursuing representativeness, the state of probability sampling methods, and the utility of non-probability, web survey methods especially for accessing hidden or minority populations. Our article has two aims: (1) to present a comprehensive description of the methodology we use at Global Drug Survey (GDS), an annual cross-sectional web survey and (2) to compare the age and sex distributions of cannabis users who voluntarily completed (a) a household survey or (b) a large web-based purposive survey (GDS), across three countries: Australia, the United States, and Switzerland. We find that within each set of country comparisons, the demographic distributions among recent cannabis users are broadly similar, demonstrating that the age and sex distributions of those who volunteer to be surveyed are not vastly different between these non-probability and probability methods. We conclude that opt-in web surveys of hard-to-reach populations are an efficient way of gaining in-depth understanding of stigmatised behaviours and are appropriate, as long as they are not used to estimate drug use prevalence of the general population. PMID:28924351
Characteristics of the First Child Predict the Parents' Probability of Having Another Child

ERIC Educational Resources Information Center

Jokela, Markus

2010-01-01

In a sample of 7,695 families in the prospective, nationally representative British Millennium Cohort Study, this study examined whether characteristics of the 1st-born child predicted parents' timing and probability of having another child within 5 years after the 1st child's birth. Infant temperament was assessed with the Carey Infant…
On the importance of incorporating sampling weights in ...

EPA Pesticide Factsheets

Occupancy models are used extensively to assess wildlife-habitat associations and to predict species distributions across large geographic regions. Occupancy models were developed as a tool to properly account for imperfect detection of a species. Current guidelines on survey design requirements for occupancy models focus on the number of sample units and the pattern of revisits to a sample unit within a season. We focus on the sampling design or how the sample units are selected in geographic space (e.g., stratified, simple random, unequal probability, etc). In a probability design, each sample unit has a sample weight which quantifies the number of sample units it represents in the finite (oftentimes areal) sampling frame. We demonstrate the importance of including sampling weights in occupancy model estimation when the design is not a simple random sample or equal probability design. We assume a finite areal sampling frame as proposed for a national bat monitoring program. We compare several unequal and equal probability designs and varying sampling intensity within a simulation study. We found the traditional single season occupancy model produced biased estimates of occupancy and lower confidence interval coverage rates compared to occupancy models that accounted for the sampling design. We also discuss how our findings inform the analyses proposed for the nascent North American Bat Monitoring Program and other collaborative synthesis efforts that propose h
Population-specific FST values for forensic STR markers: A worldwide survey.

PubMed

Buckleton, John; Curran, James; Goudet, Jérôme; Taylor, Duncan; Thiery, Alexandre; Weir, B S

2016-07-01

The interpretation of matching between DNA profiles of a person of interest and an item of evidence is undertaken using population genetic models to predict the probability of matching by chance. Calculation of matching probabilities is straightforward if allelic probabilities are known, or can be estimated, in the relevant population. It is more often the case, however, that the relevant population has not been sampled and allele frequencies are available only from a broader collection of populations as might be represented in a national or regional database. Variation of allele probabilities among the relevant populations is quantified by the population structure quantity FST and this quantity affects matching proportions. Matching within a population can be interpreted only with respect to matching between populations and we show here that FST, can be estimated from sample allelic matching proportions within and between populations. We report such estimates from data we extracted from 250 papers in the forensic literature, representing STR profiles at up to 24 loci from nearly 500,000 people in 446 different populations. The results suggest that theta values in current forensic use do not have the buffer of conservatism often thought. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Population-specific FST values for forensic STR markers: A worldwide survey

PubMed Central

Buckleton, John; Curran, James; Goudet, Jérôme; Taylor, Duncan; Thiery, Alexandre; Weir, B.S.

2016-01-01

The interpretation of matching between DNA profiles of a person of interest and an item of evidence is undertaken using population genetic models to predict the probability of matching by chance. Calculation of matching probabilities is straightforward if allelic probabilities are known, or can be estimated, in the relevant population. It is more often the case, however, that the relevant population has not been sampled and allele frequencies are available only from a broader collection of populations as might be represented in a national or regional database. Variation of allele probabilities among the relevant populations is quantified by the population structure quantity FST and this quanity affects matching propoptions. Matching within a population can be interpreted only with respect to matching between populations and we show here that FST, can be estimated from sample allelic matching proportions within and between populations. We report such estimates from data we extracted from 250 papers in the forensic literature, representing STR profiles at up to 24 loci from nearly 500,000 people in 446 different populations. The results suggest that theta values in current forensic use do not have the buffer of conservativism often thought. PMID:27082756
77 FR 72205 - Testing and Labeling Pertaining to Product Certification Regarding Representative Samples for...

Federal Register 2010, 2011, 2012, 2013, 2014

2012-12-05

... testing periodically and when there has been a material change in the product's design or manufacturing... control data during product manufacture; and using manufacturing techniques with intrinsic manufacturing... sample in the production population an equal probability of being selected (75 FR at 28349 through 28350...
Beliefs about women's vibrator use: results from a nationally representative probability survey in the United States.

PubMed

Herbenick, Debra; Reece, Michael; Schick, Vanessa; Jozkowski, Kristen N; Middelstadt, Susan E; Sanders, Stephanie A; Dodge, Brian S; Ghassemi, Annahita; Fortenberry, J Dennis

2011-01-01

Women's vibrator use is common in the United States, although little is known about beliefs about its use. Elicitation surveys and interviews informed the development of a 10-item scale, the Beliefs About Women's Vibrator Use Scale, which was administered to a nationally representative probability sample of adults ages 18 to 60 years. Most women and men held high positive and low negative beliefs about women's vibrator use. Women with positive beliefs reported higher Female Sexual Function Index scores related to arousal, lubrication, orgasm, satisfaction, and pain (indicating less pain).
Best Practices in Using Large, Complex Samples: The Importance of Using Appropriate Weights and Design Effect Compensation

ERIC Educational Resources Information Center

Osborne, Jason W.

2011-01-01

Large surveys often use probability sampling in order to obtain representative samples, and these data sets are valuable tools for researchers in all areas of science. Yet many researchers are not formally prepared to appropriately utilize these resources. Indeed, users of one popular dataset were generally found "not" to have modeled…
Has Adolescent Suicidality Decreased in the United States? Data from Two National Samples of Adolescents Interviewed in 1995 and 2005

ERIC Educational Resources Information Center

Wolitzky-Taylor, Kate B.; Ruggiero, Kenneth J.; McCart, Michael R.; Smith, Daniel W.; Hanson, Rochelle F.; Resnick, Heidi S.; de Arellano, Michael A.; Saunders, Benjamin E.; Kilpatrick, Dean G.

2010-01-01

We compared the prevalence and correlates of adolescent suicidal ideation and attempts in two nationally representative probability samples of adolescents interviewed in 1995 (National Survey of Adolescents; N = 4,023) and 2005 (National Survey of Adolescents-Replication; N = 3,614). Participants in both samples completed a telephone survey that…
Statistical evaluation of vibration analysis techniques

NASA Technical Reports Server (NTRS)

Milner, G. Martin; Miller, Patrice S.

1987-01-01

An evaluation methodology is presented for a selection of candidate vibration analysis techniques applicable to machinery representative of the environmental control and life support system of advanced spacecraft; illustrative results are given. Attention is given to the statistical analysis of small sample experiments, the quantification of detection performance for diverse techniques through the computation of probability of detection versus probability of false alarm, and the quantification of diagnostic performance.
Latin hypercube approach to estimate uncertainty in ground water vulnerability

USGS Publications Warehouse

Gurdak, J.J.; McCray, J.E.; Thyne, G.; Qi, S.L.

2007-01-01

A methodology is proposed to quantify prediction uncertainty associated with ground water vulnerability models that were developed through an approach that coupled multivariate logistic regression with a geographic information system (GIS). This method uses Latin hypercube sampling (LHS) to illustrate the propagation of input error and estimate uncertainty associated with the logistic regression predictions of ground water vulnerability. Central to the proposed method is the assumption that prediction uncertainty in ground water vulnerability models is a function of input error propagation from uncertainty in the estimated logistic regression model coefficients (model error) and the values of explanatory variables represented in the GIS (data error). Input probability distributions that represent both model and data error sources of uncertainty were simultaneously sampled using a Latin hypercube approach with logistic regression calculations of probability of elevated nonpoint source contaminants in ground water. The resulting probability distribution represents the prediction intervals and associated uncertainty of the ground water vulnerability predictions. The method is illustrated through a ground water vulnerability assessment of the High Plains regional aquifer. Results of the LHS simulations reveal significant prediction uncertainties that vary spatially across the regional aquifer. Additionally, the proposed method enables a spatial deconstruction of the prediction uncertainty that can lead to improved prediction of ground water vulnerability. ?? 2007 National Ground Water Association.
Nonlinear Spatial Inversion Without Monte Carlo Sampling

NASA Astrophysics Data System (ADS)

Curtis, A.; Nawaz, A.

2017-12-01

High-dimensional, nonlinear inverse or inference problems usually have non-unique solutions. The distribution of solutions are described by probability distributions, and these are usually found using Monte Carlo (MC) sampling methods. These take pseudo-random samples of models in parameter space, calculate the probability of each sample given available data and other information, and thus map out high or low probability values of model parameters. However, such methods would converge to the solution only as the number of samples tends to infinity; in practice, MC is found to be slow to converge, convergence is not guaranteed to be achieved in finite time, and detection of convergence requires the use of subjective criteria. We propose a method for Bayesian inversion of categorical variables such as geological facies or rock types in spatial problems, which requires no sampling at all. The method uses a 2-D Hidden Markov Model over a grid of cells, where observations represent localized data constraining the model in each cell. The data in our example application are seismic properties such as P- and S-wave impedances or rock density; our model parameters are the hidden states and represent the geological rock types in each cell. The observations at each location are assumed to depend on the facies at that location only - an assumption referred to as `localized likelihoods'. However, the facies at a location cannot be determined solely by the observation at that location as it also depends on prior information concerning its correlation with the spatial distribution of facies elsewhere. Such prior information is included in the inversion in the form of a training image which represents a conceptual depiction of the distribution of local geologies that might be expected, but other forms of prior information can be used in the method as desired. The method provides direct (pseudo-analytic) estimates of posterior marginal probability distributions over each variable, so these do not need to be estimated from samples as is required in MC methods. On a 2-D test example the method is shown to outperform previous methods significantly, and at a fraction of the computational cost. In many foreseeable applications there are therefore no serious impediments to extending the method to 3-D spatial models.
Estimating the concordance probability in a survival analysis with a discrete number of risk groups.

PubMed

Heller, Glenn; Mo, Qianxing

2016-04-01

A clinical risk classification system is an important component of a treatment decision algorithm. A measure used to assess the strength of a risk classification system is discrimination, and when the outcome is survival time, the most commonly applied global measure of discrimination is the concordance probability. The concordance probability represents the pairwise probability of lower patient risk given longer survival time. The c-index and the concordance probability estimate have been used to estimate the concordance probability when patient-specific risk scores are continuous. In the current paper, the concordance probability estimate and an inverse probability censoring weighted c-index are modified to account for discrete risk scores. Simulations are generated to assess the finite sample properties of the concordance probability estimate and the weighted c-index. An application of these measures of discriminatory power to a metastatic prostate cancer risk classification system is examined.
Childhood Trauma and Psychiatric Disorders as Correlates of School Dropout in a National Sample of Young Adults

ERIC Educational Resources Information Center

Porche, Michelle V.; Fortuna, Lisa R.; Lin, Julia; Alegria, Margarita

2011-01-01

The effect of childhood trauma, psychiatric diagnoses, and mental health services on school dropout among U.S.-born and immigrant youth is examined using data from the Collaborative Psychiatric Epidemiology Surveys, a nationally representative probability sample of African Americans, Afro-Caribbeans, Asians, Latinos, and non-Latino Whites,…
An agglomerative hierarchical clustering approach to visualisation in Bayesian clustering problems

PubMed Central

Dawson, Kevin J.; Belkhir, Khalid

2009-01-01

Clustering problems (including the clustering of individuals into outcrossing populations, hybrid generations, full-sib families and selfing lines) have recently received much attention in population genetics. In these clustering problems, the parameter of interest is a partition of the set of sampled individuals, - the sample partition. In a fully Bayesian approach to clustering problems of this type, our knowledge about the sample partition is represented by a probability distribution on the space of possible sample partitions. Since the number of possible partitions grows very rapidly with the sample size, we can not visualise this probability distribution in its entirety, unless the sample is very small. As a solution to this visualisation problem, we recommend using an agglomerative hierarchical clustering algorithm, which we call the exact linkage algorithm. This algorithm is a special case of the maximin clustering algorithm that we introduced previously. The exact linkage algorithm is now implemented in our software package Partition View. The exact linkage algorithm takes the posterior co-assignment probabilities as input, and yields as output a rooted binary tree, - or more generally, a forest of such trees. Each node of this forest defines a set of individuals, and the node height is the posterior co-assignment probability of this set. This provides a useful visual representation of the uncertainty associated with the assignment of individuals to categories. It is also a useful starting point for a more detailed exploration of the posterior distribution in terms of the co-assignment probabilities. PMID:19337306
Evaluation of nutrient quality-assurance data for Alexanders and Mount Rock Spring basins, Cumberland County, Pennsylvania

USGS Publications Warehouse

Witt, E. C.; Hippe, D.J.; Giovannitti, R.M.

1992-01-01

A total of 304 nutrient samples were collected from May 1990 through September 1991 to determine concentrations and loads of nutrients in water discharged from two spring basins in Cumberland County, Pa. Fifty-four percent of these nutrient samples were for the evaluation of (1) laboratory consistency, (2) container and preservative cleanliness, (3) maintenance of analyte representativeness as affected by three different preservation methods, and (4) comparison of analyte results with the "Most Probable Value" for Standard Reference Water Samples. Results of 37 duplicate analyses indicate that the Pennsylvania Department of Environmental Resources, Bureau of Laboratories (principal laboratory) remained within its ±10 percent goal for all but one analyte. Results of the blank analysis show that the sampling containers did not compromise the water quality. However, mercuric-chloride-preservation blanks apparently contained measurable ammonium in four of five samples and ammonium plus organic nitrogen in two of five samples. Interlaboratory results indicate substantial differences in the determination of nitrate and ammonium plus organic nitrogen between the principal laboratory and the U.S. Geological Survey National Water-Quality Laboratory. In comparison with the U.S. Environmental Protection Agency Quality-Control Samples, the principal laboratory was sufficiently accurate in its determination of nutrient anafytes. Analysis of replicate samples indicated that sulfuric-acid preservative best maintained the representativeness of the anafytes nitrate and ammonium plus organic nitrogen, whereas, mercuric chloride best maintained the representativeness of orthophosphate. Comparison of nutrient analyte determinations with the Most Probable Value for each preservation method shows that two of five analytes with no chemical preservative compare well, three of five with mercuric-chloride preservative compare well, and three of five with sulfuricacid preservative compare well.
Taxonomic classification of world map units in crop producing areas of Argentina and Brazil with representative US soil series and major land resource areas in which they occur

NASA Technical Reports Server (NTRS)

Huckle, H. F. (Principal Investigator)

1980-01-01

The most probable current U.S. taxonomic classification of the soils estimated to dominate world soil map units (WSM)) in selected crop producing states of Argentina and Brazil are presented. Representative U.S. soil series the units are given. The map units occurring in each state are listed with areal extent and major U.S. land resource areas in which similar soils most probably occur. Soil series sampled in LARS Technical Report 111579 and major land resource areas in which they occur with corresponding similar WSM units at the taxonomic subgroup levels are given.
A country-wide probability sample of public attitudes toward stuttering in Portugal.

PubMed

Valente, Ana Rita S; St Louis, Kenneth O; Leahy, Margaret; Hall, Andreia; Jesus, Luis M T

2017-06-01

Negative public attitudes toward stuttering have been widely reported, although differences among countries and regions exist. Clear reasons for these differences remain obscure. Published research is unavailable on public attitudes toward stuttering in Portugal as well as a representative sample that explores stuttering attitudes in an entire country. This study sought to (a) determine the feasibility of a country-wide probability sampling scheme to measure public stuttering attitudes in Portugal using a standard instrument (the Public Opinion Survey of Human Attributes-Stuttering [POSHA-S]) and (b) identify demographic variables that predict Portuguese attitudes. The POSHA-S was translated to European Portuguese through a five-step process. Thereafter, a local administrative office-based, three-stage, cluster, probability sampling scheme was carried out to obtain 311 adult respondents who filled out the questionnaire. The Portuguese population held stuttering attitudes that were generally within the average range of those observed from numerous previous POSHA-S samples. Demographic variables that predicted more versus less positive stuttering attitudes were respondents' age, region of the country, years of school completed, working situation, and number of languages spoken. Non-predicting variables were respondents' sex, marital status, and parental status. A local administrative office-based, probability sampling scheme generated a respondent profile similar to census data and indicated that Portuguese attitudes are generally typical. Copyright © 2017 Elsevier Inc. All rights reserved.

Exposure to Trauma and Separation Anxiety in Children after the WTC Attack

ERIC Educational Resources Information Center

Hoven, Christina W.; Duarte, Cristiane S.; Wu, Ping; Erickson, Elizabeth A.; Musa, George J.; Mandell, Donald J.

2004-01-01

The impact of exposure to the World Trade Center attack on children presenting separation anxiety disorder (SAD) 6 months after the attack was studied in a representative sample of New York City public school students (N = 8,236). Probable SAD occurred in 12.3% of the sample and was more frequent in girls, young children, and children who…
Data processing 1: Advancements in machine analysis of multispectral data

NASA Technical Reports Server (NTRS)

Swain, P. H.

1972-01-01

Multispectral data processing procedures are outlined beginning with the data display process used to accomplish data editing and proceeding through clustering, feature selection criterion for error probability estimation, and sample clustering and sample classification. The effective utilization of large quantities of remote sensing data by formulating a three stage sampling model for evaluation of crop acreage estimates represents an improvement in determining the cost benefit relationship associated with remote sensing technology.
Probabilistic methods for rotordynamics analysis

NASA Technical Reports Server (NTRS)

Wu, Y.-T.; Torng, T. Y.; Millwater, H. R.; Fossum, A. F.; Rheinfurth, M. H.

1991-01-01

This paper summarizes the development of the methods and a computer program to compute the probability of instability of dynamic systems that can be represented by a system of second-order ordinary linear differential equations. Two instability criteria based upon the eigenvalues or Routh-Hurwitz test functions are investigated. Computational methods based on a fast probability integration concept and an efficient adaptive importance sampling method are proposed to perform efficient probabilistic analysis. A numerical example is provided to demonstrate the methods.
THIRD NATIONAL HEALTH AND NUTRITION EXAMINATION SURVEY (NHANES III)

EPA Science Inventory

The Third National Health and Nutrition Examination Survey (NHANES III), 1988-94, was conducted on a nationwide probability sample of approximately 33,994 persons 2 months and over. The survey was designed to obtain nationally representative information on the health and nutritio...
Sampling in Developmental Science: Situations, Shortcomings, Solutions, and Standards.

PubMed

Bornstein, Marc H; Jager, Justin; Putnick, Diane L

2013-12-01

Sampling is a key feature of every study in developmental science. Although sampling has far-reaching implications, too little attention is paid to sampling. Here, we describe, discuss, and evaluate four prominent sampling strategies in developmental science: population-based probability sampling, convenience sampling, quota sampling, and homogeneous sampling. We then judge these sampling strategies by five criteria: whether they yield representative and generalizable estimates of a study's target population, whether they yield representative and generalizable estimates of subsamples within a study's target population, the recruitment efforts and costs they entail, whether they yield sufficient power to detect subsample differences, and whether they introduce "noise" related to variation in subsamples and whether that "noise" can be accounted for statistically. We use sample composition of gender, ethnicity, and socioeconomic status to illustrate and assess the four sampling strategies. Finally, we tally the use of the four sampling strategies in five prominent developmental science journals and make recommendations about best practices for sample selection and reporting.
Sampling in Developmental Science: Situations, Shortcomings, Solutions, and Standards

PubMed Central

Bornstein, Marc H.; Jager, Justin; Putnick, Diane L.

2014-01-01

Sampling is a key feature of every study in developmental science. Although sampling has far-reaching implications, too little attention is paid to sampling. Here, we describe, discuss, and evaluate four prominent sampling strategies in developmental science: population-based probability sampling, convenience sampling, quota sampling, and homogeneous sampling. We then judge these sampling strategies by five criteria: whether they yield representative and generalizable estimates of a study’s target population, whether they yield representative and generalizable estimates of subsamples within a study’s target population, the recruitment efforts and costs they entail, whether they yield sufficient power to detect subsample differences, and whether they introduce “noise” related to variation in subsamples and whether that “noise” can be accounted for statistically. We use sample composition of gender, ethnicity, and socioeconomic status to illustrate and assess the four sampling strategies. Finally, we tally the use of the four sampling strategies in five prominent developmental science journals and make recommendations about best practices for sample selection and reporting. PMID:25580049
An efficient reliability algorithm for locating design point using the combination of importance sampling concepts and response surface method

NASA Astrophysics Data System (ADS)

Shayanfar, Mohsen Ali; Barkhordari, Mohammad Ali; Roudak, Mohammad Amin

2017-06-01

Monte Carlo simulation (MCS) is a useful tool for computation of probability of failure in reliability analysis. However, the large number of required random samples makes it time-consuming. Response surface method (RSM) is another common method in reliability analysis. Although RSM is widely used for its simplicity, it cannot be trusted in highly nonlinear problems due to its linear nature. In this paper, a new efficient algorithm, employing the combination of importance sampling, as a class of MCS, and RSM is proposed. In the proposed algorithm, analysis starts with importance sampling concepts and using a represented two-step updating rule of design point. This part finishes after a small number of samples are generated. Then RSM starts to work using Bucher experimental design, with the last design point and a represented effective length as the center point and radius of Bucher's approach, respectively. Through illustrative numerical examples, simplicity and efficiency of the proposed algorithm and the effectiveness of the represented rules are shown.
Quantifying seining detection probability for fishes of Great Plains sand‐bed rivers

USGS Publications Warehouse

Mollenhauer, Robert; Logue, Daniel R.; Brewer, Shannon K.

2018-01-01

Species detection error (i.e., imperfect and variable detection probability) is an essential consideration when investigators map distributions and interpret habitat associations. When fish detection error that is due to highly variable instream environments needs to be addressed, sand‐bed streams of the Great Plains represent a unique challenge. We quantified seining detection probability for diminutive Great Plains fishes across a range of sampling conditions in two sand‐bed rivers in Oklahoma. Imperfect detection resulted in underestimates of species occurrence using naïve estimates, particularly for less common fishes. Seining detection probability also varied among fishes and across sampling conditions. We observed a quadratic relationship between water depth and detection probability, in which the exact nature of the relationship was species‐specific and dependent on water clarity. Similarly, the direction of the relationship between water clarity and detection probability was species‐specific and dependent on differences in water depth. The relationship between water temperature and detection probability was also species dependent, where both the magnitude and direction of the relationship varied among fishes. We showed how ignoring detection error confounded an underlying relationship between species occurrence and water depth. Despite imperfect and heterogeneous detection, our results support that determining species absence can be accomplished with two to six spatially replicated seine hauls per 200‐m reach under average sampling conditions; however, required effort would be higher under certain conditions. Detection probability was low for the Arkansas River Shiner Notropis girardi, which is federally listed as threatened, and more than 10 seine hauls per 200‐m reach would be required to assess presence across sampling conditions. Our model allows scientists to estimate sampling effort to confidently assess species occurrence, which maximizes the use of available resources. Increased implementation of approaches that consider detection error promote ecological advancements and conservation and management decisions that are better informed.
Bipartite discrimination of independently prepared quantum states as a counterexample to a parallel repetition conjecture

NASA Astrophysics Data System (ADS)

Akibue, Seiseki; Kato, Go

2018-04-01

For distinguishing quantum states sampled from a fixed ensemble, the gap in bipartite and single-party distinguishability can be interpreted as a nonlocality of the ensemble. In this paper, we consider bipartite state discrimination in a composite system consisting of N subsystems, where each subsystem is shared between two parties and the state of each subsystem is randomly sampled from a particular ensemble comprising the Bell states. We show that the success probability of perfectly identifying the state converges to 1 as N →∞ if the entropy of the probability distribution associated with the ensemble is less than 1, even if the success probability is less than 1 for any finite N . In other words, the nonlocality of the N -fold ensemble asymptotically disappears if the probability distribution associated with each ensemble is concentrated. Furthermore, we show that the disappearance of the nonlocality can be regarded as a remarkable counterexample of a fundamental open question in theoretical computer science, called a parallel repetition conjecture of interactive games with two classically communicating players. Measurements for the discrimination task include a projective measurement of one party represented by stabilizer states, which enable the other party to perfectly distinguish states that are sampled with high probability.
On the quantification and efficient propagation of imprecise probabilities resulting from small datasets

NASA Astrophysics Data System (ADS)

Zhang, Jiaxin; Shields, Michael D.

2018-01-01

This paper addresses the problem of uncertainty quantification and propagation when data for characterizing probability distributions are scarce. We propose a methodology wherein the full uncertainty associated with probability model form and parameter estimation are retained and efficiently propagated. This is achieved by applying the information-theoretic multimodel inference method to identify plausible candidate probability densities and associated probabilities that each method is the best model in the Kullback-Leibler sense. The joint parameter densities for each plausible model are then estimated using Bayes' rule. We then propagate this full set of probability models by estimating an optimal importance sampling density that is representative of all plausible models, propagating this density, and reweighting the samples according to each of the candidate probability models. This is in contrast with conventional methods that try to identify a single probability model that encapsulates the full uncertainty caused by lack of data and consequently underestimate uncertainty. The result is a complete probabilistic description of both aleatory and epistemic uncertainty achieved with several orders of magnitude reduction in computational cost. It is shown how the model can be updated to adaptively accommodate added data and added candidate probability models. The method is applied for uncertainty analysis of plate buckling strength where it is demonstrated how dataset size affects the confidence (or lack thereof) we can place in statistical estimates of response when data are lacking.
Technology Development Risk Assessment for Space Transportation Systems

NASA Technical Reports Server (NTRS)

Mathias, Donovan L.; Godsell, Aga M.; Go, Susie

2006-01-01

A new approach for assessing development risk associated with technology development projects is presented. The method represents technology evolution in terms of sector-specific discrete development stages. A Monte Carlo simulation is used to generate development probability distributions based on statistical models of the discrete transitions. Development risk is derived from the resulting probability distributions and specific program requirements. Two sample cases are discussed to illustrate the approach, a single rocket engine development and a three-technology space transportation portfolio.
A new estimator of the discovery probability.

PubMed

Favaro, Stefano; Lijoi, Antonio; Prünster, Igor

2012-12-01

Species sampling problems have a long history in ecological and biological studies and a number of issues, including the evaluation of species richness, the design of sampling experiments, and the estimation of rare species variety, are to be addressed. Such inferential problems have recently emerged also in genomic applications, however, exhibiting some peculiar features that make them more challenging: specifically, one has to deal with very large populations (genomic libraries) containing a huge number of distinct species (genes) and only a small portion of the library has been sampled (sequenced). These aspects motivate the Bayesian nonparametric approach we undertake, since it allows to achieve the degree of flexibility typically needed in this framework. Based on an observed sample of size n, focus will be on prediction of a key aspect of the outcome from an additional sample of size m, namely, the so-called discovery probability. In particular, conditionally on an observed basic sample of size n, we derive a novel estimator of the probability of detecting, at the (n+m+1)th observation, species that have been observed with any given frequency in the enlarged sample of size n+m. Such an estimator admits a closed-form expression that can be exactly evaluated. The result we obtain allows us to quantify both the rate at which rare species are detected and the achieved sample coverage of abundant species, as m increases. Natural applications are represented by the estimation of the probability of discovering rare genes within genomic libraries and the results are illustrated by means of two expressed sequence tags datasets. © 2012, The International Biometric Society.
Disentangling the relationship between child maltreatment and violent delinquency: using a nationally representative sample.

PubMed

Yun, Ilhong; Ball, Jeremy D; Lim, Hyeyoung

2011-01-01

This study uses the National Longitudinal Study of Adolescents (Add Health) data, a nationally representative sample of adolescents, to disentangle the relationship between child maltreatment and violent delinquency. Also examined are potential moderating effects of gender, socioeconomic status (SES), and religiosity on the association between child maltreatment and violent delinquency. Contrary to prior research findings, the current analyses reveal that physical abuse is not associated with future violent delinquency, whereas sexual abuse and neglect predict violent delinquency significantly. The current study also did not reveal any moderating effects of gender, SES, and religiosity on the association between maltreatment and violent delinquency. Interpretations of these findings are presented, drawing on the properties of the national probability sample compared to the findings of most prior studies that used localized samples.
Propagating Mixed Uncertainties in Cyber Attacker Payoffs: Exploration of Two-Phase Monte Carlo Sampling and Probability Bounds Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chatterjee, Samrat; Tipireddy, Ramakrishna; Oster, Matthew R.

Securing cyber-systems on a continual basis against a multitude of adverse events is a challenging undertaking. Game-theoretic approaches, that model actions of strategic decision-makers, are increasingly being applied to address cybersecurity resource allocation challenges. Such game-based models account for multiple player actions and represent cyber attacker payoffs mostly as point utility estimates. Since a cyber-attacker’s payoff generation mechanism is largely unknown, appropriate representation and propagation of uncertainty is a critical task. In this paper we expand on prior work and focus on operationalizing the probabilistic uncertainty quantification framework, for a notional cyber system, through: 1) representation of uncertain attacker andmore » system-related modeling variables as probability distributions and mathematical intervals, and 2) exploration of uncertainty propagation techniques including two-phase Monte Carlo sampling and probability bounds analysis.« less
Archaeal communities of Arctic methane-containing permafrost.

PubMed

Shcherbakova, Victoria; Yoshimura, Yoshitaka; Ryzhmanova, Yana; Taguchi, Yukihiro; Segawa, Takahiro; Oshurkova, Victoria; Rivkina, Elizaveta

2016-10-01

In the present study, we used culture-independent methods to investigate the diversity of methanogenic archaea and their distribution in five permafrost samples collected from a borehole in the Kolyma River Lowland (north-east of Russia). Total DNA was extracted from methane-containing permafrost samples of different age and amplified by PCR. The resulting DNA fragments were cloned. Phylogenetic analysis of the sequences showed the presence of archaea in all studied samples; 60%-95% of sequences belonged to the Euryarchaeota. Methanogenic archaea were novel representatives of Methanosarcinales, Methanomicrobiales, Methanobacteriales and Methanocellales orders. Bathyarchaeota (Miscellaneous Crenarchaeota Group) representatives were found among nonmethanogenic archaea in all the samples studied. The Thaumarchaeota representatives were not found in the upper sample, whereas Woesearchaeota (formerly DHVEG-6) were found in the three deepest samples. Unexpectedly, the greatest diversity of archaea was observed at a depth of 22.3 m, probably due to the availability of the labile organic carbon and/or due to the migration of the microbial cells during the freezing front towards the bottom. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Advances in statistics

Treesearch

Howard Stauffer; Nadav Nur

2005-01-01

The papers included in the Advances in Statistics section of the Partners in Flight (PIF) 2002 Proceedings represent a small sample of statistical topics of current importance to Partners In Flight research scientists: hierarchical modeling, estimation of detection probabilities, and Bayesian applications. Sauer et al. (this volume) examines a hierarchical model...
75 FR 1415 - Submission for OMB Review: Comment Request

Federal Register 2010, 2011, 2012, 2013, 2014

2010-01-11

... Department of Labor--Bureau of Labor Statistics (BLS), Office of Management and Budget, Room 10235... Statistics. Type of Review: Revision of a currently approved collection. Title of Collection: The Consumer... sector. The data are collected from a national probability sample of households designed to represent the...
INDOOR, OUTDOOR, AND PERSONAL AIR EXPOSURES TO PARTICLES, ELEMENTS, AND NICOTINE FOR 178 RESIDENTS OF RIVERSIDE, CALIFORNIA

EPA Science Inventory

Personal, indoor, and outdoor concentrations of inhalable particles and 15 elements were measured for a probability sample of 178 persons representing 139,000 nonsmoking residents of Riverside, California. ewly designed personal monitors were employed. ersonal exposures often exc...
Flood Frequency Curves - Use of information on the likelihood of extreme floods

NASA Astrophysics Data System (ADS)

Faber, B.

2011-12-01

Investment in the infrastructure that reduces flood risk for flood-prone communities must incorporate information on the magnitude and frequency of flooding in that area. Traditionally, that information has been a probability distribution of annual maximum streamflows developed from the historical gaged record at a stream site. Practice in the United States fits a Log-Pearson type3 distribution to the annual maximum flows of an unimpaired streamflow record, using the method of moments to estimate distribution parameters. The procedure makes the assumptions that annual peak streamflow events are (1) independent, (2) identically distributed, and (3) form a representative sample of the overall probability distribution. Each of these assumptions can be challenged. We rarely have enough data to form a representative sample, and therefore must compute and display the uncertainty in the estimated flood distribution. But, is there a wet/dry cycle that makes precipitation less than independent between successive years? Are the peak flows caused by different types of events from different statistical populations? How does the watershed or climate changing over time (non-stationarity) affect the probability distribution floods? Potential approaches to avoid these assumptions vary from estimating trend and shift and removing them from early data (and so forming a homogeneous data set), to methods that estimate statistical parameters that vary with time. A further issue in estimating a probability distribution of flood magnitude (the flood frequency curve) is whether a purely statistical approach can accurately capture the range and frequency of floods that are of interest. A meteorologically-based analysis produces "probable maximum precipitation" (PMP) and subsequently a "probable maximum flood" (PMF) that attempts to describe an upper bound on flood magnitude in a particular watershed. This analysis can help constrain the upper tail of the probability distribution, well beyond the range of gaged data or even historical or paleo-flood data, which can be very important in risk analyses performed for flood risk management and dam and levee safety studies.
Method of identifying clusters representing statistical dependencies in multivariate data

NASA Technical Reports Server (NTRS)

Borucki, W. J.; Card, D. H.; Lyle, G. C.

1975-01-01

Approach is first to cluster and then to compute spatial boundaries for resulting clusters. Next step is to compute, from set of Monte Carlo samples obtained from scrambled data, estimates of probabilities of obtaining at least as many points within boundaries as were actually observed in original data.

ARACNe-AP: Gene Network Reverse Engineering through Adaptive Partitioning inference of Mutual Information. | Office of Cancer Genomics

Cancer.gov

The accurate reconstruction of gene regulatory networks from large scale molecular profile datasets represents one of the grand challenges of Systems Biology. The Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) represents one of the most effective tools to accomplish this goal. However, the initial Fixed Bandwidth (FB) implementation is both inefficient and unable to deal with sample sets providing largely uneven coverage of the probability density space.
Rare event simulation in radiation transport

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kollman, Craig

1993-10-01

This dissertation studies methods for estimating extremely small probabilities by Monte Carlo simulation. Problems in radiation transport typically involve estimating very rare events or the expected value of a random variable which is with overwhelming probability equal to zero. These problems often have high dimensional state spaces and irregular geometries so that analytic solutions are not possible. Monte Carlo simulation must be used to estimate the radiation dosage being transported to a particular location. If the area is well shielded the probability of any one particular particle getting through is very small. Because of the large number of particles involved,more » even a tiny fraction penetrating the shield may represent an unacceptable level of radiation. It therefore becomes critical to be able to accurately estimate this extremely small probability. Importance sampling is a well known technique for improving the efficiency of rare event calculations. Here, a new set of probabilities is used in the simulation runs. The results are multiple by the likelihood ratio between the true and simulated probabilities so as to keep the estimator unbiased. The variance of the resulting estimator is very sensitive to which new set of transition probabilities are chosen. It is shown that a zero variance estimator does exist, but that its computation requires exact knowledge of the solution. A simple random walk with an associated killing model for the scatter of neutrons is introduced. Large deviation results for optimal importance sampling in random walks are extended to the case where killing is present. An adaptive ``learning`` algorithm for implementing importance sampling is given for more general Markov chain models of neutron scatter. For finite state spaces this algorithm is shown to give with probability one, a sequence of estimates converging exponentially fast to the true solution.« less
Total coliform and E. coli in public water systems using undisinfected ground water in the United States.

PubMed

Messner, Michael J; Berger, Philip; Javier, Julie

2017-06-01

Public water systems (PWSs) in the United States generate total coliform (TC) and Escherichia coli (EC) monitoring data, as required by the Total Coliform Rule (TCR). We analyzed data generated in 2011 by approximately 38,000 small (serving fewer than 4101 individuals) undisinfected public water systems (PWSs). We used statistical modeling to characterize a distribution of TC detection probabilities for each of nine groupings of PWSs based on system type (community, non-transient non-community, and transient non-community) and population served (less than 101, 101-1000 and 1001-4100 people). We found that among PWS types sampled in 2011, on average, undisinfected transient PWSs test positive for TC 4.3% of the time as compared with 3% for undisinfected non-transient PWSs and 2.5% for undisinfected community PWSs. Within each type of PWS, the smaller systems have higher median TC detection than the larger systems. All TC-positive samples were assayed for EC. Among TC-positive samples from small undisinfected PWSs, EC is detected in about 5% of samples, regardless of PWS type or size. We evaluated the upper tail of the TC detection probability distributions and found that significant percentages of some system types have high TC detection probabilities. For example, assuming the systems providing data are nationally-representative, then 5.0% of the ∼50,000 small undisinfected transient PWSs in the U.S. have TC detection probabilities of 20% or more. Communities with such high TC detection probabilities may have elevated risk of acute gastrointestinal (AGI) illness - perhaps as great or greater than the attributable risk to drinking water (6-22%) calculated for 14 Wisconsin community PWSs with much lower TC detection probabilities (about 2.3%, Borchardt et al., 2012). Published by Elsevier GmbH.
Translational Genomics Research Institute: Identification of Pathways Enriched with Condition-Specific Statistical Dependencies Across Four Subtypes of Glioblastoma Multiforme | Office of Cancer Genomics

Cancer.gov

Evaluation of Differential DependencY (EDDY) is a statistical test for the differential dependency relationship of a set of genes between two given conditions. For each condition, possible dependency network structures are enumerated and their likelihoods are computed to represent a probability distribution of dependency networks. The difference between the probability distributions of dependency networks is computed between conditions, and its statistical significance is evaluated with random permutations of condition labels on the samples.
Translational Genomics Research Institute (TGen): Identification of Pathways Enriched with Condition-Specific Statistical Dependencies Across Four Subtypes of Glioblastoma Multiforme | Office of Cancer Genomics

Cancer.gov

Evaluation of Differential DependencY (EDDY) is a statistical test for the differential dependency relationship of a set of genes between two given conditions. For each condition, possible dependency network structures are enumerated and their likelihoods are computed to represent a probability distribution of dependency networks. The difference between the probability distributions of dependency networks is computed between conditions, and its statistical significance is evaluated with random permutations of condition labels on the samples.
Statistical analysis of general aviation VG-VGH data

NASA Technical Reports Server (NTRS)

Clay, L. E.; Dickey, R. L.; Moran, M. S.; Payauys, K. W.; Severyn, T. P.

1974-01-01

To represent the loads spectra of general aviation aircraft operating in the Continental United States, VG and VGH data collected since 1963 in eight operational categories were processed and analyzed. Adequacy of data sample and current operational categories, and parameter distributions required for valid data extrapolation were studied along with envelopes of equal probability of exceeding the normal load factor (n sub z) versus airspeed for gust and maneuver loads and the probability of exceeding current design maneuver, gust, and landing impact n sub z limits. The significant findings are included.
Collision rates and impact velocities in the Main Asteroid Belt

NASA Technical Reports Server (NTRS)

Farinella, Paolo; Davis, Donald R.

1992-01-01

Wetherill's (1967) algorithm is presently used to compute the mutual collision probabilities and impact velocities of a set of 682 asteroids with large-than-50-km radius representative of a bias-free sample of asteroid orbits. While collision probabilities are nearly independent of eccentricities, a significant decrease is associated with larger inclinations. Collisional velocities grow steeply with orbital eccentricity and inclination, but with curiously small variation across the asteroid belt. Family asteroids are noted to undergo collisions with other family members 2-3 times more often than with nonmembers.
Fatality Analysis Reporting System, General Estimates System: 2001 Data Summary.

ERIC Educational Resources Information Center

2003

The Fatality Analysis Reporting System (FARS), which became operational in 1975, contains data on a census of fatal traffic crashes within the 50 states, the District of Columbia, and Puerto Rico. The General Estimates System (GES), which began in 1988, provides data from a nationally representative probability sample selected from all…
Height and Weight of Children: United States.

ERIC Educational Resources Information Center

Hamill, Peter V. V.; And Others

This report contains national estimates based on findings from the Health Examination Survey in 1963-65 on height and weight measurements of children 6- to 11-years-old. A nationwide probability sample of 7,119 children was selected to represent the noninstitutionalized children (about 24 million) in this age group. Height was obtained in stocking…
Regolith in the South Pole-Aitken Basin is Mainly Indigenous Material

NASA Technical Reports Server (NTRS)

Haskin, L. A.; Gillis, J. J.; Jolliff, B. L.; Korotev, R. L.

2003-01-01

This abstract is concerned with the probability that a mission to a site within the South Pole-Aitken basin (SPA) would yield a meaningful sample of typical SPA floor material. The probability seems favorable, barring a highly atypical landing site, because the chemical composition of the SPA interior, as determined remotely from orbit, is different from that of the surrounding lunar surface. How representative would the sample be? To what extent have lateral transport or later events compromised the original chemical and mineralogical composition of the floor material? Where or in what kind of deposit should the mission land to provide the best example? We address these questions from the point of view of modeling of impact ejecta deposits. SPA is the largest lunar impact basin. Shallow for its diameter, it has a subdued gravity signature, a lower albedo, and a more Th- and Ferich interior than the surrounding highlands (the Feldspathic Highlands Terrane, FHT). Its floor may represent noritic or perhaps (but less abundant) gabbroic lower crust of the FHT, the upper crust stripped away by the basin-forming impact, possibly an oblique one.
From the field: Efficacy of detecting Chronic Wasting Disease via sampling hunter-killed white-tailed deer

USGS Publications Warehouse

Diefenbach, D.R.; Rosenberry, C.S.; Boyd, Robert C.

2004-01-01

Surveillance programs for Chronic Wasting Disease (CWD) in free-ranging cervids often use a standard of being able to detect 1% prevalence when determining minimum sample sizes. However, 1% prevalence may represent >10,000 infected animals in a population of 1 million, and most wildlife managers would prefer to detect the presence of CWD when far fewer infected animals exist. We wanted to detect the presence of CWD in white-tailed deer (Odocoileus virginianus) in Pennsylvania when the disease was present in only 1 of 21 wildlife management units (WMUs) statewide. We used computer simulation to estimate the probability of detecting CWD based on a sampling design to detect the presence of CWD at 0.1% and 1.0% prevalence (23-76 and 225-762 infected deer, respectively) using tissue samples collected from hunter-killed deer. The probability of detection at 0.1% prevalence was <30% with sample sizes of ???6,000 deer, and the probability of detection at 1.0% prevalence was 46-72% with statewide sample sizes of 2,000-6,000 deer. We believe that testing of hunter-killed deer is an essential part of any surveillance program for CWD, but our results demonstrated the importance of a multifaceted surveillance approach for CWD detection rather than sole reliance on testing hunter-killed deer.
Volatile element chemistry of selected lunar, meteoritic, and terrestrial samples

NASA Technical Reports Server (NTRS)

Simoneit, B. R.; Christiansen, P. C.; Burlingame, A. L.

1973-01-01

Using vacuum pyrolysis and high resolution mass spectrometry, a study is made of the gas release patterns of representative lunar samples, meteorites, terrestrial samples, and synthetic samples doped with various sources of carbon and nitrogen. The pyrolytic gas evolution patterns were intercorrelated, allowing an assessment of the possible sources of the volatilizable material in the lunar samples to be made. Lightly surface adsorbed species and more strongly chemisorbed species are released from ambient to 300 C and from 300 to 500 C, respectively. The low-temperature volatiles (less than 500 C) derived from various chondrites correlate well with the gas evolution patterns of volatile-rich samples, as for example 74220 and 61221. Solar wind entrapped species and molecules derived from reactions probably in the grain surfaces are evolved from about 500 to 700 C, respectively. Solar wind implanted C, N, and S species are generated from 750 to 1150 C, probably by reaction with the mineral matrix during the annealing process. Possible indigenous and/or refractory carbide, nitride, and sulfide C, N, and S are released in the region from 1200 C to fusion.
Prevalence, risk, and correlates of posttraumatic stress disorder across ethnic and racial minority groups in the United States.

PubMed

Alegría, Margarita; Fortuna, Lisa R; Lin, Julia Y; Norris, Fran H; Gao, Shan; Takeuchi, David T; Jackson, James S; Shrout, Patrick E; Valentine, Anne

2013-12-01

We assess whether posttraumatic stress disorder (PTSD) varies in prevalence, diagnostic criteria endorsement, and type and frequency of potentially traumatic events (PTEs) among a nationally representative US sample of 5071 non-Latino whites, 3264 Latinos, 2178 Asians, 4249 African Americans, and 1476 Afro-Caribbeans. PTSD and other psychiatric disorders were evaluated using the World Mental Health-Composite International Diagnostic Interview (WMH-CIDI) in a national household sample that oversampled ethnic/racial minorities (n=16,238) but was weighted to produce results representative of the general population. Asians have lower prevalence rates of probable lifetime PTSD, whereas African Americans have higher rates as compared with non-Latino whites, even after adjusting for type and number of exposures to traumatic events, and for sociodemographic, clinical, and social support factors. Afro-Caribbeans and Latinos seem to demonstrate similar risk to non-Latino whites, adjusting for these same covariates. Higher rates of probable PTSD exhibited by African Americans and lower rates for Asians, as compared with non-Latino whites, do not appear related to differential symptom endorsement, differences in risk or protective factors, or differences in types and frequencies of PTEs across groups. There appears to be marked differences in conditional risk of probable PTSD across ethnic/racial groups. Questions remain about what explains risk of probable PTSD. Several factors that might account for these differences are discussed, as well as the clinical implications of our findings. Uncertainty of the PTSD diagnostic assessment for Latinos and Asians requires further evaluation.
Representing Learning With Graphical Models

NASA Technical Reports Server (NTRS)

Buntine, Wray L.; Lum, Henry, Jr. (Technical Monitor)

1994-01-01

Probabilistic graphical models are being used widely in artificial intelligence, for instance, in diagnosis and expert systems, as a unified qualitative and quantitative framework for representing and reasoning with probabilities and independencies. Their development and use spans several fields including artificial intelligence, decision theory and statistics, and provides an important bridge between these communities. This paper shows by way of example that these models can be extended to machine learning, neural networks and knowledge discovery by representing the notion of a sample on the graphical model. Not only does this allow a flexible variety of learning problems to be represented, it also provides the means for representing the goal of learning and opens the way for the automatic development of learning algorithms from specifications.
The contribution of threat probability estimates to reexperiencing symptoms: a prospective analog study.

PubMed

Regambal, Marci J; Alden, Lynn E

2012-09-01

Individuals with posttraumatic stress disorder (PTSD) are hypothesized to have a "sense of current threat." Perceived threat from the environment (i.e., external threat), can lead to overestimating the probability of the traumatic event reoccurring (Ehlers & Clark, 2000). However, it is unclear if external threat judgments are a pre-existing vulnerability for PTSD or a consequence of trauma exposure. We used trauma analog methodology to prospectively measure probability estimates of a traumatic event, and investigate how these estimates were related to cognitive processes implicated in PTSD development. 151 participants estimated the probability of being in car-accident related situations, watched a movie of a car accident victim, and then completed a measure of data-driven processing during the movie. One week later, participants re-estimated the probabilities, and completed measures of reexperiencing symptoms and symptom appraisals/reactions. Path analysis revealed that higher pre-existing probability estimates predicted greater data-driven processing which was associated with negative appraisals and responses to intrusions. Furthermore, lower pre-existing probability estimates and negative responses to intrusions were both associated with a greater change in probability estimates. Reexperiencing symptoms were predicted by negative responses to intrusions and, to a lesser degree, by greater changes in probability estimates. The undergraduate student sample may not be representative of the general public. The reexperiencing symptoms are less severe than what would be found in a trauma sample. Threat estimates present both a vulnerability and a consequence of exposure to a distressing event. Furthermore, changes in these estimates are associated with cognitive processes implicated in PTSD. Copyright © 2012 Elsevier Ltd. All rights reserved.
Prevalence and correlates of depression among new U.S. immigrants.

PubMed

Wong, Eunice C; Miles, Jeremy N V

2014-06-01

Although immigrants comprise one of the fastest growing segments of society, information on their adjustment to life in the US remains limited. The present study examined the prevalence of depression and associated correlates among a national sample of immigrants newly admitted to legal permanent residence to the US. Data were derived from the baseline adult cohort of the New Immigrant Survey, a national representative sample of immigrants who had obtained legal permanent residence between May and November 2003. Approximately 3% of respondents met criteria for probable depression in the past 12 months. Respondents who were female, younger in age, in the US for a longer period of time, and exposed to political violence in their country of origin were more likely to meet criteria for probable depression. Both pre-immigration and resettlement related factors were associated with probable depression. Further research is needed to better understand how processes in the country of origin and in the resettlement country influence the adjustment of immigrants.
Palaeodemography of the Atapuerca-SH Middle Pleistocene hominid sample.

PubMed

Bermúdez de Castro, J M; Nicolás, M E

1997-01-01

We report here on the palaeodemographic analysis of the hominid sample recovered to date from the Sima de los Huesos (SH) Middle Pleistocene cave site in the Sierra de Atapuerca (Burgos, Spain). The analysis of the mandibular, maxillary, and dental remains has made it possible to estimate that a minimum of 32 individuals, who probably belonged to the same biological population, are represented in the current SH human hypodigm. The remains of nine-individuals are assigned to males, and nine to females, suggesting that a 1:1 sex ratio characterizes this hominid sample. The survivorship curve shows a low representation of infants and children, a high mortality among the adolescents and prime-age adults, and a low older adult mortality. Longevity was probably no greater than 40 years. This mortality pattern (adolescents and adults); which in some aspects resembles that observed in Neandertals, is quite different from those reported for recent foraging human groups. The adult age-at-death distribution of the SH hominid sample appears to be neither the consequence of underaging the older adults, nor of differential preservation or of the recognition of skeletal remains. Thus if we accept that they had a life history pattern similar to that of modern humans there would appear to be a clear contradiction between the demographic distribution and the demographic viability of the population represented by the SH hominid fossils. The possible representational bias of the SH hominid sample, as well as some aspects of the reproductive biology of the Pleistocene populations are also discussed.
Lognormal Approximations of Fault Tree Uncertainty Distributions.

PubMed

El-Shanawany, Ashraf Ben; Ardron, Keith H; Walker, Simon P

2018-01-26

Fault trees are used in reliability modeling to create logical models of fault combinations that can lead to undesirable events. The output of a fault tree analysis (the top event probability) is expressed in terms of the failure probabilities of basic events that are input to the model. Typically, the basic event probabilities are not known exactly, but are modeled as probability distributions: therefore, the top event probability is also represented as an uncertainty distribution. Monte Carlo methods are generally used for evaluating the uncertainty distribution, but such calculations are computationally intensive and do not readily reveal the dominant contributors to the uncertainty. In this article, a closed-form approximation for the fault tree top event uncertainty distribution is developed, which is applicable when the uncertainties in the basic events of the model are lognormally distributed. The results of the approximate method are compared with results from two sampling-based methods: namely, the Monte Carlo method and the Wilks method based on order statistics. It is shown that the closed-form expression can provide a reasonable approximation to results obtained by Monte Carlo sampling, without incurring the computational expense. The Wilks method is found to be a useful means of providing an upper bound for the percentiles of the uncertainty distribution while being computationally inexpensive compared with full Monte Carlo sampling. The lognormal approximation method and Wilks's method appear attractive, practical alternatives for the evaluation of uncertainty in the output of fault trees and similar multilinear models. © 2018 Society for Risk Analysis.
Application of advanced sampling and analysis methods to predict the structure of adsorbed protein on a material surface

PubMed Central

Abramyan, Tigran M.; Hyde-Volpe, David L.; Stuart, Steven J.; Latour, Robert A.

2017-01-01

The use of standard molecular dynamics simulation methods to predict the interactions of a protein with a material surface have the inherent limitations of lacking the ability to determine the most likely conformations and orientations of the adsorbed protein on the surface and to determine the level of convergence attained by the simulation. In addition, standard mixing rules are typically applied to combine the nonbonded force field parameters of the solution and solid phases the system to represent interfacial behavior without validation. As a means to circumvent these problems, the authors demonstrate the application of an efficient advanced sampling method (TIGER2A) for the simulation of the adsorption of hen egg-white lysozyme on a crystalline (110) high-density polyethylene surface plane. Simulations are conducted to generate a Boltzmann-weighted ensemble of sampled states using force field parameters that were validated to represent interfacial behavior for this system. The resulting ensembles of sampled states were then analyzed using an in-house-developed cluster analysis method to predict the most probable orientations and conformations of the protein on the surface based on the amount of sampling performed, from which free energy differences between the adsorbed states were able to be calculated. In addition, by conducting two independent sets of TIGER2A simulations combined with cluster analyses, the authors demonstrate a method to estimate the degree of convergence achieved for a given amount of sampling. The results from these simulations demonstrate that these methods enable the most probable orientations and conformations of an adsorbed protein to be predicted and that the use of our validated interfacial force field parameter set provides closer agreement to available experimental results compared to using standard CHARMM force field parameterization to represent molecular behavior at the interface. PMID:28514864
Evaluating the effect of disturbed ensemble distributions on SCFG based statistical sampling of RNA secondary structures.

PubMed

Scheid, Anika; Nebel, Markus E

2012-07-09

Over the past years, statistical and Bayesian approaches have become increasingly appreciated to address the long-standing problem of computational RNA structure prediction. Recently, a novel probabilistic method for the prediction of RNA secondary structures from a single sequence has been studied which is based on generating statistically representative and reproducible samples of the entire ensemble of feasible structures for a particular input sequence. This method samples the possible foldings from a distribution implied by a sophisticated (traditional or length-dependent) stochastic context-free grammar (SCFG) that mirrors the standard thermodynamic model applied in modern physics-based prediction algorithms. Specifically, that grammar represents an exact probabilistic counterpart to the energy model underlying the Sfold software, which employs a sampling extension of the partition function (PF) approach to produce statistically representative subsets of the Boltzmann-weighted ensemble. Although both sampling approaches have the same worst-case time and space complexities, it has been indicated that they differ in performance (both with respect to prediction accuracy and quality of generated samples), where neither of these two competing approaches generally outperforms the other. In this work, we will consider the SCFG based approach in order to perform an analysis on how the quality of generated sample sets and the corresponding prediction accuracy changes when different degrees of disturbances are incorporated into the needed sampling probabilities. This is motivated by the fact that if the results prove to be resistant to large errors on the distinct sampling probabilities (compared to the exact ones), then it will be an indication that these probabilities do not need to be computed exactly, but it may be sufficient and more efficient to approximate them. Thus, it might then be possible to decrease the worst-case time requirements of such an SCFG based sampling method without significant accuracy losses. If, on the other hand, the quality of sampled structures can be observed to strongly react to slight disturbances, there is little hope for improving the complexity by heuristic procedures. We hence provide a reliable test for the hypothesis that a heuristic method could be implemented to improve the time scaling of RNA secondary structure prediction in the worst-case - without sacrificing much of the accuracy of the results. Our experiments indicate that absolute errors generally lead to the generation of useless sample sets, whereas relative errors seem to have only small negative impact on both the predictive accuracy and the overall quality of resulting structure samples. Based on these observations, we present some useful ideas for developing a time-reduced sampling method guaranteeing an acceptable predictive accuracy. We also discuss some inherent drawbacks that arise in the context of approximation. The key results of this paper are crucial for the design of an efficient and competitive heuristic prediction method based on the increasingly accepted and attractive statistical sampling approach. This has indeed been indicated by the construction of prototype algorithms.

Evaluating the effect of disturbed ensemble distributions on SCFG based statistical sampling of RNA secondary structures

PubMed Central

2012-01-01

Background Over the past years, statistical and Bayesian approaches have become increasingly appreciated to address the long-standing problem of computational RNA structure prediction. Recently, a novel probabilistic method for the prediction of RNA secondary structures from a single sequence has been studied which is based on generating statistically representative and reproducible samples of the entire ensemble of feasible structures for a particular input sequence. This method samples the possible foldings from a distribution implied by a sophisticated (traditional or length-dependent) stochastic context-free grammar (SCFG) that mirrors the standard thermodynamic model applied in modern physics-based prediction algorithms. Specifically, that grammar represents an exact probabilistic counterpart to the energy model underlying the Sfold software, which employs a sampling extension of the partition function (PF) approach to produce statistically representative subsets of the Boltzmann-weighted ensemble. Although both sampling approaches have the same worst-case time and space complexities, it has been indicated that they differ in performance (both with respect to prediction accuracy and quality of generated samples), where neither of these two competing approaches generally outperforms the other. Results In this work, we will consider the SCFG based approach in order to perform an analysis on how the quality of generated sample sets and the corresponding prediction accuracy changes when different degrees of disturbances are incorporated into the needed sampling probabilities. This is motivated by the fact that if the results prove to be resistant to large errors on the distinct sampling probabilities (compared to the exact ones), then it will be an indication that these probabilities do not need to be computed exactly, but it may be sufficient and more efficient to approximate them. Thus, it might then be possible to decrease the worst-case time requirements of such an SCFG based sampling method without significant accuracy losses. If, on the other hand, the quality of sampled structures can be observed to strongly react to slight disturbances, there is little hope for improving the complexity by heuristic procedures. We hence provide a reliable test for the hypothesis that a heuristic method could be implemented to improve the time scaling of RNA secondary structure prediction in the worst-case – without sacrificing much of the accuracy of the results. Conclusions Our experiments indicate that absolute errors generally lead to the generation of useless sample sets, whereas relative errors seem to have only small negative impact on both the predictive accuracy and the overall quality of resulting structure samples. Based on these observations, we present some useful ideas for developing a time-reduced sampling method guaranteeing an acceptable predictive accuracy. We also discuss some inherent drawbacks that arise in the context of approximation. The key results of this paper are crucial for the design of an efficient and competitive heuristic prediction method based on the increasingly accepted and attractive statistical sampling approach. This has indeed been indicated by the construction of prototype algorithms. PMID:22776037
Men who have sex with men in Great Britain: comparing methods and estimates from probability and convenience sample surveys

PubMed Central

Prah, Philip; Hickson, Ford; Bonell, Chris; McDaid, Lisa M; Johnson, Anne M; Wayal, Sonali; Clifton, Soazig; Sonnenberg, Pam; Nardone, Anthony; Erens, Bob; Copas, Andrew J; Riddell, Julie; Weatherburn, Peter; Mercer, Catherine H

2016-01-01

Objective To examine sociodemographic and behavioural differences between men who have sex with men (MSM) participating in recent UK convenience surveys and a national probability sample survey. Methods We compared 148 MSM aged 18–64 years interviewed for Britain's third National Survey of Sexual Attitudes and Lifestyles (Natsal-3) undertaken in 2010–2012, with men in the same age range participating in contemporaneous convenience surveys of MSM: 15 500 British resident men in the European MSM Internet Survey (EMIS); 797 in the London Gay Men's Sexual Health Survey; and 1234 in Scotland's Gay Men's Sexual Health Survey. Analyses compared men reporting at least one male sexual partner (past year) on similarly worded questions and multivariable analyses accounted for sociodemographic differences between the surveys. Results MSM in convenience surveys were younger and better educated than MSM in Natsal-3, and a larger proportion identified as gay (85%–95% vs 62%). Partner numbers were higher and same-sex anal sex more common in convenience surveys. Unprotected anal intercourse was more commonly reported in EMIS. Compared with Natsal-3, MSM in convenience surveys were more likely to report gonorrhoea diagnoses and HIV testing (both past year). Differences between the samples were reduced when restricting analysis to gay-identifying MSM. Conclusions National probability surveys better reflect the population of MSM but are limited by their smaller samples of MSM. Convenience surveys recruit larger samples of MSM but tend to over-represent MSM identifying as gay and reporting more sexual risk behaviours. Because both sampling strategies have strengths and weaknesses, methods are needed to triangulate data from probability and convenience surveys. PMID:26965869
Statistical methods for identifying and bounding a UXO target area or minefield

DOE Office of Scientific and Technical Information (OSTI.GOV)

McKinstry, Craig A.; Pulsipher, Brent A.; Gilbert, Richard O.

2003-09-18

The sampling unit for minefield or UXO area characterization is typically represented by a geographical block or transect swath that lends itself to characterization by geophysical instrumentation such as mobile sensor arrays. New spatially based statistical survey methods and tools, more appropriate for these unique sampling units have been developed and implemented at PNNL (Visual Sample Plan software, ver. 2.0) with support from the US Department of Defense. Though originally developed to support UXO detection and removal efforts, these tools may also be used in current form or adapted to support demining efforts and aid in the development of newmore » sensors and detection technologies by explicitly incorporating both sampling and detection error in performance assessments. These tools may be used to (1) determine transect designs for detecting and bounding target areas of critical size, shape, and density of detectable items of interest with a specified confidence probability, (2) evaluate the probability that target areas of a specified size, shape and density have not been missed by a systematic or meandering transect survey, and (3) support post-removal verification by calculating the number of transects required to achieve a specified confidence probability that no UXO or mines have been missed.« less
Modeled forest inventory data suggest climate benefits from fuels management

Treesearch

Jeremy S. Fried; Theresa B. Jain; Jonathan. Sandquist

2013-01-01

As part of a recent synthesis addressing fuel management in dry, mixed-conifer forests we analyzed more than 5,000 Forest Inventory and Analysis (FIA) plots, a probability sample that represents 33 million acres of these forests throughout Washington, Oregon, Idaho, Montana, Utah, and extreme northern California. We relied on the BioSum analysis framework that...
Attitudes toward Science among Grades 3 through 12 Arab Students in Qatar: Findings from a Cross-Sectional National Study

ERIC Educational Resources Information Center

Said, Ziad; Summers, Ryan; Abd-El-Khalick, Fouad; Wang, Shuai

2016-01-01

This study assessed students' attitudes toward science in Qatar. A cross-sectional, nationwide probability sample representing all students enrolled in grades 3 through 12 in the various types of schools in Qatar completed the "Arabic Speaking Students' Attitudes toward Science Survey" (ASSASS). The validity and reliability of the…
On the use of posterior predictive probabilities and prediction uncertainty to tailor informative sampling for parasitological surveillance in livestock.

PubMed

Musella, Vincenzo; Rinaldi, Laura; Lagazio, Corrado; Cringoli, Giuseppe; Biggeri, Annibale; Catelan, Dolores

2014-09-15

Model-based geostatistics and Bayesian approaches are appropriate in the context of Veterinary Epidemiology when point data have been collected by valid study designs. The aim is to predict a continuous infection risk surface. Little work has been done on the use of predictive infection probabilities at farm unit level. In this paper we show how to use predictive infection probability and related uncertainty from a Bayesian kriging model to draw a informative samples from the 8794 geo-referenced sheep farms of the Campania region (southern Italy). Parasitological data come from a first cross-sectional survey carried out to study the spatial distribution of selected helminths in sheep farms. A grid sampling was performed to select the farms for coprological examinations. Faecal samples were collected for 121 sheep farms and the presence of 21 different helminths were investigated using the FLOTAC technique. The 21 responses are very different in terms of geographical distribution and prevalence of infection. The observed prevalence range is from 0.83% to 96.69%. The distributions of the posterior predictive probabilities for all the 21 parasites are very heterogeneous. We show how the results of the Bayesian kriging model can be used to plan a second wave survey. Several alternatives can be chosen depending on the purposes of the second survey: weight by posterior predictive probabilities, their uncertainty or combining both information. The proposed Bayesian kriging model is simple, and the proposed samping strategy represents a useful tool to address targeted infection control treatments and surbveillance campaigns. It is easily extendable to other fields of research. Copyright © 2014 Elsevier B.V. All rights reserved.
Estimation of distribution overlap of urn models.

PubMed

Hampton, Jerrad; Lladser, Manuel E

2012-01-01

A classical problem in statistics is estimating the expected coverage of a sample, which has had applications in gene expression, microbial ecology, optimization, and even numismatics. Here we consider a related extension of this problem to random samples of two discrete distributions. Specifically, we estimate what we call the dissimilarity probability of a sample, i.e., the probability of a draw from one distribution not being observed in [Formula: see text] draws from another distribution. We show our estimator of dissimilarity to be a [Formula: see text]-statistic and a uniformly minimum variance unbiased estimator of dissimilarity over the largest appropriate range of [Formula: see text]. Furthermore, despite the non-Markovian nature of our estimator when applied sequentially over [Formula: see text], we show it converges uniformly in probability to the dissimilarity parameter, and we present criteria when it is approximately normally distributed and admits a consistent jackknife estimator of its variance. As proof of concept, we analyze V35 16S rRNA data to discern between various microbial environments. Other potential applications concern any situation where dissimilarity of two discrete distributions may be of interest. For instance, in SELEX experiments, each urn could represent a random RNA pool and each draw a possible solution to a particular binding site problem over that pool. The dissimilarity of these pools is then related to the probability of finding binding site solutions in one pool that are absent in the other.
Use of Internet panels to conduct surveys.

PubMed

Hays, Ron D; Liu, Honghu; Kapteyn, Arie

2015-09-01

The use of Internet panels to collect survey data is increasing because it is cost-effective, enables access to large and diverse samples quickly, takes less time than traditional methods to obtain data for analysis, and the standardization of the data collection process makes studies easy to replicate. A variety of probability-based panels have been created, including Telepanel/CentERpanel, Knowledge Networks (now GFK KnowledgePanel), the American Life Panel, the Longitudinal Internet Studies for the Social Sciences panel, and the Understanding America Study panel. Despite the advantage of having a known denominator (sampling frame), the probability-based Internet panels often have low recruitment participation rates, and some have argued that there is little practical difference between opting out of a probability sample and opting into a nonprobability (convenience) Internet panel. This article provides an overview of both probability-based and convenience panels, discussing potential benefits and cautions for each method, and summarizing the approaches used to weight panel respondents in order to better represent the underlying population. Challenges of using Internet panel data are discussed, including false answers, careless responses, giving the same answer repeatedly, getting multiple surveys from the same respondent, and panelists being members of multiple panels. More is to be learned about Internet panels generally and about Web-based data collection, as well as how to evaluate data collected using mobile devices and social-media platforms.
[Respondent-Driven Sampling: a new sampling method to study visible and hidden populations].

PubMed

Mantecón, Alejandro; Juan, Montse; Calafat, Amador; Becoña, Elisardo; Román, Encarna

2008-01-01

The paper introduces a variant of chain-referral sampling: respondent-driven sampling (RDS). This sampling method shows that methods based on network analysis can be combined with the statistical validity of standard probability sampling methods. In this sense, RDS appears to be a mathematical improvement of snowball sampling oriented to the study of hidden populations. However, we try to prove its validity with populations that are not within a sampling frame but can nonetheless be contacted without difficulty. The basics of RDS are explained through our research on young people (aged 14 to 25) who go clubbing, consume alcohol and other drugs, and have sex. Fieldwork was carried out between May and July 2007 in three Spanish regions: Baleares, Galicia and Comunidad Valenciana. The presentation of the study shows the utility of this type of sampling when the population is accessible but there is a difficulty deriving from the lack of a sampling frame. However, the sample obtained is not a random representative one in statistical terms of the target population. It must be acknowledged that the final sample is representative of a 'pseudo-population' that approximates to the target population but is not identical to it.
Exact calculation of loop formation probability identifies folding motifs in RNA secondary structures

PubMed Central

Sloma, Michael F.; Mathews, David H.

2016-01-01

RNA secondary structure prediction is widely used to analyze RNA sequences. In an RNA partition function calculation, free energy nearest neighbor parameters are used in a dynamic programming algorithm to estimate statistical properties of the secondary structure ensemble. Previously, partition functions have largely been used to estimate the probability that a given pair of nucleotides form a base pair, the conditional stacking probability, the accessibility to binding of a continuous stretch of nucleotides, or a representative sample of RNA structures. Here it is demonstrated that an RNA partition function can also be used to calculate the exact probability of formation of hairpin loops, internal loops, bulge loops, or multibranch loops at a given position. This calculation can also be used to estimate the probability of formation of specific helices. Benchmarking on a set of RNA sequences with known secondary structures indicated that loops that were calculated to be more probable were more likely to be present in the known structure than less probable loops. Furthermore, highly probable loops are more likely to be in the known structure than the set of loops predicted in the lowest free energy structures. PMID:27852924
Probability versus representativeness in infancy: can infants use naïve physics to adjust population base rates in probabilistic inference?

PubMed

Denison, Stephanie; Trikutam, Pallavi; Xu, Fei

2014-08-01

A rich tradition in developmental psychology explores physical reasoning in infancy. However, no research to date has investigated whether infants can reason about physical objects that behave probabilistically, rather than deterministically. Physical events are often quite variable, in that similar-looking objects can be placed in similar contexts with different outcomes. Can infants rapidly acquire probabilistic physical knowledge, such as some leaves fall and some glasses break by simply observing the statistical regularity with which objects behave and apply that knowledge in subsequent reasoning? We taught 11-month-old infants physical constraints on objects and asked them to reason about the probability of different outcomes when objects were drawn from a large distribution. Infants could have reasoned either by using the perceptual similarity between the samples and larger distributions or by applying physical rules to adjust base rates and estimate the probabilities. Infants learned the physical constraints quickly and used them to estimate probabilities, rather than relying on similarity, a version of the representativeness heuristic. These results indicate that infants can rapidly and flexibly acquire physical knowledge about objects following very brief exposure and apply it in subsequent reasoning. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Neural Mechanisms for Integrating Prior Knowledge and Likelihood in Value-Based Probabilistic Inference

PubMed Central

Ting, Chih-Chung; Yu, Chia-Chen; Maloney, Laurence T.

2015-01-01

In Bayesian decision theory, knowledge about the probabilities of possible outcomes is captured by a prior distribution and a likelihood function. The prior reflects past knowledge and the likelihood summarizes current sensory information. The two combined (integrated) form a posterior distribution that allows estimation of the probability of different possible outcomes. In this study, we investigated the neural mechanisms underlying Bayesian integration using a novel lottery decision task in which both prior knowledge and likelihood information about reward probability were systematically manipulated on a trial-by-trial basis. Consistent with Bayesian integration, as sample size increased, subjects tended to weigh likelihood information more compared with prior information. Using fMRI in humans, we found that the medial prefrontal cortex (mPFC) correlated with the mean of the posterior distribution, a statistic that reflects the integration of prior knowledge and likelihood of reward probability. Subsequent analysis revealed that both prior and likelihood information were represented in mPFC and that the neural representations of prior and likelihood in mPFC reflected changes in the behaviorally estimated weights assigned to these different sources of information in response to changes in the environment. Together, these results establish the role of mPFC in prior-likelihood integration and highlight its involvement in representing and integrating these distinct sources of information. PMID:25632152
Metacognition, risk behavior, and risk outcomes: the role of perceived intelligence and perceived knowledge.

PubMed

Jaccard, James; Dodge, Tonya; Guilamo-Ramos, Vincent

2005-03-01

The present study explores 2 key variables in social metacognition: perceived intelligence and perceived levels of knowledge about a specific content domain. The former represents a judgment of one's knowledge at an abstract level, whereas the latter represents a judgment of one's knowledge in a specific content domain. Data from interviews of approximately 8,411 female adolescents from a national sample were analyzed in a 2-wave panel design with a year between assessments. Higher levels of perceived intelligence at Wave 1 were associated with a lower probability of the occurrence of a pregnancy over the ensuing year independent of actual IQ, self-esteem, and academic aspirations. Higher levels of perceived knowledge about the accurate use of birth control were associated with a higher probability of the occurrence of a pregnancy independent of actual knowledge about accurate use, perceived intelligence, self-esteem, and academic aspirations.
Identifying the rooted species tree from the distribution of unrooted gene trees under the coalescent.

PubMed

Allman, Elizabeth S; Degnan, James H; Rhodes, John A

2011-06-01

Gene trees are evolutionary trees representing the ancestry of genes sampled from multiple populations. Species trees represent populations of individuals-each with many genes-splitting into new populations or species. The coalescent process, which models ancestry of gene copies within populations, is often used to model the probability distribution of gene trees given a fixed species tree. This multispecies coalescent model provides a framework for phylogeneticists to infer species trees from gene trees using maximum likelihood or Bayesian approaches. Because the coalescent models a branching process over time, all trees are typically assumed to be rooted in this setting. Often, however, gene trees inferred by traditional phylogenetic methods are unrooted. We investigate probabilities of unrooted gene trees under the multispecies coalescent model. We show that when there are four species with one gene sampled per species, the distribution of unrooted gene tree topologies identifies the unrooted species tree topology and some, but not all, information in the species tree edges (branch lengths). The location of the root on the species tree is not identifiable in this situation. However, for 5 or more species with one gene sampled per species, we show that the distribution of unrooted gene tree topologies identifies the rooted species tree topology and all its internal branch lengths. The length of any pendant branch leading to a leaf of the species tree is also identifiable for any species from which more than one gene is sampled.
Effects of sampling conditions on DNA-based estimates of American black bear abundance

USGS Publications Warehouse

Laufenberg, Jared S.; Van Manen, Frank T.; Clark, Joseph D.

2013-01-01

DNA-based capture-mark-recapture techniques are commonly used to estimate American black bear (Ursus americanus) population abundance (N). Although the technique is well established, many questions remain regarding study design. In particular, relationships among N, capture probability of heterogeneity mixtures A and B (pA and pB, respectively, or p, collectively), the proportion of each mixture (π), number of capture occasions (k), and probability of obtaining reliable estimates of N are not fully understood. We investigated these relationships using 1) an empirical dataset of DNA samples for which true N was unknown and 2) simulated datasets with known properties that represented a broader array of sampling conditions. For the empirical data analysis, we used the full closed population with heterogeneity data type in Program MARK to estimate N for a black bear population in Great Smoky Mountains National Park, Tennessee. We systematically reduced the number of those samples used in the analysis to evaluate the effect that changes in capture probabilities may have on parameter estimates. Model-averaged N for females and males were 161 (95% CI = 114–272) and 100 (95% CI = 74–167), respectively (pooled N = 261, 95% CI = 192–419), and the average weekly p was 0.09 for females and 0.12 for males. When we reduced the number of samples of the empirical data, support for heterogeneity models decreased. For the simulation analysis, we generated capture data with individual heterogeneity covering a range of sampling conditions commonly encountered in DNA-based capture-mark-recapture studies and examined the relationships between those conditions and accuracy (i.e., probability of obtaining an estimated N that is within 20% of true N), coverage (i.e., probability that 95% confidence interval includes true N), and precision (i.e., probability of obtaining a coefficient of variation ≤20%) of estimates using logistic regression. The capture probability for the larger of 2 mixture proportions of the population (i.e., pA or pB, depending on the value of π) was most important for predicting accuracy and precision, whereas capture probabilities of both mixture proportions (pA and pB) were important to explain variation in coverage. Based on sampling conditions similar to parameter estimates from the empirical dataset (pA = 0.30, pB = 0.05, N = 250, π = 0.15, and k = 10), predicted accuracy and precision were low (60% and 53%, respectively), whereas coverage was high (94%). Increasing pB, the capture probability for the predominate but most difficult to capture proportion of the population, was most effective to improve accuracy under those conditions. However, manipulation of other parameters may be more effective under different conditions. In general, the probabilities of obtaining accurate and precise estimates were best when p≥ 0.2. Our regression models can be used by managers to evaluate specific sampling scenarios and guide development of sampling frameworks or to assess reliability of DNA-based capture-mark-recapture studies.
Injecting asteroid fragments into resonances

NASA Technical Reports Server (NTRS)

Farinella, Paolo; Gonczi, R.; Froeschle, Christiane; Froeschle, Claude

1992-01-01

We have quantitatively modeled the chance insertion of asteroid collisional fragments into the 3:1 and g = g(sub 6) resonances, through which they can achieve Earth-approaching orbits. Although the results depend on some poorly known parameters, they indicate that most meteorites and near-earth asteroids probably come from a small and non-representative sample of asteroids, located in the neighborhood of the two resonances.
A Ten-Year Analysis of the Post-Secondary Outcomes of Students with Disabilities at the Pennsylvania State University

ERIC Educational Resources Information Center

Hong, Barbara S. S.; Herbert, James T.; Petrin, Robert A.

2011-01-01

This proposed exploratory study represents the largest and first investigation in the USA that will purposefully analyse and track students who have sought disability services over a 10-year span (academic years 2000-2011). Using "ex post-facto" data on a non-probability purposive sample of approximately 6000 undergraduates, the research…
Online Reinforcement Learning Using a Probability Density Estimation.

PubMed

Agostini, Alejandro; Celaya, Enric

2017-01-01

Function approximation in online, incremental, reinforcement learning needs to deal with two fundamental problems: biased sampling and nonstationarity. In this kind of task, biased sampling occurs because samples are obtained from specific trajectories dictated by the dynamics of the environment and are usually concentrated in particular convergence regions, which in the long term tend to dominate the approximation in the less sampled regions. The nonstationarity comes from the recursive nature of the estimations typical of temporal difference methods. This nonstationarity has a local profile, varying not only along the learning process but also along different regions of the state space. We propose to deal with these problems using an estimation of the probability density of samples represented with a gaussian mixture model. To deal with the nonstationarity problem, we use the common approach of introducing a forgetting factor in the updating formula. However, instead of using the same forgetting factor for the whole domain, we make it dependent on the local density of samples, which we use to estimate the nonstationarity of the function at any given input point. To address the biased sampling problem, the forgetting factor applied to each mixture component is modulated according to the new information provided in the updating, rather than forgetting depending only on time, thus avoiding undesired distortions of the approximation in less sampled regions.
Public attitudes toward stuttering in Turkey: probability versus convenience sampling.

PubMed

Ozdemir, R Sertan; St Louis, Kenneth O; Topbaş, Seyhun

2011-12-01

A Turkish translation of the Public Opinion Survey of Human Attributes-Stuttering (POSHA-S) was used to compare probability versus convenience sampling to measure public attitudes toward stuttering. A convenience sample of adults in Eskişehir, Turkey was compared with two replicates of a school-based, probability cluster sampling scheme. The two replicates of the probability sampling scheme yielded similar demographic samples, both of which were different from the convenience sample. Components of subscores on the POSHA-S were significantly different in more than half of the comparisons between convenience and probability samples, indicating important differences in public attitudes. If POSHA-S users intend to generalize to specific geographic areas, results of this study indicate that probability sampling is a better research strategy than convenience sampling. The reader will be able to: (1) discuss the difference between convenience sampling and probability sampling; (2) describe a school-based probability sampling scheme; and (3) describe differences in POSHA-S results from convenience sampling versus probability sampling. Copyright © 2011 Elsevier Inc. All rights reserved.
A Framework for Final Drive Simultaneous Failure Diagnosis Based on Fuzzy Entropy and Sparse Bayesian Extreme Learning Machine

PubMed Central

Ye, Qing; Pan, Hao; Liu, Changhua

2015-01-01

This research proposes a novel framework of final drive simultaneous failure diagnosis containing feature extraction, training paired diagnostic models, generating decision threshold, and recognizing simultaneous failure modes. In feature extraction module, adopt wavelet package transform and fuzzy entropy to reduce noise interference and extract representative features of failure mode. Use single failure sample to construct probability classifiers based on paired sparse Bayesian extreme learning machine which is trained only by single failure modes and have high generalization and sparsity of sparse Bayesian learning approach. To generate optimal decision threshold which can convert probability output obtained from classifiers into final simultaneous failure modes, this research proposes using samples containing both single and simultaneous failure modes and Grid search method which is superior to traditional techniques in global optimization. Compared with other frequently used diagnostic approaches based on support vector machine and probability neural networks, experiment results based on F 1-measure value verify that the diagnostic accuracy and efficiency of the proposed framework which are crucial for simultaneous failure diagnosis are superior to the existing approach. PMID:25722717

A method for the extraction and quantitation of phycoerythrin from algae

NASA Technical Reports Server (NTRS)

Stewart, D. E.

1982-01-01

A summary of a new technique for the extraction and quantitation of phycoerythrin (PHE) from algal samples is described. Results of analysis of four extracts representing three PHE types from algae including cryptomonad and cyanophyte types are presented. The method of extraction and an equation for quantitation are given. A graph showing the relationship of concentration and fluorescence units that may be used with samples fluorescing around 575-580 nm (probably dominated by cryptophytes in estuarine waters) and 560 nm (dominated by cyanophytes characteristics of the open ocean) is provided.
Radionuclides at Descartes in the central highlands

NASA Technical Reports Server (NTRS)

Wrigley, R. C.

1973-01-01

Throium, uranium, potassium, aluminium-26, and sodium-22 were measured by nondestructive gamma ray spectrometry in six soil and two rock samples gathered by Apollo 16 in the lunar central highlands. The soil samples probably include both major geologic formations in the vicinity, the Cayley and Descartes Formations, although it is possible that the Descartes Formation is not represented. The rock samples have low concentrations of primordial radionuclides. The Al concentrations were lower than could be expected from the high abundance of alumina in the Apollo 16 soils reported earlier, but this could be due to lower concentrations of target elements in these soils, sampling depth variations, or regolithic mixing (exposure age variations).
Adaptive Conditioning of Multiple-Point Geostatistical Facies Simulation to Flow Data with Facies Probability Maps

NASA Astrophysics Data System (ADS)

Khodabakhshi, M.; Jafarpour, B.

2013-12-01

Characterization of complex geologic patterns that create preferential flow paths in certain reservoir systems requires higher-order geostatistical modeling techniques. Multipoint statistics (MPS) provides a flexible grid-based approach for simulating such complex geologic patterns from a conceptual prior model known as a training image (TI). In this approach, a stationary TI that encodes the higher-order spatial statistics of the expected geologic patterns is used to represent the shape and connectivity of the underlying lithofacies. While MPS is quite powerful for describing complex geologic facies connectivity, the nonlinear and complex relation between the flow data and facies distribution makes flow data conditioning quite challenging. We propose an adaptive technique for conditioning facies simulation from a prior TI to nonlinear flow data. Non-adaptive strategies for conditioning facies simulation to flow data can involves many forward flow model solutions that can be computationally very demanding. To improve the conditioning efficiency, we develop an adaptive sampling approach through a data feedback mechanism based on the sampling history. In this approach, after a short period of sampling burn-in time where unconditional samples are generated and passed through an acceptance/rejection test, an ensemble of accepted samples is identified and used to generate a facies probability map. This facies probability map contains the common features of the accepted samples and provides conditioning information about facies occurrence in each grid block, which is used to guide the conditional facies simulation process. As the sampling progresses, the initial probability map is updated according to the collective information about the facies distribution in the chain of accepted samples to increase the acceptance rate and efficiency of the conditioning. This conditioning process can be viewed as an optimization approach where each new sample is proposed based on the sampling history to improve the data mismatch objective function. We extend the application of this adaptive conditioning approach to the case where multiple training images are proposed to describe the geologic scenario in a given formation. We discuss the advantages and limitations of the proposed adaptive conditioning scheme and use numerical experiments from fluvial channel formations to demonstrate its applicability and performance compared to non-adaptive conditioning techniques.
Probability of detecting perchlorate under natural conditions in deep groundwater in California and the Southwestern United States

USGS Publications Warehouse

Fram, Miranda S.; Belitz, Kenneth

2011-01-01

We use data from 1626 groundwater samples collected in California, primarily from public drinking water supply wells, to investigate the distribution of perchlorate in deep groundwater under natural conditions. The wells were sampled for the California Groundwater Ambient Monitoring and Assessment Priority Basin Project. We develop a logistic regression model for predicting probabilities of detecting perchlorate at concentrations greater than multiple threshold concentrations as a function of climate (represented by an aridity index) and potential anthropogenic contributions of perchlorate (quantified as an anthropogenic score, AS). AS is a composite categorical variable including terms for nitrate, pesticides, and volatile organic compounds. Incorporating water-quality parameters in AS permits identification of perturbation of natural occurrence patterns by flushing of natural perchlorate salts from unsaturated zones by irrigation recharge as well as addition of perchlorate from industrial and agricultural sources. The data and model results indicate low concentrations (0.1-0.5 μg/L) of perchlorate occur under natural conditions in groundwater across a wide range of climates, beyond the arid to semiarid climates in which they mostly have been previously reported. The probability of detecting perchlorate at concentrations greater than 0.1 μg/L under natural conditions ranges from 50-70% in semiarid to arid regions of California and the Southwestern United States to 5-15% in the wettest regions sampled (the Northern California coast). The probability of concentrations above 1 μg/L under natural conditions is low (generally <3%).
Quality of Parent-Child Relationship, Family Conflict, Peer Pressure, and Drinking Behaviors of Adolescents in an Asian Context: The Case of Singapore

ERIC Educational Resources Information Center

Choo, Hyekyung; Shek, Daniel

2013-01-01

Analyzing data from a probability sample representative of secondary school students in Singapore (N = 1,599), this study examined the independent impact between the quality of mother-child relationship, the quality of father-child relationship and family conflict on the frequency of drinking and drunkenness, and whether each dyadic parent-child…
Language and Adjustment Scales for the Thematic Apperception Test for Youths 12-17 Years. Vital Health and Statistics, Series 2, No. 62.

ERIC Educational Resources Information Center

Neman, Ronald S.; And Others

The study represents an extension of previous research involving the development of scales for the five-card, orally administered, and tape-recorded version of the Thematic Apperception Test(TAT). Scale development is documented and national norms are presented based on a national probability sample of 1,398 youths administered the Cycle III test…
Free-Energy Profiles of Membrane Insertion of the M2 Transmembrane Peptide from Influenza A Virus

DTIC Science & Technology

2008-12-01

ABSTRACT The insertion of the M2 transmembrane peptide from influenza A virus into a membrane has been studied with molecular - dynamics simulations ...performed replica-exchange molecular - dynamics simulations with umbrella-sampling techniques to characterize the probability distribution and conformation...atomic- detailed molecular dynamics (MD) simulation techniques represent a valuable complementary methodology to inves- tigate membrane-insertion of
Methodological considerations in using complex survey data: an applied example with the Head Start Family and Child Experiences Survey.

PubMed

Hahs-Vaughn, Debbie L; McWayne, Christine M; Bulotsky-Shearer, Rebecca J; Wen, Xiaoli; Faria, Ann-Marie

2011-06-01

Complex survey data are collected by means other than simple random samples. This creates two analytical issues: nonindependence and unequal selection probability. Failing to address these issues results in underestimated standard errors and biased parameter estimates. Using data from the nationally representative Head Start Family and Child Experiences Survey (FACES; 1997 and 2000 cohorts), three diverse multilevel models are presented that illustrate differences in results depending on addressing or ignoring the complex sampling issues. Limitations of using complex survey data are reported, along with recommendations for reporting complex sample results. © The Author(s) 2011
Exact calculation of loop formation probability identifies folding motifs in RNA secondary structures.

PubMed

Sloma, Michael F; Mathews, David H

2016-12-01

RNA secondary structure prediction is widely used to analyze RNA sequences. In an RNA partition function calculation, free energy nearest neighbor parameters are used in a dynamic programming algorithm to estimate statistical properties of the secondary structure ensemble. Previously, partition functions have largely been used to estimate the probability that a given pair of nucleotides form a base pair, the conditional stacking probability, the accessibility to binding of a continuous stretch of nucleotides, or a representative sample of RNA structures. Here it is demonstrated that an RNA partition function can also be used to calculate the exact probability of formation of hairpin loops, internal loops, bulge loops, or multibranch loops at a given position. This calculation can also be used to estimate the probability of formation of specific helices. Benchmarking on a set of RNA sequences with known secondary structures indicated that loops that were calculated to be more probable were more likely to be present in the known structure than less probable loops. Furthermore, highly probable loops are more likely to be in the known structure than the set of loops predicted in the lowest free energy structures. © 2016 Sloma and Mathews; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Prevalence of anxiety, depression and post-traumatic stress disorder in the Kashmir Valley

PubMed Central

Lenglet, Annick; Ariti, Cono; Shah, Showkat; Shah, Helal; Ara, Shabnum; Viney, Kerri; Janes, Simon; Pintaldi, Giovanni

2017-01-01

Background Following the partition of India in 1947, the Kashmir Valley has been subject to continual political insecurity and ongoing conflict, the region remains highly militarised. We conducted a representative cross-sectional population-based survey of adults to estimate the prevalence and predictors of anxiety, depression and post-traumatic stress disorder (PTSD) in the 10 districts of the Kashmir Valley. Methods Between October and December 2015, we interviewed 5519 out of 5600 invited participants, ≥18 years of age, randomly sampled using a probability proportional to size cluster sampling design. We estimated the prevalence of a probable psychological disorder using the Hopkins Symptom Checklist (HSCL-25) and the Harvard Trauma Questionnaire (HTQ-16). Both screening instruments had been culturally adapted and translated. Data were weighted to account for the sampling design and multivariate logistic regression analysis was conducted to identify risk factors for developing symptoms of psychological distress. Findings The estimated prevalence of mental distress in adults in the Kashmir Valley was 45% (95% CI 42.6 to 47.0). We identified 41% (95% CI 39.2 to 43.4) of adults with probable depression, 26% (95% CI 23.8 to 27.5) with probable anxiety and 19% (95% CI 17.5 to 21.2) with probable PTSD. The three disorders were associated with the following characteristics: being female, over 55 years of age, having had no formal education, living in a rural area and being widowed/divorced or separated. A dose–response association was found between the number of traumatic events experienced or witnessed and all three mental disorders. Interpretation The implementation of mental health awareness programmes, interventions aimed at high risk groups and addressing trauma-related symptoms from all causes are needed in the Kashmir Valley. PMID:29082026
Collecting cometary soil samples? Development of the ROSETTA sample acquisition system

NASA Technical Reports Server (NTRS)

Coste, P. A.; Fenzi, M.; Eiden, Michael

1993-01-01

In the reference scenario of the ROSETTA CNRS mission, the Sample Acquisition System is mounted on the Comet Lander. Its tasks are to acquire three kinds of cometary samples and to transfer them to the Earth Return Capsule. Operations are to be performed in vacuum and microgravity, on a probably rough and dusty surface, in a largely unknown material, at temperatures in the order of 100 K. The concept and operation of the Sample Acquisition System are presented. The design of the prototype corer and surface sampling tool, and of the equipment for testing them at cryogenic temperatures in ambient conditions and in vacuum in various materials representing cometary soil, are described. Results of recent preliminary tests performed in low temperature thermal vacuum in a cometary analog ice-dust mixture are provided.
Men who have sex with men in Great Britain: comparing methods and estimates from probability and convenience sample surveys.

PubMed

Prah, Philip; Hickson, Ford; Bonell, Chris; McDaid, Lisa M; Johnson, Anne M; Wayal, Sonali; Clifton, Soazig; Sonnenberg, Pam; Nardone, Anthony; Erens, Bob; Copas, Andrew J; Riddell, Julie; Weatherburn, Peter; Mercer, Catherine H

2016-09-01

To examine sociodemographic and behavioural differences between men who have sex with men (MSM) participating in recent UK convenience surveys and a national probability sample survey. We compared 148 MSM aged 18-64 years interviewed for Britain's third National Survey of Sexual Attitudes and Lifestyles (Natsal-3) undertaken in 2010-2012, with men in the same age range participating in contemporaneous convenience surveys of MSM: 15 500 British resident men in the European MSM Internet Survey (EMIS); 797 in the London Gay Men's Sexual Health Survey; and 1234 in Scotland's Gay Men's Sexual Health Survey. Analyses compared men reporting at least one male sexual partner (past year) on similarly worded questions and multivariable analyses accounted for sociodemographic differences between the surveys. MSM in convenience surveys were younger and better educated than MSM in Natsal-3, and a larger proportion identified as gay (85%-95% vs 62%). Partner numbers were higher and same-sex anal sex more common in convenience surveys. Unprotected anal intercourse was more commonly reported in EMIS. Compared with Natsal-3, MSM in convenience surveys were more likely to report gonorrhoea diagnoses and HIV testing (both past year). Differences between the samples were reduced when restricting analysis to gay-identifying MSM. National probability surveys better reflect the population of MSM but are limited by their smaller samples of MSM. Convenience surveys recruit larger samples of MSM but tend to over-represent MSM identifying as gay and reporting more sexual risk behaviours. Because both sampling strategies have strengths and weaknesses, methods are needed to triangulate data from probability and convenience surveys. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Masculinity-femininity predicts sexual orientation in men but not in women.

PubMed

Udry, J Richard; Chantala, Kim

2006-11-01

Using the nationally representative sample of about 15,000 Add Health respondents in Wave III, the hypothesis is tested that masculinity-femininity in adolescence is correlated with sexual orientation 5 years later and 6 years later: that is, that for adolescent males in 1995 and again in 1996, more feminine males have a higher probability of self-identifying as homosexuals in 2001-02. It is predicted that for adolescent females in 1995 and 1996, more masculine females have a higher probability of self-identifying as homosexuals in 2001-02. Masculinity-femininity is measured by the classical method used by Terman & Miles. For both time periods, the hypothesis was strongly confirmed for males: the more feminine males had several times the probability of being attracted to same-sex partners, several times the probability of having same-sex partners, and several times the probability of self-identifying as homosexuals, compared with more masculine males. For females, no relationship was found at either time period between masculinity and sex of preference. The biological mechanism underlying homosexuality may be different for males and females.
A Bayesian predictive two-stage design for phase II clinical trials.

PubMed

Sambucini, Valeria

2008-04-15

In this paper, we propose a Bayesian two-stage design for phase II clinical trials, which represents a predictive version of the single threshold design (STD) recently introduced by Tan and Machin. The STD two-stage sample sizes are determined specifying a minimum threshold for the posterior probability that the true response rate exceeds a pre-specified target value and assuming that the observed response rate is slightly higher than the target. Unlike the STD, we do not refer to a fixed experimental outcome, but take into account the uncertainty about future data. In both stages, the design aims to control the probability of getting a large posterior probability that the true response rate exceeds the target value. Such a probability is expressed in terms of prior predictive distributions of the data. The performance of the design is based on the distinction between analysis and design priors, recently introduced in the literature. The properties of the method are studied when all the design parameters vary.
Probability bounds analysis for nonlinear population ecology models.

PubMed

Enszer, Joshua A; Andrei Măceș, D; Stadtherr, Mark A

2015-09-01

Mathematical models in population ecology often involve parameters that are empirically determined and inherently uncertain, with probability distributions for the uncertainties not known precisely. Propagating such imprecise uncertainties rigorously through a model to determine their effect on model outputs can be a challenging problem. We illustrate here a method for the direct propagation of uncertainties represented by probability bounds though nonlinear, continuous-time, dynamic models in population ecology. This makes it possible to determine rigorous bounds on the probability that some specified outcome for a population is achieved, which can be a core problem in ecosystem modeling for risk assessment and management. Results can be obtained at a computational cost that is considerably less than that required by statistical sampling methods such as Monte Carlo analysis. The method is demonstrated using three example systems, with focus on a model of an experimental aquatic food web subject to the effects of contamination by ionic liquids, a new class of potentially important industrial chemicals. Copyright © 2015. Published by Elsevier Inc.
Attitudes toward Bisexual Men and Women among a Nationally Representative Probability Sample of Adults in the United States.

PubMed

Dodge, Brian; Herbenick, Debby; Friedman, M Reuel; Schick, Vanessa; Fu, Tsung-Chieh Jane; Bostwick, Wendy; Bartelt, Elizabeth; Muñoz-Laboy, Miguel; Pletta, David; Reece, Michael; Sandfort, Theo G M

2016-01-01

As bisexual individuals in the United States (U.S.) face significant health disparities, researchers have posited that these differences may be fueled, at least in part, by negative attitudes, prejudice, stigma, and discrimination toward bisexual individuals from heterosexual and gay/lesbian individuals. Previous studies of individual and social attitudes toward bisexual men and women have been conducted almost exclusively with convenience samples, with limited generalizability to the broader U.S. Our study provides an assessment of attitudes toward bisexual men and women among a nationally representative probability sample of heterosexual, gay, lesbian, and other-identified adults in the U.S. Data were collected from the 2015 National Survey of Sexual Health and Behavior (NSSHB), via an online questionnaire with a probability sample of adults (18 years and over) from throughout the U.S. We included two modified 5-item versions of the Bisexualities: Indiana Attitudes Scale (BIAS), validated sub-scales that were developed to measure attitudes toward bisexual men and women. Data were analyzed using descriptive statistics, gamma regression, and paired t-tests. Gender, sexual identity, age, race/ethnicity, income, and educational attainment were all significantly associated with participants' attitudes toward bisexual individuals. In terms of responses to individual scale items, participants were most likely to "neither agree nor disagree" with all attitudinal statements. Across sexual identities, self-identified other participants reported the most positive attitudes, while heterosexual male participants reported the least positive attitudes. As in previous research on convenience samples, we found a wide range of demographic characteristics were related with attitudes toward bisexual individuals in our nationally-representative study of heterosexual, gay/lesbian, and other-identified adults in the U.S. In particular, gender emerged as a significant characteristic; female participants' attitudes were more positive than male participants' attitudes, and all participants' attitudes were generally more positive toward bisexual women than bisexual men. While recent population data suggest a marked shift in more positive attitudes toward gay men and lesbian women in the general population of the U.S., the largest proportions of participants in our study reported a relative lack of agreement or disagreement with all affective-evaluative statements in the BIAS scales. Findings document the relative lack of positive attitudes toward bisexual individuals among the general population of adults in the U.S. and highlight the need for developing intervention approaches to promote more positive attitudes toward bisexual individuals, targeted toward not only heterosexual but also gay/lesbian individuals and communities.
Attitudes toward Bisexual Men and Women among a Nationally Representative Probability Sample of Adults in the United States

PubMed Central

Herbenick, Debby; Friedman, M. Reuel; Schick, Vanessa; Fu, Tsung-Chieh (Jane); Bostwick, Wendy; Bartelt, Elizabeth; Muñoz-Laboy, Miguel; Pletta, David; Reece, Michael; Sandfort, Theo G. M.

2016-01-01

As bisexual individuals in the United States (U.S.) face significant health disparities, researchers have posited that these differences may be fueled, at least in part, by negative attitudes, prejudice, stigma, and discrimination toward bisexual individuals from heterosexual and gay/lesbian individuals. Previous studies of individual and social attitudes toward bisexual men and women have been conducted almost exclusively with convenience samples, with limited generalizability to the broader U.S. population. Our study provides an assessment of attitudes toward bisexual men and women among a nationally representative probability sample of heterosexual, gay, lesbian, and other-identified adults in the U.S. Data were collected from the 2015 National Survey of Sexual Health and Behavior (NSSHB), via an online questionnaire with a probability sample of adults (18 years and over) from throughout the U.S. We included two modified 5-item versions of the Bisexualities: Indiana Attitudes Scale (BIAS), validated sub-scales that were developed to measure attitudes toward bisexual men and women. Data were analyzed using descriptive statistics, gamma regression, and paired t-tests. Gender, sexual identity, age, race/ethnicity, income, and educational attainment were all significantly associated with participants' attitudes toward bisexual individuals. In terms of responses to individual scale items, participants were most likely to “neither agree nor disagree” with all attitudinal statements. Across sexual identities, self-identified other participants reported the most positive attitudes, while heterosexual male participants reported the least positive attitudes. As in previous research on convenience samples, we found a wide range of demographic characteristics were related with attitudes toward bisexual individuals in our nationally-representative study of heterosexual, gay/lesbian, and other-identified adults in the U.S. In particular, gender emerged as a significant characteristic; female participants’ attitudes were more positive than male participants’ attitudes, and all participants’ attitudes were generally more positive toward bisexual women than bisexual men. While recent population data suggest a marked shift in more positive attitudes toward gay men and lesbian women in the general population of the U.S., the largest proportions of participants in our study reported a relative lack of agreement or disagreement with all affective-evaluative statements in the BIAS scales. Findings document the relative lack of positive attitudes toward bisexual individuals among the general population of adults in the U.S. and highlight the need for developing intervention approaches to promote more positive attitudes toward bisexual individuals, targeted toward not only heterosexual but also gay/lesbian individuals and communities. PMID:27783644
On estimating probability of presence from use-availability or presence-background data.

PubMed

Phillips, Steven J; Elith, Jane

2013-06-01

A fundamental ecological modeling task is to estimate the probability that a species is present in (or uses) a site, conditional on environmental variables. For many species, available data consist of "presence" data (locations where the species [or evidence of it] has been observed), together with "background" data, a random sample of available environmental conditions. Recently published papers disagree on whether probability of presence is identifiable from such presence-background data alone. This paper aims to resolve the disagreement, demonstrating that additional information is required. We defined seven simulated species representing various simple shapes of response to environmental variables (constant, linear, convex, unimodal, S-shaped) and ran five logistic model-fitting methods using 1000 presence samples and 10 000 background samples; the simulations were repeated 100 times. The experiment revealed a stark contrast between two groups of methods: those based on a strong assumption that species' true probability of presence exactly matches a given parametric form had highly variable predictions and much larger RMS error than methods that take population prevalence (the fraction of sites in which the species is present) as an additional parameter. For six species, the former group grossly under- or overestimated probability of presence. The cause was not model structure or choice of link function, because all methods were logistic with linear and, where necessary, quadratic terms. Rather, the experiment demonstrates that an estimate of prevalence is not just helpful, but is necessary (except in special cases) for identifying probability of presence. We therefore advise against use of methods that rely on the strong assumption, due to Lele and Keim (recently advocated by Royle et al.) and Lancaster and Imbens. The methods are fragile, and their strong assumption is unlikely to be true in practice. We emphasize, however, that we are not arguing against standard statistical methods such as logistic regression, generalized linear models, and so forth, none of which requires the strong assumption. If probability of presence is required for a given application, there is no panacea for lack of data. Presence-background data must be augmented with an additional datum, e.g., species' prevalence, to reliably estimate absolute (rather than relative) probability of presence.
Rare Event Simulation in Radiation Transport

NASA Astrophysics Data System (ADS)

Kollman, Craig

This dissertation studies methods for estimating extremely small probabilities by Monte Carlo simulation. Problems in radiation transport typically involve estimating very rare events or the expected value of a random variable which is with overwhelming probability equal to zero. These problems often have high dimensional state spaces and irregular geometries so that analytic solutions are not possible. Monte Carlo simulation must be used to estimate the radiation dosage being transported to a particular location. If the area is well shielded the probability of any one particular particle getting through is very small. Because of the large number of particles involved, even a tiny fraction penetrating the shield may represent an unacceptable level of radiation. It therefore becomes critical to be able to accurately estimate this extremely small probability. Importance sampling is a well known technique for improving the efficiency of rare event calculations. Here, a new set of probabilities is used in the simulation runs. The results are multiplied by the likelihood ratio between the true and simulated probabilities so as to keep our estimator unbiased. The variance of the resulting estimator is very sensitive to which new set of transition probabilities are chosen. It is shown that a zero variance estimator does exist, but that its computation requires exact knowledge of the solution. A simple random walk with an associated killing model for the scatter of neutrons is introduced. Large deviation results for optimal importance sampling in random walks are extended to the case where killing is present. An adaptive "learning" algorithm for implementing importance sampling is given for more general Markov chain models of neutron scatter. For finite state spaces this algorithm is shown to give, with probability one, a sequence of estimates converging exponentially fast to the true solution. In the final chapter, an attempt to generalize this algorithm to a continuous state space is made. This involves partitioning the space into a finite number of cells. There is a tradeoff between additional computation per iteration and variance reduction per iteration that arises in determining the optimal grid size. All versions of this algorithm can be thought of as a compromise between deterministic and Monte Carlo methods, capturing advantages of both techniques.
Methods for fitting a parametric probability distribution to most probable number data.

PubMed

Williams, Michael S; Ebel, Eric D

2012-07-02

Every year hundreds of thousands, if not millions, of samples are collected and analyzed to assess microbial contamination in food and water. The concentration of pathogenic organisms at the end of the production process is low for most commodities, so a highly sensitive screening test is used to determine whether the organism of interest is present in a sample. In some applications, samples that test positive are subjected to quantitation. The most probable number (MPN) technique is a common method to quantify the level of contamination in a sample because it is able to provide estimates at low concentrations. This technique uses a series of dilution count experiments to derive estimates of the concentration of the microorganism of interest. An application for these data is food-safety risk assessment, where the MPN concentration estimates can be fitted to a parametric distribution to summarize the range of potential exposures to the contaminant. Many different methods (e.g., substitution methods, maximum likelihood and regression on order statistics) have been proposed to fit microbial contamination data to a distribution, but the development of these methods rarely considers how the MPN technique influences the choice of distribution function and fitting method. An often overlooked aspect when applying these methods is whether the data represent actual measurements of the average concentration of microorganism per milliliter or the data are real-valued estimates of the average concentration, as is the case with MPN data. In this study, we propose two methods for fitting MPN data to a probability distribution. The first method uses a maximum likelihood estimator that takes average concentration values as the data inputs. The second is a Bayesian latent variable method that uses the counts of the number of positive tubes at each dilution to estimate the parameters of the contamination distribution. The performance of the two fitting methods is compared for two data sets that represent Salmonella and Campylobacter concentrations on chicken carcasses. The results demonstrate a bias in the maximum likelihood estimator that increases with reductions in average concentration. The Bayesian method provided unbiased estimates of the concentration distribution parameters for all data sets. We provide computer code for the Bayesian fitting method. Published by Elsevier B.V.

Generalized Maximum Entropy

NASA Technical Reports Server (NTRS)

Cheeseman, Peter; Stutz, John

2005-01-01

A long standing mystery in using Maximum Entropy (MaxEnt) is how to deal with constraints whose values are uncertain. This situation arises when constraint values are estimated from data, because of finite sample sizes. One approach to this problem, advocated by E.T. Jaynes [1], is to ignore this uncertainty, and treat the empirically observed values as exact. We refer to this as the classic MaxEnt approach. Classic MaxEnt gives point probabilities (subject to the given constraints), rather than probability densities. We develop an alternative approach that assumes that the uncertain constraint values are represented by a probability density {e.g: a Gaussian), and this uncertainty yields a MaxEnt posterior probability density. That is, the classic MaxEnt point probabilities are regarded as a multidimensional function of the given constraint values, and uncertainty on these values is transmitted through the MaxEnt function to give uncertainty over the MaXEnt probabilities. We illustrate this approach by explicitly calculating the generalized MaxEnt density for a simple but common case, then show how this can be extended numerically to the general case. This paper expands the generalized MaxEnt concept introduced in a previous paper [3].
Sexual risk behaviours and sexual health outcomes among heterosexual black Caribbeans: comparing sexually transmitted infection clinic attendees and national probability survey respondents.

PubMed

Gerver, S M; Easterbrook, P J; Anderson, M; Solarin, I; Elam, G; Fenton, K A; Garnett, G; Mercer, C H

2011-02-01

We compared sociodemographic characteristics, sexual risk behaviours and sexual health experiences of 266 heterosexual black Caribbeans recruited at a London sexual health clinic between September 2005 and January 2006 with 402 heterosexual black Caribbeans interviewed for a British probability survey between May 1999 and August 2001. Male clinic attendees were more likely than men in the national survey to report: ≥10 sexual partners (lifetime; adjusted odds ratio [AOR]: 3.27, 95% confidence interval [CI]: 1.66-6.42), ≥2 partners (last year; AOR: 5.40, 95% CI: 2.64-11.0), concurrent partnerships (AOR: 3.26, 95% CI: 1.61-6.60), sex with partner(s) from the Caribbean (last 5 years; AOR: 7.97, 95% CI: 2.42-26.2) and previous sexually transmitted infection (STI) diagnosis/diagnoses (last 5 years; AOR: 16.2, 95% CI: 8.04-32.6). Similar patterns were observed for women clinic attendees, who also had increased odds of termination of pregnancy (AOR: 3.25, 95% CI: 1.87-5.66). These results highlight the substantially higher levels of several high-risk sexual behaviours among UK black Caribbeans attending a sexual health clinic compared with those in the general population. High-risk individuals are under-represented in probability samples, and it is therefore important that convenience samples of high-risk individuals are performed in conjunction with nationally representative surveys to fully understand the risk behaviours and sexual health-care needs of ethnic minority communities.
Probability sampling in legal cases: Kansas cellphone users

NASA Astrophysics Data System (ADS)

Kadane, Joseph B.

2012-10-01

Probability sampling is a standard statistical technique. This article introduces the basic ideas of probability sampling, and shows in detail how probability sampling was used in a particular legal case.
Review of Literature on Probability of Detection for Liquid Penetrant Nondestructive Testing

DTIC Science & Technology

2011-11-01

increased maintenance costs , or catastrophic failure of safety- critical structure. Knowledge of the reliability achieved by NDT methods, including...representative components to gather data for statistical analysis, which can be prohibitively expensive. To account for sampling variability inherent in any...Sioux City and Pensacola. (Those recommendations were discussed in Section 3.4.) Drury et al report on a factorial experiment aimed at identifying the
Phenols in hydrothermal petroleums and sediment bitumen from Guaymas Basin, Gulf of California

NASA Technical Reports Server (NTRS)

Simoneit, B. R.; Leif, R. N.; Ishiwatari, R.

1996-01-01

The aliphatic, aromatic and polar (NSO) fractions of seabed petroleums and sediment bitumen extracts from the Guaymas Basin hydrothermal system have been analyzed by gas chromatography and gas chromatography-mass spectrometry (free and silylated). The oils were collected from the interiors and exteriors of high temperature hydrothermal vents and represent hydrothermal pyrolyzates that have migrated to the seafloor by hydrothermal fluid circulation. The downcore sediments are representative of both thermally unaltered and thermally altered sediments. The survey has revealed the presence of oxygenated compounds in samples with a high degree of thermal maturity. Phenols are one class of oxygenated compounds found in these samples. A group of methyl-, dimethyl- and trimethyl-isoprenoidyl phenols (C27-C29) is present in all of the seabed NSO fractions, with the methyl- and dimethyl-isoprenoidyl phenols occurring as major components, and a trimethyl-isoprenoidyl phenol as a minor component. A homologous series of n-alkylphenols (C13-C33) has also been found in the seabed petroleums. These phenols are most likely derived from the hydrothermal alteration of sedimentary organic matter. The n-alkylphenols are probably synthesized under hydrothermal conditions, but the isoprenoidyl phenols are probably hydrothermal alteration products of natural product precursors. The suites of phenols do not appear to be useful tracers of high temperature hydrothermal processes.
Psychopathology among New York city public school children 6 months after September 11.

PubMed

Hoven, Christina W; Duarte, Cristiane S; Lucas, Christopher P; Wu, Ping; Mandell, Donald J; Goodwin, Renee D; Cohen, Michael; Balaban, Victor; Woodruff, Bradley A; Bin, Fan; Musa, George J; Mei, Lori; Cantor, Pamela A; Aber, J Lawrence; Cohen, Patricia; Susser, Ezra

2005-05-01

Children exposed to a traumatic event may be at higher risk for developing mental disorders. The prevalence of child psychopathology, however, has not been assessed in a population-based sample exposed to different levels of mass trauma or across a range of disorders. To determine prevalence and correlates of probable mental disorders among New York City, NY, public school students 6 months following the September 11, 2001, World Trade Center attack. Survey. New York City public schools. A citywide, random, representative sample of 8236 students in grades 4 through 12, including oversampling in closest proximity to the World Trade Center site (ground zero) and other high-risk areas. Children were screened for probable mental disorders with the Diagnostic Interview Schedule for Children Predictive Scales. One or more of 6 probable anxiety/depressive disorders were identified in 28.6% of all children. The most prevalent were probable agoraphobia (14.8%), probable separation anxiety (12.3%), and probable posttraumatic stress disorder (10.6%). Higher levels of exposure correspond to higher prevalence for all probable anxiety/depressive disorders. Girls and children in grades 4 and 5 were the most affected. In logistic regression analyses, child's exposure (adjusted odds ratio, 1.62), exposure of a child's family member (adjusted odds ratio, 1.80), and the child's prior trauma (adjusted odds ratio, 2.01) were related to increased likelihood of probable anxiety/depressive disorders. Results were adjusted for different types of exposure, sociodemographic characteristics, and child mental health service use. A high proportion of New York City public school children had a probable mental disorder 6 months after September 11, 2001. The data suggest that there is a relationship between level of exposure to trauma and likelihood of child anxiety/depressive disorders in the community. The results support the need to apply wide-area epidemiological approaches to mental health assessment after any large-scale disaster.
A risk assessment method for multi-site damage

NASA Astrophysics Data System (ADS)

Millwater, Harry Russell, Jr.

This research focused on developing probabilistic methods suitable for computing small probabilities of failure, e.g., 10sp{-6}, of structures subject to multi-site damage (MSD). MSD is defined as the simultaneous development of fatigue cracks at multiple sites in the same structural element such that the fatigue cracks may coalesce to form one large crack. MSD is modeled as an array of collinear cracks with random initial crack lengths with the centers of the initial cracks spaced uniformly apart. The data used was chosen to be representative of aluminum structures. The structure is considered failed whenever any two adjacent cracks link up. A fatigue computer model is developed that can accurately and efficiently grow a collinear array of arbitrary length cracks from initial size until failure. An algorithm is developed to compute the stress intensity factors of all cracks considering all interaction effects. The probability of failure of two to 100 cracks is studied. Lower bounds on the probability of failure are developed based upon the probability of the largest crack exceeding a critical crack size. The critical crack size is based on the initial crack size that will grow across the ligament when the neighboring crack has zero length. The probability is evaluated using extreme value theory. An upper bound is based on the probability of the maximum sum of initial cracks being greater than a critical crack size. A weakest link sampling approach is developed that can accurately and efficiently compute small probabilities of failure. This methodology is based on predicting the weakest link, i.e., the two cracks to link up first, for a realization of initial crack sizes, and computing the cycles-to-failure using these two cracks. Criteria to determine the weakest link are discussed. Probability results using the weakest link sampling method are compared to Monte Carlo-based benchmark results. The results indicate that very small probabilities can be computed accurately in a few minutes using a Hewlett-Packard workstation.
Faster computation of exact RNA shape probabilities.

PubMed

Janssen, Stefan; Giegerich, Robert

2010-03-01

Abstract shape analysis allows efficient computation of a representative sample of low-energy foldings of an RNA molecule. More comprehensive information is obtained by computing shape probabilities, accumulating the Boltzmann probabilities of all structures within each abstract shape. Such information is superior to free energies because it is independent of sequence length and base composition. However, up to this point, computation of shape probabilities evaluates all shapes simultaneously and comes with a computation cost which is exponential in the length of the sequence. We device an approach called RapidShapes that computes the shapes above a specified probability threshold T by generating a list of promising shapes and constructing specialized folding programs for each shape to compute its share of Boltzmann probability. This aims at a heuristic improvement of runtime, while still computing exact probability values. Evaluating this approach and several substrategies, we find that only a small proportion of shapes have to be actually computed. For an RNA sequence of length 400, this leads, depending on the threshold, to a 10-138 fold speed-up compared with the previous complete method. Thus, probabilistic shape analysis has become feasible in medium-scale applications, such as the screening of RNA transcripts in a bacterial genome. RapidShapes is available via http://bibiserv.cebitec.uni-bielefeld.de/rnashapes
Identification of Species and Sources of Cryptosporidium Oocysts in Storm Waters with a Small-Subunit rRNA-Based Diagnostic and Genotyping Tool

PubMed Central

Xiao, Lihua; Alderisio, Kerri; Limor, Josef; Royer, Michael; Lal, Altaf A.

2000-01-01

The identification of Cryptosporidium oocysts in environmental samples is largely made by the use of an immunofluorescent assay. In this study, we have used a small-subunit rRNA-based PCR-restriction fragment length polymorphism technique to identify species and sources of Cryptosporidium oocysts present in 29 storm water samples collected from a stream in New York. A total of 12 genotypes were found in 27 positive samples; for 4 the species and probable origins were identified by sequence analysis, whereas the rest represent new genotypes from wildlife. Thus, this technique provides an alternative method for the detection and differentiation of Cryptosporidium parasites in environmental samples. PMID:11097935
A Looping-Based Model for Quenching Repression

PubMed Central

Pollak, Yaroslav; Goldberg, Sarah; Amit, Roee

2017-01-01

We model the regulatory role of proteins bound to looped DNA using a simulation in which dsDNA is represented as a self-avoiding chain, and proteins as spherical protrusions. We simulate long self-avoiding chains using a sequential importance sampling Monte-Carlo algorithm, and compute the probabilities for chain looping with and without a protrusion. We find that a protrusion near one of the chain’s termini reduces the probability of looping, even for chains much longer than the protrusion–chain-terminus distance. This effect increases with protrusion size, and decreases with protrusion-terminus distance. The reduced probability of looping can be explained via an eclipse-like model, which provides a novel inhibitory mechanism. We test the eclipse model on two possible transcription-factor occupancy states of the D. melanogaster eve 3/7 enhancer, and show that it provides a possible explanation for the experimentally-observed eve stripe 3 and 7 expression patterns. PMID:28085884
REPORT FOR COMMERCIAL GRADE NICKEL CHARACTERIZATION AND BENCHMARKING

DOE Office of Scientific and Technical Information (OSTI.GOV)

None

2012-12-20

Oak Ridge Associated Universities (ORAU), under the Oak Ridge Institute for Science and Education (ORISE) contract, has completed the collection, sample analysis, and review of analytical results to benchmark the concentrations of gross alpha-emitting radionuclides, gross beta-emitting radionuclides, and technetium-99 in commercial grade nickel. This report presents methods, change management, observations, and statistical analysis of materials procured from sellers representing nine countries on four continents. The data suggest there is a low probability of detecting alpha- and beta-emitting radionuclides in commercial nickel. Technetium-99 was not detected in any samples, thus suggesting it is not present in commercial nickel.
Inadequate Iodine Intake in Population Groups Defined by Age, Life Stage and Vegetarian Dietary Practice in a Norwegian Convenience Sample.

PubMed

Brantsæter, Anne Lise; Knutsen, Helle Katrine; Johansen, Nina Cathrine; Nyheim, Kristine Aastad; Erlund, Iris; Meltzer, Helle Margrete; Henjum, Sigrun

2018-02-17

Inadequate iodine intake has been identified in populations considered iodine replete for decades. The objective of the current study is to evaluate urinary iodine concentration (UIC) and the probability of adequate iodine intake in subgroups of the Norwegian population defined by age, life stage and vegetarian dietary practice. In a cross-sectional survey, we assessed the probability of adequate iodine intake by two 24-h food diaries and UIC from two fasting morning spot urine samples in 276 participants. The participants included children ( n = 47), adolescents ( n = 46), adults ( n = 71), the elderly ( n = 23), pregnant women ( n = 45), ovo-lacto vegetarians ( n = 25), and vegans ( n = 19). In all participants combined, the median (95% CI) UIC was 101 (90, 110) µg/L, median (25th, 75th percentile) calculated iodine intake was 112 (77, 175) µg/day and median (25th, 75th percentile) estimated usual iodine intake was 101 (75, 150) µg/day. According to WHOs criteria for evaluation of median UIC, iodine intake was inadequate in the elderly, pregnant women, vegans and non-pregnant women of childbearing age. Children had the highest (82%) and vegans the lowest (14%) probability of adequate iodine intake according to reported food and supplement intakes. This study confirms the need for monitoring iodine intake and status in nationally representative study samples in Norway.
A sampling design and model for estimating abundance of Nile crocodiles while accounting for heterogeneity of detectability of multiple observers

USGS Publications Warehouse

Shirley, Matthew H.; Dorazio, Robert M.; Abassery, Ekramy; Elhady, Amr A.; Mekki, Mohammed S.; Asran, Hosni H.

2012-01-01

As part of the development of a management program for Nile crocodiles in Lake Nasser, Egypt, we used a dependent double-observer sampling protocol with multiple observers to compute estimates of population size. To analyze the data, we developed a hierarchical model that allowed us to assess variation in detection probabilities among observers and survey dates, as well as account for variation in crocodile abundance among sites and habitats. We conducted surveys from July 2008-June 2009 in 15 areas of Lake Nasser that were representative of 3 main habitat categories. During these surveys, we sampled 1,086 km of lake shore wherein we detected 386 crocodiles. Analysis of the data revealed significant variability in both inter- and intra-observer detection probabilities. Our raw encounter rate was 0.355 crocodiles/km. When we accounted for observer effects and habitat, we estimated a surface population abundance of 2,581 (2,239-2,987, 95% credible intervals) crocodiles in Lake Nasser. Our results underscore the importance of well-trained, experienced monitoring personnel in order to decrease heterogeneity in intra-observer detection probability and to better detect changes in the population based on survey indices. This study will assist the Egyptian government establish a monitoring program as an integral part of future crocodile harvest activities in Lake Nasser
Inadequate Iodine Intake in Population Groups Defined by Age, Life Stage and Vegetarian Dietary Practice in a Norwegian Convenience Sample

PubMed Central

Knutsen, Helle Katrine; Johansen, Nina Cathrine; Nyheim, Kristine Aastad; Erlund, Iris; Meltzer, Helle Margrete

2018-01-01

Inadequate iodine intake has been identified in populations considered iodine replete for decades. The objective of the current study is to evaluate urinary iodine concentration (UIC) and the probability of adequate iodine intake in subgroups of the Norwegian population defined by age, life stage and vegetarian dietary practice. In a cross-sectional survey, we assessed the probability of adequate iodine intake by two 24-h food diaries and UIC from two fasting morning spot urine samples in 276 participants. The participants included children (n = 47), adolescents (n = 46), adults (n = 71), the elderly (n = 23), pregnant women (n = 45), ovo-lacto vegetarians (n = 25), and vegans (n = 19). In all participants combined, the median (95% CI) UIC was 101 (90, 110) µg/L, median (25th, 75th percentile) calculated iodine intake was 112 (77, 175) µg/day and median (25th, 75th percentile) estimated usual iodine intake was 101 (75, 150) µg/day. According to WHOs criteria for evaluation of median UIC, iodine intake was inadequate in the elderly, pregnant women, vegans and non-pregnant women of childbearing age. Children had the highest (82%) and vegans the lowest (14%) probability of adequate iodine intake according to reported food and supplement intakes. This study confirms the need for monitoring iodine intake and status in nationally representative study samples in Norway. PMID:29462974
Comparison of electrofishing techniques to detect larval lampreys in wadeable streams in the Pacific Northwest

USGS Publications Warehouse

Dunham, Jason B.; Chelgren, Nathan D.; Heck, Michael P.; Clark, Steven M.

2013-01-01

We evaluated the probability of detecting larval lampreys using different methods of backpack electrofishing in wadeable streams in the U.S. Pacific Northwest. Our primary objective was to compare capture of lampreys using electrofishing with standard settings for salmon and trout to settings specifically adapted for capture of lampreys. Field work consisted of removal sampling by means of backpack electrofishing in 19 sites in streams representing a broad range of conditions in the region. Captures of lampreys at these sites were analyzed with a modified removal-sampling model and Bayesian estimation to measure the relative odds of capture using the lamprey-specific settings compared with the standard salmonid settings. We found that the odds of capture were 2.66 (95% credible interval, 0.87–78.18) times greater for the lamprey-specific settings relative to standard salmonid settings. When estimates of capture probability were applied to estimating the probabilities of detection, we found high (>0.80) detectability when the actual number of lampreys in a site was greater than 10 individuals and effort was at least two passes of electrofishing, regardless of the settings used. Further work is needed to evaluate key assumptions in our approach, including the evaluation of individual-specific capture probabilities and population closure. For now our results suggest comparable results are possible for detection of lampreys by using backpack electrofishing with salmonid- or lamprey-specific settings.
Probabilistic hindcasts and projections of the coupled climate, carbon cycle and Atlantic meridional overturning circulation system: a Bayesian fusion of century-scale observations with a simple model

NASA Astrophysics Data System (ADS)

Urban, Nathan M.; Keller, Klaus

2010-10-01

How has the Atlantic Meridional Overturning Circulation (AMOC) varied over the past centuries and what is the risk of an anthropogenic AMOC collapse? We report probabilistic projections of the future climate which improve on previous AMOC projection studies by (i) greatly expanding the considered observational constraints and (ii) carefully sampling the tail areas of the parameter probability distribution function (pdf). We use a Bayesian inversion to constrain a simple model of the coupled climate, carbon cycle and AMOC systems using observations to derive multicentury hindcasts and projections. Our hindcasts show considerable skill in representing the observational constraints. We show that robust AMOC risk estimates can require carefully sampling the parameter pdfs. We find a low probability of experiencing an AMOC collapse within the 21st century for a business-as-usual emissions scenario. The probability of experiencing an AMOC collapse within two centuries is 1/10. The probability of crossing a forcing threshold and triggering a future AMOC collapse (by 2300) is approximately 1/30 in the 21st century and over 1/3 in the 22nd. Given the simplicity of the model structure and uncertainty in the forcing assumptions, our analysis should be considered a proof of concept and the quantitative conclusions subject to severe caveats.
Probabilistic liver atlas construction.

PubMed

Dura, Esther; Domingo, Juan; Ayala, Guillermo; Marti-Bonmati, Luis; Goceri, E

2017-01-13

Anatomical atlases are 3D volumes or shapes representing an organ or structure of the human body. They contain either the prototypical shape of the object of interest together with other shapes representing its statistical variations (statistical atlas) or a probability map of belonging to the object (probabilistic atlas). Probabilistic atlases are mostly built with simple estimations only involving the data at each spatial location. A new method for probabilistic atlas construction that uses a generalized linear model is proposed. This method aims to improve the estimation of the probability to be covered by the liver. Furthermore, all methods to build an atlas involve previous coregistration of the sample of shapes available. The influence of the geometrical transformation adopted for registration in the quality of the final atlas has not been sufficiently investigated. The ability of an atlas to adapt to a new case is one of the most important quality criteria that should be taken into account. The presented experiments show that some methods for atlas construction are severely affected by the previous coregistration step. We show the good performance of the new approach. Furthermore, results suggest that extremely flexible registration methods are not always beneficial, since they can reduce the variability of the atlas and hence its ability to give sensible values of probability when used as an aid in segmentation of new cases.
Smaller than expected cognitive deficits in schizophrenia patients from the population-representative ABC catchment cohort.

PubMed

Lennertz, Leonhard; An der Heiden, Wolfram; Kronacher, Regina; Schulze-Rauschenbach, Svenja; Maier, Wolfgang; Häfner, Heinz; Wagner, Michael

2016-08-01

Most neuropsychological studies on schizophrenia suffer from sample selection bias, with male and chronic patients being overrepresented. This probably leads to an overestimation of cognitive impairments. The present study aimed to provide a less biased estimate of cognitive functions in schizophrenia using a population-representative catchment area sample. Schizophrenia patients (N = 89) from the prospective Mannheim ABC cohort were assessed 14 years after disease onset and first diagnosis, using a comprehensive neuropsychological test battery. A healthy control group (N = 90) was carefully matched according to age, gender, and geographic region (city, rural surrounds). The present sample was representative for the initial ABC cohort. In the comprehensive neuropsychological assessment, the schizophrenia patients were only moderately impaired as compared to the healthy control group (d = 0.56 for a general cognitive index, d = 0.42 for verbal memory, d = 0.61 for executive functions, d = 0.69 for attention). Only 33 % of the schizophrenia patients scored one standard deviation unit below the healthy control group in the general cognitive index. Neuropsychological performance did not correlate with measures of the clinical course including age at onset, number of hospital admissions, and time in paid work. Thus, in this population-representative sample of schizophrenia patients, neuropsychological deficits were less pronounced than expected from meta-analyses. In agreement with other epidemiological studies, this suggests a less devastating picture of cognition in schizophrenia.
Genetic stock identification of Russian honey bees.

PubMed

Bourgeois, Lelania; Sheppard, Walter S; Sylvester, H Allen; Rinderer, Thomas E

2010-06-01

A genetic stock certification assay was developed to distinguish Russian honey bees from other European (Apis mellifera L.) stocks that are commercially produced in the United States. In total, 11 microsatellite and five single-nucleotide polymorphism loci were used. Loci were selected for relatively high levels of homogeneity within each group and for differences in allele frequencies between groups. A baseline sample consisted of the 18 lines of Russian honey bees released to the Russian Bee Breeders Association and bees from 34 queen breeders representing commercially produced European honey bee stocks. Suitability tests of the baseline sample pool showed high levels of accuracy. The probability of correct assignment was 94.2% for non-Russian bees and 93.3% for Russian bees. A neighbor-joining phenogram representing genetic distance data showed clear distinction of Russian and non-Russian honey bee stocks. Furthermore, a test of appropriate sample size showed a sample of eight bees per colony maximizes accuracy and consistency of the results. An additional 34 samples were tested as blind samples (origin unknown to those collecting data) to determine accuracy of individual assignment tests. Only one of these samples was incorrectly assigned. The 18 current breeding lines were represented among the 2009 blind sampling, demonstrating temporal stability of the genetic stock identification assay. The certification assay will be used through services provided by a service laboratory, by the Russian Bee Breeders Association to genetically certify their stock. The genetic certification will be used in conjunction with continued selection for favorable traits, such as honey production and varroa and tracheal mite resistance.
Adaptive skin segmentation via feature-based face detection

NASA Astrophysics Data System (ADS)

Taylor, Michael J.; Morris, Tim

2014-05-01

Variations in illumination can have significant effects on the apparent colour of skin, which can be damaging to the efficacy of any colour-based segmentation approach. We attempt to overcome this issue by presenting a new adaptive approach, capable of generating skin colour models at run-time. Our approach adopts a Viola-Jones feature-based face detector, in a moderate-recall, high-precision configuration, to sample faces within an image, with an emphasis on avoiding potentially detrimental false positives. From these samples, we extract a set of pixels that are likely to be from skin regions, filter them according to their relative luma values in an attempt to eliminate typical non-skin facial features (eyes, mouths, nostrils, etc.), and hence establish a set of pixels that we can be confident represent skin. Using this representative set, we train a unimodal Gaussian function to model the skin colour in the given image in the normalised rg colour space - a combination of modelling approach and colour space that benefits us in a number of ways. A generated function can subsequently be applied to every pixel in the given image, and, hence, the probability that any given pixel represents skin can be determined. Segmentation of the skin, therefore, can be as simple as applying a binary threshold to the calculated probabilities. In this paper, we touch upon a number of existing approaches, describe the methods behind our new system, present the results of its application to arbitrary images of people with detectable faces, which we have found to be extremely encouraging, and investigate its potential to be used as part of real-time systems.

Exploring Representativeness and Informativeness for Active Learning.

PubMed

Du, Bo; Wang, Zengmao; Zhang, Lefei; Zhang, Liangpei; Liu, Wei; Shen, Jialie; Tao, Dacheng

2017-01-01

How can we find a general way to choose the most suitable samples for training a classifier? Even with very limited prior information? Active learning, which can be regarded as an iterative optimization procedure, plays a key role to construct a refined training set to improve the classification performance in a variety of applications, such as text analysis, image recognition, social network modeling, etc. Although combining representativeness and informativeness of samples has been proven promising for active sampling, state-of-the-art methods perform well under certain data structures. Then can we find a way to fuse the two active sampling criteria without any assumption on data? This paper proposes a general active learning framework that effectively fuses the two criteria. Inspired by a two-sample discrepancy problem, triple measures are elaborately designed to guarantee that the query samples not only possess the representativeness of the unlabeled data but also reveal the diversity of the labeled data. Any appropriate similarity measure can be employed to construct the triple measures. Meanwhile, an uncertain measure is leveraged to generate the informativeness criterion, which can be carried out in different ways. Rooted in this framework, a practical active learning algorithm is proposed, which exploits a radial basis function together with the estimated probabilities to construct the triple measures and a modified best-versus-second-best strategy to construct the uncertain measure, respectively. Experimental results on benchmark datasets demonstrate that our algorithm consistently achieves superior performance over the state-of-the-art active learning algorithms.
New Concepts in the Evaluation of Biodegradation/Persistence of Chemical Substances Using a Microbial Inoculum

PubMed Central

Thouand, Gérald; Durand, Marie-José; Maul, Armand; Gancet, Christian; Blok, Han

2011-01-01

The European REACH Regulation (Registration, Evaluation, Authorization of CHemical substances) implies, among other things, the evaluation of the biodegradability of chemical substances produced by industry. A large set of test methods is available including detailed information on the appropriate conditions for testing. However, the inoculum used for these tests constitutes a “black box.” If biodegradation is achievable from the growth of a small group of specific microbial species with the substance as the only carbon source, the result of the test depends largely on the cell density of this group at “time zero.” If these species are relatively rare in an inoculum that is normally used, the likelihood of inoculating a test with sufficient specific cells becomes a matter of probability. Normally this probability increases with total cell density and with the diversity of species in the inoculum. Furthermore the history of the inoculum, e.g., a possible pre-exposure to the test substance or similar substances will have a significant influence on the probability. A high probability can be expected for substances that are widely used and regularly released into the environment, whereas a low probability can be expected for new xenobiotic substances that have not yet been released into the environment. Be that as it may, once the inoculum sample contains sufficient specific degraders, the performance of the biodegradation will follow a typical S shaped growth curve which depends on the specific growth rate under laboratory conditions, the so called F/M ratio (ratio between food and biomass) and the more or less toxic recalcitrant, but possible, metabolites. Normally regulators require the evaluation of the growth curve using a simple approach such as half-time. Unfortunately probability and biodegradation half-time are very often confused. As the half-time values reflect laboratory conditions which are quite different from environmental conditions (after a substance is released), these values should not be used to quantify and predict environmental behavior. The probability value could be of much greater benefit for predictions under realistic conditions. The main issue in the evaluation of probability is that the result is not based on a single inoculum from an environmental sample, but on a variety of samples. These samples can be representative of regional or local areas, climate regions, water types, and history, e.g., pristine or polluted. The above concept has provided us with a new approach, namely “Probabio.” With this approach, persistence is not only regarded as a simple intrinsic property of a substance, but also as the capability of various environmental samples to degrade a substance under realistic exposure conditions and F/M ratio. PMID:21863143
Occupancy Modeling Species-Environment Relationships with Non-ignorable Survey Designs.

PubMed

Irvine, Kathryn M; Rodhouse, Thomas J; Wright, Wilson J; Olsen, Anthony R

2018-05-26

Statistical models supporting inferences about species occurrence patterns in relation to environmental gradients are fundamental to ecology and conservation biology. A common implicit assumption is that the sampling design is ignorable and does not need to be formally accounted for in analyses. The analyst assumes data are representative of the desired population and statistical modeling proceeds. However, if datasets from probability and non-probability surveys are combined or unequal selection probabilities are used, the design may be non ignorable. We outline the use of pseudo-maximum likelihood estimation for site-occupancy models to account for such non-ignorable survey designs. This estimation method accounts for the survey design by properly weighting the pseudo-likelihood equation. In our empirical example, legacy and newer randomly selected locations were surveyed for bats to bridge a historic statewide effort with an ongoing nationwide program. We provide a worked example using bat acoustic detection/non-detection data and show how analysts can diagnose whether their design is ignorable. Using simulations we assessed whether our approach is viable for modeling datasets composed of sites contributed outside of a probability design Pseudo-maximum likelihood estimates differed from the usual maximum likelihood occu31 pancy estimates for some bat species. Using simulations we show the maximum likelihood estimator of species-environment relationships with non-ignorable sampling designs was biased, whereas the pseudo-likelihood estimator was design-unbiased. However, in our simulation study the designs composed of a large proportion of legacy or non-probability sites resulted in estimation issues for standard errors. These issues were likely a result of highly variable weights confounded by small sample sizes (5% or 10% sampling intensity and 4 revisits). Aggregating datasets from multiple sources logically supports larger sample sizes and potentially increases spatial extents for statistical inferences. Our results suggest that ignoring the mechanism for how locations were selected for data collection (e.g., the sampling design) could result in erroneous model-based conclusions. Therefore, in order to ensure robust and defensible recommendations for evidence-based conservation decision-making, the survey design information in addition to the data themselves must be available for analysts. Details for constructing the weights used in estimation and code for implementation are provided. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
The neural correlates of subjective utility of monetary outcome and probability weight in economic and in motor decision under risk

PubMed Central

Wu, Shih-Wei; Delgado, Mauricio R.; Maloney, Laurence T.

2011-01-01

In decision under risk, people choose between lotteries that contain a list of potential outcomes paired with their probabilities of occurrence. We previously developed a method for translating such lotteries to mathematically equivalent motor lotteries. The probability of each outcome in a motor lottery is determined by the subject’s noise in executing a movement. In this study, we used functional magnetic resonance imaging in humans to compare the neural correlates of monetary outcome and probability in classical lottery tasks where information about probability was explicitly communicated to the subjects and in mathematically equivalent motor lottery tasks where probability was implicit in the subjects’ own motor noise. We found that activity in the medial prefrontal cortex (mPFC) and the posterior cingulate cortex (PCC) quantitatively represent the subjective utility of monetary outcome in both tasks. For probability, we found that the mPFC significantly tracked the distortion of such information in both tasks. Specifically, activity in mPFC represents probability information but not the physical properties of the stimuli correlated with this information. Together, the results demonstrate that mPFC represents probability from two distinct forms of decision under risk. PMID:21677166
The neural correlates of subjective utility of monetary outcome and probability weight in economic and in motor decision under risk.

PubMed

Wu, Shih-Wei; Delgado, Mauricio R; Maloney, Laurence T

2011-06-15

In decision under risk, people choose between lotteries that contain a list of potential outcomes paired with their probabilities of occurrence. We previously developed a method for translating such lotteries to mathematically equivalent "motor lotteries." The probability of each outcome in a motor lottery is determined by the subject's noise in executing a movement. In this study, we used functional magnetic resonance imaging in humans to compare the neural correlates of monetary outcome and probability in classical lottery tasks in which information about probability was explicitly communicated to the subjects and in mathematically equivalent motor lottery tasks in which probability was implicit in the subjects' own motor noise. We found that activity in the medial prefrontal cortex (mPFC) and the posterior cingulate cortex quantitatively represent the subjective utility of monetary outcome in both tasks. For probability, we found that the mPFC significantly tracked the distortion of such information in both tasks. Specifically, activity in mPFC represents probability information but not the physical properties of the stimuli correlated with this information. Together, the results demonstrate that mPFC represents probability from two distinct forms of decision under risk.
[Socio-demographic and health factors associated with the institutionalization of dependent people].

PubMed

Ayuso Gutiérrez, Mercedes; Pozo Rubio, Raúl Del; Escribano Sotos, Francisco

2010-01-01

The analysis of the effect that different variables have in the probability that dependent people are institutionalized is a topic scantily studied in Spain. The aim of the work is to analyze as certain socio-demographic and health factors can influence probability of dependent person living in a residence. A cross-section study has been conducted from a representative sample of the dependent population in Cuenca (Spain) in February, 2009. We have obtained information for people with level II and III of dependence. A binary logit regression model has been estimated to identify those factors related to the institutionalization of dependent people. People with ages between 65-74 years old are six times more likely to be institutionalized than younger people (< 65 years old); this probability increases sixteen times for those individuals with ages equal or higher than 95 years. The probability of institutionalization of people who live in an urban area is three times the probability of people who live in a rural area. People who need pharmacological, psychotherapy or rehabilitation treatments have between two and four times more probability of being institutionalized that those who do not need those. Age, marital status, place of residence, cardiovascular and musculoskeletal diseases and four times of medical treatment are the principal variables associated with the institutionalization of dependent people.
Hyper-polyhedron model applied to molecular screening of guanidines as Na/H exchange inhibitors.

PubMed

Bao, Xin-Hua; Lu, Wen-Cong; Liu, Liang; Chen, Nian-Yi

2003-05-01

To investigate structure-activity relationships of N-(3-Oxo-3,4-dihydro-2H-benzo[1,4]oxazine-6-carbonyl) guanidines in Na/H exchange inhibitory activities and probe into a new method of the computer-aided molecular screening. The hyper-polyhedron model (HPM) was proposed in our lab. The samples with probably higher activities could be determined in such a way that their representing points should be in the hyper-polyhedron region where all known samples with high activities were distributed. And the predictive ability of different methods available was tested by the cross-validation experiment. The accurate rate of molecular screening of N-(3-Oxo-3,4-dihydro-2H-benzo[1,4]oxazine-6-carbonyl) guanidines by HPM was much higher than that obtained by PCA (principal component analysis) and Fisher methods for the data set available here. Therefore, HPM could be used as a powerful tool for screening new compounds with probably higher activities.
Performance Tested Method multiple laboratory validation study of ELISA-based assays for the detection of peanuts in food.

PubMed

Park, Douglas L; Coates, Scott; Brewer, Vickery A; Garber, Eric A E; Abouzied, Mohamed; Johnson, Kurt; Ritter, Bruce; McKenzie, Deborah

2005-01-01

Performance Tested Method multiple laboratory validations for the detection of peanut protein in 4 different food matrixes were conducted under the auspices of the AOAC Research Institute. In this blind study, 3 commercially available ELISA test kits were validated: Neogen Veratox for Peanut, R-Biopharm RIDASCREEN FAST Peanut, and Tepnel BioKits for Peanut Assay. The food matrixes used were breakfast cereal, cookies, ice cream, and milk chocolate spiked at 0 and 5 ppm peanut. Analyses of the samples were conducted by laboratories representing industry and international and U.S governmental agencies. All 3 commercial test kits successfully identified spiked and peanut-free samples. The validation study required 60 analyses on test samples at the target level 5 microg peanut/g food and 60 analyses at a peanut-free level, which was designed to ensure that the lower 95% confidence limit for the sensitivity and specificity would not be <90%. The probability that a test sample contains an allergen given a prevalence rate of 5% and a positive test result using a single test kit analysis with 95% sensitivity and 95% specificity, which was demonstrated for these test kits, would be 50%. When 2 test kits are run simultaneously on all samples, the probability becomes 95%. It is therefore recommended that all field samples be analyzed with at least 2 of the validated kits.
Analysis of sequences from field samples reveals the presence of the recently described pepper vein yellows virus (genus Polerovirus) in six additional countries.

PubMed

Knierim, Dennis; Tsai, Wen-Shi; Kenyon, Lawrence

2013-06-01

Polerovirus infection was detected by reverse transcription polymerase chain reaction (RT-PCR) in 29 pepper plants (Capsicum spp.) and one black nightshade plant (Solanum nigrum) sample collected from fields in India, Indonesia, Mali, Philippines, Thailand and Taiwan. At least two representative samples for each country were selected to generate a general polerovirus RT-PCR product of 1.4 kb length for sequencing. Sequence analysis of the partial genome sequences revealed the presence of pepper vein yellows virus (PeVYV) in all 13 samples. A 1990 Australian herbarium sample of pepper described by serological means as infected with capsicum yellows virus (CYV) was identified by sequence analysis of a partial CP sequence as probably infected with a potato leaf roll virus (PLRV) isolate.
The estimation of tree posterior probabilities using conditional clade probability distributions.

PubMed

Larget, Bret

2013-07-01

In this article I introduce the idea of conditional independence of separated subtrees as a principle by which to estimate the posterior probability of trees using conditional clade probability distributions rather than simple sample relative frequencies. I describe an algorithm for these calculations and software which implements these ideas. I show that these alternative calculations are very similar to simple sample relative frequencies for high probability trees but are substantially more accurate for relatively low probability trees. The method allows the posterior probability of unsampled trees to be calculated when these trees contain only clades that are in other sampled trees. Furthermore, the method can be used to estimate the total probability of the set of sampled trees which provides a measure of the thoroughness of a posterior sample.
A scenario tree model for the Canadian Notifiable Avian Influenza Surveillance System and its application to estimation of probability of freedom and sample size determination.

PubMed

Christensen, Jette; Stryhn, Henrik; Vallières, André; El Allaki, Farouk

2011-05-01

In 2008, Canada designed and implemented the Canadian Notifiable Avian Influenza Surveillance System (CanNAISS) with six surveillance activities in a phased-in approach. CanNAISS was a surveillance system because it had more than one surveillance activity or component in 2008: passive surveillance; pre-slaughter surveillance; and voluntary enhanced notifiable avian influenza surveillance. Our objectives were to give a short overview of two active surveillance components in CanNAISS; describe the CanNAISS scenario tree model and its application to estimation of probability of populations being free of NAI virus infection and sample size determination. Our data from the pre-slaughter surveillance component included diagnostic test results from 6296 serum samples representing 601 commercial chicken and turkey farms collected from 25 August 2008 to 29 January 2009. In addition, we included data from a sub-population of farms with high biosecurity standards: 36,164 samples from 55 farms sampled repeatedly over the 24 months study period from January 2007 to December 2008. All submissions were negative for Notifiable Avian Influenza (NAI) virus infection. We developed the CanNAISS scenario tree model, so that it will estimate the surveillance component sensitivity and the probability of a population being free of NAI at the 0.01 farm-level and 0.3 within-farm-level prevalences. We propose that a general model, such as the CanNAISS scenario tree model, may have a broader application than more detailed models that require disease specific input parameters, such as relative risk estimates. Crown Copyright © 2011. Published by Elsevier B.V. All rights reserved.
A multi-source probabilistic hazard assessment of tephra dispersal in the Neapolitan area

NASA Astrophysics Data System (ADS)

Sandri, Laura; Costa, Antonio; Selva, Jacopo; Folch, Arnau; Macedonio, Giovanni; Tonini, Roberto

2015-04-01

In this study we present the results obtained from a long-term Probabilistic Hazard Assessment (PHA) of tephra dispersal in the Neapolitan area. Usual PHA for tephra dispersal needs the definition of eruptive scenarios (usually by grouping eruption sizes and possible vent positions in a limited number of classes) with associated probabilities, a meteorological dataset covering a representative time period, and a tephra dispersal model. PHA then results from combining simulations considering different volcanological and meteorological conditions through weights associated to their specific probability of occurrence. However, volcanological parameters (i.e., erupted mass, eruption column height, eruption duration, bulk granulometry, fraction of aggregates) typically encompass a wide range of values. Because of such a natural variability, single representative scenarios or size classes cannot be adequately defined using single values for the volcanological inputs. In the present study, we use a method that accounts for this within-size-class variability in the framework of Event Trees. The variability of each parameter is modeled with specific Probability Density Functions, and meteorological and volcanological input values are chosen by using a stratified sampling method. This procedure allows for quantifying hazard without relying on the definition of scenarios, thus avoiding potential biases introduced by selecting single representative scenarios. Embedding this procedure into the Bayesian Event Tree scheme enables the tephra fall PHA and its epistemic uncertainties. We have appied this scheme to analyze long-term tephra fall PHA from Vesuvius and Campi Flegrei, in a multi-source paradigm. We integrate two tephra dispersal models (the analytical HAZMAP and the numerical FALL3D) into BET_VH. The ECMWF reanalysis dataset are used for exploring different meteorological conditions. The results obtained show that PHA accounting for the whole natural variability are consistent with previous probabilities maps elaborated for Vesuvius and Campi Flegrei on the basis of single representative scenarios, but show significant differences. In particular, the area characterized by a 300 kg/m2-load exceedance probability larger than 5%, accounting for the whole range of variability (that is, from small violent strombolian to plinian eruptions), is similar to that displayed in the maps based on the medium magnitude reference eruption, but it is of a smaller extent. This is due to the relatively higher weight of the small magnitude eruptions considered in this study, but neglected in the reference scenario maps. On the other hand, in our new maps the area characterized by a 300 kg/m2-load exceedance probability larger than 1% is much larger than that of the medium magnitude reference eruption, due to the contribution of plinian eruptions at lower probabilities, again neglected in the reference scenario maps.
The Estimation of Tree Posterior Probabilities Using Conditional Clade Probability Distributions

PubMed Central

Larget, Bret

2013-01-01

In this article I introduce the idea of conditional independence of separated subtrees as a principle by which to estimate the posterior probability of trees using conditional clade probability distributions rather than simple sample relative frequencies. I describe an algorithm for these calculations and software which implements these ideas. I show that these alternative calculations are very similar to simple sample relative frequencies for high probability trees but are substantially more accurate for relatively low probability trees. The method allows the posterior probability of unsampled trees to be calculated when these trees contain only clades that are in other sampled trees. Furthermore, the method can be used to estimate the total probability of the set of sampled trees which provides a measure of the thoroughness of a posterior sample. [Bayesian phylogenetics; conditional clade distributions; improved accuracy; posterior probabilities of trees.] PMID:23479066
Has Adolescent Suicidality Decreased in the United States? Data From Two National Samples of Adolescents Interviewed in 1995 and 2005

PubMed Central

Wolitzky-Taylor, Kate B.; Ruggiero, Kenneth J.; McCart, Michael R.; Smith, Daniel W.; Hanson, Rochelle F.; Resnick, Heidi S.; de Arellano, Michael A.; Saunders, Benjamin E.; Kilpatrick, Dean G.

2011-01-01

We compared the prevalence and correlates of adolescent suicidal ideation and attempts in two nationally representative probability samples of adolescents interviewed in 1995 (National Survey of Adolescents; N =4,023) and 2005 (National Survey of Adolescents-Replication; N =3,614). Participants in both samples completed a telephone survey that assessed major depressive episode (MDE), post-traumatic stress disorder, suicidal ideation and attempts, violence exposure, and substance use. Results demonstrated that the lifetime prevalence of suicidal ideation among adolescents was lower in 2005 than 1995, whereas the prevalence of suicide attempts remained stable. MDE was the strongest predictor of suicidality in both samples. In addition, several demographic, substance use, and violence exposure variables were significantly associated with increased risk of suicidal ideation and attempts in both samples, with female gender, nonexperimental drug use, and direct violence exposure being consistent risk factors in both samples. PMID:20390799
Detection of Classical swine fever virus infection by individual oral fluid of pigs following experimental inoculation.

PubMed

Petrini, Stefano; Pierini, Ilaria; Giammarioli, Monica; Feliziani, Francesco; De Mia, Gian Mario

2017-03-01

We evaluated the use of oral fluid as an alternative to serum samples for Classical swine fever virus (CSFV) detection. Individual oral fluid and serum samples were collected at different times post-infection from pigs that were experimentally inoculated with CSFV Alfort 187 strain. We found no evidence of CSFV neutralizing antibodies in swine oral fluid samples under our experimental conditions. In contrast, real-time reverse transcription-polymerase chain reaction could detect CSFV nucleic acid from the oral fluid as early as 8 d postinfection, which also coincided with the time of initial detection in blood samples. The probability of CSFV detection in oral fluid was identical or even higher than in the corresponding blood sample. Our results support the feasibility of using this sampling method for CSFV genome detection, which may represent an additional cost-effective tool for CSF control.
Constructing diagnostic likelihood: clinical decisions using subjective versus statistical probability.

PubMed

Kinnear, John; Jackson, Ruth

2017-07-01

Although physicians are highly trained in the application of evidence-based medicine, and are assumed to make rational decisions, there is evidence that their decision making is prone to biases. One of the biases that has been shown to affect accuracy of judgements is that of representativeness and base-rate neglect, where the saliency of a person's features leads to overestimation of their likelihood of belonging to a group. This results in the substitution of 'subjective' probability for statistical probability. This study examines clinicians' propensity to make estimations of subjective probability when presented with clinical information that is considered typical of a medical condition. The strength of the representativeness bias is tested by presenting choices in textual and graphic form. Understanding of statistical probability is also tested by omitting all clinical information. For the questions that included clinical information, 46.7% and 45.5% of clinicians made judgements of statistical probability, respectively. Where the question omitted clinical information, 79.9% of clinicians made a judgement consistent with statistical probability. There was a statistically significant difference in responses to the questions with and without representativeness information (χ2 (1, n=254)=54.45, p<0.0001). Physicians are strongly influenced by a representativeness bias, leading to base-rate neglect, even though they understand the application of statistical probability. One of the causes for this representativeness bias may be the way clinical medicine is taught where stereotypic presentations are emphasised in diagnostic decision making. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
More than Just Convenient: The Scientific Merits of Homogeneous Convenience Samples

PubMed Central

Jager, Justin; Putnick, Diane L.; Bornstein, Marc H.

2017-01-01

Despite their disadvantaged generalizability relative to probability samples, non-probability convenience samples are the standard within developmental science, and likely will remain so because probability samples are cost-prohibitive and most available probability samples are ill-suited to examine developmental questions. In lieu of focusing on how to eliminate or sharply reduce reliance on convenience samples within developmental science, here we propose how to augment their advantages when it comes to understanding population effects as well as subpopulation differences. Although all convenience samples have less clear generalizability than probability samples, we argue that homogeneous convenience samples have clearer generalizability relative to conventional convenience samples. Therefore, when researchers are limited to convenience samples, they should consider homogeneous convenience samples as a positive alternative to conventional or heterogeneous) convenience samples. We discuss future directions as well as potential obstacles to expanding the use of homogeneous convenience samples in developmental science. PMID:28475254
Diet- and body size-related attitudes and behaviors associated with vitamin supplement use in a representative sample of fourth-grade students in Texas.

PubMed

George, Goldy C; Hoelscher, Deanna M; Nicklas, Theresa A; Kelder, Steven H

2009-01-01

To examine diet- and body size-related attitudes and behaviors associated with supplement use in a representative sample of fourth-grade students in Texas. Cross-sectional data from the School Physical Activity and Nutrition study, a probability-based sample of schoolchildren. Children completed a questionnaire that assessed supplement use, food choices, diet-related attitudes, and physical activity; height and weight were measured. School classrooms. Representative sample of fourth-grade students in Texas (n = 5967; mean age = 9.7 years standard error of the mean [SEM] = .03 years, 46% Hispanic, 11% African-American). Previous day vitamin supplement consumption, diet- and body size-related attitudes, food choices, demographic factors, and physical activity. Multivariable logistic regression models, P < .05. The prevalence of supplement use was 29%. Supplement intake was associated with physical activity. Girls who used supplements were more likely to report positive body image and greater interest in trying new food. Relative to nonusers, supplement users were less likely to perceive that they always ate healthful food, although supplement use was associated with more healthful food choices in boys and girls (P < .001). The widespread use of supplements and clustering of supplement use with healthful diet and greater physical activity in fourth graders suggest that supplement use be closely investigated in studies of diet-disease precursor relations and lifestyle factors in children.
Areal distribution and concentration of contaminants of concern in surficial streambed and lakebed sediments, Lake St. Clair and tributaries, Michigan, 1990-2003

USGS Publications Warehouse

Rachol, Cynthia M.; Button, Daniel T.

2006-01-01

As part of the Lake St. Clair Regional Monitoring Project, the U.S. Geological Survey evaluated data collected from surficial streambed and lakebed sediments in the Lake Erie-Lake St. Clair drainages. This study incorporates data collected from 1990 through 2003 and focuses primarily on the U.S. part of the Lake St. Clair Basin, including Lake St. Clair, the St. Clair River, and tributaries to Lake St. Clair. Comparable data from the Canadian part of the study area are included where available. The data are compiled into 4 chemical classes and consist of 21 compounds. The data are compared to effects-based sediment-quality guidelines, where the Threshold Effect Level and Lowest Effect Level represent concentrations below which adverse effects on biota are not expected and the Probable Effect Level and Severe Effect Level represent concentrations above which adverse effects on biota are expected to be frequent.Maps in the report show the spatial distribution of the sampling locations and illustrate the concentrations relative to the selected sediment-quality guidelines. These maps indicate that sediment samples from certain areas routinely had contaminant concentrations greater than the Threshold Effect Concentration or Lowest Effect Level. These locations are the upper reach of the St. Clair River, the main stem and mouth of the Clinton River, Big Beaver Creek, Red Run, and Paint Creek. Maps also indicated areas that routinely contained sediment contaminant concentrations that were greater than the Probable Effect Concentration or Severe Effect Level. These locations include the upper reach of the St. Clair River, the main stem and mouth of the Clinton River, Red Run, within direct tributaries along Lake St. Clair and in marinas within the lake, and within the Clinton River headwaters in Oakland County.Although most samples collected within Lake St. Clair were from sites adjacent to the mouths of its tributaries, samples analyzed for trace-element concentrations were collected throughout the lake. The distribution of trace-element concentrations corresponded well with the results of a two-dimensional hydrodynamic model of flow patterns from the Clinton River into Lake St. Clair. The model was developed independent from the bed sediment analysis described in this report; yet it showed a zone of deposition for outflow from the Clinton River into Lake St. Clair that corresponded well with the spatial distribution of trace-element concentrations. This zone runs along the western shoreline of Lake St. Clair from L'Anse Creuse Bay to St. Clair Shores, Michigan and is reflected in the samples analyzed for mercury and cadmium.Statistical summaries of the concentration data are presented for most contaminants, and selected statistics are compared to effects-based sediment-quality guidelines. Summaries were not computed for dieldrin, chlordane, hexachlorocyclohexane, lindane, and mirex because insufficient data are available for these contaminants. A statistical comparison showed that the median concentration for hexachlorobenzene, anthracene, benz[a]anthracene, chrysene, and pyrene are greater than the Threshold Effect Concentration or Lowest Effect Level.Probable Effect Concentration Quotients provide a mechanism for comparing the concentrations of contaminant mixtures against effects-based biota data. Probable Effect Concentration Quotients were calculated for individual samples and compared to effects-based toxicity ranges. The toxicity-range categories used in this study were nontoxic (quotients < 0.5) and toxic (quotients > 0.5). Of the 546 individual samples for which Probable Effect Concentration Quotients were calculated, 469 (86 percent) were categorized as being nontoxic and 77 (14 percent) were categorized as being toxic. Bed-sediment samples with toxic Probable Effect Concentration Quotients were collected from Paint Creek, Galloway Creek, the main stem of the Clinton River, Big Beaver Creek, Red Run, Clinton River towards the mouth, Lake St. Clair along the western shore, and the St. Clair River near Sarnia.
The nature of terrains of different types on the surface of Venus and selection of potential landing sites for a descent probe of the Venera-D Mission

NASA Astrophysics Data System (ADS)

Ivanov, M. A.; Zasova, L. V.; Gerasimov, M. V.; Korablev, O. I.; Marov, M. Ya.; Zelenyi, L. M.; Ignat'ev, N. I.; Tuchin, A. G.

2017-01-01

We discuss a change in the resurfacing regimes of Venus and probable ways of forming the terrain types that make up the surface of the planet. The interpretation of the nature of the terrain types and their morphologic features allows us to characterize their scientific priority and the risk of landing on their surface to be estimated. From the scientific point of view, two terrain types are of special interest and represent easily achievable targets: the lower unit of regional plains and the smooth plains associated with impact craters. Regional plains are probably a melting from the upper fertile mantle. The material of smooth plains of impact origin is a well-mixed and representative sample of the Venusian crust. The lower unit of regional plains is the most widespread one on the surface of Venus, and it occurs within the boundaries of all of the precalculated approach trajectories of the lander. Smooth plains of impact origin are crossed by the approach trajectories precalculated for 2018 and 2026.

Petrography of impact glasses and melt breccias from the El'gygytgyn impact structure, Russia

NASA Astrophysics Data System (ADS)

Pittarello, Lidia; Koeberl, Christian

2013-07-01

The El'gygytgyn impact structure, 18 km in diameter and 3.6 Ma old, in Arctic Siberia, Russia, is the only impact structure on Earth mostly excavated in acidic volcanic rocks. The Late Cretaceous volcanic target includes lavas, tuffs, and ignimbrites of rhyolitic, dacitic, and andesitic composition, and local occurrence of basalt. Although the ejecta blanket around the crater is nearly completely eroded, bomb-shaped impact glasses, redeposited after the impact event, occur in lacustrine terraces within the crater. Here we present detailed petrographic descriptions of newly collected impact glass-bearing samples. The observed features contribute to constrain the formation of the melt and its cooling history within the framework of the impact process. The collected samples can be grouped into two types, characterized by specific features: (1) "pure" glasses, containing very few clasts or new crystals and which were likely formed during the early stages of cratering and (2) a second type, which represents composite samples with impact melt breccia lenses embedded in silicate glass. These mixed samples probably resulted from inclusion of unmelted impact debris during ejection and deposition. After deposition the glassy portions continued to deform, whereas the impact melt breccia inclusions that probably had already cooled down behaved as rigid bodies in the flow.
Detecting temporal trends in species assemblages with bootstrapping procedures and hierarchical models

USGS Publications Warehouse

Gotelli, Nicholas J.; Dorazio, Robert M.; Ellison, Aaron M.; Grossman, Gary D.

2010-01-01

Quantifying patterns of temporal trends in species assemblages is an important analytical challenge in community ecology. We describe methods of analysis that can be applied to a matrix of counts of individuals that is organized by species (rows) and time-ordered sampling periods (columns). We first developed a bootstrapping procedure to test the null hypothesis of random sampling from a stationary species abundance distribution with temporally varying sampling probabilities. This procedure can be modified to account for undetected species. We next developed a hierarchical model to estimate species-specific trends in abundance while accounting for species-specific probabilities of detection. We analysed two long-term datasets on stream fishes and grassland insects to demonstrate these methods. For both assemblages, the bootstrap test indicated that temporal trends in abundance were more heterogeneous than expected under the null model. We used the hierarchical model to estimate trends in abundance and identified sets of species in each assemblage that were steadily increasing, decreasing or remaining constant in abundance over more than a decade of standardized annual surveys. Our methods of analysis are broadly applicable to other ecological datasets, and they represent an advance over most existing procedures, which do not incorporate effects of incomplete sampling and imperfect detection.
Mining Rare Events Data for Assessing Customer Attrition Risk

NASA Astrophysics Data System (ADS)

Au, Tom; Chin, Meei-Ling Ivy; Ma, Guangqin

Customer attrition refers to the phenomenon whereby a customer leaves a service provider. As competition intensifies, preventing customers from leaving is a major challenge to many businesses such as telecom service providers. Research has shown that retaining existing customers is more profitable than acquiring new customers due primarily to savings on acquisition costs, the higher volume of service consumption, and customer referrals. For a large enterprise, its customer base consists of tens of millions service subscribers, more often the events, such as switching to competitors or canceling services are large in absolute number, but rare in percentage, far less than 5%. Based on a simple random sample, popular statistical procedures, such as logistic regression, tree-based method and neural network, can sharply underestimate the probability of rare events, and often result a null model (no significant predictors). To improve efficiency and accuracy for event probability estimation, a case-based data collection technique is then considered. A case-based sample is formed by taking all available events and a small, but representative fraction of nonevents from a dataset of interest. In this article we showed a consistent prior correction method for events probability estimation and demonstrated the performance of the above data collection techniques in predicting customer attrition with actual telecommunications data.
Estimation of occupancy, breeding success, and predicted abundance of golden eagles (Aquila chrysaetos) in the Diablo Range, California, 2014

USGS Publications Warehouse

Wiens, J. David; Kolar, Patrick S.; Fuller, Mark R.; Hunt, W. Grainger; Hunt, Teresa

2015-01-01

We used a multistate occupancy sampling design to estimate occupancy, breeding success, and abundance of territorial pairs of golden eagles (Aquila chrysaetos) in the Diablo Range, California, in 2014. This method uses the spatial pattern of detections and non-detections over repeated visits to survey sites to estimate probabilities of occupancy and successful reproduction while accounting for imperfect detection of golden eagles and their young during surveys. The estimated probability of detecting territorial pairs of golden eagles and their young was less than 1 and varied with time of the breeding season, as did the probability of correctly classifying a pair’s breeding status. Imperfect detection and breeding classification led to a sizeable difference between the uncorrected, naïve estimate of the proportion of occupied sites where successful reproduction was observed (0.20) and the model-based estimate (0.30). The analysis further indicated a relatively high overall probability of landscape occupancy by pairs of golden eagles (0.67, standard error = 0.06), but that areas with the greatest occupancy and reproductive potential were patchily distributed. We documented a total of 138 territorial pairs of golden eagles during surveys completed in the 2014 breeding season, which represented about one-half of the 280 pairs we estimated to occur in the broader 5,169-square kilometer region sampled. The study results emphasize the importance of accounting for imperfect detection and spatial heterogeneity in studies of site occupancy, breeding success, and abundance of golden eagles.
Computer models of social processes: the case of migration.

PubMed

Beshers, J M

1967-06-01

The demographic model is a program for representing births, deaths, migration, and social mobility as social processes in a non-stationary stochastic process (Markovian). Transition probabilities for each age group are stored and then retrieved at the next appearance of that age cohort. In this way new transition probabilities can be calculated as a function of the old transition probabilities and of two successive distribution vectors.Transition probabilities can be calculated to represent effects of the whole age-by-state distribution at any given time period, too. Such effects as saturation or queuing may be represented by a market mechanism; for example, migration between metropolitan areas can be represented as depending upon job supplies and labor markets. Within metropolitan areas, migration can be represented as invasion and succession processes with tipping points (acceleration curves), and the market device has been extended to represent this phenomenon.Thus, the demographic model makes possible the representation of alternative classes of models of demographic processes. With each class of model one can deduce implied time series (varying parame-terswithin the class) and the output of the several classes can be compared to each other and to outside criteria, such as empirical time series.
Multiple murder and criminal careers: a latent class analysis of multiple homicide offenders.

PubMed

Vaughn, Michael G; DeLisi, Matt; Beaver, Kevin M; Howard, Matthew O

2009-01-10

To construct an empirically rigorous typology of multiple homicide offenders (MHOs). The current study conducted latent class analysis of the official records of 160 MHOs sampled from eight states to evaluate their criminal careers. A 3-class solution best fit the data (-2LL=-1123.61, Bayesian Information Criterion (BIC)=2648.15, df=81, L(2)=1179.77). Class 1 (n=64, class assignment probability=.999) was the low-offending group marked by little criminal record and delayed arrest onset. Class 2 (n=51, class assignment probability=.957) was the severe group that represents the most violent and habitual criminals. Class 3 (n=45, class assignment probability=.959) was the moderate group whose offending careers were similar to Class 2. A sustained criminal career with involvement in versatile forms of crime was observed for two of three classes of MHOs. Linkages to extant typologies and recommendations for additional research that incorporates clinical constructs are proffered.
Probability Distributions for Random Quantum Operations

NASA Astrophysics Data System (ADS)

Schultz, Kevin

Motivated by uncertainty quantification and inference of quantum information systems, in this work we draw connections between the notions of random quantum states and operations in quantum information with probability distributions commonly encountered in the field of orientation statistics. This approach identifies natural sample spaces and probability distributions upon these spaces that can be used in the analysis, simulation, and inference of quantum information systems. The theory of exponential families on Stiefel manifolds provides the appropriate generalization to the classical case. Furthermore, this viewpoint motivates a number of additional questions into the convex geometry of quantum operations relative to both the differential geometry of Stiefel manifolds as well as the information geometry of exponential families defined upon them. In particular, we draw on results from convex geometry to characterize which quantum operations can be represented as the average of a random quantum operation. This project was supported by the Intelligence Advanced Research Projects Activity via Department of Interior National Business Center Contract Number 2012-12050800010.
Assessing the impact of antidrug advertising on adolescent drug consumption: results from a behavioral economic model.

PubMed

Block, Lauren G; Morwitz, Vicki G; Putsis, William P; Sen, Subrata K

2002-08-01

This study examined whether adolescents' recall of antidrug advertising is associated with a decreased probability of using illicit drugs and, given drug use, a reduced volume of use. A behavioral economic model of influences on drug consumption was developed with survey data from a nationally representative sample of adolescents to determine the incremental impact of antidrug advertising. The findings provided evidence that recall of antidrug advertising was associated with a lower probability of marijuana and cocaine/crack use. Recall of such advertising was not associated with the decision of how much marijuana or cocaine/crack to use. Results suggest that individuals predisposed to try marijuana are also predisposed to try cocaine/crack. The present results provide support for the effectiveness of antidrug advertising programs.
Transmuted of Rayleigh Distribution with Estimation and Application on Noise Signal

NASA Astrophysics Data System (ADS)

Ahmed, Suhad; Qasim, Zainab

2018-05-01

This paper deals with transforming one parameter Rayleigh distribution, into transmuted probability distribution through introducing a new parameter (λ), since this studied distribution is necessary in representing signal data distribution and failure data model the value of this transmuted parameter |λ| ≤ 1, is also estimated as well as the original parameter (⊖) by methods of moments and maximum likelihood using different sample size (n=25, 50, 75, 100) and comparing the results of estimation by statistical measure (mean square error, MSE).
Recruiting and retaining youth and young adults: challenges and opportunities in survey research for tobacco control.

PubMed

Cantrell, Jennifer; Hair, Elizabeth C; Smith, Alexandria; Bennett, Morgane; Rath, Jessica Miller; Thomas, Randall K; Fahimi, Mansour; Dennis, J Michael; Vallone, Donna

2018-03-01

Evaluation studies of population-based tobacco control interventions often rely on large-scale survey data from numerous respondents across many geographic areas to provide evidence of their effectiveness. Significant challenges for survey research have emerged with the evolving communications landscape, particularly for surveying hard-to-reach populations such as youth and young adults. This study combines the comprehensive coverage of an address-based sampling (ABS) frame with the timeliness of online data collection to develop a nationally representative longitudinal cohort of young people aged 15-21. We constructed an ABS frame, partially supplemented with auxiliary data, to recruit this hard-to-reach sample. Branded and tested mail-based recruitment materials were designed to bring respondents online for screening, consent and surveying. Once enrolled, respondents completed online surveys every 6 months via computer, tablet or smartphone. Numerous strategies were utilized to enhance retention and representativeness RESULTS: Results detail sample performance, representativeness and retention rates as well as device utilization trends for survey completion among youth and young adult respondents. Panel development efforts resulted in a large, nationally representative sample with high retention rates. This study is among the first to employ this hybrid ABS-to-online methodology to recruit and retain youth and young adults in a probability-based online cohort panel. The approach is particularly valuable for conducting research among younger populations as it capitalizes on their increasing access to and comfort with digital communication. We discuss challenges and opportunities of panel recruitment and retention methods in an effort to provide valuable information for tobacco control researchers seeking to obtain representative, population-based samples of youth and young adults in the U.S. as well as across the globe. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Detection of mastitis in dairy cattle by use of mixture models for repeated somatic cell scores: a Bayesian approach via Gibbs sampling.

PubMed

Odegård, J; Jensen, J; Madsen, P; Gianola, D; Klemetsdal, G; Heringstad, B

2003-11-01

The distribution of somatic cell scores could be regarded as a mixture of at least two components depending on a cow's udder health status. A heteroscedastic two-component Bayesian normal mixture model with random effects was developed and implemented via Gibbs sampling. The model was evaluated using datasets consisting of simulated somatic cell score records. Somatic cell score was simulated as a mixture representing two alternative udder health statuses ("healthy" or "diseased"). Animals were assigned randomly to the two components according to the probability of group membership (Pm). Random effects (additive genetic and permanent environment), when included, had identical distributions across mixture components. Posterior probabilities of putative mastitis were estimated for all observations, and model adequacy was evaluated using measures of sensitivity, specificity, and posterior probability of misclassification. Fitting different residual variances in the two mixture components caused some bias in estimation of parameters. When the components were difficult to disentangle, so were their residual variances, causing bias in estimation of Pm and of location parameters of the two underlying distributions. When all variance components were identical across mixture components, the mixture model analyses returned parameter estimates essentially without bias and with a high degree of precision. Including random effects in the model increased the probability of correct classification substantially. No sizable differences in probability of correct classification were found between models in which a single cow effect (ignoring relationships) was fitted and models where this effect was split into genetic and permanent environmental components, utilizing relationship information. When genetic and permanent environmental effects were fitted, the between-replicate variance of estimates of posterior means was smaller because the model accounted for random genetic drift.
Reconstruction of three-dimensional porous media using generative adversarial neural networks

NASA Astrophysics Data System (ADS)

Mosser, Lukas; Dubrule, Olivier; Blunt, Martin J.

2017-10-01

To evaluate the variability of multiphase flow properties of porous media at the pore scale, it is necessary to acquire a number of representative samples of the void-solid structure. While modern x-ray computer tomography has made it possible to extract three-dimensional images of the pore space, assessment of the variability in the inherent material properties is often experimentally not feasible. We present a method to reconstruct the solid-void structure of porous media by applying a generative neural network that allows an implicit description of the probability distribution represented by three-dimensional image data sets. We show, by using an adversarial learning approach for neural networks, that this method of unsupervised learning is able to generate representative samples of porous media that honor their statistics. We successfully compare measures of pore morphology, such as the Euler characteristic, two-point statistics, and directional single-phase permeability of synthetic realizations with the calculated properties of a bead pack, Berea sandstone, and Ketton limestone. Results show that generative adversarial networks can be used to reconstruct high-resolution three-dimensional images of porous media at different scales that are representative of the morphology of the images used to train the neural network. The fully convolutional nature of the trained neural network allows the generation of large samples while maintaining computational efficiency. Compared to classical stochastic methods of image reconstruction, the implicit representation of the learned data distribution can be stored and reused to generate multiple realizations of the pore structure very rapidly.
Comparison of Bootstrapping and Markov Chain Monte Carlo for Copula Analysis of Hydrological Droughts

NASA Astrophysics Data System (ADS)

Yang, P.; Ng, T. L.; Yang, W.

2015-12-01

Effective water resources management depends on the reliable estimation of the uncertainty of drought events. Confidence intervals (CIs) are commonly applied to quantify this uncertainty. A CI seeks to be at the minimal length necessary to cover the true value of the estimated variable with the desired probability. In drought analysis where two or more variables (e.g., duration and severity) are often used to describe a drought, copulas have been found suitable for representing the joint probability behavior of these variables. However, the comprehensive assessment of the parameter uncertainties of copulas of droughts has been largely ignored, and the few studies that have recognized this issue have not explicitly compared the various methods to produce the best CIs. Thus, the objective of this study to compare the CIs generated using two widely applied uncertainty estimation methods, bootstrapping and Markov Chain Monte Carlo (MCMC). To achieve this objective, (1) the marginal distributions lognormal, Gamma, and Generalized Extreme Value, and the copula functions Clayton, Frank, and Plackett are selected to construct joint probability functions of two drought related variables. (2) The resulting joint functions are then fitted to 200 sets of simulated realizations of drought events with known distribution and extreme parameters and (3) from there, using bootstrapping and MCMC, CIs of the parameters are generated and compared. The effect of an informative prior on the CIs generated by MCMC is also evaluated. CIs are produced for different sample sizes (50, 100, and 200) of the simulated drought events for fitting the joint probability functions. Preliminary results assuming lognormal marginal distributions and the Clayton copula function suggest that for cases with small or medium sample sizes (~50-100), MCMC to be superior method if an informative prior exists. Where an informative prior is unavailable, for small sample sizes (~50), both bootstrapping and MCMC yield the same level of performance, and for medium sample sizes (~100), bootstrapping is better. For cases with a large sample size (~200), there is little difference between the CIs generated using bootstrapping and MCMC regardless of whether or not an informative prior exists.
Estimating true human and animal host source contribution in quantitative microbial source tracking using the Monte Carlo method.

PubMed

Wang, Dan; Silkie, Sarah S; Nelson, Kara L; Wuertz, Stefan

2010-09-01

Cultivation- and library-independent, quantitative PCR-based methods have become the method of choice in microbial source tracking. However, these qPCR assays are not 100% specific and sensitive for the target sequence in their respective hosts' genome. The factors that can lead to false positive and false negative information in qPCR results are well defined. It is highly desirable to have a way of removing such false information to estimate the true concentration of host-specific genetic markers and help guide the interpretation of environmental monitoring studies. Here we propose a statistical model based on the Law of Total Probability to predict the true concentration of these markers. The distributions of the probabilities of obtaining false information are estimated from representative fecal samples of known origin. Measurement error is derived from the sample precision error of replicated qPCR reactions. Then, the Monte Carlo method is applied to sample from these distributions of probabilities and measurement error. The set of equations given by the Law of Total Probability allows one to calculate the distribution of true concentrations, from which their expected value, confidence interval and other statistical characteristics can be easily evaluated. The output distributions of predicted true concentrations can then be used as input to watershed-wide total maximum daily load determinations, quantitative microbial risk assessment and other environmental models. This model was validated by both statistical simulations and real world samples. It was able to correct the intrinsic false information associated with qPCR assays and output the distribution of true concentrations of Bacteroidales for each animal host group. Model performance was strongly affected by the precision error. It could perform reliably and precisely when the standard deviation of the precision error was small (≤ 0.1). Further improvement on the precision of sample processing and qPCR reaction would greatly improve the performance of the model. This methodology, built upon Bacteroidales assays, is readily transferable to any other microbial source indicator where a universal assay for fecal sources of that indicator exists. Copyright © 2010 Elsevier Ltd. All rights reserved.
Using hidden Markov models to align multiple sequences.

PubMed

Mount, David W

2009-07-01

A hidden Markov model (HMM) is a probabilistic model of a multiple sequence alignment (msa) of proteins. In the model, each column of symbols in the alignment is represented by a frequency distribution of the symbols (called a "state"), and insertions and deletions are represented by other states. One moves through the model along a particular path from state to state in a Markov chain (i.e., random choice of next move), trying to match a given sequence. The next matching symbol is chosen from each state, recording its probability (frequency) and also the probability of going to that state from a previous one (the transition probability). State and transition probabilities are multiplied to obtain a probability of the given sequence. The hidden nature of the HMM is due to the lack of information about the value of a specific state, which is instead represented by a probability distribution over all possible values. This article discusses the advantages and disadvantages of HMMs in msa and presents algorithms for calculating an HMM and the conditions for producing the best HMM.
Detection of a Serum Siderophore by LC-MS/MS as a Potential Biomarker of Invasive Aspergillosis

PubMed Central

Carroll, Cassandra S.; Amankwa, Lawrence N.; Pinto, Linda J.; Fuller, Jeffrey D.; Moore, Margo M.

2016-01-01

Invasive aspergillosis (IA) is a life-threatening systemic mycosis caused primarily by Aspergillus fumigatus. Early diagnosis of IA is based, in part, on an immunoassay for circulating fungal cell wall carbohydrate, galactomannan (GM). However, a wide range of sensitivity and specificity rates have been reported for the GM test across various patient populations. To obtain iron in vivo, A. fumigatus secretes the siderophore, N,N',N"-triacetylfusarinine C (TAFC) and we hypothesize that TAFC may represent a possible biomarker for early detection of IA. We developed an ultra performance liquid chromatography tandem mass spectrometry (UPLC-MS/MS) method for TAFC analysis from serum, and measured TAFC in serum samples collected from patients at risk for IA. The method showed lower and upper limits of quantitation (LOQ) of 5 ng/ml and 750 ng/ml, respectively, and complete TAFC recovery from spiked serum. As proof of concept, we evaluated 76 serum samples from 58 patients with suspected IA that were investigated for the presence of GM. Fourteen serum samples obtained from 11 patients diagnosed with probable or proven IA were also analyzed for the presence of TAFC. Control sera (n = 16) were analyzed to establish a TAFC cut-off value (≥6 ng/ml). Of the 36 GM-positive samples (≥0.5 GM index) from suspected IA patients, TAFC was considered positive in 25 (69%). TAFC was also found in 28 additional GM-negative samples. TAFC was detected in 4 of the 14 samples (28%) from patients with proven/probable aspergillosis. Log-transformed TAFC and GM values from patients with proven/probable IA, healthy individuals and SLE patients showed a significant correlation with a Pearson r value of 0.77. In summary, we have developed a method for the detection of TAFC in serum that revealed this fungal product in the sera of patients at risk for invasive aspergillosis. A prospective study is warranted to determine whether this method provides improved early detection of IA. PMID:26974544
Characterization of the Sukinda and Nausahi ultramafic complexes, Orissa, India by platinum-group element geochemistry

USGS Publications Warehouse

Page, N.J.; Banerji, P.K.; Haffty, J.

1985-01-01

Samples of 20 chromitite, 14 ultramafic and mafic rock, and 9 laterite and soil samples from the Precambrian Sukinda and Nausahi ultramafic complexes, Orissa, India were analyzed for platinum-group elements (PGE). The maximum concentrations are: palladium, 13 parts per billion (ppb); platinum, 120 ppb; rhodium, 21 ppb; iridium, 210 ppb; and ruthenium, 630 ppb. Comparison of chondrite-normalized ratios of PGE for the chromitite samples of lower Proterozoic to Archean age with similar data from Paleozoic and Mesozoic ophiolite complexes strongly implies that these complexes represent Precambrian analogs of ophiolite complexes. This finding is consistent with the geology and petrology of the Indian complexes and suggests that plate-tectonic and ocean basin developement models probably apply to some parts of Precambrian shield areas. ?? 1985.
Geospatial techniques for developing a sampling frame of watersheds across a region

USGS Publications Warehouse

Gresswell, Robert E.; Bateman, Douglas S.; Lienkaemper, George; Guy, T.J.

2004-01-01

Current land-management decisions that affect the persistence of native salmonids are often influenced by studies of individual sites that are selected based on judgment and convenience. Although this approach is useful for some purposes, extrapolating results to areas that were not sampled is statistically inappropriate because the sampling design is usually biased. Therefore, in recent investigations of coastal cutthroat trout (Oncorhynchus clarki clarki) located above natural barriers to anadromous salmonids, we used a methodology for extending the statistical scope of inference. The purpose of this paper is to apply geospatial tools to identify a population of watersheds and develop a probability-based sampling design for coastal cutthroat trout in western Oregon, USA. The population of mid-size watersheds (500-5800 ha) west of the Cascade Range divide was derived from watershed delineations based on digital elevation models. Because a database with locations of isolated populations of coastal cutthroat trout did not exist, a sampling frame of isolated watersheds containing cutthroat trout had to be developed. After the sampling frame of watersheds was established, isolated watersheds with coastal cutthroat trout were stratified by ecoregion and erosion potential based on dominant bedrock lithology (i.e., sedimentary and igneous). A stratified random sample of 60 watersheds was selected with proportional allocation in each stratum. By comparing watershed drainage areas of streams in the general population to those in the sampling frame and the resulting sample (n = 60), we were able to evaluate the how representative the subset of watersheds was in relation to the population of watersheds. Geospatial tools provided a relatively inexpensive means to generate the information necessary to develop a statistically robust, probability-based sampling design.
A statistical treatment of bioassay pour fractions

NASA Astrophysics Data System (ADS)

Barengoltz, Jack; Hughes, David

A bioassay is a method for estimating the number of bacterial spores on a spacecraft surface for the purpose of demonstrating compliance with planetary protection (PP) requirements (Ref. 1). The details of the process may be seen in the appropriate PP document (e.g., for NASA, Ref. 2). In general, the surface is mechanically sampled with a damp sterile swab or wipe. The completion of the process is colony formation in a growth medium in a plate (Petri dish); the colonies are counted. Consider a set of samples from randomly selected, known areas of one spacecraft surface, for simplicity. One may calculate the mean and standard deviation of the bioburden density, which is the ratio of counts to area sampled. The standard deviation represents an estimate of the variation from place to place of the true bioburden density commingled with the precision of the individual sample counts. The accuracy of individual sample results depends on the equipment used, the collection method, and the culturing method. One aspect that greatly influences the result is the pour fraction, which is the quantity of fluid added to the plates divided by the total fluid used in extracting spores from the sampling equipment. In an analysis of a single sample’s counts due to the pour fraction, one seeks to answer the question: What is the probability that if a certain number of spores are counted with a known pour fraction, that there are an additional number of spores in the part of the rinse not poured. This is given for specific values by the binomial distribution density, where detection (of culturable spores) is success and the probability of success is the pour fraction. A special summation over the binomial distribution, equivalent to adding for all possible values of the true total number of spores, is performed. This distribution when normalized will almost yield the desired quantity. It is the probability that the additional number of spores does not exceed a certain value. Of course, for a desired value of uncertainty, one must invert the calculation. However, this probability of finding exactly the number of spores in the poured part is correct only in the case where all values of the true number of spores greater than or equal to the adjusted count are equally probable. This is not realistic, of course, but the result can only overestimate the uncertainty. So it is useful. In probability speak, one has the conditional probability given any true total number of spores. Therefore one must multiply it by the probability of each possible true count, before the summation. If the counts for a sample set (of which this is one sample) are available, one may use the calculated variance and the normal probability distribution. In this approach, one assumes a normal distribution and neglects the contribution from spatial variation. The former is a common assumption. The latter can only add to the conservatism (over estimate the number of spores at some level of confidence). A more straightforward approach is to assume a Poisson probability distribution for the measured total sample set counts, and use the product of the number of samples and the mean number of counts per sample as the mean of the Poisson distribution. It is necessary to set the total count to 1 in the Poisson distribution when actual total count is zero. Finally, even when the planetary protection requirements for spore burden refer only to the mean values, they require an adjustment for pour fraction and method efficiency (a PP specification based on independent data). The adjusted mean values are a 50/50 proposition (e.g., the probability of the true total counts in the sample set exceeding the estimate is 0.50). However, this is highly unconservative when the total counts are zero. No adjustment to the mean values occurs for either pour fraction or efficiency. The recommended approach is once again to set the total counts to 1, but now applied to the mean values. Then one may apply the corrections to the revised counts. It can be shown by the methods developed in this work that this change is usually conservative enough to increase the level of confidence in the estimate to 0.5. 1. NASA. (2005) Planetary protection provisions for robotic extraterrestrial missions. NPR 8020.12C, April 2005, National Aeronautics and Space Administration, Washington, DC. 2. NASA. (2010) Handbook for the Microbiological Examination of Space Hardware, NASA-HDBK-6022, National Aeronautics and Space Administration, Washington, DC.
Sampling considerations for disease surveillance in wildlife populations

USGS Publications Warehouse

Nusser, S.M.; Clark, W.R.; Otis, D.L.; Huang, L.

2008-01-01

Disease surveillance in wildlife populations involves detecting the presence of a disease, characterizing its prevalence and spread, and subsequent monitoring. A probability sample of animals selected from the population and corresponding estimators of disease prevalence and detection provide estimates with quantifiable statistical properties, but this approach is rarely used. Although wildlife scientists often assume probability sampling and random disease distributions to calculate sample sizes, convenience samples (i.e., samples of readily available animals) are typically used, and disease distributions are rarely random. We demonstrate how landscape-based simulation can be used to explore properties of estimators from convenience samples in relation to probability samples. We used simulation methods to model what is known about the habitat preferences of the wildlife population, the disease distribution, and the potential biases of the convenience-sample approach. Using chronic wasting disease in free-ranging deer (Odocoileus virginianus) as a simple illustration, we show that using probability sample designs with appropriate estimators provides unbiased surveillance parameter estimates but that the selection bias and coverage errors associated with convenience samples can lead to biased and misleading results. We also suggest practical alternatives to convenience samples that mix probability and convenience sampling. For example, a sample of land areas can be selected using a probability design that oversamples areas with larger animal populations, followed by harvesting of individual animals within sampled areas using a convenience sampling method.

Approximation of Failure Probability Using Conditional Sampling

NASA Technical Reports Server (NTRS)

Giesy. Daniel P.; Crespo, Luis G.; Kenney, Sean P.

2008-01-01

In analyzing systems which depend on uncertain parameters, one technique is to partition the uncertain parameter domain into a failure set and its complement, and judge the quality of the system by estimating the probability of failure. If this is done by a sampling technique such as Monte Carlo and the probability of failure is small, accurate approximation can require so many sample points that the computational expense is prohibitive. Previous work of the authors has shown how to bound the failure event by sets of such simple geometry that their probabilities can be calculated analytically. In this paper, it is shown how to make use of these failure bounding sets and conditional sampling within them to substantially reduce the computational burden of approximating failure probability. It is also shown how the use of these sampling techniques improves the confidence intervals for the failure probability estimate for a given number of sample points and how they reduce the number of sample point analyses needed to achieve a given level of confidence.
Data Analysis with Graphical Models: Software Tools

NASA Technical Reports Server (NTRS)

Buntine, Wray L.

1994-01-01

Probabilistic graphical models (directed and undirected Markov fields, and combined in chain graphs) are used widely in expert systems, image processing and other areas as a framework for representing and reasoning with probabilities. They come with corresponding algorithms for performing probabilistic inference. This paper discusses an extension to these models by Spiegelhalter and Gilks, plates, used to graphically model the notion of a sample. This offers a graphical specification language for representing data analysis problems. When combined with general methods for statistical inference, this also offers a unifying framework for prototyping and/or generating data analysis algorithms from graphical specifications. This paper outlines the framework and then presents some basic tools for the task: a graphical version of the Pitman-Koopman Theorem for the exponential family, problem decomposition, and the calculation of exact Bayes factors. Other tools already developed, such as automatic differentiation, Gibbs sampling, and use of the EM algorithm, make this a broad basis for the generation of data analysis software.
Human West Nile virus infection in Bosnia and Herzegovina.

PubMed

Ahmetagić, Sead; Petković, Jovan; Hukić, Mirsada; Smriko-Nuhanović, Arnela; Piljić, Dilista

2015-02-01

To describe the first two cases of West Nile virus (WNV) neuroinvasive infections in Bosnia and Herzegovina. At the Clinic for Infectious Diseases of the University Clinical Centre Tuzla, Bosnia and Herzegovina (BiH), specific screening for WNV infection was performed on patients with neuroinvasive diseases from 1 August to 31 October 2013. Serum samples were tested for the presence of WNV IgM and IgG antibodies using enzyme-linked immunosorbent assay (ELISA); positive serum samples were further analyzed by detection of WNV nucleic acid of two distinct lineages (lineage 1 and lineage 2) in sera by RT-PCR. Three (out of nine) patients met clinical criteria, and two of them had high serum titre of WNV specific IgM antibodies (3.5 and 5.2). Serum RT-PCR testing was negative. Conformation by neutralization testing was not performed. Both cases represented with encephalitis. None of these cases had recent travel history in WNW endemic areas, or history of blood transfusion and organ transplantation, so they represented autochthonous cases. Although there were no previous reports of flavivirus infections in BiH, described cases had high titre of WNV specific antibodies in serum, and negative flavivirus-vaccination history, they were defined as probable cases because recommended testing for case confirmation was not performed. The West Nile virus should be considered a possible causative pathogen in this area, probably in patients with mild influenza-like disease of unknown origin and those with neuroinvasive disease during late summer and early autumn.
Jimsphere wind and turbulence exceedance statistic

NASA Technical Reports Server (NTRS)

Adelfang, S. I.; Court, A.

1972-01-01

Exceedance statistics of winds and gusts observed over Cape Kennedy with Jimsphere balloon sensors are described. Gust profiles containing positive and negative departures, from smoothed profiles, in the wavelength ranges 100-2500, 100-1900, 100-860, and 100-460 meters were computed from 1578 profiles with four 41 weight digital high pass filters. Extreme values of the square root of gust speed are normally distributed. Monthly and annual exceedance probability distributions of normalized rms gust speeds in three altitude bands (2-7, 6-11, and 9-14 km) are log-normal. The rms gust speeds are largest in the 100-2500 wavelength band between 9 and 14 km in late winter and early spring. A study of monthly and annual exceedance probabilities and the number of occurrences per kilometer of level crossings with positive slope indicates significant variability with season, altitude, and filter configuration. A decile sampling scheme is tested and an optimum approach is suggested for drawing a relatively small random sample that represents the characteristic extreme wind speeds and shears of a large parent population of Jimsphere wind profiles.
Epidemiology of major depression in four cities in Mexico.

PubMed

Slone, Laurie B; Norris, Fran H; Murphy, Arthur D; Baker, Charlene K; Perilla, Julia L; Diaz, Dayna; Rodriguez, Francisco Gutiérrez; Gutiérrez Rodriguez, José de Jesús

2006-01-01

Analyses were conducted to estimate lifetime and current prevalence of major depressive disorder (MDD) for four representative cities of Mexico, to identify variables that influence the probability of MDD, and to further describe depression in Mexican culture. A multistage probability sampling design was used to draw a sample of 2,509 adults in four different regions of Mexico. MDD was assessed according to DSM-IV criteria by using the Composite International Diagnostic Interview collected by trained lay interviewers. The prevalence of MDD in these four cities averaged 12.8% for lifetime and 6.1% for the previous 12 months. MDD was highly comorbid with other mental disorders. Women were more likely to have lifetime MDD than were men. Being divorced, separated, or widowed (compared to married or never married) and having experienced childhood trauma were related to higher lifetime prevalence but not to current prevalence. In addition, age and education level were related to current 12-month MDD. Data on the profile of MDD in urban Mexico are provided. This research expands our understanding of MDD across cultures.
Flipping Out: Calculating Probability with a Coin Game

ERIC Educational Resources Information Center

Degner, Kate

2015-01-01

In the author's experience with this activity, students struggle with the idea of representativeness in probability. Therefore, this student misconception is part of the classroom discussion about the activities in this lesson. Representativeness is related to the (incorrect) idea that outcomes that seem more random are more likely to happen. This…
Microplastic analysis in the South Funen Archipelago, Baltic Sea, implementing manta trawling and bulk sampling.

PubMed

Tamminga, Matthias; Hengstmann, Elena; Fischer, Elke Kerstin

2018-03-01

Microplastic contamination in surface waters of the South Funen Archipelago in Denmark was assessed. Therefore, ten manta trawls were conducted in June 2015. Moreover, 31 low-volume bulk samples were taken to evaluate, whether consistent results in comparison to the net-based approach can be obtained. Microplastic contamination in the South Funen Archipelago (0.07 ± 0.02 particles/m3) is slightly below values reported before. The sheltered position of the study area, low population pressure on adjacent islands and the absence of any major potential point sources were identified as major factors explaining the low concentration of microplastics. Within the Archipelago, harbors or marinas and the associated vessel traffic are the most probable sources of microplastics. The concentration of microplastics in low-volume bulk samples is not comparable to manta trawl results. This is mainly due to insufficient representativeness of the bulk sample volumes.
Reproducibility of preclinical animal research improves with heterogeneity of study samples

PubMed Central

Vogt, Lucile; Sena, Emily S.; Würbel, Hanno

2018-01-01

Single-laboratory studies conducted under highly standardized conditions are the gold standard in preclinical animal research. Using simulations based on 440 preclinical studies across 13 different interventions in animal models of stroke, myocardial infarction, and breast cancer, we compared the accuracy of effect size estimates between single-laboratory and multi-laboratory study designs. Single-laboratory studies generally failed to predict effect size accurately, and larger sample sizes rendered effect size estimates even less accurate. By contrast, multi-laboratory designs including as few as 2 to 4 laboratories increased coverage probability by up to 42 percentage points without a need for larger sample sizes. These findings demonstrate that within-study standardization is a major cause of poor reproducibility. More representative study samples are required to improve the external validity and reproducibility of preclinical animal research and to prevent wasting animals and resources for inconclusive research. PMID:29470495
School Progress Among Children of Same-Sex Couples.

PubMed

Watkins, Caleb S

2018-06-01

This study uses logit regressions on a pooled sample of children from the 2012, 2013, and 2014 American Community Survey to perform a nationally representative analysis of school progress for a large sample of 4,430 children who reside with same-sex couples. Odds ratios from regressions that compare children between different-sex married couples and same-sex couples fail to show significant differences in normal school progress between households across a variety of sample compositions. Likewise, marginal effects from regressions that compare children with similar family dynamics between different-sex married couples and same-sex couples fail to predict significantly higher probabilities of grade retention for children of same-sex couples. Significantly lower grade retention rates are sometimes predicted for children of same-sex couples than for different-sex married couples, but these differences are sensitive to sample exclusions and do not indicate causal benefits to same-sex parenting.
New color-based tracking algorithm for joints of the upper extremities

NASA Astrophysics Data System (ADS)

Wu, Xiangping; Chow, Daniel H. K.; Zheng, Xiaoxiang

2007-11-01

To track the joints of the upper limb of stroke sufferers for rehabilitation assessment, a new tracking algorithm which utilizes a developed color-based particle filter and a novel strategy for handling occlusions is proposed in this paper. Objects are represented by their color histogram models and particle filter is introduced to track the objects within a probability framework. Kalman filter, as a local optimizer, is integrated into the sampling stage of the particle filter that steers samples to a region with high likelihood and therefore fewer samples is required. A color clustering method and anatomic constraints are used in dealing with occlusion problem. Compared with the general basic particle filtering method, the experimental results show that the new algorithm has reduced the number of samples and hence the computational consumption, and has achieved better abilities of handling complete occlusion over a few frames.
Probability of coincidental similarity among the orbits of small bodies - I. Pairing

NASA Astrophysics Data System (ADS)

Jopek, Tadeusz Jan; Bronikowska, Małgorzata

2017-09-01

Probability of coincidental clustering among orbits of comets, asteroids and meteoroids depends on many factors like: the size of the orbital sample searched for clusters or the size of the identified group, it is different for groups of 2,3,4,… members. Probability of coincidental clustering is assessed by the numerical simulation, therefore, it depends also on the method used for the synthetic orbits generation. We have tested the impact of some of these factors. For a given size of the orbital sample we have assessed probability of random pairing among several orbital populations of different sizes. We have found how these probabilities vary with the size of the orbital samples. Finally, keeping fixed size of the orbital sample we have shown that the probability of random pairing can be significantly different for the orbital samples obtained by different observation techniques. Also for the user convenience we have obtained several formulae which, for given size of the orbital sample can be used to calculate the similarity threshold corresponding to the small value of the probability of coincidental similarity among two orbits.
Predicting the probability of mortality of gastric cancer patients using decision tree.

PubMed

Mohammadzadeh, F; Noorkojuri, H; Pourhoseingholi, M A; Saadat, S; Baghestani, A R

2015-06-01

Gastric cancer is the fourth most common cancer worldwide. This reason motivated us to investigate and introduce gastric cancer risk factors utilizing statistical methods. The aim of this study was to identify the most important factors influencing the mortality of patients who suffer from gastric cancer disease and to introduce a classification approach according to decision tree model for predicting the probability of mortality from this disease. Data on 216 patients with gastric cancer, who were registered in Taleghani hospital in Tehran,Iran, were analyzed. At first, patients were divided into two groups: the dead and alive. Then, to fit decision tree model to our data, we randomly selected 20% of dataset to the test sample and remaining dataset considered as the training sample. Finally, the validity of the model examined with sensitivity, specificity, diagnosis accuracy and the area under the receiver operating characteristic curve. The CART version 6.0 and SPSS version 19.0 softwares were used for the analysis of the data. Diabetes, ethnicity, tobacco, tumor size, surgery, pathologic stage, age at diagnosis, exposure to chemical weapons and alcohol consumption were determined as effective factors on mortality of gastric cancer. The sensitivity, specificity and accuracy of decision tree were 0.72, 0.75 and 0.74 respectively. The indices of sensitivity, specificity and accuracy represented that the decision tree model has acceptable accuracy to prediction the probability of mortality in gastric cancer patients. So a simple decision tree consisted of factors affecting on mortality of gastric cancer may help clinicians as a reliable and practical tool to predict the probability of mortality in these patients.
The Cognitive Substrate of Subjective Probability

ERIC Educational Resources Information Center

Nilsson, Hakan; Olsson, Henrik; Juslin, Peter

2005-01-01

The prominent cognitive theories of probability judgment were primarily developed to explain cognitive biases rather than to account for the cognitive processes in probability judgment. In this article the authors compare 3 major theories of the processes and representations in probability judgment: the representativeness heuristic, implemented as…
Method- and species-specific detection probabilities of fish occupancy in Arctic lakes: Implications for design and management

USGS Publications Warehouse

Haynes, Trevor B.; Rosenberger, Amanda E.; Lindberg, Mark S.; Whitman, Matthew; Schmutz, Joel A.

2013-01-01

Studies examining species occurrence often fail to account for false absences in field sampling. We investigate detection probabilities of five gear types for six fish species in a sample of lakes on the North Slope, Alaska. We used an occupancy modeling approach to provide estimates of detection probabilities for each method. Variation in gear- and species-specific detection probability was considerable. For example, detection probabilities for the fyke net ranged from 0.82 (SE = 0.05) for least cisco (Coregonus sardinella) to 0.04 (SE = 0.01) for slimy sculpin (Cottus cognatus). Detection probabilities were also affected by site-specific variables such as depth of the lake, year, day of sampling, and lake connection to a stream. With the exception of the dip net and shore minnow traps, each gear type provided the highest detection probability of at least one species. Results suggest that a multimethod approach may be most effective when attempting to sample the entire fish community of Arctic lakes. Detection probability estimates will be useful for designing optimal fish sampling and monitoring protocols in Arctic lakes.
Probabilistic treatment of the uncertainty from the finite size of weighted Monte Carlo data

NASA Astrophysics Data System (ADS)

Glüsenkamp, Thorsten

2018-06-01

Parameter estimation in HEP experiments often involves Monte Carlo simulation to model the experimental response function. A typical application are forward-folding likelihood analyses with re-weighting, or time-consuming minimization schemes with a new simulation set for each parameter value. Problematically, the finite size of such Monte Carlo samples carries intrinsic uncertainty that can lead to a substantial bias in parameter estimation if it is neglected and the sample size is small. We introduce a probabilistic treatment of this problem by replacing the usual likelihood functions with novel generalized probability distributions that incorporate the finite statistics via suitable marginalization. These new PDFs are analytic, and can be used to replace the Poisson, multinomial, and sample-based unbinned likelihoods, which covers many use cases in high-energy physics. In the limit of infinite statistics, they reduce to the respective standard probability distributions. In the general case of arbitrary Monte Carlo weights, the expressions involve the fourth Lauricella function FD, for which we find a new finite-sum representation in a certain parameter setting. The result also represents an exact form for Carlson's Dirichlet average Rn with n > 0, and thereby an efficient way to calculate the probability generating function of the Dirichlet-multinomial distribution, the extended divided difference of a monomial, or arbitrary moments of univariate B-splines. We demonstrate the bias reduction of our approach with a typical toy Monte Carlo problem, estimating the normalization of a peak in a falling energy spectrum, and compare the results with previously published methods from the literature.
Does part-time sick leave help individuals with mental disorders recover lost work capacity?

PubMed

Andrén, Daniela

2014-06-01

This paper aims to answer the question whether combining sick leave with some hours of work can help employees diagnosed with a mental disorder (MD) increase their probability of returning to work. Given the available data, this paper analyzes the impact of part-time sick leave (PTSL) on the probability of fully recovering lost work capacity for employees diagnosed with an MD. The effects of PTSL on the probability of fully recovering lost work capacity are estimated by a discrete choice one-factor model using data on a nationally representative sample extracted from the register of the National Agency of Social Insurance in Sweden and supplemented with information from questionnaires. All individuals in the sample were 20-64 years old and started a sickness spell of at least 15 days between 1 and 16 February 2001. We selected all employed individuals diagnosed with an MD, with a final sample of 629 individuals. The results show that PTSL is associated with a low likelihood of full recovery, yet the timing of the assignment is important. PTSL's effect is relatively low (0.015) when it is assigned in the beginning of the spell but relatively high (0.387), and statistically significant, when assigned after 60 days of full-time sick leave (FTSL). This suggests efficiency improvements from assigning employees with an MD diagnosis, when possible, to PTSL. The employment gains will be enhanced if employees with an MD diagnosis are encouraged to return to work part-time after 60 days or more of FTSL.
Intermediate Pond Sizes Contain the Highest Density, Richness, and Diversity of Pond-Breeding Amphibians

PubMed Central

Semlitsch, Raymond D.; Peterman, William E.; Anderson, Thomas L.; Drake, Dana L.; Ousterhout, Brittany H.

2015-01-01

We present data on amphibian density, species richness, and diversity from a 7140-ha area consisting of 200 ponds in the Midwestern U.S. that represents most of the possible lentic aquatic breeding habitats common in this region. Our study includes all possible breeding sites with natural and anthropogenic disturbance processes that can be missing from studies where sampling intensity is low, sample area is small, or partial disturbance gradients are sampled. We tested whether pond area was a significant predictor of density, species richness, and diversity of amphibians and if values peaked at intermediate pond areas. We found that in all cases a quadratic model fit our data significantly better than a linear model. Because small ponds have a high probability of pond drying and large ponds have a high probability of fish colonization and accumulation of invertebrate predators, drying and predation may be two mechanisms driving the peak of density and diversity towards intermediate values of pond size. We also found that not all intermediate sized ponds produced many larvae; in fact, some had low amphibian density, richness, and diversity. Further analyses of the subset of ponds represented in the peak of the area distribution showed that fish, hydroperiod, invertebrate density, and canopy are additional factors that drive density, richness and diversity of ponds up or down, when extremely small or large ponds are eliminated. Our results indicate that fishless ponds at intermediate sizes are more diverse, produce more larvae, and have greater potential to recruit juveniles into adult populations of most species sampled. Further, hylid and chorus frogs are found predictably more often in ephemeral ponds whereas bullfrogs, green frogs, and cricket frogs are found most often in permanent ponds with fish. Our data increase understanding of what factors structure and maintain amphibian diversity across large landscapes. PMID:25906355
Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading

PubMed Central

Ellis, Ian O.; Green, Andrew R.; Hanka, Rudolf

2008-01-01

Background We consider the problem of assessing inter-rater agreement when there are missing data and a large number of raters. Previous studies have shown only ‘moderate’ agreement between pathologists in grading breast cancer tumour specimens. We analyse a large but incomplete data-set consisting of 24177 grades, on a discrete 1–3 scale, provided by 732 pathologists for 52 samples. Methodology/Principal Findings We review existing methods for analysing inter-rater agreement for multiple raters and demonstrate two further methods. Firstly, we examine a simple non-chance-corrected agreement score based on the observed proportion of agreements with the consensus for each sample, which makes no allowance for missing data. Secondly, treating grades as lying on a continuous scale representing tumour severity, we use a Bayesian latent trait method to model cumulative probabilities of assigning grade values as functions of the severity and clarity of the tumour and of rater-specific parameters representing boundaries between grades 1–2 and 2–3. We simulate from the fitted model to estimate, for each rater, the probability of agreement with the majority. Both methods suggest that there are differences between raters in terms of rating behaviour, most often caused by consistent over- or under-estimation of the grade boundaries, and also considerable variability in the distribution of grades assigned to many individual samples. The Bayesian model addresses the tendency of the agreement score to be biased upwards for raters who, by chance, see a relatively ‘easy’ set of samples. Conclusions/Significance Latent trait models can be adapted to provide novel information about the nature of inter-rater agreement when the number of raters is large and there are missing data. In this large study there is substantial variability between pathologists and uncertainty in the identity of the ‘true’ grade of many of the breast cancer tumours, a fact often ignored in clinical studies. PMID:18698346
Evaluation of radio-tracking and strip transect methods for determining foraging ranges of Black-Legged Kittiwakes

USGS Publications Warehouse

Ostrand, William D.; Drew, G.S.; Suryan, R.M.; McDonald, L.L.

1998-01-01

We compared strip transect and radio-tracking methods of determining foraging range of Black-legged Kittiwakes (Rissa tridactyla). The mean distance birds were observed from their colony determined by radio-tracking was significantly greater than the mean value calculated from strip transects. We determined that this difference was due to two sources of bias: (1) as distance from the colony increased, the area of available habitat also increased resulting in decreasing bird densities (bird spreading). Consequently, the probability of detecting birds during transect surveys also would decrease as distance from the colony increased, and (2) the maximum distance birds were observed from the colony during radio-tracking exceeded the extent of the strip transect survey. We compared the observed number of birds seen on the strip transect survey to the predictions of a model of the decreasing probability of detection due to bird spreading. Strip transect data were significantly different from modeled data; however, the field data were consistently equal to or below the model predictions, indicating a general conformity to the concept of declining detection at increasing distance. We conclude that radio-tracking data gave a more representative indication of foraging distances than did strip transect sampling. Previous studies of seabirds that have used strip transect sampling without accounting for bird spreading or the effects of study-area limitations probably underestimated foraging range.
Measurement of the top quark mass using template methods on dilepton events in p anti-p collisions at s**(1/2) = 1.96-TeV

DOE Office of Scientific and Technical Information (OSTI.GOV)

Abulencia, A.; Acosta, D.; Adelman, Jahred A.

2006-02-01

The authors describe a measurement of the top quark mass from events produced in p{bar p} collisions at a center-of-mass energy of 1.96 TeV, using the Collider Detector at Fermilab. They identify t{bar t} candidates where both W bosons from the top quarks decay into leptons (e{nu}, {mu}{nu}, or {tau}{nu}) from a data sample of 360 pb{sup -1}. The top quark mass is reconstructed in each event separately by three different methods, which draw upon simulated distributions of the neutrino pseudorapidity, t{bar t} longitudinal momentum, or neutrino azimuthal angle in order to extract probability distributions for the top quark mass.more » For each method, representative mass distributions, or templates, are constructed from simulated samples of signal and background events, and parameterized to form continuous probability density functions. A likelihood fit incorporating these parameterized templates is then performed on the data sample masses in order to derive a final top quark mass. Combining the three template methods, taking into account correlations in their statistical and systematic uncertainties, results in a top quark mass measurement of 170.1 {+-} 6.0(stat.) {+-} 4.1(syst.) GeV/c{sup 2}.« less

Invited commentary: recruiting for epidemiologic studies using social media.

PubMed

Allsworth, Jenifer E

2015-05-15

Social media-based recruitment for epidemiologic studies has the potential to expand the demographic and geographic reach of investigators and identify potential participants more cost-effectively than traditional approaches. In fact, social media are particularly appealing for their ability to engage traditionally "hard-to-reach" populations, including young adults and low-income populations. Despite their great promise as a tool for epidemiologists, social media-based recruitment approaches do not currently compare favorably with gold-standard probability-based sampling approaches. Sparse data on the demographic characteristics of social media users, patterns of social media use, and appropriate sampling frames limit our ability to implement probability-based sampling strategies. In a well-conducted study, Harris et al. (Am J Epidemiol. 2015;181(10):737-746) examined the cost-effectiveness of social media-based recruitment (advertisements and promotion) in the Contraceptive Use, Pregnancy Intention, and Decisions (CUPID) Study, a cohort study of 3,799 young adult Australian women, and the approximate representativeness of the CUPID cohort. Implications for social media-based recruitment strategies for cohort assembly, data accuracy, implementation, and human subjects concerns are discussed. © The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Composition of the crust beneath the Kenya rift

USGS Publications Warehouse

Mooney, W.D.; Christensen, N.I.

1994-01-01

We infer the composition of the crust beneath and on the flanks of the Kenya rift based on a comparison of the KRISP-90 crustal velocity structure with laboratory measurements of compressional-wave velocities of rock samples from Kenya. The rock samples studied, which are representative of the major lithologies exposed in Kenya, include volcanic tuffs and flows (primarily basalts and phonolites), and felsic to intermediate composition gneisses. This comparison indicates that the upper crust (5-12 km depth) consists primarily of quartzo-feldspathic gneisses and schists similar to rocks exposed on the flanks of the rift, whereas the middle crust (12-22 km depth) consists of more mafic, hornblende-rich metamorphic rocks, probably intruded by mafic rocks beneath the rift axis. The lower crust on the flanks of the rift may consist of mafic granulite facies rocks. Along the rift axis, the lower crust varies in thickness from 9 km in the southern rift to only 2-3 km in the north, and has a seismic velocity substantially higher than the samples investigated in this study. The lower crust of the rift probably consists of a crust/mantle mix of high-grade metamorphic rocks, mafic intrusives, and an igneous mafic residuum accreted to the base of the crust during differentiation of a melt derived from the upper mantle. ?? 1994.
Exploring the Connection Between Sampling Problems in Bayesian Inference and Statistical Mechanics

NASA Technical Reports Server (NTRS)

Pohorille, Andrew

2006-01-01

The Bayesian and statistical mechanical communities often share the same objective in their work - estimating and integrating probability distribution functions (pdfs) describing stochastic systems, models or processes. Frequently, these pdfs are complex functions of random variables exhibiting multiple, well separated local minima. Conventional strategies for sampling such pdfs are inefficient, sometimes leading to an apparent non-ergodic behavior. Several recently developed techniques for handling this problem have been successfully applied in statistical mechanics. In the multicanonical and Wang-Landau Monte Carlo (MC) methods, the correct pdfs are recovered from uniform sampling of the parameter space by iteratively establishing proper weighting factors connecting these distributions. Trivial generalizations allow for sampling from any chosen pdf. The closely related transition matrix method relies on estimating transition probabilities between different states. All these methods proved to generate estimates of pdfs with high statistical accuracy. In another MC technique, parallel tempering, several random walks, each corresponding to a different value of a parameter (e.g. "temperature"), are generated and occasionally exchanged using the Metropolis criterion. This method can be considered as a statistically correct version of simulated annealing. An alternative approach is to represent the set of independent variables as a Hamiltonian system. Considerab!e progress has been made in understanding how to ensure that the system obeys the equipartition theorem or, equivalently, that coupling between the variables is correctly described. Then a host of techniques developed for dynamical systems can be used. Among them, probably the most powerful is the Adaptive Biasing Force method, in which thermodynamic integration and biased sampling are combined to yield very efficient estimates of pdfs. The third class of methods deals with transitions between states described by rate constants. These problems are isomorphic with chemical kinetics problems. Recently, several efficient techniques for this purpose have been developed based on the approach originally proposed by Gillespie. Although the utility of the techniques mentioned above for Bayesian problems has not been determined, further research along these lines is warranted
Multinomial logistic regression in workers' health

NASA Astrophysics Data System (ADS)

Grilo, Luís M.; Grilo, Helena L.; Gonçalves, Sónia P.; Junça, Ana

2017-11-01

In European countries, namely in Portugal, it is common to hear some people mentioning that they are exposed to excessive and continuous psychosocial stressors at work. This is increasing in diverse activity sectors, such as, the Services sector. A representative sample was collected from a Portuguese Services' organization, by applying a survey (internationally validated), which variables were measured in five ordered categories in Likert-type scale. A multinomial logistic regression model is used to estimate the probability of each category of the dependent variable general health perception where, among other independent variables, burnout appear as statistically significant.
A survey of natural terrestrial and airborne radionuclides in moss samples from the peninsular Thailand.

PubMed

Wattanavatee, Komrit; Krmar, Miodrag; Bhongsuwan, Tripob

2017-10-01

The aim of this study was to determine the activity concentrations of natural terrestrial radionuclides ( 238 U, 226 Ra, 232 Th and 40 K) and airborne radionuclides ( 210 Pb, 210 Pb ex and 7 Be) in natural terrestrial mosses. The collected moss samples (46) representing 17 species were collected from 17 sampling localities in the National Parks and Wildlife Sanctuaries of Thailand, situated in the mountainous areas between the northern and the southern ends of peninsular Thailand (∼7-12 °N, 99-102 °E). Activity concentrations of radionuclides in the samples were measured using a low background gamma spectrometer. The results revealed non-uniform spatial distributions of all the radionuclides in the study area. Principal component analysis and cluster analysis revealed two distinct origins for the studied radionuclides, and furthermore, the Pearson correlations were strong within 226 Ra, 232 Th, 238 U and 40 K as well as within 210 Pb and 210 Pb ex , but there was no significant correlation between these two groups. Also 7 Be was uncorrelated to the others, as expected due to different origins of the airborne and terrestrial radionuclides. The radionuclide activities of moss samples varied by moss species, topography, geology, and meteorology of each sampling area. The observed abnormally high concentrations of some radionuclides probably indicate that the concentrations of airborne and terrestrial radionuclides in moss samples were directly related to local geological features of the sampling site, or that high levels of 7 Be were most probably linked with topography and regional NE monsoonal winds from mainland China. Copyright © 2017 Elsevier Ltd. All rights reserved.
Evaluation of a Class of Simple and Effective Uncertainty Methods for Sparse Samples of Random Variables and Functions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Romero, Vicente; Bonney, Matthew; Schroeder, Benjamin

When very few samples of a random quantity are available from a source distribution of unknown shape, it is usually not possible to accurately infer the exact distribution from which the data samples come. Under-estimation of important quantities such as response variance and failure probabilities can result. For many engineering purposes, including design and risk analysis, we attempt to avoid under-estimation with a strategy to conservatively estimate (bound) these types of quantities -- without being overly conservative -- when only a few samples of a random quantity are available from model predictions or replicate experiments. This report examines a classmore » of related sparse-data uncertainty representation and inference approaches that are relatively simple, inexpensive, and effective. Tradeoffs between the methods' conservatism, reliability, and risk versus number of data samples (cost) are quantified with multi-attribute metrics use d to assess method performance for conservative estimation of two representative quantities: central 95% of response; and 10 -4 probability of exceeding a response threshold in a tail of the distribution. Each method's performance is characterized with 10,000 random trials on a large number of diverse and challenging distributions. The best method and number of samples to use in a given circumstance depends on the uncertainty quantity to be estimated, the PDF character, and the desired reliability of bounding the true value. On the basis of this large data base and study, a strategy is proposed for selecting the method and number of samples for attaining reasonable credibility levels in bounding these types of quantities when sparse samples of random variables or functions are available from experiments or simulations.« less
A Statistical Framework for Microbial Source Attribution

DOE Office of Scientific and Technical Information (OSTI.GOV)

Velsko, S P; Allen, J E; Cunningham, C T

2009-04-28

This report presents a general approach to inferring transmission and source relationships among microbial isolates from their genetic sequences. The outbreak transmission graph (also called the transmission tree or transmission network) is the fundamental structure which determines the statistical distributions relevant to source attribution. The nodes of this graph are infected individuals or aggregated sub-populations of individuals in which transmitted bacteria or viruses undergo clonal expansion, leading to a genetically heterogeneous population. Each edge of the graph represents a transmission event in which one or a small number of bacteria or virions infects another node thus increasing the size ofmore » the transmission network. Recombination and re-assortment events originate in nodes which are common to two distinct networks. In order to calculate the probability that one node was infected by another, given the observed genetic sequences of microbial isolates sampled from them, we require two fundamental probability distributions. The first is the probability of obtaining the observed mutational differences between two isolates given that they are separated by M steps in a transmission network. The second is the probability that two nodes sampled randomly from an outbreak transmission network are separated by M transmission events. We show how these distributions can be obtained from the genetic sequences of isolates obtained by sampling from past outbreaks combined with data from contact tracing studies. Realistic examples are drawn from the SARS outbreak of 2003, the FMDV outbreak in Great Britain in 2001, and HIV transmission cases. The likelihood estimators derived in this report, and the underlying probability distribution functions required to calculate them possess certain compelling general properties in the context of microbial forensics. These include the ability to quantify the significance of a sequence 'match' or 'mismatch' between two isolates; the ability to capture non-intuitive effects of network structure on inferential power, including the 'small world' effect; the insensitivity of inferences to uncertainties in the underlying distributions; and the concept of rescaling, i.e. ability to collapse sub-networks into single nodes and examine transmission inferences on the rescaled network.« less
Probable late lyme disease: a variant manifestation of untreated Borrelia burgdorferi infection

PubMed Central

2012-01-01

Background Lyme disease, a bacterial infection with the tick-borne spirochete Borrelia burgdorferi, can cause early and late manifestations. The category of probable Lyme disease was recently added to the CDC surveillance case definition to describe patients with serologic evidence of exposure and physician-diagnosed disease in the absence of objective signs. We present a retrospective case series of 13 untreated patients with persistent symptoms of greater than 12 weeks duration who meet these criteria and suggest a label of ‘probable late Lyme disease’ for this presentation. Methods The sample for this analysis draws from a retrospective chart review of consecutive, adult patients presenting between August 2002 and August 2007 to the author (JA), an infectious disease specialist. Patients were included in the analysis if their current illness had lasted greater than or equal to 12 weeks duration at the time of evaluation. Results Probable late Lyme patients with positive IgG serology but no history of previous physician-documented Lyme disease or appropriate Lyme treatment were found to represent 6% of our heterogeneous sample presenting with ≥ 12 weeks of symptom duration. Patients experienced a range of symptoms including fatigue, widespread pain, and cognitive complaints. Approximately one-third of this subset reported a patient-observed rash at illness onset, with a similar proportion having been exposed to non-recommended antibiotics or glucocorticosteroid treatment for their initial disease. A clinically significant response to antibiotics treatment was noted in the majority of patients with probable late Lyme disease, although post-treatment symptom recurrence was common. Conclusions We suggest that patients with probable late Lyme disease share features with both confirmed late Lyme disease and post-treatment Lyme disease syndrome. Physicians should consider the recent inclusion of probable Lyme disease in the CDC Lyme disease surveillance criteria when evaluating patients, especially in patients with a history suggestive of misdiagnosed or inadequately treated early Lyme disease. Further studies are warranted to delineate later manifestations of Lyme disease and to quantify treatment benefit in this population. PMID:22853630
Social surveys in HIV/AIDS: telling or writing? A comparison of interview and postal methods.

PubMed

McEwan, R T; Harrington, B E; Bhopal, R S; Madhok, R; McCallum, A

1992-06-01

We compare a probability sample postal questionnaire survey and a quota controlled interview survey, and review the literature on these subjects. In contrast to other studies, where quota samples were not representative because of biased selection of respondents by interviewers, our quota sample was representative. Response rates were similar in our postal and interview surveys (74 and 77%, respectively), although many previous similar postal surveys had poor response rates. As in other comparison studies, costs were higher in our interview survey, substantive responses and the quality of responses to closed-ended questions were similar, and responses to open-ended questions were better in the interview survey. 'Socially unacceptable' responses on sexual behaviour were less likely in interviews. Quota controlled surveys are appropriate in surveys on HIV/AIDS under certain circumstances, e.g. where the population parameters are well known, and where interviewers can gain access to the entire population. Postal questionnaires are better for obtaining information on sexual behaviour, if adequate steps are taken to improve response rates, and when in-depth answers are not needed. For most surveys in the HIV/AIDS field we recommend the postal method.
Probable Posttraumatic Stress Disorder in the US Veteran Population According to DSM-5: Results From the National Health and Resilience in Veterans Study.

PubMed

Wisco, Blair E; Marx, Brian P; Miller, Mark W; Wolf, Erika J; Mota, Natalie P; Krystal, John H; Southwick, Steven M; Pietrzak, Robert H

2016-11-01

With the publication of DSM-5, important changes were made to the diagnostic criteria for posttraumatic stress disorder (PTSD), including the addition of 3 new symptoms. Some have argued that these changes will further increase the already high rates of comorbidity between PTSD and other psychiatric disorders. This study examined the prevalence of DSM-5 PTSD, conditional probability of PTSD given certain trauma exposures, endorsement of specific PTSD symptoms, and psychiatric comorbidities in the US veteran population. Data were analyzed from the National Health and Resilience in Veterans Study (NHRVS), a Web-based survey of a cross-sectional, nationally representative, population-based sample of 1,484 US veterans, which was fielded from September through October 2013. Probable PTSD was assessed using the PTSD Checklist-5. The weighted lifetime and past-month prevalence of probable DSM-5 PTSD was 8.1% (SE = 0.7%) and 4.7% (SE = 0.6%), respectively. Conditional probability of lifetime probable PTSD ranged from 10.1% (sudden death of close family member or friend) to 28.0% (childhood sexual abuse). The DSM-5 PTSD symptoms with the lowest prevalence among veterans with probable PTSD were trauma-related amnesia and reckless and self-destructive behavior. Probable PTSD was associated with increased odds of mood and anxiety disorders (OR = 7.6-62.8, P < .001), substance use disorders (OR = 3.9-4.5, P < .001), and suicidal behaviors (OR = 6.7-15.1, P < .001). In US veterans, the prevalence of DSM-5 probable PTSD, conditional probability of probable PTSD, and odds of psychiatric comorbidity were similar to prior findings with DSM-IV-based measures; we found no evidence that changes in DSM-5 increase psychiatric comorbidity. Results underscore the high rates of exposure to both military and nonmilitary trauma and the high public health burden of DSM-5 PTSD and comorbid conditions in veterans. © Copyright 2016 Physicians Postgraduate Press, Inc.
Estimating the breeding population of long-billed curlew in the United States

USGS Publications Warehouse

Stanley, T.R.; Skagen, S.K.

2007-01-01

Determining population size and long-term trends in population size for species of high concern is a priority of international, national, and regional conservation plans. Long-billed curlews (Numenius americanus) are a species of special concern in North America due to apparent declines in their population. Because long-billed curlews are not adequately monitored by existing programs, we undertook a 2-year study with the goals of 1) determining present long-billed curlew distribution and breeding population size in the United States and 2) providing recommendations for a long-term long-billed curlew monitoring protocol. We selected a stratified random sample of survey routes in 16 western states for sampling in 2004 and 2005, and we analyzed count data from these routes to estimate detection probabilities and abundance. In addition, we evaluated habitat along roadsides to determine how well roadsides represented habitat throughout the sampling units. We estimated there were 164,515 (SE = 42,047) breeding long-billed curlews in 2004, and 109,533 (SE = 31,060) breeding individuals in 2005. These estimates far exceed currently accepted estimates based on expert opinion. We found that habitat along roadsides was representative of long-billed curlew habitat in general. We make recommendations for improving sampling methodology, and we present power curves to provide guidance on minimum sample sizes required to detect trends in abundance.
II. MORE THAN JUST CONVENIENT: THE SCIENTIFIC MERITS OF HOMOGENEOUS CONVENIENCE SAMPLES.

PubMed

Jager, Justin; Putnick, Diane L; Bornstein, Marc H

2017-06-01

Despite their disadvantaged generalizability relative to probability samples, nonprobability convenience samples are the standard within developmental science, and likely will remain so because probability samples are cost-prohibitive and most available probability samples are ill-suited to examine developmental questions. In lieu of focusing on how to eliminate or sharply reduce reliance on convenience samples within developmental science, here we propose how to augment their advantages when it comes to understanding population effects as well as subpopulation differences. Although all convenience samples have less clear generalizability than probability samples, we argue that homogeneous convenience samples have clearer generalizability relative to conventional convenience samples. Therefore, when researchers are limited to convenience samples, they should consider homogeneous convenience samples as a positive alternative to conventional (or heterogeneous) convenience samples. We discuss future directions as well as potential obstacles to expanding the use of homogeneous convenience samples in developmental science. © 2017 The Society for Research in Child Development, Inc.
Nonprobability and probability-based sampling strategies in sexual science.

PubMed

Catania, Joseph A; Dolcini, M Margaret; Orellana, Roberto; Narayanan, Vasudah

2015-01-01

With few exceptions, much of sexual science builds upon data from opportunistic nonprobability samples of limited generalizability. Although probability-based studies are considered the gold standard in terms of generalizability, they are costly to apply to many of the hard-to-reach populations of interest to sexologists. The present article discusses recent conclusions by sampling experts that have relevance to sexual science that advocates for nonprobability methods. In this regard, we provide an overview of Internet sampling as a useful, cost-efficient, nonprobability sampling method of value to sex researchers conducting modeling work or clinical trials. We also argue that probability-based sampling methods may be more readily applied in sex research with hard-to-reach populations than is typically thought. In this context, we provide three case studies that utilize qualitative and quantitative techniques directed at reducing limitations in applying probability-based sampling to hard-to-reach populations: indigenous Peruvians, African American youth, and urban men who have sex with men (MSM). Recommendations are made with regard to presampling studies, adaptive and disproportionate sampling methods, and strategies that may be utilized in evaluating nonprobability and probability-based sampling methods.
Importance Sampling in the Evaluation and Optimization of Buffered Failure Probability

DTIC Science & Technology

2015-07-01

12th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP12 Vancouver, Canada, July 12-15, 2015...Importance Sampling in the Evaluation and Optimization of Buffered Failure Probability Marwan M. Harajli Graduate Student, Dept. of Civil and Environ...criterion is usually the failure probability . In this paper, we examine the buffered failure probability as an attractive alternative to the failure
Assessing the Impact of Antidrug Advertising on Adolescent Drug Consumption: Results From a Behavioral Economic Model

PubMed Central

Block, Lauren G.; Morwitz, Vicki G.; Putsis, William P.; Sen, Subrata K.

2002-01-01

Objectives. This study examined whether adolescents’ recall of antidrug advertising is associated with a decreased probability of using illicit drugs and, given drug use, a reduced volume of use. Methods. A behavioral economic model of influences on drug consumption was developed with survey data from a nationally representative sample of adolescents to determine the incremental impact of antidrug advertising. Results. The findings provided evidence that recall of antidrug advertising was associated with a lower probability of marijuana and cocaine/crack use. Recall of such advertising was not associated with the decision of how much marijuana or cocaine/crack to use. Results suggest that individuals predisposed to try marijuana are also predisposed to try cocaine/crack. Conclusions. The present results provide support for the effectiveness of antidrug advertising programs. (Am J Public Health. 2002;92:1346–1351) PMID:12144995
Random function representation of stationary stochastic vector processes for probability density evolution analysis of wind-induced structures

NASA Astrophysics Data System (ADS)

Liu, Zhangjun; Liu, Zenghui

2018-06-01

This paper develops a hybrid approach of spectral representation and random function for simulating stationary stochastic vector processes. In the proposed approach, the high-dimensional random variables, included in the original spectral representation (OSR) formula, could be effectively reduced to only two elementary random variables by introducing the random functions that serve as random constraints. Based on this, a satisfactory simulation accuracy can be guaranteed by selecting a small representative point set of the elementary random variables. The probability information of the stochastic excitations can be fully emerged through just several hundred of sample functions generated by the proposed approach. Therefore, combined with the probability density evolution method (PDEM), it could be able to implement dynamic response analysis and reliability assessment of engineering structures. For illustrative purposes, a stochastic turbulence wind velocity field acting on a frame-shear-wall structure is simulated by constructing three types of random functions to demonstrate the accuracy and efficiency of the proposed approach. Careful and in-depth studies concerning the probability density evolution analysis of the wind-induced structure have been conducted so as to better illustrate the application prospects of the proposed approach. Numerical examples also show that the proposed approach possesses a good robustness.
Responsiveness-informed multiple imputation and inverse probability-weighting in cohort studies with missing data that are non-monotone or not missing at random.

PubMed

Doidge, James C

2018-02-01

Population-based cohort studies are invaluable to health research because of the breadth of data collection over time, and the representativeness of their samples. However, they are especially prone to missing data, which can compromise the validity of analyses when data are not missing at random. Having many waves of data collection presents opportunity for participants' responsiveness to be observed over time, which may be informative about missing data mechanisms and thus useful as an auxiliary variable. Modern approaches to handling missing data such as multiple imputation and maximum likelihood can be difficult to implement with the large numbers of auxiliary variables and large amounts of non-monotone missing data that occur in cohort studies. Inverse probability-weighting can be easier to implement but conventional wisdom has stated that it cannot be applied to non-monotone missing data. This paper describes two methods of applying inverse probability-weighting to non-monotone missing data, and explores the potential value of including measures of responsiveness in either inverse probability-weighting or multiple imputation. Simulation studies are used to compare methods and demonstrate that responsiveness in longitudinal studies can be used to mitigate bias induced by missing data, even when data are not missing at random.
Association Between Persistent Pain and Memory Decline and Dementia in a Longitudinal Cohort of Elders.

PubMed

Whitlock, Elizabeth L; Diaz-Ramirez, L Grisell; Glymour, M Maria; Boscardin, W John; Covinsky, Kenneth E; Smith, Alexander K

2017-08-01

Chronic pain is common among the elderly and is associated with cognitive deficits in cross-sectional studies; the population-level association between chronic pain and longitudinal cognition is unknown. To determine the population-level association between persistent pain, which may reflect chronic pain, and subsequent cognitive decline. Cohort study with biennial interviews of 10 065 community-dwelling older adults in the nationally representative Health and Retirement Study who were 62 years or older in 2000 and answered pain and cognition questions in both 1998 and 2000. Data analysis was conducted between June 24 and October 31, 2016. "Persistent pain," defined as a participant reporting that he or she was often troubled with moderate or severe pain in both the 1998 and 2000 interviews. Coprimary outcomes were composite memory score and dementia probability, estimated by combining neuropsychological test results and informant and proxy interviews, which were tracked from 2000 through 2012. Linear mixed-effects models, with random slope and intercept for each participant, were used to estimate the association of persistent pain with slope of the subsequent cognitive trajectory, adjusting for demographic characteristics and comorbidities measures in 2000 and applying sampling weights to represent the 2000 US population. We hypothesized that persistent pain would predict accelerated memory decline and increased probability of dementia. To quantify the impact of persistent pain on functional independence, we combined our primary results with information on the association between memory and ability to manage medications and finances independently. Of the 10 065 eligible HRS sample members, 60% were female, and median baseline age was 73 years (interquartile range, 67-78 years). At baseline, persistent pain affected 10.9% of participants and was associated with worse depressive symptoms and more limitations in activities of daily living. After covariate adjustment, persistent pain was associated with 9.2% (95% CI, 2.8%-15.0%) more rapid memory decline compared with those without persistent pain. After 10 years, this accelerated memory decline implied a 15.9% higher relative risk of inability to manage medications and an 11.8% higher relative risk of inability to manage finances independently. Adjusted dementia probability increased 7.7% faster (95% CI, 0.55%-14.2%); after 10 years, this translates to an absolute 2.2% increase in dementia probability for those with persistent pain. Persistent pain was associated with accelerated memory decline and increased probability of dementia.
Finding SDSS Galaxy Clusters in 4-dimensional Color Space Using the False Discovery Rate

NASA Astrophysics Data System (ADS)

Nichol, R. C.; Miller, C. J.; Reichart, D.; Wasserman, L.; Genovese, C.; SDSS Collaboration

2000-12-01

We describe a recently developed statistical technique that provides a meaningful cut-off in probability-based decision making. We are concerned with multiple testing, where each test produces a well-defined probability (or p-value). By well-known, we mean that the null hypothesis used to determine the p-value is fully understood and appropriate. The method is entitled False Discovery Rate (FDR) and its largest advantage over other measures is that it allows one to specify a maximal amount of acceptable error. As an example of this tool, we apply FDR to a four-dimensional clustering algorithm using SDSS data. For each galaxy (or test galaxy), we count the number of neighbors that fit within one standard deviation of a four dimensional Gaussian centered on that test galaxy. The mean and standard deviation of that Gaussian are determined from the colors and errors of the test galaxy. We then take that same Gaussian and place it on a random selection of n galaxies and make a similar count. In the limit of large n, we expect the median count around these random galaxies to represent a typical field galaxy. For every test galaxy we determine the probability (or p-value) that it is a field galaxy based on these counts. A low p-value implies that the test galaxy is in a cluster environment. Once we have a p-value for every galaxy, we use FDR to determine at what level we should make our probability cut-off. Once this cut-off is made, we have a final sample of galaxies that are cluster-like galaxies. Using FDR, we also know the maximum amount of field contamination in our cluster galaxy sample. We present our preliminary galaxy clustering results using these methods.
Methodology Series Module 5: Sampling Strategies.

PubMed

Setia, Maninder Singh

2016-01-01

Once the research question and the research design have been finalised, it is important to select the appropriate sample for the study. The method by which the researcher selects the sample is the ' Sampling Method'. There are essentially two types of sampling methods: 1) probability sampling - based on chance events (such as random numbers, flipping a coin etc.); and 2) non-probability sampling - based on researcher's choice, population that accessible & available. Some of the non-probability sampling methods are: purposive sampling, convenience sampling, or quota sampling. Random sampling method (such as simple random sample or stratified random sample) is a form of probability sampling. It is important to understand the different sampling methods used in clinical studies and mention this method clearly in the manuscript. The researcher should not misrepresent the sampling method in the manuscript (such as using the term ' random sample' when the researcher has used convenience sample). The sampling method will depend on the research question. For instance, the researcher may want to understand an issue in greater detail for one particular population rather than worry about the ' generalizability' of these results. In such a scenario, the researcher may want to use ' purposive sampling' for the study.

Prediction of beta-turns from amino acid sequences using the residue-coupled model.

PubMed

Guruprasad, K; Shukla, S

2003-04-01

We evaluated the prediction of beta-turns from amino acid sequences using the residue-coupled model with an enlarged representative protein data set selected from the Protein Data Bank. Our results show that the probability values derived from a data set comprising 425 protein chains yielded an overall beta-turn prediction accuracy 68.74%, compared with 94.7% reported earlier on a data set of 30 proteins using the same method. However, we noted that the overall beta-turn prediction accuracy using probability values derived from the 30-protein data set reduces to 40.74% when tested on the data set comprising 425 protein chains. In contrast, using probability values derived from the 425 data set used in this analysis, the overall beta-turn prediction accuracy yielded consistent results when tested on either the 30-protein data set (64.62%) used earlier or a more recent representative data set comprising 619 protein chains (64.66%) or on a jackknife data set comprising 476 representative protein chains (63.38%). We therefore recommend the use of probability values derived from the 425 representative protein chains data set reported here, which gives more realistic and consistent predictions of beta-turns from amino acid sequences.
Late Paleocene Arctic Ocean shallow-marine temperatures from mollusc stable isotopes

USGS Publications Warehouse

Bice, Karen L.; Arthur, Michael A.; Marincovich, Louie

1996-01-01

Late Paleocene high-latitude (80°N) Arctic Ocean shallow-marine temperatures are estimated from molluscan δ18O time series. Sampling of individual growth increments of two specimens of the bivalve Camptochlamys alaskensis provides a high-resolution record of shell stable isotope composition. The heavy carbon isotopic values of the specimens support a late Paleocene age for the youngest marine beds of the Prince Creek Formation exposed near Ocean Point, Alaska. The oxygen isotopic composition of regional freshwater runoff is estimated from the mean δ18O value of two freshwater bivalves collected from approximately coeval fluviatile beds. Over a 30 – 34‰ range of salinity, values assumed to represent the tolerance of C. alaskensis, the mean annual shallow-marine temperature recorded by these individuals is between 11° and 22°C. These values could represent maximum estimates of the mean annual temperature because of a possible warm-month bias imposed on the average δ18O value by slowing or cessation of growth in winter months. The amplitude of the molluscan δ18O time series probably records most of the seasonality in shallow-marine temperature. The annual temperature range indicated is approximately 6°C, suggesting very moderate high-latitude marine temperature seasonality during the late Paleocene. On the basis of analogy with modern Chlamys species, C. alaskensis probably inhabited water depths of 30–50 m. The seasonal temperature range derived from δ18O is therefore likely to be damped relative to the full range of annual sea surface temperatures. High-resolution sampling of molluscan shell material across inferred growth bands represents an important proxy record of seasonality of marine and freshwater conditions applicable at any latitude. If applied to other regions and time periods, the approach used here would contribute substantially to the paleoclimate record of seasonality.
Red-shouldered hawk occupancy surveys in central Minnesota, USA

USGS Publications Warehouse

Henneman, C.; McLeod, M.A.; Andersen, D.E.

2007-01-01

Forest-dwelling raptors are often difficult to detect because many species occur at low density or are secretive. Broadcasting conspecific vocalizations can increase the probability of detecting forest-dwelling raptors and has been shown to be an effective method for locating raptors and assessing their relative abundance. Recent advances in statistical techniques based on presence-absence data use probabilistic arguments to derive probability of detection when it is <1 and to provide a model and likelihood-based method for estimating proportion of sites occupied. We used these maximum-likelihood models with data from red-shouldered hawk (Buteo lineatus) call-broadcast surveys conducted in central Minnesota, USA, in 1994-1995 and 2004-2005. Our objectives were to obtain estimates of occupancy and detection probability 1) over multiple sampling seasons (yr), 2) incorporating within-season time-specific detection probabilities, 3) with call type and breeding stage included as covariates in models of probability of detection, and 4) with different sampling strategies. We visited individual survey locations 2-9 times per year, and estimates of both probability of detection (range = 0.28-0.54) and site occupancy (range = 0.81-0.97) varied among years. Detection probability was affected by inclusion of a within-season time-specific covariate, call type, and breeding stage. In 2004 and 2005 we used survey results to assess the effect that number of sample locations, double sampling, and discontinued sampling had on parameter estimates. We found that estimates of probability of detection and proportion of sites occupied were similar across different sampling strategies, and we suggest ways to reduce sampling effort in a monitoring program.
Methodology Series Module 5: Sampling Strategies

PubMed Central

Setia, Maninder Singh

2016-01-01

Once the research question and the research design have been finalised, it is important to select the appropriate sample for the study. The method by which the researcher selects the sample is the ‘ Sampling Method’. There are essentially two types of sampling methods: 1) probability sampling – based on chance events (such as random numbers, flipping a coin etc.); and 2) non-probability sampling – based on researcher's choice, population that accessible & available. Some of the non-probability sampling methods are: purposive sampling, convenience sampling, or quota sampling. Random sampling method (such as simple random sample or stratified random sample) is a form of probability sampling. It is important to understand the different sampling methods used in clinical studies and mention this method clearly in the manuscript. The researcher should not misrepresent the sampling method in the manuscript (such as using the term ‘ random sample’ when the researcher has used convenience sample). The sampling method will depend on the research question. For instance, the researcher may want to understand an issue in greater detail for one particular population rather than worry about the ‘ generalizability’ of these results. In such a scenario, the researcher may want to use ‘ purposive sampling’ for the study. PMID:27688438
Ensembles of Spiking Neurons with Noise Support Optimal Probabilistic Inference in a Dynamically Changing Environment

PubMed Central

Legenstein, Robert; Maass, Wolfgang

2014-01-01

It has recently been shown that networks of spiking neurons with noise can emulate simple forms of probabilistic inference through “neural sampling”, i.e., by treating spikes as samples from a probability distribution of network states that is encoded in the network. Deficiencies of the existing model are its reliance on single neurons for sampling from each random variable, and the resulting limitation in representing quickly varying probabilistic information. We show that both deficiencies can be overcome by moving to a biologically more realistic encoding of each salient random variable through the stochastic firing activity of an ensemble of neurons. The resulting model demonstrates that networks of spiking neurons with noise can easily track and carry out basic computational operations on rapidly varying probability distributions, such as the odds of getting rewarded for a specific behavior. We demonstrate the viability of this new approach towards neural coding and computation, which makes use of the inherent parallelism of generic neural circuits, by showing that this model can explain experimentally observed firing activity of cortical neurons for a variety of tasks that require rapid temporal integration of sensory information. PMID:25340749
Exponentially-Biased Ground-State Sampling of Quantum Annealing Machines with Transverse-Field Driving Hamiltonians

NASA Technical Reports Server (NTRS)

Mandra, Salvatore

2017-01-01

We study the performance of the D-Wave 2X quantum annealing machine on systems with well-controlled ground-state degeneracy. While obtaining the ground state of a spin-glass benchmark instance represents a difficult task, the gold standard for any optimization algorithm or machine is to sample all solutions that minimize the Hamiltonian with more or less equal probability. Our results show that while naive transverse-field quantum annealing on the D-Wave 2X device can find the ground-state energy of the problems, it is not well suited in identifying all degenerate ground-state configurations associated to a particular instance. Even worse, some states are exponentially suppressed, in agreement with previous studies on toy model problems [New J. Phys. 11, 073021 (2009)]. These results suggest that more complex driving Hamiltonians are needed in future quantum annealing machines to ensure a fair sampling of the ground-state manifold.
What Are Probability Surveys used by the National Aquatic Resource Surveys?

EPA Pesticide Factsheets

The National Aquatic Resource Surveys (NARS) use probability-survey designs to assess the condition of the nation’s waters. In probability surveys (also known as sample-surveys or statistical surveys), sampling sites are selected randomly.
Phosphatized algal-bacterial assemblages in Late Cretaceous phosphorites of the Voronezh Anteclise

NASA Astrophysics Data System (ADS)

Maleonkina, Svetlana Y.

2003-01-01

Late Cretaceous phosphogenesis of the Voronezh Anteclise has occurred during Cenomanian and Early Campanian. SEM studies show the presence of phosphatized algal-bacterial assemblages both in Cenomanian and Campanian phosphorites. In some Cenomanian nodular phosphorite samples revealed empty tubes 1 - 5 microns in diameter, which are most likely trichomes of cyanobacterial filaments. Other samples contained accumulations of spheres 0,5-3 microns, similar to coccoidal bacteria. Complicated tubular forms with variable diameter 2 - 5 microns occur on surface of some quartz grains in nodules. They are probably pseudomorphs after algae. We found similar formations in the Campanian phosphate grains. Frequently, grain represents a cyanobacterial mat, which is sometimes concentrically coated by phosphatic films. The films of some grains retain the primary structure, their concentric layers are formed by pseudomorphs after different bacterial types and obviously they represent oncolite. In other cases, the primary structure is unobservable because of recrystallization process erases them. Occasionally, the central part retains the coccoidal structure and the recrystallization affects only films. Besides the core of such oncolite can be represented not only by phosphatic grain, but also by grains of other minerals, such as quartz, glauconite and heavy minerals, which serve as a substrate for cyanobacterial colonies. Bacteria also could settle on cavity surfaces and interiors frames of sponge fragments, teeth and bones.
Variation of Time Domain Failure Probabilities of Jack-up with Wave Return Periods

NASA Astrophysics Data System (ADS)

Idris, Ahmad; Harahap, Indra S. H.; Ali, Montassir Osman Ahmed

2018-04-01

This study evaluated failure probabilities of jack up units on the framework of time dependent reliability analysis using uncertainty from different sea states representing different return period of the design wave. Surface elevation for each sea state was represented by Karhunen-Loeve expansion method using the eigenfunctions of prolate spheroidal wave functions in order to obtain the wave load. The stochastic wave load was propagated on a simplified jack up model developed in commercial software to obtain the structural response due to the wave loading. Analysis of the stochastic response to determine the failure probability in excessive deck displacement in the framework of time dependent reliability analysis was performed by developing Matlab codes in a personal computer. Results from the study indicated that the failure probability increases with increase in the severity of the sea state representing a longer return period. Although the results obtained are in agreement with the results of a study of similar jack up model using time independent method at higher values of maximum allowable deck displacement, it is in contrast at lower values of the criteria where the study reported that failure probability decreases with increase in the severity of the sea state.
Size scales over which ordinary chondrites and their parent asteroids are homogeneous in oxidation state and oxygen-isotopic composition

NASA Astrophysics Data System (ADS)

Rubin, Alan E.; Ziegler, Karen; Young, Edward D.

2008-02-01

Literature data demonstrate that on a global, asteroid-wide scale (plausibly on the order of 100 km), ordinary chondrites (OC) have heterogeneous oxidation states and O-isotopic compositions (represented, respectively, by the mean olivine Fa and bulk Δ 17O compositions of equilibrated samples). Samples analyzed here include: (a) two H5 chondrite Antarctic finds (ALHA79046 and TIL 82415) that have the same cosmic-ray exposure age (7.6 Ma) and were probably within ˜1 km of each other when they were excavated from the H-chondrite parent body, (b) different individual stones from the Holbrook L/LL6 fall that were probably within ˜1 m of each other when their parent meteoroid penetrated the Earth's atmosphere, and (c) drill cores from a large slab of the Estacado H6 find located within a few tens of centimeters of each other. Our results indicate that OC are heterogeneous in their bulk oxidation state and O-isotopic composition on 100-km-size scales, but homogeneous on meter-, decimeter- and centimeter-size scales. (On kilometer size scales, oxidation state is heterogeneous, but O isotopes appear to be homogeneous.) The asteroid-wide heterogeneity in oxidation state and O-isotopic composition was inherited from the solar nebula. The homogeneity on small size scales was probably caused in part by fluid-assisted metamorphism and mainly by impact-gardening processes (which are most effective at mixing target materials on scales of ⩽1 m).
Systematic sampling for suspended sediment

Treesearch

Robert B. Thomas

1991-01-01

Abstract - Because of high costs or complex logistics, scientific populations cannot be measured entirely and must be sampled. Accepted scientific practice holds that sample selection be based on statistical principles to assure objectivity when estimating totals and variances. Probability sampling--obtaining samples with known probabilities--is the only method that...
On the use of secondary capture-recapture samples to estimate temporary emigration and breeding proportions

USGS Publications Warehouse

Kendall, W.L.; Nichols, J.D.; North, P.M.; Nichols, J.D.

1995-01-01

The use of the Cormack- Jolly-Seber model under a standard sampling scheme of one sample per time period, when the Jolly-Seber assumption that all emigration is permanent does not hold, leads to the confounding of temporary emigration probabilities with capture probabilities. This biases the estimates of capture probability when temporary emigration is a completely random process, and both capture and survival probabilities when there is a temporary trap response in temporary emigration, or it is Markovian. The use of secondary capture samples over a shorter interval within each period, during which the population is assumed to be closed (Pollock's robust design), provides a second source of information on capture probabilities. This solves the confounding problem, and thus temporary emigration probabilities can be estimated. This process can be accomplished in an ad hoc fashion for completely random temporary emigration and to some extent in the temporary trap response case, but modelling the complete sampling process provides more flexibility and permits direct estimation of variances. For the case of Markovian temporary emigration, a full likelihood is required.
[Adult mortality differentials in Argentina].

PubMed

Rofman, R

1994-06-01

Adult mortality differentials in Argentina are estimated and analyzed using data from the National Social Security Administration. The study of adult mortality has attracted little attention in developing countries because of the scarcity of reliable statistics and the greater importance assigned to demographic phenomena traditionally associated with development, such as infant mortality and fertility. A sample of 39,421 records of retired persons surviving as of June 30, 1988, was analyzed by age, sex, region of residence, relative amount of pension, and social security fund of membership prior to the consolidation of the system in 1967. The thirteen former funds were grouped into the five categories of government, commerce, industry, self-employed, and other, which were assumed to be proxies for the activity sector in which the individual spent his active life. The sample is not representative of the Argentine population, since it excludes the lowest and highest socioeconomic strata and overrepresents men and urban residents. It is, however, believed to be adequate for explaining mortality differentials for most of the population covered by the social security system. The study methodology was based on the technique of logistic analysis and on the use of regional model life tables developed by Coale and others. To evaluate the effect of the study variables on the probability of dying, a regression model of maximal verisimilitude was estimated. The model relates the logit of the probability of death between ages 65 and 95 to the available explanatory variables, including their possible interactions. Life tables were constructed by sex, region of residence, previous pension fund, and income. As a test of external consistency, a model including only age and sex as explanatory variables was constructed using the methodology. The results confirmed consistency between the estimated values and other published estimates. A significant conclusion of the study was that social security data are a satisfactory source for study of adult mortality, a finding of importance in cases where vital statistics systems are deficient. Mortality differentials by income level and activity sector were significant, representing up to 11.5 years in life expectancy at age 20 and 4.4 years at age 65. Mortality differentials by region were minor, probably due to the nature of the sample. The lowest observed mortality levels were in own-account workers, independent professionals, and small businessmen.
Sexual diversity in the United States: Results from a nationally representative probability sample of adult women and men

PubMed Central

Herbenick, Debby; Bowling, Jessamyn; Fu, Tsung-Chieh (Jane); Guerra-Reyes, Lucia; Sanders, Stephanie

2017-01-01

In 2015, we conducted a cross-sectional, Internet-based, U.S. nationally representative probability survey of 2,021 adults (975 men, 1,046 women) focused on a broad range of sexual behaviors. Individuals invited to participate were from the GfK KnowledgePanel®. The survey was titled the 2015 Sexual Exploration in America Study and survey completion took about 12 to 15 minutes. The survey was confidential and the researchers never had access to respondents’ identifiers. Respondents reported on demographic items, lifetime and recent sexual behaviors, and the appeal of 50+ sexual behaviors. Most (>80%) reported lifetime masturbation, vaginal sex, and oral sex. Lifetime anal sex was reported by 43% of men (insertive) and 37% of women (receptive). Common lifetime sexual behaviors included wearing sexy lingerie/underwear (75% women, 26% men), sending/receiving digital nude/semi-nude photos (54% women, 65% men), reading erotic stories (57% of participants), public sex (≥43%), role-playing (≥22%), tying/being tied up (≥20%), spanking (≥30%), and watching sexually explicit videos/DVDs (60% women, 82% men). Having engaged in threesomes (10% women, 18% men) and playful whipping (≥13%) were less common. Lifetime group sex, sex parties, taking a sexuality class/workshop, and going to BDSM parties were uncommon (each <8%). More Americans identified behaviors as “appealing” than had engaged in them. Romantic/affectionate behaviors were among those most commonly identified as appealing for both men and women. The appeal of particular behaviors was associated with greater odds that the individual had ever engaged in the behavior. This study contributes to our understanding of more diverse adult sexual behaviors than has previously been captured in U.S. nationally representative probability surveys. Implications for sexuality educators, clinicians, and individuals in the general population are discussed. PMID:28727762
Sexual diversity in the United States: Results from a nationally representative probability sample of adult women and men.

PubMed

Herbenick, Debby; Bowling, Jessamyn; Fu, Tsung-Chieh Jane; Dodge, Brian; Guerra-Reyes, Lucia; Sanders, Stephanie

2017-01-01

In 2015, we conducted a cross-sectional, Internet-based, U.S. nationally representative probability survey of 2,021 adults (975 men, 1,046 women) focused on a broad range of sexual behaviors. Individuals invited to participate were from the GfK KnowledgePanel®. The survey was titled the 2015 Sexual Exploration in America Study and survey completion took about 12 to 15 minutes. The survey was confidential and the researchers never had access to respondents' identifiers. Respondents reported on demographic items, lifetime and recent sexual behaviors, and the appeal of 50+ sexual behaviors. Most (>80%) reported lifetime masturbation, vaginal sex, and oral sex. Lifetime anal sex was reported by 43% of men (insertive) and 37% of women (receptive). Common lifetime sexual behaviors included wearing sexy lingerie/underwear (75% women, 26% men), sending/receiving digital nude/semi-nude photos (54% women, 65% men), reading erotic stories (57% of participants), public sex (≥43%), role-playing (≥22%), tying/being tied up (≥20%), spanking (≥30%), and watching sexually explicit videos/DVDs (60% women, 82% men). Having engaged in threesomes (10% women, 18% men) and playful whipping (≥13%) were less common. Lifetime group sex, sex parties, taking a sexuality class/workshop, and going to BDSM parties were uncommon (each <8%). More Americans identified behaviors as "appealing" than had engaged in them. Romantic/affectionate behaviors were among those most commonly identified as appealing for both men and women. The appeal of particular behaviors was associated with greater odds that the individual had ever engaged in the behavior. This study contributes to our understanding of more diverse adult sexual behaviors than has previously been captured in U.S. nationally representative probability surveys. Implications for sexuality educators, clinicians, and individuals in the general population are discussed.
Pairing call-response surveys and distance sampling for a mammalian carnivore

USGS Publications Warehouse

Hansen, Sara J. K.; Frair, Jacqueline L.; Underwood, Harold B.; Gibbs, James P.

2015-01-01

Density estimates accounting for differential animal detectability are difficult to acquire for wide-ranging and elusive species such as mammalian carnivores. Pairing distance sampling with call-response surveys may provide an efficient means of tracking changes in populations of coyotes (Canis latrans), a species of particular interest in the eastern United States. Blind field trials in rural New York State indicated 119-m linear error for triangulated coyote calls, and a 1.8-km distance threshold for call detectability, which was sufficient to estimate a detection function with precision using distance sampling. We conducted statewide road-based surveys with sampling locations spaced ≥6 km apart from June to August 2010. Each detected call (be it a single or group) counted as a single object, representing 1 territorial pair, because of uncertainty in the number of vocalizing animals. From 524 survey points and 75 detections, we estimated the probability of detecting a calling coyote to be 0.17 ± 0.02 SE, yielding a detection-corrected index of 0.75 pairs/10 km2 (95% CI: 0.52–1.1, 18.5% CV) for a minimum of 8,133 pairs across rural New York State. Importantly, we consider this an index rather than true estimate of abundance given the unknown probability of coyote availability for detection during our surveys. Even so, pairing distance sampling with call-response surveys provided a novel, efficient, and noninvasive means of monitoring populations of wide-ranging and elusive, albeit reliably vocal, mammalian carnivores. Our approach offers an effective new means of tracking species like coyotes, one that is readily extendable to other species and geographic extents, provided key assumptions of distance sampling are met.
Propensity, Probability, and Quantum Theory

NASA Astrophysics Data System (ADS)

Ballentine, Leslie E.

2016-08-01

Quantum mechanics and probability theory share one peculiarity. Both have well established mathematical formalisms, yet both are subject to controversy about the meaning and interpretation of their basic concepts. Since probability plays a fundamental role in QM, the conceptual problems of one theory can affect the other. We first classify the interpretations of probability into three major classes: (a) inferential probability, (b) ensemble probability, and (c) propensity. Class (a) is the basis of inductive logic; (b) deals with the frequencies of events in repeatable experiments; (c) describes a form of causality that is weaker than determinism. An important, but neglected, paper by P. Humphreys demonstrated that propensity must differ mathematically, as well as conceptually, from probability, but he did not develop a theory of propensity. Such a theory is developed in this paper. Propensity theory shares many, but not all, of the axioms of probability theory. As a consequence, propensity supports the Law of Large Numbers from probability theory, but does not support Bayes theorem. Although there are particular problems within QM to which any of the classes of probability may be applied, it is argued that the intrinsic quantum probabilities (calculated from a state vector or density matrix) are most naturally interpreted as quantum propensities. This does not alter the familiar statistical interpretation of QM. But the interpretation of quantum states as representing knowledge is untenable. Examples show that a density matrix fails to represent knowledge.
Large landslides from oceanic volcanoes

USGS Publications Warehouse

Holcomb, R.T.; Searle, R.C.

1991-01-01

Large landslides are ubiquitous around the submarine flanks of Hawaiian volcanoes, and GLORIA has also revealed large landslides offshore from Tristan da Cunha and El Hierro. On both of the latter islands, steep flanks formerly attributed to tilting or marine erosion have been reinterpreted as landslide headwalls mantled by younger lava flows. These landslides occur in a wide range of settings and probably represent only a small sample from a large population. They may explain the large volumes of archipelagic aprons and the stellate shapes of many oceanic volcanoes. Large landslides and associated tsunamis pose hazards to many islands. -from Authors
Turnover among Filipino nurses in Ministry of Health hospitals in Saudi Arabia: causes and recommendations for improvement.

PubMed

Aljohani, Khalid Abdullah; Alomari, Omar

2018-01-01

Nurse turnover is a critical challenge for healthcare organizations as it results in a decreasing nurse/patient ratio and increasing costs. Identify factors influencing the termination of Filipino nurses in Ministry of Health (MOH) hospitals and record nurse recommendations to improve retention. Cross-sectional. Data was gathered from a convenience sample of Filipino nurses with previous experience in MOH hospitals in Saudi Arabia who attended recruitment interviews at the Saudi employment office in Manila. The sample included 124 nurses. Major turnover factors included low salary (18.3%), low nurse/patient ratio (15%), end of contract (14.5%), discrimination (13.5%), and bad accommodations (9%). Suggested areas of improvement included financial motivations (34%), administration support (25%), quality of life (25%), and work environment (16%). Managing nurse turnover can be implemented on the organizational as well as at MOH levels. The recommendations given by the participants provide direct targets to improve retention. With convenience sampling, the sample is probably not representative of the Filipino nursing population. None.
Critical review of the United Kingdom's "gold standard" survey of public attitudes to science.

PubMed

Smith, Benjamin K; Jensen, Eric A

2016-02-01

Since 2000, the UK government has funded surveys aimed at understanding the UK public's attitudes toward science, scientists, and science policy. Known as the Public Attitudes to Science series, these surveys and their predecessors have long been used in UK science communication policy, practice, and scholarship as a source of authoritative knowledge about science-related attitudes and behaviors. Given their importance and the significant public funding investment they represent, detailed academic scrutiny of the studies is needed. In this essay, we critically review the most recently published Public Attitudes to Science survey (2014), assessing the robustness of its methods and claims. The review casts doubt on the quality of key elements of the Public Attitudes to Science 2014 survey data and analysis while highlighting the importance of robust quantitative social research methodology. Our analysis comparing the main sample and booster sample for young people demonstrates that quota sampling cannot be assumed equivalent to probability-based sampling techniques. © The Author(s) 2016.

Probability Issues in without Replacement Sampling

ERIC Educational Resources Information Center

Joarder, A. H.; Al-Sabah, W. S.

2007-01-01

Sampling without replacement is an important aspect in teaching conditional probabilities in elementary statistics courses. Different methods proposed in different texts for calculating probabilities of events in this context are reviewed and their relative merits and limitations in applications are pinpointed. An alternative representation of…
Meteorite Dunite Breccia MIL 03443: A Probable Crustal Cumulate Closely Related to Diogenites from the HED Parent Asteroid

NASA Technical Reports Server (NTRS)

Mittlefehldt, David W.

2008-01-01

There are numerous types of differentiated meteorites, but most represent either the crusts or cores of their parent asteroids. Ureilites, olivine-pyroxene-graphite rocks, are exceptions; they are mantle restites [1]. Dunite is expected to be a common mantle lithology in differentiated asteroids. In particular, models of the eucrite parent asteroid contain large volumes of dunite mantle [2-4]. Yet dunites are very rare among meteorites, and none are known associated with the howardite, eucrite, diogenite (HED) suite. Spectroscopic measurements of 4 Vesta, the probable HED parent asteroid, show one region with an olivine signature [5] although the surface is dominated by basaltic and orthopyroxenitic material equated with eucrites and diogenites [6]. One might expect that a small number of dunitic or olivine-rich meteorites might be delivered along with the HED suite. The 46 gram meteoritic dunite MIL 03443 (Fig. 1) was recovered from the Miller Range ice field of Antarctica. This meteorite was tentatively classified as a mesosiderite because large, dunitic clasts are found in this type of meteorite, but it was noted that MIL 03443 could represent a dunite sample of the HED suite [7]. Here I will present a preliminary petrologic study of two thin sections of this meteorite.
Dermoscopic clues to differentiate facial lentigo maligna from pigmented actinic keratosis.

PubMed

Lallas, A; Tschandl, P; Kyrgidis, A; Stolz, W; Rabinovitz, H; Cameron, A; Gourhant, J Y; Giacomel, J; Kittler, H; Muir, J; Argenziano, G; Hofmann-Wellenhof, R; Zalaudek, I

2016-05-01

Dermoscopy is limited in differentiating accurately between pigmented lentigo maligna (LM) and pigmented actinic keratosis (PAK). This might be related to the fact that most studies have focused on pigmented criteria only, without considering additional recognizable features. To investigate the diagnostic accuracy of established dermoscopic criteria for pigmented LM and PAK, but including in the evaluation features previously associated with nonpigmented facial actinic keratosis. Retrospectively enrolled cases of histopathologically diagnosed LM, PAK and solar lentigo/early seborrhoeic keratosis (SL/SK) were dermoscopically evaluated for the presence of predefined criteria. Univariate and multivariate regression analyses were performed and receiver operating characteristic curves were used. The study sample consisted of 70 LMs, 56 PAKs and 18 SL/SKs. In a multivariate analysis, the most potent predictors of LM were grey rhomboids (sixfold increased probability of LM), nonevident follicles (fourfold) and intense pigmentation (twofold). In contrast, white circles, scales and red colour were significantly correlated with PAK, posing a 14-fold, eightfold and fourfold probability for PAK, respectively. The absence of evident follicles also represented a frequent LM criterion, characterizing 71% of LMs. White and evident follicles, scales and red colour represent significant diagnostic clues for PAK. Conversely, intense pigmentation and grey rhomboidal lines appear highly suggestive of LM. © 2015 British Association of Dermatologists.
Autonomous learning derived from experimental modeling of physical laws.

PubMed

Grabec, Igor

2013-05-01

This article deals with experimental description of physical laws by probability density function of measured data. The Gaussian mixture model specified by representative data and related probabilities is utilized for this purpose. The information cost function of the model is described in terms of information entropy by the sum of the estimation error and redundancy. A new method is proposed for searching the minimum of the cost function. The number of the resulting prototype data depends on the accuracy of measurement. Their adaptation resembles a self-organized, highly non-linear cooperation between neurons in an artificial NN. A prototype datum corresponds to the memorized content, while the related probability corresponds to the excitability of the neuron. The method does not include any free parameters except objectively determined accuracy of the measurement system and is therefore convenient for autonomous execution. Since representative data are generally less numerous than the measured ones, the method is applicable for a rather general and objective compression of overwhelming experimental data in automatic data-acquisition systems. Such compression is demonstrated on analytically determined random noise and measured traffic flow data. The flow over a day is described by a vector of 24 components. The set of 365 vectors measured over one year is compressed by autonomous learning to just 4 representative vectors and related probabilities. These vectors represent the flow in normal working days and weekends or holidays, while the related probabilities correspond to relative frequencies of these days. This example reveals that autonomous learning yields a new basis for interpretation of representative data and the optimal model structure. Copyright © 2012 Elsevier Ltd. All rights reserved.
The future of Stardust science

NASA Astrophysics Data System (ADS)

Westphal, A. J.; Bridges, J. C.; Brownlee, D. E.; Butterworth, A. L.; de Gregorio, B. T.; Dominguez, G.; Flynn, G. J.; Gainsforth, Z.; Ishii, H. A.; Joswiak, D.; Nittler, L. R.; Ogliore, R. C.; Palma, R.; Pepin, R. O.; Stephan, T.; Zolensky, M. E.

2017-09-01

Recent observations indicate that >99% of the small bodies in the solar system reside in its outer reaches—in the Kuiper Belt and Oort Cloud. Kuiper Belt bodies are probably the best-preserved representatives of the icy planetesimals that dominated the bulk of the solid mass in the early solar system. They likely contain preserved materials inherited from the protosolar cloud, held in cryogenic storage since the formation of the solar system. Despite their importance, they are relatively underrepresented in our extraterrestrial sample collections by many orders of magnitude ( 1013 by mass) as compared with the asteroids, represented by meteorites, which are composed of materials that have generally been strongly altered by thermal and aqueous processes. We have only begun to scratch the surface in understanding Kuiper Belt objects, but it is already clear that the very limited samples of them that we have in our laboratories hold the promise of dramatically expanding our understanding of the formation of the solar system. Stardust returned the first samples from a known small solar system body, the Jupiter-family comet 81P/Wild 2, and, in a separate collector, the first solid samples from the local interstellar medium. The first decade of Stardust research resulted in more than 142 peer-reviewed publications, including 15 papers in Science. Analyses of these amazing samples continue to yield unexpected discoveries and to raise new questions about the history of the early solar system. We identify nine high-priority scientific objectives for future Stardust analyses that address important unsolved problems in planetary science.
Assessing performance and validating finite element simulations using probabilistic knowledge

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dolin, Ronald M.; Rodriguez, E. A.

Two probabilistic approaches for assessing performance are presented. The first approach assesses probability of failure by simultaneously modeling all likely events. The probability each event causes failure along with the event's likelihood of occurrence contribute to the overall probability of failure. The second assessment method is based on stochastic sampling using an influence diagram. Latin-hypercube sampling is used to stochastically assess events. The overall probability of failure is taken as the maximum probability of failure of all the events. The Likelihood of Occurrence simulation suggests failure does not occur while the Stochastic Sampling approach predicts failure. The Likelihood of Occurrencemore » results are used to validate finite element predictions.« less
Quantifying Mixed Uncertainties in Cyber Attacker Payoffs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chatterjee, Samrat; Halappanavar, Mahantesh; Tipireddy, Ramakrishna

Representation and propagation of uncertainty in cyber attacker payoffs is a key aspect of security games. Past research has primarily focused on representing the defender’s beliefs about attacker payoffs as point utility estimates. More recently, within the physical security domain, attacker payoff uncertainties have been represented as Uniform and Gaussian probability distributions, and intervals. Within cyber-settings, continuous probability distributions may still be appropriate for addressing statistical (aleatory) uncertainties where the defender may assume that the attacker’s payoffs differ over time. However, systematic (epistemic) uncertainties may exist, where the defender may not have sufficient knowledge or there is insufficient information aboutmore » the attacker’s payoff generation mechanism. Such epistemic uncertainties are more suitably represented as probability boxes with intervals. In this study, we explore the mathematical treatment of such mixed payoff uncertainties.« less
Exploring the full natural variability of eruption sizes within probabilistic hazard assessment of tephra dispersal

NASA Astrophysics Data System (ADS)

Selva, Jacopo; Sandri, Laura; Costa, Antonio; Tonini, Roberto; Folch, Arnau; Macedonio, Giovanni

2014-05-01

The intrinsic uncertainty and variability associated to the size of next eruption strongly affects short to long-term tephra hazard assessment. Often, emergency plans are established accounting for the effects of one or a few representative scenarios (meant as a specific combination of eruptive size and vent position), selected with subjective criteria. On the other hand, probabilistic hazard assessments (PHA) consistently explore the natural variability of such scenarios. PHA for tephra dispersal needs the definition of eruptive scenarios (usually by grouping possible eruption sizes and vent positions in classes) with associated probabilities, a meteorological dataset covering a representative time period, and a tephra dispersal model. PHA results from combining simulations considering different volcanological and meteorological conditions through a weight given by their specific probability of occurrence. However, volcanological parameters, such as erupted mass, eruption column height and duration, bulk granulometry, fraction of aggregates, typically encompass a wide range of values. Because of such a variability, single representative scenarios or size classes cannot be adequately defined using single values for the volcanological inputs. Here we propose a method that accounts for this within-size-class variability in the framework of Event Trees. The variability of each parameter is modeled with specific Probability Density Functions, and meteorological and volcanological inputs are chosen by using a stratified sampling method. This procedure allows avoiding the bias introduced by selecting single representative scenarios and thus neglecting most of the intrinsic eruptive variability. When considering within-size-class variability, attention must be paid to appropriately weight events falling within the same size class. While a uniform weight to all the events belonging to a size class is the most straightforward idea, this implies a strong dependence on the thresholds dividing classes: under this choice, the largest event of a size class has a much larger weight than the smallest event of the subsequent size class. In order to overcome this problem, in this study, we propose an innovative solution able to smoothly link the weight variability within each size class to the variability among the size classes through a common power law, and, simultaneously, respect the probability of different size classes conditional to the occurrence of an eruption. Embedding this procedure into the Bayesian Event Tree scheme enables for tephra fall PHA, quantified through hazard curves and maps representing readable results applicable in planning risk mitigation actions, and for the quantification of its epistemic uncertainties. As examples, we analyze long-term tephra fall PHA at Vesuvius and Campi Flegrei. We integrate two tephra dispersal models (the analytical HAZMAP and the numerical FALL3D) into BET_VH. The ECMWF reanalysis dataset are used for exploring different meteorological conditions. The results obtained clearly show that PHA accounting for the whole natural variability significantly differs from that based on a representative scenarios, as in volcanic hazard common practice.
Correlation between pubic hair grooming and STIs: results from a nationally representative probability sample.

PubMed

Osterberg, E Charles; Gaither, Thomas W; Awad, Mohannad A; Truesdale, Matthew D; Allen, Isabel; Sutcliffe, Siobhan; Breyer, Benjamin N

2017-05-01

STIs are the most common infections among adults. Concurrently, pubic hair grooming is prevalent. Small-scale studies have demonstrated a relationship between pubic hair grooming and STIs. We aim to examine this relationship in a large sample of men and women. We conducted a probability survey of US residents aged 18-65 years. The survey ascertained self-reported pubic hair grooming practices, sexual behaviours and STI history. We defined extreme grooming as removal of all pubic hair more than 11 times per year and high-frequency grooming as daily/weekly trimming. Cutaneous STIs included herpes, human papillomavirus, syphilis and molluscum. Secretory STIs included gonorrhoea, chlamydia and HIV. We analysed lice separately. Of 7580 respondents who completed the survey, 74% reported grooming their pubic hair, 66% of men and 84% of women. After adjusting for age and lifetime sexual partners, ever having groomed was positively associated with a history of self-reported STIs (OR 1.8; 95% CI 1.4 to 2.2), including cutaneous STIs (OR 2.6; CI 1.8 to 3.7), secretory STIs (OR 1.7; CI 1.3 to 2.2) and lice (OR 1.9; CI 1.3 to 2.9). These positive associations were stronger for extreme groomers (OR 4.4; CI 2.9 to 6.8) and high-frequency groomers (OR 3.5; CI 2.3 to 5.4) with cutaneous STIs, and for non-extreme groomers (OR 2.0; CI 1.3 to 3.0) and low-frequency groomers (OR 2.0; CI 1.3 to 3.1) with lice. Among a representative sample of US residents, pubic hair grooming was positively related to self-reported STI history. Further research is warranted to gain insight into STI risk-reduction strategies. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Extensional versus Intuitive Reasoning: The Conjunction Fallacy in Probability Judgment.

ERIC Educational Resources Information Center

Tversky, Amos; Kahneman, Daniel

1983-01-01

Judgments under uncertainty are often mediated by intuitive heuristics that are not bound by the conjunction rule of probability. Representativeness and availability heuristics can make a conjunction appear more probable than one of its constituents. Alternative interpretations of this conjunction fallacy are discussed and attempts to combat it…
Texas Adolescent Tobacco and Marketing Surveillance System’s Design

PubMed Central

Pérez, Adriana; Harrell, Melissa B.; Malkani, Raja I.; Jackson, Christian D.; Delk, Joanne; Allotey, Prince A.; Matthews, Krystin J.; Martinez, Pablo; Perry, Cheryl L.

2017-01-01

Objectives To provide a full methodological description of the design of the wave I and II (6-month follow-up) surveys of the Texas Adolescent Tobacco and Marketing Surveillance System (TATAMS), a longitudinal surveillance study of 6th, 8th, and 10th grade students who attended schools in Bexar, Dallas, Tarrant, Harris, or Travis counties, where the 4 largest cities in Texas (San Antonio, Dallas, Fort Worth, Houston, and Austin, respectively) are located. Methods TATAMS used a complex probability design, yielding representative estimates of these students in these counties during the 2014–2015 academic year. Weighted prevalence of the use of tobacco products, drugs and alcohol in wave I, and the percent of: (i) bias, (ii) relative bias, and (iii) relative bias ratio, between waves I and II are estimated. Results The wave I sample included 79 schools and 3,907 students. The prevalence of current cigarette, e-cigarette and hookah use at wave I was 3.5%, 7.4%, and 2.5%, respectively. Small biases, mostly less than 3.5%, were observed for nonrespondents in wave II. Conclusions Even with adaptions to the sampling methodology, the resulting sample adequately represents the target population. Results from TATAMS will have important implications for future tobacco policy in Texas and federal regulation. PMID:29098172
Extended Importance Sampling for Reliability Analysis under Evidence Theory

NASA Astrophysics Data System (ADS)

Yuan, X. K.; Chen, B.; Zhang, B. Q.

2018-05-01

In early engineering practice, the lack of data and information makes uncertainty difficult to deal with. However, evidence theory has been proposed to handle uncertainty with limited information as an alternative way to traditional probability theory. In this contribution, a simulation-based approach, called ‘Extended importance sampling’, is proposed based on evidence theory to handle problems with epistemic uncertainty. The proposed approach stems from the traditional importance sampling for reliability analysis under probability theory, and is developed to handle the problem with epistemic uncertainty. It first introduces a nominal instrumental probability density function (PDF) for every epistemic uncertainty variable, and thus an ‘equivalent’ reliability problem under probability theory is obtained. Then the samples of these variables are generated in a way of importance sampling. Based on these samples, the plausibility and belief (upper and lower bounds of probability) can be estimated. It is more efficient than direct Monte Carlo simulation. Numerical and engineering examples are given to illustrate the efficiency and feasible of the proposed approach.
Relationship between blood manganese and blood pressure in the Korean general population according to KNHANES 2008

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lee, Byung-Kook; Kim, Yangho, E-mail: yanghokm@nuri.net

Introduction: We present data on the association of manganese (Mn) level with hypertension in a representative sample of the adult Korean population who participated in the Korean National Health and Nutrition Examination Survey (KNHANES) 2008. Methods: This study was based on the data obtained by KNHANES 2008, which was conducted for three years (2007-2009) using a rolling sampling design involving a complex, stratified, multistage, probability-cluster survey of a representative sample of the noninstitutionalized civilian population of South Korea. Results: Multiple regression analysis after controlling for covariates, including gender, age, regional area, education level, smoking, drinking status, hemoglobin, and serum creatinine,more » showed that the beta coefficients of log blood Mn were 3.514, 1.878, and 2.517 for diastolic blood pressure, and 3.593, 2.449, and 2.440 for systolic blood pressure in female, male, and all participants, respectively. Multiple regression analysis including three other blood metals, lead, mercury, and cadmium, revealed no significant effects of the three metals on blood pressure and showed no effect on the association between blood Mn and blood pressure. In addition, doubling the blood Mn increased the risk of hypertension 1.828, 1.573, and 1.567 fold in women, men, and all participants, respectively, after adjustment for covariates. The addition of blood lead, mercury, and cadmium as covariates did not affect the association between blood Mn and the prevalence of hypertension. Conclusion: Blood Mn level was associated with an increased risk of hypertension in a representative sample of the Korean adult population. - Highlights: {yields} We showed the association of manganese with hypertension in Korean population. {yields} This study was based on the data obtained by KNHANES 2008. {yields} Blood manganese level was associated with an increased risk of hypertension.« less
Comparative study on the bioaccumulation and biotransformation of arsenic by some northeastern Atlantic and northwestern Mediterranean sponges.

PubMed

Orani, Anna Maria; Barats, Aurélie; Zitte, Wendy; Morrow, Christine; Thomas, Olivier P

2018-06-01

The bioaccumulation and biotransformation of arsenic (As) were studied in six representative marine sponges from the French Mediterranean and Irish Atlantic coasts. Methodologies were carefully optimized in one of the species on Haliclona fulva sponges for two critical steps: the sample mineralization for total As analysis by ICP-MS and the extraction of As species for HPLC-ICP-MS analysis. During the optimization, extractions performed with 0.6 mol L -1 H 3 PO 4 were shown to be the most efficient. Extraction recovery of 81% was obtained which represents the best results obtained until now in sponge samples. Total As analyses and As speciation were performed on certified reference materials and allow confirming the measurement quality both during the sample preparation and analysis. Additionally, this study represents an environmental survey demonstrating a high variability of total As concentrations among the different species, probably related to different physiological or microbial features. As speciation results showed the predominance of arsenobetaine (AsB) regardless of the sponge species, as well as the occurrence of low amounts of dimethylarsinic acid (DMA), arsenate (As(+V)), and unknown As species in some samples. The process responsible for As transformation in sponges is most likely related to sponges metabolism itself or the action of symbiont organisms. AsB is supposed to be implied in the protection against osmolytic stress. This study demonstrates the ability of sponges to accumulate and bio-transform As, proving that sponges are relevant bio-monitors for As contamination in the marine environment, and potential tools in environmental bio-remediation. Copyright © 2018 Elsevier Ltd. All rights reserved.
PROBABILITY SAMPLING AND POPULATION INFERENCE IN MONITORING PROGRAMS

EPA Science Inventory

A fundamental difference between probability sampling and conventional statistics is that "sampling" deals with real, tangible populations, whereas "conventional statistics" usually deals with hypothetical populations that have no real-world realization. he focus here is on real ...
Type I error probabilities based on design-stage strategies with applications to noninferiority trials.

PubMed

Rothmann, Mark

2005-01-01

When testing the equality of means from two different populations, a t-test or large sample normal test tend to be performed. For these tests, when the sample size or design for the second sample is dependent on the results of the first sample, the type I error probability is altered for each specific possibility in the null hypothesis. We will examine the impact on the type I error probabilities for two confidence interval procedures and procedures using test statistics when the design for the second sample or experiment is dependent on the results from the first sample or experiment (or series of experiments). Ways for controlling a desired maximum type I error probability or a desired type I error rate will be discussed. Results are applied to the setting of noninferiority comparisons in active controlled trials where the use of a placebo is unethical.
A constrained multinomial Probit route choice model in the metro network: Formulation, estimation and application

PubMed Central

Zhang, Yongsheng; Wei, Heng; Zheng, Kangning

2017-01-01

Considering that metro network expansion brings us with more alternative routes, it is attractive to integrate the impacts of routes set and the interdependency among alternative routes on route choice probability into route choice modeling. Therefore, the formulation, estimation and application of a constrained multinomial probit (CMNP) route choice model in the metro network are carried out in this paper. The utility function is formulated as three components: the compensatory component is a function of influencing factors; the non-compensatory component measures the impacts of routes set on utility; following a multivariate normal distribution, the covariance of error component is structured into three parts, representing the correlation among routes, the transfer variance of route, and the unobserved variance respectively. Considering multidimensional integrals of the multivariate normal probability density function, the CMNP model is rewritten as Hierarchical Bayes formula and M-H sampling algorithm based Monte Carlo Markov Chain approach is constructed to estimate all parameters. Based on Guangzhou Metro data, reliable estimation results are gained. Furthermore, the proposed CMNP model also shows a good forecasting performance for the route choice probabilities calculation and a good application performance for transfer flow volume prediction. PMID:28591188
Evidence That a Psychopathology Interactome Has Diagnostic Value, Predicting Clinical Needs: An Experience Sampling Study

PubMed Central

van Os, Jim; Lataster, Tineke; Delespaul, Philippe; Wichers, Marieke; Myin-Germeys, Inez

2014-01-01

Background For the purpose of diagnosis, psychopathology can be represented as categories of mental disorder, symptom dimensions or symptom networks. Also, psychopathology can be assessed at different levels of temporal resolution (monthly episodes, daily fluctuating symptoms, momentary fluctuating mental states). We tested the diagnostic value, in terms of prediction of treatment needs, of the combination of symptom networks and momentary assessment level. Method Fifty-seven patients with a psychotic disorder participated in an ESM study, capturing psychotic experiences, emotions and circumstances at 10 semi-random moments in the flow of daily life over a period of 6 days. Symptoms were assessed by interview with the Positive and Negative Syndrome Scale (PANSS); treatment needs were assessed using the Camberwell Assessment of Need (CAN). Results Psychotic symptoms assessed with the PANSS (Clinical Psychotic Symptoms) were strongly associated with psychotic experiences assessed with ESM (Momentary Psychotic Experiences). However, the degree to which Momentary Psychotic Experiences manifested as Clinical Psychotic Symptoms was determined by level of momentary negative affect (higher levels increasing probability of Momentary Psychotic Experiences manifesting as Clinical Psychotic Symptoms), momentary positive affect (higher levels decreasing probability of Clinical Psychotic Symptoms), greater persistence of Momentary Psychotic Experiences (persistence predicting increased probability of Clinical Psychotic Symptoms) and momentary environmental stress associated with events and activities (higher levels increasing probability of Clinical Psychotic Symptoms). Similarly, the degree to which momentary visual or auditory hallucinations manifested as Clinical Psychotic Symptoms was strongly contingent on the level of accompanying momentary paranoid delusional ideation. Momentary Psychotic Experiences were associated with CAN unmet treatment needs, over and above PANSS measures of psychopathology, similarly moderated by momentary interactions with emotions and context. Conclusion The results suggest that psychopathology, represented as an interactome at the momentary level of temporal resolution, is informative in diagnosing clinical needs, over and above traditional symptom measures. PMID:24466189
Appraisal of geodynamic inversion results: a data mining approach

NASA Astrophysics Data System (ADS)

Baumann, T. S.

2016-11-01

Bayesian sampling based inversions require many thousands or even millions of forward models, depending on how nonlinear or non-unique the inverse problem is, and how many unknowns are involved. The result of such a probabilistic inversion is not a single `best-fit' model, but rather a probability distribution that is represented by the entire model ensemble. Often, a geophysical inverse problem is non-unique, and the corresponding posterior distribution is multimodal, meaning that the distribution consists of clusters with similar models that represent the observations equally well. In these cases, we would like to visualize the characteristic model properties within each of these clusters of models. However, even for a moderate number of inversion parameters, a manual appraisal for a large number of models is not feasible. This poses the question whether it is possible to extract end-member models that represent each of the best-fit regions including their uncertainties. Here, I show how a machine learning tool can be used to characterize end-member models, including their uncertainties, from a complete model ensemble that represents a posterior probability distribution. The model ensemble used here results from a nonlinear geodynamic inverse problem, where rheological properties of the lithosphere are constrained from multiple geophysical observations. It is demonstrated that by taking vertical cross-sections through the effective viscosity structure of each of the models, the entire model ensemble can be classified into four end-member model categories that have a similar effective viscosity structure. These classification results are helpful to explore the non-uniqueness of the inverse problem and can be used to compute representative data fits for each of the end-member models. Conversely, these insights also reveal how new observational constraints could reduce the non-uniqueness. The method is not limited to geodynamic applications and a generalized MATLAB code is provided to perform the appraisal analysis.
Semiparametric temporal process regression of survival-out-of-hospital.

PubMed

Zhan, Tianyu; Schaubel, Douglas E

2018-05-23

The recurrent/terminal event data structure has undergone considerable methodological development in the last 10-15 years. An example of the data structure that has arisen with increasing frequency involves the recurrent event being hospitalization and the terminal event being death. We consider the response Survival-Out-of-Hospital, defined as a temporal process (indicator function) taking the value 1 when the subject is currently alive and not hospitalized, and 0 otherwise. Survival-Out-of-Hospital is a useful alternative strategy for the analysis of hospitalization/survival in the chronic disease setting, with the response variate representing a refinement to survival time through the incorporation of an objective quality-of-life component. The semiparametric model we consider assumes multiplicative covariate effects and leaves unspecified the baseline probability of being alive-and-out-of-hospital. Using zero-mean estimating equations, the proposed regression parameter estimator can be computed without estimating the unspecified baseline probability process, although baseline probabilities can subsequently be estimated for any time point within the support of the censoring distribution. We demonstrate that the regression parameter estimator is asymptotically normal, and that the baseline probability function estimator converges to a Gaussian process. Simulation studies are performed to show that our estimating procedures have satisfactory finite sample performances. The proposed methods are applied to the Dialysis Outcomes and Practice Patterns Study (DOPPS), an international end-stage renal disease study.

In favor of general probability distributions: lateral prefrontal and insular cortices respond to stimulus inherent, but irrelevant differences.

PubMed

Mestres-Missé, Anna; Trampel, Robert; Turner, Robert; Kotz, Sonja A

2016-04-01

A key aspect of optimal behavior is the ability to predict what will come next. To achieve this, we must have a fairly good idea of the probability of occurrence of possible outcomes. This is based both on prior knowledge about a particular or similar situation and on immediately relevant new information. One question that arises is: when considering converging prior probability and external evidence, is the most probable outcome selected or does the brain represent degrees of uncertainty, even highly improbable ones? Using functional magnetic resonance imaging, the current study explored these possibilities by contrasting words that differ in their probability of occurrence, namely, unbalanced ambiguous words and unambiguous words. Unbalanced ambiguous words have a strong frequency-based bias towards one meaning, while unambiguous words have only one meaning. The current results reveal larger activation in lateral prefrontal and insular cortices in response to dominant ambiguous compared to unambiguous words even when prior and contextual information biases one interpretation only. These results suggest a probability distribution, whereby all outcomes and their associated probabilities of occurrence--even if very low--are represented and maintained.
78 FR 20137 - Probable Economic Effect of Certain Modifications to the North American Free Trade Agreement...

Federal Register 2010, 2011, 2012, 2013, 2014

2013-04-03

... INTERNATIONAL TRADE COMMISSION [Investigation No. TA-103-027] Probable Economic Effect of Certain... investigation No. TA-103-027, Probable Economic Effect of Certain Modifications to the North American Free Trade... reached agreement in principle with representatives of the governments of Canada and Mexico on proposed...
On the Determinants of the Conjunction Fallacy: Probability versus Inductive Confirmation

ERIC Educational Resources Information Center

Tentori, Katya; Crupi, Vincenzo; Russo, Selena

2013-01-01

Major recent interpretations of the conjunction fallacy postulate that people assess the probability of a conjunction according to (non-normative) averaging rules as applied to the constituents' probabilities or represent the conjunction fallacy as an effect of random error in the judgment process. In the present contribution, we contrast such…
Probability model for analyzing fire management alternatives: theory and structure

Treesearch

Frederick W. Bratten

1982-01-01

A theoretical probability model has been developed for analyzing program alternatives in fire management. It includes submodels or modules for predicting probabilities of fire behavior, fire occurrence, fire suppression, effects of fire on land resources, and financial effects of fire. Generalized "fire management situations" are used to represent actual fire...
Seeing the Forest when Entry Is Unlikely: Probability and the Mental Representation of Events

ERIC Educational Resources Information Center

Wakslak, Cheryl J.; Trope, Yaacov; Liberman, Nira; Alony, Rotem

2006-01-01

Conceptualizing probability as psychological distance, the authors draw on construal level theory (Y. Trope & N. Liberman, 2003) to propose that decreasing an event's probability leads individuals to represent the event by its central, abstract, general features (high-level construal) rather than by its peripheral, concrete, specific features…
Nuclear-effects model embedded stochastically in simulation (NEMESIS) summary report. Technical paper

DOE Office of Scientific and Technical Information (OSTI.GOV)

Youngren, M.A.

1989-11-01

An analytic probability model of tactical nuclear warfare in the theater is presented in this paper. The model addresses major problems associated with representing nuclear warfare in the theater. Current theater representations of a potential nuclear battlefield are developed in context of low-resolution, theater-level models or scenarios. These models or scenarios provide insufficient resolution in time and space for modeling a nuclear exchange. The model presented in this paper handles the spatial uncertainty in potentially targeted unit locations by proposing two-dimensional multivariate probability models for the actual and perceived locations of units subordinate to the major (division-level) units represented inmore » theater scenarios. The temporal uncertainty in the activities of interest represented in our theater-level Force Evaluation Model (FORCEM) is handled through probability models of the acquisition and movement of potential nuclear target units.« less
Investigation of Dielectric Breakdown Characteristics for Double-break Vacuum Interrupter and Dielectric Breakdown Probability Distribution in Vacuum Interrupter

NASA Astrophysics Data System (ADS)

Shioiri, Tetsu; Asari, Naoki; Sato, Junichi; Sasage, Kosuke; Yokokura, Kunio; Homma, Mitsutaka; Suzuki, Katsumi

To investigate the reliability of equipment of vacuum insulation, a study was carried out to clarify breakdown probability distributions in vacuum gap. Further, a double-break vacuum circuit breaker was investigated for breakdown probability distribution. The test results show that the breakdown probability distribution of the vacuum gap can be represented by a Weibull distribution using a location parameter, which shows the voltage that permits a zero breakdown probability. The location parameter obtained from Weibull plot depends on electrode area. The shape parameter obtained from Weibull plot of vacuum gap was 10∼14, and is constant irrespective non-uniform field factor. The breakdown probability distribution after no-load switching can be represented by Weibull distribution using a location parameter. The shape parameter after no-load switching was 6∼8.5, and is constant, irrespective of gap length. This indicates that the scatter of breakdown voltage was increased by no-load switching. If the vacuum circuit breaker uses a double break, breakdown probability at low voltage becomes lower than single-break probability. Although potential distribution is a concern in the double-break vacuum cuicuit breaker, its insulation reliability is better than that of the single-break vacuum interrupter even if the bias of the vacuum interrupter's sharing voltage is taken into account.
Self-Supervised Dynamical Systems

NASA Technical Reports Server (NTRS)

Zak, Michail

2003-01-01

Some progress has been made in a continuing effort to develop mathematical models of the behaviors of multi-agent systems known in biology, economics, and sociology (e.g., systems ranging from single or a few biomolecules to many interacting higher organisms). Living systems can be characterized by nonlinear evolution of probability distributions over different possible choices of the next steps in their motions. One of the main challenges in mathematical modeling of living systems is to distinguish between random walks of purely physical origin (for instance, Brownian motions) and those of biological origin. Following a line of reasoning from prior research, it has been assumed, in the present development, that a biological random walk can be represented by a nonlinear mathematical model that represents coupled mental and motor dynamics incorporating the psychological concept of reflection or self-image. The nonlinear dynamics impart the lifelike ability to behave in ways and to exhibit patterns that depart from thermodynamic equilibrium. Reflection or self-image has traditionally been recognized as a basic element of intelligence. The nonlinear mathematical models of the present development are denoted self-supervised dynamical systems. They include (1) equations of classical dynamics, including random components caused by uncertainties in initial conditions and by Langevin forces, coupled with (2) the corresponding Liouville or Fokker-Planck equations that describe the evolutions of probability densities that represent the uncertainties. The coupling is effected by fictitious information-based forces, denoted supervising forces, composed of probability densities and functionals thereof. The equations of classical mechanics represent motor dynamics that is, dynamics in the traditional sense, signifying Newton s equations of motion. The evolution of the probability densities represents mental dynamics or self-image. Then the interaction between the physical and metal aspects of a monad is implemented by feedback from mental to motor dynamics, as represented by the aforementioned fictitious forces. This feedback is what makes the evolution of probability densities nonlinear. The deviation from linear evolution can be characterized, in a sense, as an expression of free will. It has been demonstrated that probability densities can approach prescribed attractors while exhibiting such patterns as shock waves, solitons, and chaos in probability space. The concept of self-supervised dynamical systems has been considered for application to diverse phenomena, including information-based neural networks, cooperation, competition, deception, games, and control of chaos. In addition, a formal similarity between the mathematical structures of self-supervised dynamical systems and of quantum-mechanical systems has been investigated.
Meta-analysis of the effect of natural frequencies on Bayesian reasoning.

PubMed

McDowell, Michelle; Jacobs, Perke

2017-12-01

The natural frequency facilitation effect describes the finding that people are better able to solve descriptive Bayesian inference tasks when represented as joint frequencies obtained through natural sampling, known as natural frequencies, than as conditional probabilities. The present meta-analysis reviews 20 years of research seeking to address when, why, and for whom natural frequency formats are most effective. We review contributions from research associated with the 2 dominant theoretical perspectives, the ecological rationality framework and nested-sets theory, and test potential moderators of the effect. A systematic review of relevant literature yielded 35 articles representing 226 performance estimates. These estimates were statistically integrated using a bivariate mixed-effects model that yields summary estimates of average performances across the 2 formats and estimates of the effects of different study characteristics on performance. These study characteristics range from moderators representing individual characteristics (e.g., numeracy, expertise), to methodological differences (e.g., use of incentives, scoring criteria) and features of problem representation (e.g., short menu format, visual aid). Short menu formats (less computationally complex representations showing joint-events) and visual aids demonstrated some of the strongest moderation effects, improving performance for both conditional probability and natural frequency formats. A number of methodological factors (e.g., exposure to both problem formats) were also found to affect performance rates, emphasizing the importance of a systematic approach. We suggest how research on Bayesian reasoning can be strengthened by broadening the definition of successful Bayesian reasoning to incorporate choice and process and by applying different research methodologies. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Sm-Nd and Rb-Sr Isotopic Studies of Meteorite Kalahari 009: An Old VLT Mare Basalt

NASA Technical Reports Server (NTRS)

Shih, C.-Y.; Nyquist, L. E.; Reese, Y.; Bischoff, A.

2008-01-01

Lunar meteorite Kalahari 009 is a fragmental basaltic breccia contain ing various very-low-Ti (VLT) mare basalt clasts embedded in a fine-g rained matrix of similar composition. This meteorite and lunar meteorite Kalahari 008, an anorthositic breccia, were suggested to be paired mainly due to the presence of similar fayalitic olivines in fragment s found in both meteorites. Thus, Kalahari 009 probably represents a VLT basalt that came from a locality near a mare-highland boundary r egion of the Moon, as compared to the typical VLT mare basalt samples collected at Mare Crisium during the Luna-24 mission. The concordant Sm-Nd and Ar-Ar ages of such a VLT basalt (24170) suggest that the extrusion of VLT basalts at Mare Crisium occurred 3.30 +/- 0.05 Ga ag o. Previous age results for Kalahari 009 range from approximately 4.2 Ga by its Lu-Hf isochron age to 1.70?0.04 Ga of its Ar-Ar plateau ag e. However, recent in-situ U-Pb dating of phosphates in Kalahari 009 defined an old crystallization age of 4.35+/- 0.15 Ga. The authors su ggested that Kalahari 009 represents a cryptomaria basalt. In this r eport, we present Sm-Nd and Rb-Sr isotopic results for Kalahari 009, discuss the relationship of its age and isotopic characteristics to t hose of other L-24 VLT mare basalts and other probable cryptomaria ba salts represented by Apollo 14 aluminous mare basalts, and discuss it s petrogenesis.
Tail mean and related robust solution concepts

NASA Astrophysics Data System (ADS)

Ogryczak, Włodzimierz

2014-01-01

Robust optimisation might be viewed as a multicriteria optimisation problem where objectives correspond to the scenarios although their probabilities are unknown or imprecise. The simplest robust solution concept represents a conservative approach focused on the worst-case scenario results optimisation. A softer concept allows one to optimise the tail mean thus combining performances under multiple worst scenarios. We show that while considering robust models allowing the probabilities to vary only within given intervals, the tail mean represents the robust solution for only upper bounded probabilities. For any arbitrary intervals of probabilities the corresponding robust solution may be expressed by the optimisation of appropriately combined mean and tail mean criteria thus remaining easily implementable with auxiliary linear inequalities. Moreover, we use the tail mean concept to develope linear programming implementable robust solution concepts related to risk averse optimisation criteria.
Electrofishing capture probability of smallmouth bass in streams

USGS Publications Warehouse

Dauwalter, D.C.; Fisher, W.L.

2007-01-01

Abundance estimation is an integral part of understanding the ecology and advancing the management of fish populations and communities. Mark-recapture and removal methods are commonly used to estimate the abundance of stream fishes. Alternatively, abundance can be estimated by dividing the number of individuals sampled by the probability of capture. We conducted a mark-recapture study and used multiple repeated-measures logistic regression to determine the influence of fish size, sampling procedures, and stream habitat variables on the cumulative capture probability for smallmouth bass Micropterus dolomieu in two eastern Oklahoma streams. The predicted capture probability was used to adjust the number of individuals sampled to obtain abundance estimates. The observed capture probabilities were higher for larger fish and decreased with successive electrofishing passes for larger fish only. Model selection suggested that the number of electrofishing passes, fish length, and mean thalweg depth affected capture probabilities the most; there was little evidence for any effect of electrofishing power density and woody debris density on capture probability. Leave-one-out cross validation showed that the cumulative capture probability model predicts smallmouth abundance accurately. ?? Copyright by the American Fisheries Society 2007.
Sampling designs matching species biology produce accurate and affordable abundance indices

PubMed Central

Farley, Sean; Russell, Gareth J.; Butler, Matthew J.; Selinger, Jeff

2013-01-01

Wildlife biologists often use grid-based designs to sample animals and generate abundance estimates. Although sampling in grids is theoretically sound, in application, the method can be logistically difficult and expensive when sampling elusive species inhabiting extensive areas. These factors make it challenging to sample animals and meet the statistical assumption of all individuals having an equal probability of capture. Violating this assumption biases results. Does an alternative exist? Perhaps by sampling only where resources attract animals (i.e., targeted sampling), it would provide accurate abundance estimates more efficiently and affordably. However, biases from this approach would also arise if individuals have an unequal probability of capture, especially if some failed to visit the sampling area. Since most biological programs are resource limited, and acquiring abundance data drives many conservation and management applications, it becomes imperative to identify economical and informative sampling designs. Therefore, we evaluated abundance estimates generated from grid and targeted sampling designs using simulations based on geographic positioning system (GPS) data from 42 Alaskan brown bears (Ursus arctos). Migratory salmon drew brown bears from the wider landscape, concentrating them at anadromous streams. This provided a scenario for testing the targeted approach. Grid and targeted sampling varied by trap amount, location (traps placed randomly, systematically or by expert opinion), and traps stationary or moved between capture sessions. We began by identifying when to sample, and if bears had equal probability of capture. We compared abundance estimates against seven criteria: bias, precision, accuracy, effort, plus encounter rates, and probabilities of capture and recapture. One grid (49 km2 cells) and one targeted configuration provided the most accurate results. Both placed traps by expert opinion and moved traps between capture sessions, which raised capture probabilities. The grid design was least biased (−10.5%), but imprecise (CV 21.2%), and used most effort (16,100 trap-nights). The targeted configuration was more biased (−17.3%), but most precise (CV 12.3%), with least effort (7,000 trap-nights). Targeted sampling generated encounter rates four times higher, and capture and recapture probabilities 11% and 60% higher than grid sampling, in a sampling frame 88% smaller. Bears had unequal probability of capture with both sampling designs, partly because some bears never had traps available to sample them. Hence, grid and targeted sampling generated abundance indices, not estimates. Overall, targeted sampling provided the most accurate and affordable design to index abundance. Targeted sampling may offer an alternative method to index the abundance of other species inhabiting expansive and inaccessible landscapes elsewhere, provided their attraction to resource concentrations. PMID:24392290
Estimating Whether Replacing Time in Active Outdoor Play and Sedentary Video Games With Active Video Games Influences Youth's Mental Health.

PubMed

Janssen, Ian

2016-11-01

The primary objective was to use isotemporal substitution models to estimate whether replacing time spent in sedentary video games (SVGs) and active outdoor play (AOP) with active video games (AVGs) would be associated with changes in youth's mental health. A representative sample of 20,122 Canadian youth in Grades 6-10 was studied. The exposure variables were average hours/day spent playing AVGs, SVGs, and AOP. The outcomes consisted of a negative and internalizing mental health indicator (emotional problems), a positive and internalizing mental health indicator (life satisfaction), and a positive and externalizing mental health indicator (prosocial behavior). Isotemporal substitution models estimated the extent to which replacing time spent in SVGs and AOP with an equivalent amount of time in AVGs had on the mental health indicators. Replacing 1 hour/day of SVGs with 1 hour/day of AVGs was associated with a 6% (95% confidence interval: 3%-9%) reduced probability of high emotional problems, a 4% (2%-7%) increased probability of high life satisfaction, and a 13% (9%-16%) increased probability of high prosocial behavior. Replacing 1 hour/day of AOP with 1 hour/day of AVGs was associated with a 7% (3%-11%) increased probability of high emotional problems, a 3% (1%-5%) reduced probability of high life satisfaction, and a 6% (2%-9%) reduced probability of high prosocial behavior. Replacing SVGs with AVGs was associated with more preferable mental health indicators. Conversely, replacing AOP with AVGs was associated with more deleterious mental health indicators. Copyright © 2016 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
Ignition probability of polymer-bonded explosives accounting for multiple sources of material stochasticity

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kim, S.; Barua, A.; Zhou, M., E-mail: min.zhou@me.gatech.edu

2014-05-07

Accounting for the combined effect of multiple sources of stochasticity in material attributes, we develop an approach that computationally predicts the probability of ignition of polymer-bonded explosives (PBXs) under impact loading. The probabilistic nature of the specific ignition processes is assumed to arise from two sources of stochasticity. The first source involves random variations in material microstructural morphology; the second source involves random fluctuations in grain-binder interfacial bonding strength. The effect of the first source of stochasticity is analyzed with multiple sets of statistically similar microstructures and constant interfacial bonding strength. Subsequently, each of the microstructures in the multiple setsmore » is assigned multiple instantiations of randomly varying grain-binder interfacial strengths to analyze the effect of the second source of stochasticity. Critical hotspot size-temperature states reaching the threshold for ignition are calculated through finite element simulations that explicitly account for microstructure and bulk and interfacial dissipation to quantify the time to criticality (t{sub c}) of individual samples, allowing the probability distribution of the time to criticality that results from each source of stochastic variation for a material to be analyzed. Two probability superposition models are considered to combine the effects of the multiple sources of stochasticity. The first is a parallel and series combination model, and the second is a nested probability function model. Results show that the nested Weibull distribution provides an accurate description of the combined ignition probability. The approach developed here represents a general framework for analyzing the stochasticity in the material behavior that arises out of multiple types of uncertainty associated with the structure, design, synthesis and processing of materials.« less
Haze in Apple-Based Beverages: Detailed Polyphenol, Polysaccharide, Protein, and Mineral Compositions.

PubMed

Millet, Melanie; Poupard, Pascal; Le Quéré, Jean-Michel; Bauduin, Remi; Guyot, Sylvain

2017-08-09

Producers of apple-based beverages are confronted with colloidal instability. Haze is caused by interactions between molecules that lead to the formation of aggregates. Haze composition in three apple-based beverages, namely, French sparkling cider, apple juice, and pommeau, was studied. Phenolic compounds, proteins, polysaccharides, and minerals were analyzed using global and detailed analytical methods. The results explained <75% (w/w) of haze dry mass. Polyphenols, represented mainly by procyanidins, were the main compounds identified and accounted for 10-31% of haze. However, oxidized phenolic compounds were probably underestimated and may represent a high proportion of haze. Proteins were present in all of the samples in proportions of <6% of haze except in two apple juice hazes, where they were the main constituents (18 and 24%). Polysaccharides accounted for 0-30% of haze. Potassium and calcium were the main minerals.
High probability neurotransmitter release sites represent an energy efficient design

PubMed Central

Lu, Zhongmin; Chouhan, Amit K.; Borycz, Jolanta A.; Lu, Zhiyuan; Rossano, Adam J; Brain, Keith L.; Zhou, You; Meinertzhagen, Ian A.; Macleod, Gregory T.

2016-01-01

Nerve terminals contain multiple sites specialized for the release of neurotransmitters. Release usually occurs with low probability, a design thought to confer many advantages. High probability release sites are not uncommon but their advantages are not well understood. Here we test the hypothesis that high probability release sites represent an energy efficient design. We examined release site probabilities and energy efficiency at the terminals of two glutamatergic motor neurons synapsing on the same muscle fiber in Drosophila larvae. Through electrophysiological and ultrastructural measurements we calculated release site probabilities to differ considerably between terminals (0.33 vs. 0.11). We estimated the energy required to release and recycle glutamate from the same measurements. The energy required to remove calcium and sodium ions subsequent to nerve excitation was estimated through microfluorimetric and morphological measurements. We calculated energy efficiency as the number of glutamate molecules released per ATP molecule hydrolyzed, and high probability release site terminals were found to be more efficient (0.13 vs. 0.06). Our analytical model indicates that energy efficiency is optimal (~0.15) at high release site probabilities (~0.76). As limitations in energy supply constrain neural function, high probability release sites might ameliorate such constraints by demanding less energy. Energy efficiency can be viewed as one aspect of nerve terminal function, in balance with others, because high efficiency terminals depress significantly during episodic bursts of activity. PMID:27593375
Utilizing Adjoint-Based Error Estimates for Surrogate Models to Accurately Predict Probabilities of Events

DOE PAGES

Butler, Troy; Wildey, Timothy

2018-01-01

In thist study, we develop a procedure to utilize error estimates for samples of a surrogate model to compute robust upper and lower bounds on estimates of probabilities of events. We show that these error estimates can also be used in an adaptive algorithm to simultaneously reduce the computational cost and increase the accuracy in estimating probabilities of events using computationally expensive high-fidelity models. Specifically, we introduce the notion of reliability of a sample of a surrogate model, and we prove that utilizing the surrogate model for the reliable samples and the high-fidelity model for the unreliable samples gives preciselymore » the same estimate of the probability of the output event as would be obtained by evaluation of the original model for each sample. The adaptive algorithm uses the additional evaluations of the high-fidelity model for the unreliable samples to locally improve the surrogate model near the limit state, which significantly reduces the number of high-fidelity model evaluations as the limit state is resolved. Numerical results based on a recently developed adjoint-based approach for estimating the error in samples of a surrogate are provided to demonstrate (1) the robustness of the bounds on the probability of an event, and (2) that the adaptive enhancement algorithm provides a more accurate estimate of the probability of the QoI event than standard response surface approximation methods at a lower computational cost.« less
Utilizing Adjoint-Based Error Estimates for Surrogate Models to Accurately Predict Probabilities of Events

DOE Office of Scientific and Technical Information (OSTI.GOV)

Butler, Troy; Wildey, Timothy

In thist study, we develop a procedure to utilize error estimates for samples of a surrogate model to compute robust upper and lower bounds on estimates of probabilities of events. We show that these error estimates can also be used in an adaptive algorithm to simultaneously reduce the computational cost and increase the accuracy in estimating probabilities of events using computationally expensive high-fidelity models. Specifically, we introduce the notion of reliability of a sample of a surrogate model, and we prove that utilizing the surrogate model for the reliable samples and the high-fidelity model for the unreliable samples gives preciselymore » the same estimate of the probability of the output event as would be obtained by evaluation of the original model for each sample. The adaptive algorithm uses the additional evaluations of the high-fidelity model for the unreliable samples to locally improve the surrogate model near the limit state, which significantly reduces the number of high-fidelity model evaluations as the limit state is resolved. Numerical results based on a recently developed adjoint-based approach for estimating the error in samples of a surrogate are provided to demonstrate (1) the robustness of the bounds on the probability of an event, and (2) that the adaptive enhancement algorithm provides a more accurate estimate of the probability of the QoI event than standard response surface approximation methods at a lower computational cost.« less
Calibrating SALT: a sampling scheme to improve estimates of suspended sediment yield

Treesearch

Robert B. Thomas

1986-01-01

Abstract - SALT (Selection At List Time) is a variable probability sampling scheme that provides unbiased estimates of suspended sediment yield and its variance. SALT performs better than standard schemes which are estimate variance. Sampling probabilities are based on a sediment rating function which promotes greater sampling intensity during periods of high...

Moment and maximum likelihood estimators for Weibull distributions under length- and area-biased sampling

Treesearch

Jeffrey H. Gove

2003-01-01

Many of the most popular sampling schemes used in forestry are probability proportional to size methods. These methods are also referred to as size biased because sampling is actually from a weighted form of the underlying population distribution. Length- and area-biased sampling are special cases of size-biased sampling where the probability weighting comes from a...
DOE Office of Scientific and Technical Information (OSTI.GOV)

Kovalev, Andrew N.

The authors describe a measurement of the top quark mass using events with two charged leptons collected by the CDF II Detector from pmore » $$\\bar{p}$$ collisions with √s = 1.96 TeV at the Fermilab Tevatron. The posterior probability distribution of the top quark pole mass is calculated using the differential cross-section for the t$$\\bar{t}$$ production and decay expressed with respect to observed leptons and jets momenta. The presence of background events in the collected sample is modeled using calculations of the differential cross-sections for major background processes. This measurement represents the first application of this method to events with two charged leptons. In a data sample with integrated luminosity of 340 pb -1, they observe 33 candidate events and measure M top = 165.2 ± 61. stat ± 3.4 syst GeV/c 2.« less
Childhood Trauma and Psychiatric Disorders as Correlates of School Dropout in a National Sample of Young Adults

PubMed Central

Porche, Michelle V.; Fortuna, Lisa R.; Lin, Julia; Alegria, Margarita

2010-01-01

The effect of childhood trauma, psychiatric diagnoses, and mental health services on school dropout among U.S. born and immigrant youth is examined using data from the Collaborative Psychiatric Epidemiology Surveys (CPES), a nationally representative probability sample of African Americans, Afro-Caribbeans, Asians, Latinos, and non-Latino Whites, including 2532 young adults, ages 21 to 29. The dropout prevalence rate was 16% overall, with variation by childhood trauma, childhood psychiatric diagnosis, race/ethnicity, and nativity. Childhood substance and conduct disorders mediated the relationship between trauma and school dropout. Likelihood of dropout was decreased for Asians, and increased for African Americans and Latinos, compared to non-Latino Whites as a function of psychiatric disorders and trauma. Timing of U.S. immigration during adolescence increased risk of dropout. PMID:21410919
Characteristics of out-of-home caregiving environments provided under child welfare services.

PubMed

Barth, Richard P; Green, Rebecca; Webb, Mary Bruce; Wall, Ariana; Gibbons, Claire; Craig, Carlton

2008-01-01

A national probability sample of children who have been in child welfare supervised placements for about one year identifies the characteristics (e.g., age, training, education, health, and home) of the foster parents, kinship foster parents, and group home caregivers. Caregiving respondents provided information about their backgrounds. Interviewers also used the HOME-SF to assess the caregiving environments of foster care and kinship care. Comparisons are made to other nationally representative samples, including the U.S. Census and the National Survey of America's Families. Kinship care, foster care, and group care providers are significantly different from each other--and the general population--in age and education. Findings on the numbers of children cared for, understimulating environments, use of punitive punishment, and low educational levels of caregivers generate suggestions for practice with foster families.
A stochastic diffusion process for Lochner's generalized Dirichlet distribution

DOE PAGES

Bakosi, J.; Ristorcelli, J. R.

2013-10-01

The method of potential solutions of Fokker-Planck equations is used to develop a transport equation for the joint probability of N stochastic variables with Lochner’s generalized Dirichlet distribution as its asymptotic solution. Individual samples of a discrete ensemble, obtained from the system of stochastic differential equations, equivalent to the Fokker-Planck equation developed here, satisfy a unit-sum constraint at all times and ensure a bounded sample space, similarly to the process developed in for the Dirichlet distribution. Consequently, the generalized Dirichlet diffusion process may be used to represent realizations of a fluctuating ensemble of N variables subject to a conservation principle.more » Compared to the Dirichlet distribution and process, the additional parameters of the generalized Dirichlet distribution allow a more general class of physical processes to be modeled with a more general covariance matrix.« less
Analyses of flood-flow frequency for selected gaging stations in South Dakota

USGS Publications Warehouse

Benson, R.D.; Hoffman, E.B.; Wipf, V.J.

1985-01-01

Analyses of flood flow frequency were made for 111 continuous-record gaging stations in South Dakota with 10 or more years of record. The analyses were developed using the log-Pearson Type III procedure recommended by the U.S. Water Resources Council. The procedure characterizes flood occurrence at a single site as a sequence of annual peak flows. The magnitudes of the annual peak flows are assumed to be independent random variables following a log-Pearson Type III probability distribution, which defines the probability that any single annual peak flow will exceed a specified discharge. By considering only annual peak flows, the flood-frequency analysis becomes the estimation of the log-Pearson annual-probability curve using the record of annual peak flows at the site. The recorded data are divided into two classes: systematic and historic. The systematic record includes all annual peak flows determined in the process of conducting a systematic gaging program at a site. In this program, the annual peak flow is determined for each and every year of the program. The systematic record is intended to constitute an unbiased and representative sample of the population of all possible annual peak flows at the site. In contrast to the systematic record, the historic record consists of annual peak flows that would not have been determined except for evidence indicating their unusual magnitude. Flood information acquired from historical sources almost invariably refers to floods of noteworthy, and hence extraordinary, size. Although historic records form a biased and unrepresentative sample, they can be used to supplement the systematic record. (Author 's abstract)
Predicting risk for childhood asthma by pre-pregnancy, perinatal, and postnatal factors.

PubMed

Wen, Hui-Ju; Chiang, Tung-Liang; Lin, Shio-Jean; Guo, Yue Leon

2015-05-01

Symptoms of atopic disease start early in human life. Predicting risk for childhood asthma by early-life exposure would contribute to disease prevention. A birth cohort study was conducted to investigate early-life risk factors for childhood asthma and to develop a predictive model for the development of asthma. National representative samples of newborn babies were obtained by multistage stratified systematic sampling from the 2005 Taiwan Birth Registry. Information on potential risk factors and children's health was collected by home interview when babies were 6 months old and 5 yr old, respectively. Backward stepwise regression analysis was used to identify the risk factors of childhood asthma for predictive models that were used to calculate the probability of childhood asthma. A total of 19,192 children completed the study satisfactorily. Physician-diagnosed asthma was reported in 6.6% of 5-yr-old children. Pre-pregnancy factors (parental atopy and socioeconomic status), perinatal factors (place of residence, exposure to indoor mold and painting/renovations during pregnancy), and postnatal factors (maternal postpartum depression and the presence of atopic dermatitis before 6 months of age) were chosen for the predictive models, and the highest predicted probability of asthma in 5-yr-old children was 68.1% in boys and 78.1% in girls; the lowest probability in boys and girls was 4.1% and 3.2%, respectively. This investigation provides a technique for predicting risk of childhood asthma that can be used to developing a preventive strategy against asthma. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Stable isotope, chemical, and mineral compositions of the Middle Proterozoic Lijiaying Mn deposit, Shaanxi Province, China

USGS Publications Warehouse

Yeh, Hsueh-Wen; Hein, James R.; Ye, Jie; Fan, Delian

1999-01-01

The Lijiaying Mn deposit, located about 250 km southwest of Xian, is a high-quality ore characterized by low P and Fe contents and a mean Mn content of about 23%. The ore deposit occurs in shallow-water marine sedimentary rocks of probable Middle Proterozoic age. Carbonate minerals in the ore deposit include kutnahorite, calcite, Mn calcite, and Mg calcite. Carbon (−0.4 to −4.0‰) and oxygen (−3.7 to −12.9‰) isotopes show that, with a few exceptions, those carbonate minerals are not pristine low-temperature marine precipitates. All samples are depleted in rare earth elements (REEs) relative to shale and have negative Eu and positive Ce anomalies on chondrite-normalized plots. The Fe/Mn ratios of representative ore samples range from about 0.034 to <0.008 and P/Mn from 0.0023 to <0.001. Based on mineralogical data, the low ends of those ranges of ratios are probably close to ratios for the pure Mn minerals. Manganese contents have a strong positive correlation with Ce anomaly values and a moderate correlation with total REE contents. Compositional data indicate that kutnahorite is a metamorphic mineral and that most calcites formed as low-temperature marine carbonates that were subsequently metamorphosed. The braunite ore precursor mineral was probably a Mn oxyhydroxide, similar to those that formed on the deep ocean-floor during the Cenozoic. Because the Lijiaying precursor mineral formed in a shallow-water marine environment, the atmospheric oxygen content during the Middle Proterozoic may have been lower than it has been during the Cenozoic.
The relationship between problem gambling and mental and physical health correlates among a nationally representative sample of Canadian women.

PubMed

Afifi, Tracie O; Cox, Brian J; Martens, Patricia J; Sareen, Jitender; Enns, Murray W

2010-01-01

Gambling has become an increasingly common activity among women since the widespread growth of the gambling industry. Currently, our knowledge of the relationship between problem gambling among women and mental and physical correlates is limited. Therefore, important relationships between problem gambling and health and functioning, mental disorders, physical health conditions, and help-seeking behaviours among women were examined using a nationally representative Canadian sample. Data were from the nationally representative Canadian Community Health Survey Cycle 1.2 (CCHS 1.2; n = 10,056 women aged 15 years and older; data collected in 2002). The statistical analysis included binary logistic regression, multinomial logistic regression, and linear regression models. Past 12-month problem gambling was associated with a significantly higher probability of current lower general health, suicidal ideation and attempts, decreased psychological well-being, increased distress, depression, mania, panic attacks, social phobia, agoraphobia, alcohol dependence, any mental disorder, comorbidity of mental disorders, chronic bronchitis, fibromyalgia, migraine headaches, help-seeking from a professional, attending a self-help group, and calling a telephone help line (odds ratios ranged from 1.5 to 8.2). Problem gambling was associated with a broad range of negative health correlates among women. Problem gambling is an important public health concern. These findings can be used to inform healthy public policies on gambling.
ANNz2: Photometric Redshift and Probability Distribution Function Estimation using Machine Learning

NASA Astrophysics Data System (ADS)

Sadeh, I.; Abdalla, F. B.; Lahav, O.

2016-10-01

We present ANNz2, a new implementation of the public software for photometric redshift (photo-z) estimation of Collister & Lahav, which now includes generation of full probability distribution functions (PDFs). ANNz2 utilizes multiple machine learning methods, such as artificial neural networks and boosted decision/regression trees. The objective of the algorithm is to optimize the performance of the photo-z estimation, to properly derive the associated uncertainties, and to produce both single-value solutions and PDFs. In addition, estimators are made available, which mitigate possible problems of non-representative or incomplete spectroscopic training samples. ANNz2 has already been used as part of the first weak lensing analysis of the Dark Energy Survey, and is included in the experiment's first public data release. Here we illustrate the functionality of the code using data from the tenth data release of the Sloan Digital Sky Survey and the Baryon Oscillation Spectroscopic Survey. The code is available for download at http://github.com/IftachSadeh/ANNZ.
Risk forewarning model for rice grain Cd pollution based on Bayes theory.

PubMed

Wu, Bo; Guo, Shuhai; Zhang, Lingyan; Li, Fengmei

2018-03-15

Cadmium (Cd) pollution of rice grain caused by Cd-contaminated soils is a common problem in southwest and central south China. In this study, utilizing the advantages of the Bayes classification statistical method, we established a risk forewarning model for rice grain Cd pollution, and put forward two parameters (the prior probability factor and data variability factor). The sensitivity analysis of the model parameters illustrated that sample size and standard deviation influenced the accuracy and applicable range of the model. The accuracy of the model was improved by the self-renewal of the model through adding the posterior data into the priori data. Furthermore, this method can be used to predict the risk probability of rice grain Cd pollution under similar soil environment, tillage and rice varietal conditions. The Bayes approach thus represents a feasible method for risk forewarning of heavy metals pollution of agricultural products caused by contaminated soils. Copyright © 2017 Elsevier B.V. All rights reserved.
The Butterflies of Barro Colorado Island, Panama: Local Extinction since the 1930s.

PubMed

Basset, Yves; Barrios, Héctor; Segar, Simon; Srygley, Robert B; Aiello, Annette; Warren, Andrew D; Delgado, Francisco; Coronado, James; Lezcano, Jorge; Arizala, Stephany; Rivera, Marleny; Perez, Filonila; Bobadilla, Ricardo; Lopez, Yacksecari; Ramirez, José Alejandro

2015-01-01

Few data are available about the regional or local extinction of tropical butterfly species. When confirmed, local extinction was often due to the loss of host-plant species. We used published lists and recent monitoring programs to evaluate changes in butterfly composition on Barro Colorado Island (BCI, Panama) between an old (1923-1943) and a recent (1993-2013) period. Although 601 butterfly species have been recorded from BCI during the 1923-2013 period, we estimate that 390 species are currently breeding on the island, including 34 cryptic species, currently only known by their DNA Barcode Index Number. Twenty-three butterfly species that were considered abundant during the old period could not be collected during the recent period, despite a much higher sampling effort in recent times. We consider these species locally extinct from BCI and they conservatively represent 6% of the estimated local pool of resident species. Extinct species represent distant phylogenetic branches and several families. The butterfly traits most likely to influence the probability of extinction were host growth form, wing size and host specificity, independently of the phylogenetic relationships among butterfly species. On BCI, most likely candidates for extinction were small hesperiids feeding on herbs (35% of extinct species). However, contrary to our working hypothesis, extinction of these species on BCI cannot be attributed to loss of host plants. In most cases these host plants remain extant, but they probably subsist at lower or more fragmented densities. Coupled with low dispersal power, this reduced availability of host plants has probably caused the local extinction of some butterfly species. Many more bird than butterfly species have been lost from BCI recently, confirming that small preserves may be far more effective at conserving invertebrates than vertebrates and, therefore, should not necessarily be neglected from a conservation viewpoint.
Statistics 101 for Radiologists.

PubMed

Anvari, Arash; Halpern, Elkan F; Samir, Anthony E

2015-10-01

Diagnostic tests have wide clinical applications, including screening, diagnosis, measuring treatment effect, and determining prognosis. Interpreting diagnostic test results requires an understanding of key statistical concepts used to evaluate test efficacy. This review explains descriptive statistics and discusses probability, including mutually exclusive and independent events and conditional probability. In the inferential statistics section, a statistical perspective on study design is provided, together with an explanation of how to select appropriate statistical tests. Key concepts in recruiting study samples are discussed, including representativeness and random sampling. Variable types are defined, including predictor, outcome, and covariate variables, and the relationship of these variables to one another. In the hypothesis testing section, we explain how to determine if observed differences between groups are likely to be due to chance. We explain type I and II errors, statistical significance, and study power, followed by an explanation of effect sizes and how confidence intervals can be used to generalize observed effect sizes to the larger population. Statistical tests are explained in four categories: t tests and analysis of variance, proportion analysis tests, nonparametric tests, and regression techniques. We discuss sensitivity, specificity, accuracy, receiver operating characteristic analysis, and likelihood ratios. Measures of reliability and agreement, including κ statistics, intraclass correlation coefficients, and Bland-Altman graphs and analysis, are introduced. © RSNA, 2015.
Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations

NASA Astrophysics Data System (ADS)

Sandhu, Rimple; Poirel, Dominique; Pettit, Chris; Khalil, Mohammad; Sarkar, Abhijit

2016-07-01

A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid-structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic system leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib-Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.
Dose⁻Response Relationships between Second-Hand Smoke Exposure and Depressive Symptoms among Adolescents in Guangzhou, China.

PubMed

Huang, Jingya; Xu, Bin; Guo, Dan; Jiang, Ting; Huang, Wei; Liu, Guocong; Ye, Xiaohua

2018-05-14

There has been little focus on the possible association between second-hand smoke (SHS) exposure and depressive symptoms among adolescents. Thus, this study aimed to explore the dose⁻response relationships between SHS exposure and depressive symptoms among adolescents and differentiate these associations in setting-specific exposure and severity-specific outcomes. A cross-sectional study was conducted using a stratified cluster sampling method to obtain a representative sample of high school students in Guangzhou, China. Depressive symptoms were measured using the Center for Epidemiologic Studies Depression Scale. Univariable and multivariable logistic regression models were used to explore the potential associations between SHS exposure and depressive symptoms. Among 3575 nonsmoking students, 29.6% were classified as having probable depressive symptoms and 9.6% had severe depressive symptoms. There were monotonically increasing dose⁻response relationships between setting-specific (public places, homes, or indoor/outdoor campuses) SHS exposure and severity-specific (probable or severe) depressive symptoms. When examining these relations by source of exposure, we also observed similar dose⁻response relationships for SHS exposure in campuses from smoking teachers and from smoking classmates. Our findings suggest that regular SHS exposure is associated with a significant, dose-dependent increase in risk of depressive symptoms among adolescents, and highlight the need for smoke-free environments to protect the health of adolescents.
Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sandhu, Rimple; Poirel, Dominique; Pettit, Chris

2016-07-01

A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid–structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic systemmore » leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib–Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.« less
Evaluation of ultrasonic array imaging algorithms for inspection of a coarse grained material

NASA Astrophysics Data System (ADS)

Van Pamel, A.; Lowe, M. J. S.; Brett, C. R.

2014-02-01

Improving the ultrasound inspection capability for coarse grain metals remains of longstanding interest to industry and the NDE research community and is expected to become increasingly important for next generation power plants. A test sample of coarse grained Inconel 625 which is representative of future power plant components has been manufactured to test the detectability of different inspection techniques. Conventional ultrasonic A, B, and C-scans showed the sample to be extraordinarily difficult to inspect due to its scattering behaviour. However, in recent years, array probes and Full Matrix Capture (FMC) imaging algorithms, which extract the maximum amount of information possible, have unlocked exciting possibilities for improvements. This article proposes a robust methodology to evaluate the detection performance of imaging algorithms, applying this to three FMC imaging algorithms; Total Focusing Method (TFM), Phase Coherent Imaging (PCI), and Decomposition of the Time Reversal Operator with Multiple Scattering (DORT MSF). The methodology considers the statistics of detection, presenting the detection performance as Probability of Detection (POD) and probability of False Alarm (PFA). The data is captured in pulse-echo mode using 64 element array probes at centre frequencies of 1MHz and 5MHz. All three algorithms are shown to perform very similarly when comparing their flaw detection capabilities on this particular case.
Sets, Probability and Statistics: The Mathematics of Life Insurance. [Computer Program.] Second Edition.

ERIC Educational Resources Information Center

King, James M.; And Others

The materials described here represent the conversion of a highly popular student workbook "Sets, Probability and Statistics: The Mathematics of Life Insurance" into a computer program. The program is designed to familiarize students with the concepts of sets, probability, and statistics, and to provide practice using real life examples. It also…
Probability of detecting nematode infestations for quarantine sampling with imperfect extraction efficacy

PubMed Central

Chen, Peichen; Liu, Shih-Chia; Liu, Hung-I; Chen, Tse-Wei

2011-01-01

For quarantine sampling, it is of fundamental importance to determine the probability of finding an infestation when a specified number of units are inspected. In general, current sampling procedures assume 100% probability (perfect) of detecting a pest if it is present within a unit. Ideally, a nematode extraction method should remove all stages of all species with 100% efficiency regardless of season, temperature, or other environmental conditions; in practice however, no method approaches these criteria. In this study we determined the probability of detecting nematode infestations for quarantine sampling with imperfect extraction efficacy. Also, the required sample and the risk involved in detecting nematode infestations with imperfect extraction efficacy are presented. Moreover, we developed a computer program to calculate confidence levels for different scenarios with varying proportions of infestation and efficacy of detection. In addition, a case study, presenting the extraction efficacy of the modified Baermann's Funnel method on Aphelenchoides besseyi, is used to exemplify the use of our program to calculate the probability of detecting nematode infestations in quarantine sampling with imperfect extraction efficacy. The result has important implications for quarantine programs and highlights the need for a very large number of samples if perfect extraction efficacy is not achieved in such programs. We believe that the results of the study will be useful for the determination of realistic goals in the implementation of quarantine sampling. PMID:22791911
I Environmental DNA sampling is more sensitive than a traditional survey technique for detecting an aquatic invader.

PubMed

Smart, Adam S; Tingley, Reid; Weeks, Andrew R; van Rooyen, Anthony R; McCarthy, Michael A

2015-10-01

Effective management of alien species requires detecting populations in the early stages of invasion. Environmental DNA (eDNA) sampling can detect aquatic species at relatively low densities, but few studies have directly compared detection probabilities of eDNA sampling with those of traditional sampling methods. We compare the ability of a traditional sampling technique (bottle trapping) and eDNA to detect a recently established invader, the smooth newt Lissotriton vulgaris vulgaris, at seven field sites in Melbourne, Australia. Over a four-month period, per-trap detection probabilities ranged from 0.01 to 0.26 among sites where L. v. vulgaris was detected, whereas per-sample eDNA estimates were much higher (0.29-1.0). Detection probabilities of both methods varied temporally (across days and months), but temporal variation appeared to be uncorrelated between methods. Only estimates of spatial variation were strongly correlated across the two sampling techniques. Environmental variables (water depth, rainfall, ambient temperature) were not clearly correlated with detection probabilities estimated via trapping, whereas eDNA detection probabilities were negatively correlated with water depth, possibly reflecting higher eDNA concentrations at lower water levels. Our findings demonstrate that eDNA sampling can be an order of magnitude more sensitive than traditional methods, and illustrate that traditional- and eDNA-based surveys can provide independent information on species distributions when occupancy surveys are conducted over short timescales.

Effects of track and threat information on judgments of hurricane strike probability.

PubMed

Wu, Hao-Che; Lindell, Michael K; Prater, Carla S; Samuelson, Charles D

2014-06-01

Although evacuation is one of the best strategies for protecting citizens from hurricane threat, the ways that local elected officials use hurricane data in deciding whether to issue hurricane evacuation orders is not well understood. To begin to address this problem, we examined the effects of hurricane track and intensity information in a laboratory setting where participants judged the probability that hypothetical hurricanes with a constant bearing (i.e., straight line forecast track) would make landfall in each of eight 45 degree sectors around the Gulf of Mexico. The results from 162 participants in a student sample showed that the judged strike probability distributions over the eight sectors within each scenario were, unsurprisingly, unimodal and centered on the sector toward which the forecast track pointed. More significantly, although strike probability judgments for the sector in the direction of the forecast track were generally higher than the corresponding judgments for the other sectors, the latter were not zero. Most significantly, there were no appreciable differences in the patterns of strike probability judgments for hurricane tracks represented by a forecast track only, an uncertainty cone only, or forecast track with an uncertainty cone-a result consistent with a recent survey of coastal residents threatened by Hurricane Charley. The study results suggest that people are able to correctly process basic information about hurricane tracks but they do make some errors. More research is needed to understand the sources of these errors and to identify better methods of displaying uncertainty about hurricane parameters. © 2013 Society for Risk Analysis.
Alcohol and labor supply: the case of Iceland.

PubMed

Asgeirsdottir, Tinna Laufey; McGeary, Kerry Anne

2009-10-01

At a time when the government of Iceland is considering privatization of alcohol sales and a reduction of its governmental fees, it is timely to estimate the potential effects of this policy change. Given that the privatization of sales coupled with a tax reduction should lead to a decrease in the unit price of alcohol, one would expect the quantity consumed to increase. While it is of interest to project the impact of the proposed bill on the market for alcohol, another important consideration is the impact that increased alcohol consumption and, more specifically, probable alcohol misuse would have on other markets in Iceland. The only available study on this subject using Icelandic data yields surprising results. Tómasson et al. (Scand J Public Health 32:47-52, 2004) unexpectedly found no effect of probable alcohol abuse on sick leave. A logical next step would be to examine the effect of probable alcohol abuse on other important labor-market outcomes. Nationally representative survey data from 2002 allow for an analysis of probable misuse of alcohol and labor-supply choices. Labor-supply choices are considered with reference to possible effects of policies already in force, as well as proposed changes to current policies. Contrary to intuition, but in agreement with the previously mentioned Icelandic study, the adverse effects of probable misuse of alcohol on employment status or hours worked are not confirmed within this sample. The reasons for the results are unclear, although some suggestions are hypothesized. Currently, data to test those theories convincingly are not available.
Archaeomagnetic Investigation at Chapultepec, Mexico City: Case Study of Classical Settlers

NASA Astrophysics Data System (ADS)

Lopez, V.; Romero, E.; Soler-Arechalde, A. M.; Espinosa, G.

2007-05-01

During the restoration campaign at the Chapultepec Park in Mexico City downtown, a teotihuacan settlement was found at the south flank of Chapultepec Hill. Samples represent a kind of irregular home kilns with a hole in their central part bounded by andesite rocks. Alternating field demagnetization had been employed. Rock magnetic measurements which included: Hysteresis, continuous susceptibility and isothermal remanence experiments revealed that some spinels, most probably magnetite or Ti-poor Titanomagnetites are responsible for the remanence. An archeomagnetic date obtained here is of 525 AD which is in good agreement with other evidences of the Teotihuacan Classic Metepec period (450-600 AD).
Software for Data Analysis with Graphical Models

NASA Technical Reports Server (NTRS)

Buntine, Wray L.; Roy, H. Scott

1994-01-01

Probabilistic graphical models are being used widely in artificial intelligence and statistics, for instance, in diagnosis and expert systems, as a framework for representing and reasoning with probabilities and independencies. They come with corresponding algorithms for performing statistical inference. This offers a unifying framework for prototyping and/or generating data analysis algorithms from graphical specifications. This paper illustrates the framework with an example and then presents some basic techniques for the task: problem decomposition and the calculation of exact Bayes factors. Other tools already developed, such as automatic differentiation, Gibbs sampling, and use of the EM algorithm, make this a broad basis for the generation of data analysis software.
The beta distribution: A statistical model for world cloud cover

NASA Technical Reports Server (NTRS)

Falls, L. W.

1973-01-01

Much work has been performed in developing empirical global cloud cover models. This investigation was made to determine an underlying theoretical statistical distribution to represent worldwide cloud cover. The beta distribution with probability density function is given to represent the variability of this random variable. It is shown that the beta distribution possesses the versatile statistical characteristics necessary to assume the wide variety of shapes exhibited by cloud cover. A total of 160 representative empirical cloud cover distributions were investigated and the conclusion was reached that this study provides sufficient statical evidence to accept the beta probability distribution as the underlying model for world cloud cover.
Detection of 14-3-3 sigma (σ) promoter methylation as a noninvasive biomarker using blood samples for breast cancer diagnosis

PubMed Central

Ye, Meng; Huang, Tao; Ying, Ying; Li, Jinyun; Yang, Ping; Ni, Chao; Zhou, Chongchang; Chen, Si

2017-01-01

As a tumor suppressor gene, 14-3-3 σ has been reported to be frequently methylated in breast cancer. However, the clinical effect of 14-3-3 σ promoter methylation remains to be verified. This study was performed to assess the clinicopathological significance and diagnostic value of 14-3-3 σ promoter methylation in breast cancer. 14-3-3 σ promoter methylation was found to be notably higher in breast cancer than in benign lesions and normal breast tissue samples. We did not observe that 14-3-3 σ promoter methylation was linked to the age status, tumor grade, clinic stage, lymph node status, histological subtype, ER status, PR status, HER2 status, or overall survival of patients with breast cancer. The combined sensitivity, specificity, AUC (area under the curve), positive likelihood ratios (PLR), negative likelihood ratios (NLR), diagnostic odds ratio (DOR), and post-test probability values (if the pretest probability was 30%) of 14-3-3 σ promoter methylation in blood samples of breast cancer patients vs. healthy subjects were 0.69, 0.99, 0.86, 95, 0.31, 302, and 98%, respectively. Our findings suggest that 14-3-3 σ promoter methylation may be associated with the carcinogenesis of breast cancer and that the use of 14-3-3 σ promoter methylation might represent a useful blood-based biomarker for the clinical diagnosis of breast cancer. PMID:27999208
Acetic Acid Detection Threshold in Synthetic Wine Samples of a Portable Electronic Nose

PubMed Central

Macías, Miguel Macías; Manso, Antonio García; Orellana, Carlos Javier García; Velasco, Horacio Manuel González; Caballero, Ramón Gallardo; Chamizo, Juan Carlos Peguero

2013-01-01

Wine quality is related to its intrinsic visual, taste, or aroma characteristics and is reflected in the price paid for that wine. One of the most important wine faults is the excessive concentration of acetic acid which can cause a wine to take on vinegar aromas and reduce its varietal character. Thereby it is very important for the wine industry to have methods, like electronic noses, for real-time monitoring the excessive concentration of acetic acid in wines. However, aroma characterization of alcoholic beverages with sensor array electronic noses is a difficult challenge due to the masking effect of ethanol. In this work, in order to detect the presence of acetic acid in synthetic wine samples (aqueous ethanol solution at 10% v/v) we use a detection unit which consists of a commercial electronic nose and a HSS32 auto sampler, in combination with a neural network classifier (MLP). To find the characteristic vector representative of the sample that we want to classify, first we select the sensors, and the section of the sensors response curves, where the probability of detecting the presence of acetic acid will be higher, and then we apply Principal Component Analysis (PCA) such that each sensor response curve is represented by the coefficients of its first principal components. Results show that the PEN3 electronic nose is able to detect and discriminate wine samples doped with acetic acid in concentrations equal or greater than 2 g/L. PMID:23262483
Lack of evidence for microplastic contamination in honey.

PubMed

Mühlschlegel, Peter; Hauk, Armin; Walter, Ulrich; Sieber, Robert

2017-11-01

Honey samples from Switzerland were investigated with regard to their microplastic particle burden. Five representative honey samples of different origin were processed following a standardized protocol to separate plastic-based microparticles from particles of natural origin, such as pollen, propolis, wax, and bee-related debris. The procedure was optimized to minimize post-sampling microplastic cross-contamination in the laboratory. The isolated microplastic particles were characterized and grouped by means of light microscopy as well as chemically characterized by microscopically coupled Raman and Fourier transform infrared spectroscopy. Five particle classes with an abundance significantly above blank levels were identified: black particles (particle count between 1760/kg and 8680/kg), white transparent fibres (particle count between 132/kg and 728/kg), white transparent particles (particle count between 60/kg and 172/kg), coloured fibres (particle count between 32/kg and 108/kg), and coloured particles (particle count between 8/kg and 64/kg). The black particles, which represented the majority of particles, were identified as char or soot and most probably originated from the use of smokers, a widespread practice in beekeeping. The majority of fibres were identified as cellulose or polyethylene terephthalate and were most likely of textile origin. In addition to these particle and fibre groups lower numbers of fragments were detected that were related to glass, polysaccharides or chitin, and few bluish particles contained copper phthalocyanine pigment. We found no indications that the honey samples were significantly contaminated with microplastic particles.
Genetic analysis of haplotype data for 23 Y-chromosome short tandem repeat loci in the Turkish population recently settled in Sarajevo, Bosnia and Herzegovina.

PubMed

Dogan, Serkan; Primorac, Dragan; Marjanović, Damir

2014-10-01

To explore the distribution and polymorphisms of 23 short tandem repeat (STR) loci on the Y chromosome in the Turkish population recently settled in Sarajevo, Bosnia and Herzegovina and to investigate its genetic relationships with the homeland Turkish population and neighboring populations. This study included 100 healthy unrelated male individuals from the Turkish population living in Sarajevo. Buccal swab samples were collected as a DNA source. Genomic DNA was extracted using the salting out method and amplification was performed using PowerPlex Y 23 amplification kit. The studied population was compared to other populations using pairwise genetic distances, which were represented with a multi-dimensional scaling plot. Haplotype and allele frequencies of the sample population were calculated and the results showed that all 100 samples had unique haplotypes. The most polymorphic locus was DYS458, and the least polymorphic DYS391. The observed haplotype diversity was 1.0000 ± 0.0014, with a discrimination capacity of 1.00 and the match probability of 0.01. Rst values showed that our sample population was closely related in both dimensions to the Lebanese and Iraqi populations, while it was more distant from Bosnian, Croatian, and Macedonian populations. Turkish population residing in Sarajevo could be observed as a representative Turkish population, since our results were consistent with those previously published for the homeland Turkish population. Also, this study once again proved that geographically close populations were genetically more related to each other.
Design of partially supervised classifiers for multispectral image data

NASA Technical Reports Server (NTRS)

Jeon, Byeungwoo; Landgrebe, David

1993-01-01

A partially supervised classification problem is addressed, especially when the class definition and corresponding training samples are provided a priori only for just one particular class. In practical applications of pattern classification techniques, a frequently observed characteristic is the heavy, often nearly impossible requirements on representative prior statistical class characteristics of all classes in a given data set. Considering the effort in both time and man-power required to have a well-defined, exhaustive list of classes with a corresponding representative set of training samples, this 'partially' supervised capability would be very desirable, assuming adequate classifier performance can be obtained. Two different classification algorithms are developed to achieve simplicity in classifier design by reducing the requirement of prior statistical information without sacrificing significant classifying capability. The first one is based on optimal significance testing, where the optimal acceptance probability is estimated directly from the data set. In the second approach, the partially supervised classification is considered as a problem of unsupervised clustering with initially one known cluster or class. A weighted unsupervised clustering procedure is developed to automatically define other classes and estimate their class statistics. The operational simplicity thus realized should make these partially supervised classification schemes very viable tools in pattern classification.
Social Modulation or Hormonal Causation? Linkages of Testosterone with Sexual Activity and Relationship Quality in a Nationally Representative Longitudinal Sample of Older Adults.

PubMed

Das, Aniruddha; Sawin, Nicole

2016-11-01

This study used population-representative longitudinal data from the 2005-2006 and 2010-2011 waves of the National Social Life, Health and Aging Project-a probability sample of US adults aged 57-85 at baseline (N = 650 women and 620 men)-to examine the causal direction in linkages of endogenous testosterone (T) with sexual activity and relationship quality. For both genders, our autoregressive effects indicated a large amount of temporal stability, not just in individual-level attributes (T, masturbation) but also dyadic ones (partnered sex, relationship quality)-indicating that a need for more nuanced theories of relational processes. Cross-lagged results suggested gender-specific effects-generally more consistent with sexual or relational modulation of T than with hormonal causation. Specifically, men's findings indicated their T might be elevated by their sexual (masturbatory) activity but not vice versa, although androgen levels did lower men's subsequent relationship quality. Women's T, in contrast, was negatively influenced not just by their higher relationship quality but also by their more frequent partnered sex-perhaps reflecting a changing function of sexual activity in late life.
[Acceptance of lot sampling: its applicability to the evaluation of the primary care services portfolio].

PubMed

López-Picazo Ferrer, J

2001-05-15

To determine the applicability of the acceptance of lot quality assurance sampling (LQAS) in the primary care service portfolio, comparing its results with those given by classic evaluation. Compliance with the minimum technical norms (MTN) of the service of diabetic care was evaluated through the classic methodology (confidence 95%, accuracy 5%, representativeness of area, sample of 376 histories) and by LQAS (confidence 95%, power 80%, representativeness of primary care team (PCT), defining a lot by MTN and PCT, sample of 13 histories/PCT). Effort, information obtained and its operative nature were assessed. 44 PCTs from Murcia Primary Care Region. Classic methodology: compliance with MTN ranged between 91.1% (diagnosis, 95% CI, 84.2-94.0) and 30% (repercussion in viscera, 95% CI, 25.4-34.6). Objectives in three MTN were reached (diagnosis, history and EKG). LQAS: no MTN was accepted in all the PCTs, <01-diagnosis> being the most accepted (42 PCT, 95.6%) and <07-Funduscopy> the least accepted (24 PCT, 55.6%). In 9 PCT all were accepted (20.4%), and in 2 none were accepted (4.5%). Data were analysed through Pareto charts. Classic methodology offered accurate results, but did not identify which centres were those that did not comply (general focus). LQAS was preferable for evaluating MTN and probably coverage because: 1) it uses small samples, which foment internal quality-improvement initiatives; 2) it is easy and rapid to execute; 3) it identifies the PCT and criteria where there is an opportunity for improvement (specific focus), and 4) it can be used operatively for monitoring.
Generalizability of findings from randomized controlled trials: application to the National Institute of Drug Abuse Clinical Trials Network.

PubMed

Susukida, Ryoko; Crum, Rosa M; Ebnesajjad, Cyrus; Stuart, Elizabeth A; Mojtabai, Ramin

2017-07-01

To compare randomized controlled trial (RCT) sample treatment effects with the population effects of substance use disorder (SUD) treatment. Statistical weighting was used to re-compute the effects from 10 RCTs such that the participants in the trials had characteristics that resembled those of patients in the target populations. Multi-site RCTs and usual SUD treatment settings in the United States. A total of 3592 patients in 10 RCTs and 1 602 226 patients from usual SUD treatment settings between 2001 and 2009. Three outcomes of SUD treatment were examined: retention, urine toxicology and abstinence. We weighted the RCT sample treatment effects using propensity scores representing the conditional probability of participating in RCTs. Weighting the samples changed the significance of estimated sample treatment effects. Most commonly, positive effects of trials became statistically non-significant after weighting (three trials for retention and urine toxicology and one trial for abstinence); also, non-significant effects became significantly positive (one trial for abstinence) and significantly negative effects became non-significant (two trials for abstinence). There was suggestive evidence of treatment effect heterogeneity in subgroups that are under- or over-represented in the trials, some of which were consistent with the differences in average treatment effects between weighted and unweighted results. The findings of randomized controlled trials (RCTs) for substance use disorder treatment do not appear to be directly generalizable to target populations when the RCT samples do not reflect adequately the target populations and there is treatment effect heterogeneity across patient subgroups. © 2017 Society for the Study of Addiction.
Inherent limitations of probabilistic models for protein-DNA binding specificity

PubMed Central

Ruan, Shuxiang

2017-01-01

The specificities of transcription factors are most commonly represented with probabilistic models. These models provide a probability for each base occurring at each position within the binding site and the positions are assumed to contribute independently. The model is simple and intuitive and is the basis for many motif discovery algorithms. However, the model also has inherent limitations that prevent it from accurately representing true binding probabilities, especially for the highest affinity sites under conditions of high protein concentration. The limitations are not due to the assumption of independence between positions but rather are caused by the non-linear relationship between binding affinity and binding probability and the fact that independent normalization at each position skews the site probabilities. Generally probabilistic models are reasonably good approximations, but new high-throughput methods allow for biophysical models with increased accuracy that should be used whenever possible. PMID:28686588
Lunar Meteorites and Implications for Compositional Remote Sensing of the Lunar Surface

NASA Technical Reports Server (NTRS)

Korotev, R. L.

1999-01-01

Lunar meteorites (LMs) are rocks found on Earth that were ejected from the Moon by impact of an asteroidal meteoroid. Three factors make the LMs important to remote-sensing studies: (1) Most are breccias composed of regolith or fragmental material; (2) all are rocks that resided (or breccias composed of material that resided) in the upper few meters of the Moon prior to launch and (3) most apparently come from areas distant from the Apollo sites. How Many Lunar Locations? At this writing (June 1999), there are 18 known lunar meteorite specimens. When unambiguous cases of terrestrial pairing are considered, the number of actual LMs reduces to 13. (Terrestrial pairing is when a single piece of lunar rock entered Earth's atmosphere, but multiple fragments were produced because the meteoroid broke apart on entry, upon hitting the ground or ice, or while being transported through the ice.) We have no reason to believe that LMs preferentially derive from any specific region(s) of the Moon; i.e., we believe that they are samples from random locations. However, we do not know how many different locations are represented by the LMs; mathematically, it could be as few as 1 or as many as 13. The actual maximum is < 13 because in some cases a single impact appears to have yielded more than one LM. Yamato 793169 and Asuka 881757 are considered "source-crater paired" or "launch paired" because they are compositionally and petrographically similar to each other and distinct from the others, and both have similar cosmic-ray exposure (CRE) histories. The same can be said of QUE 94281 and Y 793274. Thus the 13 meteorites probably represent a maximum of 11 locations on the Moon. The minimum number of likely source craters is debated and in flux as new data for different isotopic systems are obtained. Conservatively, considering CRE data only, a minimum of about 5 impacts is required. Compositional and petrographic data offer only probabilistic constraints. An extreme, but not unreasonable viewpoint, is that such data offer no constraint. For example, if one were to cut up the Apollo 17 landing site (which was selected for its diversity) into softball-sized pieces, some of those pieces (e.g., sample 70135) would be crystalline mare basalts like Y 793169 whereas others (e.g., sample 73131 would be feldspathic regolith breccias like MAC 88104/ 88105. However, nature is not so devious. Warren argues that LMs come from craters of only a few kilometers in diameter. If so, even though CRE data allow, for example, that ALHA 81005 and Y 791197) were launched simultaneously from the same crater, the probability is nevertheless low because the two meteorites are compositionally and mineralogically distinct. Thus, within the allowed range (5-11) for the number of locations represented by the LMs, values at the high end of the range are probably more likely. Mare Meteorites: Three LMs consist almost entirely of mare basalt. Two, Y 793169 and Asuka 881757, are unbrecciated, low-Ti, crystalline rocks that are compositionally and mineralogically similar (but not identical) to each other; they probably derive from a single lunar-mare location. The third, EET 87521/96008, is a fragmental breccia consisting predominantly of VLT mare basalt. Thus, these LMs probably represent only two lunar mare locations. The basaltic LMs have mineral and bulk compositions distinct from Apollo mare basalts. The petrography of Calcalong Creek has not been described in detail, but compositionally it is unique in that it corresponds to a mixture (breccia) of about one-half feldspathic material (i.e., the mean composition of the feldspathic lunar meteorites, below), one-fourth KREEP norite, one-fourth VLT mare basalt (like EET 87521), and 1% CI chondrite. With 4 micro g/g Th and correspondingly high concentrations of other incompatible elements, it is the only lunar meteorite that is likely to have come from within the Procellarum KREEP Terrane (PKT). Yamato 793274 and QUE 94281 are together distinct in being fragmental breccias containing subequal parts of feldspathic highland material and VLT mare basalt. Jolliff et al. estimate a mare to highland ratio of 54:46 for QUE 94281 and 62:38 for Y 793274; this difference is well within the range observed for soils collected only centimeters apart (in cores) at interface site like Apollo 15 and 17 [11]. Although the two meteorites were found on opposite sides of Antarctica, they are probably launch-paired. The strongest evidence is that the pyroclastic glass spherules that occur in both are of two compositional groups and the two groups are essentially the same in both meteorites. Yamato 791197 is nominally a feldspathic lunar meteorite (below), but among FLMs, it probably contains the highest abundance of clasts and glasses of mare derivation. As a consequence, its composition is at the high-Fe, low-Mg end of the range for FLMs and is not included in the FLM average of Table 1. Its composition is consistent with about 10% mare-derived material. Similarly, the two small (Y 82) pieces of Y 82192/82193186032 are more mafic than the large (Y 86) piece, probably as a result of about 7% mare-derived material. All Apollo missions went to areas in or near the PKT, and, consequently, all Apollo regolith samples are contaminated with Th-rich material from the PKT. At the nominally "typical" highland site, Apollo 16, about 30% of the regolith (<1-mm fines) is Th-rich ejecta from the Imbrium impact and about 6% is mare material probably derived from mare basins. Thus Apollo 16 regolith is not typical of the highlands. Among Apollo rocks, the compositions of the FLMs correspond most closely to the feldspathic granulitic breccias of Apollo 16 and 17. (Additional information is contained in original)
Noble Gases in Iddingsite from the Lafayette Meteorite: Evidence for Liquid Water on Mars in the Last Few Hundred Million Years

NASA Technical Reports Server (NTRS)

Swindle, T. D.; Treiman, A. H.; Lindstrom, D. J.; Brkland, M. K.; Cohen, B. A.; Grier, J. A.; Li, B.; Olson, E. K.

2000-01-01

We analyzed noble gases from 18 samples of weathering products ("iddingsite") from the Lafayette meteorite. Potassium-argon ages of 12 samples range from near zero to 670 +/- 91 Ma. These ages confirm the martian origin of the iddingsite, but it is not clear whether any or all of the ages represent iddingsite formation as opposed to later alteration or incorporation of martian atmospheric Ar-40. In any case, because iddingsite formation requires liquid water, this data requires the presence of liquid water near the surface of Mars at least as recently as 1300 Ma ago, and probably as recently as 650 Ma ago. Krypton and Xe analysis of a single 34 microg sample indicates the presence of fractionated martian atmosphere within the iddingsite. This also confirms the martian origin of the iddingsite. The mechanism of incorporation could either be through interaction with liquid water during iddingsite formation or a result of shock implantation of adsorbed atmospheric gas.
Presence of polychlorinated biphenyls (PCBs) in bottled drinking water in Mexico City.

PubMed

Salinas, Rutilio Ortiz; Bermudez, Beatriz Schettino; Tolentino, Rey Gutiérrez; Gonzalez, Gilberto Díaz; Vega y León, Salvador

2010-10-01

This paper describes the concentrations of seven polychlorinated biphenyls (PCBs) in bottled drinking water samples that were collected over 1 year from Mexico City in two sizes (1.5 and 19 L), using gas chromatography with an electron capture detector. PCBs 28 (0.018-0.042 μg/L), 52 (0.006-0.015 μg/L) and 101 (0.001-0.039 μg/L) were the most commonly found and were present in the majority of the samples. However, total concentrations of PCBs in bottled drinking water (0.035-0.039 μg/L) were below the maximum permissible level of 0.50 μg/L stated in Mexican regulations and probably do not represent a hazard to human health. PCBs were detectable in all samples and we recommend a monitoring program be established to better understand the quality of drinking bottled water over time; this may help in producing solutions for reducing the presence of organic contaminants.
Watershed-based survey designs

USGS Publications Warehouse

Detenbeck, N.E.; Cincotta, D.; Denver, J.M.; Greenlee, S.K.; Olsen, A.R.; Pitchford, A.M.

2005-01-01

Watershed-based sampling design and assessment tools help serve the multiple goals for water quality monitoring required under the Clean Water Act, including assessment of regional conditions to meet Section 305(b), identification of impaired water bodies or watersheds to meet Section 303(d), and development of empirical relationships between causes or sources of impairment and biological responses. Creation of GIS databases for hydrography, hydrologically corrected digital elevation models, and hydrologic derivatives such as watershed boundaries and upstream–downstream topology of subcatchments would provide a consistent seamless nationwide framework for these designs. The elements of a watershed-based sample framework can be represented either as a continuous infinite set defined by points along a linear stream network, or as a discrete set of watershed polygons. Watershed-based designs can be developed with existing probabilistic survey methods, including the use of unequal probability weighting, stratification, and two-stage frames for sampling. Case studies for monitoring of Atlantic Coastal Plain streams, West Virginia wadeable streams, and coastal Oregon streams illustrate three different approaches for selecting sites for watershed-based survey designs.
A comparison of exact tests for trend with binary endpoints using Bartholomew's statistic.

PubMed

Consiglio, J D; Shan, G; Wilding, G E

2014-01-01

Tests for trend are important in a number of scientific fields when trends associated with binary variables are of interest. Implementing the standard Cochran-Armitage trend test requires an arbitrary choice of scores assigned to represent the grouping variable. Bartholomew proposed a test for qualitatively ordered samples using asymptotic critical values, but type I error control can be problematic in finite samples. To our knowledge, use of the exact probability distribution has not been explored, and we study its use in the present paper. Specifically we consider an approach based on conditioning on both sets of marginal totals and three unconditional approaches where only the marginal totals corresponding to the group sample sizes are treated as fixed. While slightly conservative, all four tests are guaranteed to have actual type I error rates below the nominal level. The unconditional tests are found to exhibit far less conservatism than the conditional test and thereby gain a power advantage.
Approach for validating actinide and fission product compositions for burnup credit criticality safety analyses

DOE PAGES

Radulescu, Georgeta; Gauld, Ian C.; Ilas, Germina; ...

2014-11-01

This paper describes a depletion code validation approach for criticality safety analysis using burnup credit for actinide and fission product nuclides in spent nuclear fuel (SNF) compositions. The technical basis for determining the uncertainties in the calculated nuclide concentrations is comparison of calculations to available measurements obtained from destructive radiochemical assay of SNF samples. Probability distributions developed for the uncertainties in the calculated nuclide concentrations were applied to the SNF compositions of a criticality safety analysis model by the use of a Monte Carlo uncertainty sampling method to determine bias and bias uncertainty in effective neutron multiplication factor. Application ofmore » the Monte Carlo uncertainty sampling approach is demonstrated for representative criticality safety analysis models of pressurized water reactor spent fuel pool storage racks and transportation packages using burnup-dependent nuclide concentrations calculated with SCALE 6.1 and the ENDF/B-VII nuclear data. Furthermore, the validation approach and results support a recent revision of the U.S. Nuclear Regulatory Commission Interim Staff Guidance 8.« less

Serological survey on Leptospira infection in slaughtered swine in North-Central Italy.

PubMed

Bertelloni, F; Turchi, B; Vattiata, E; Viola, P; Pardini, S; Cerri, D; Fratini, F

2018-05-30

Swine can act as asymptomatic carriers of some Leptospira serovars. In this study, 1194 sera from 61 farms located in five different Regions of North-West Italy were collected from slaughtered healthy pigs. Presence of antibody against four Leptospira serovars was evaluated. Overall, 52.5% of analysed farms presented at least one positive animal and 34.4% presented at least one positive swine with titre ⩾1:400. A percentage of 16.6% sera was positive and 5.9% samples presented a positive titre ⩾1:400. Tuscany and Lombardy showed the highest percentage of positive farms (64.3% and 54.6%, respectively) and sera (28.5% and 13.3%, respectively), probably due to environmental conditions and potential risk factors, which promote maintenance and spreading of Leptospira in these areas. The main represented serogroups were Australis (21.3% positive farms, 8.2% positive sera) and Pomona (18.0% positive farms, 8.1% positive sera). In swine, these serogroups are the most detected worldwide; however, our results seem to highlight a reemerging of serogroup Pomona in pigs in investigated areas. A low percentage of sera (0.6%) scored positive to Canicola, leaving an open question on the role of pigs in the epidemiology of this serovar. Higher antibody titres were detected for serogroups Australis and Pomona. Swine leptospirosis is probably underestimated in Italy and could represent a potential risk for animal and human health.
What Can Quantum Optics Say about Computational Complexity Theory?

NASA Astrophysics Data System (ADS)

Rahimi-Keshari, Saleh; Lund, Austin P.; Ralph, Timothy C.

2015-02-01

Considering the problem of sampling from the output photon-counting probability distribution of a linear-optical network for input Gaussian states, we obtain results that are of interest from both quantum theory and the computational complexity theory point of view. We derive a general formula for calculating the output probabilities, and by considering input thermal states, we show that the output probabilities are proportional to permanents of positive-semidefinite Hermitian matrices. It is believed that approximating permanents of complex matrices in general is a #P-hard problem. However, we show that these permanents can be approximated with an algorithm in the BPPNP complexity class, as there exists an efficient classical algorithm for sampling from the output probability distribution. We further consider input squeezed-vacuum states and discuss the complexity of sampling from the probability distribution at the output.
Repeat migration and disappointment.

PubMed

Grant, E K; Vanderkamp, J

1986-01-01

This article investigates the determinants of repeat migration among the 44 regions of Canada, using information from a large micro-database which spans the period 1968 to 1971. The explanation of repeat migration probabilities is a difficult task, and this attempt is only partly successful. May of the explanatory variables are not significant, and the overall explanatory power of the equations is not high. In the area of personal characteristics, the variables related to age, sex, and marital status are generally significant and with expected signs. The distance variable has a strongly positive effect on onward move probabilities. Variables related to prior migration experience have an important impact that differs between return and onward probabilities. In particular, the occurrence of prior moves has a striking effect on the probability of onward migration. The variable representing disappointment, or relative success of the initial move, plays a significant role in explaining repeat migration probabilities. The disappointment variable represents the ratio of actural versus expected wage income in the year after the initial move, and its effect on both repeat migration probabilities is always negative and almost always highly significant. The repeat probabilities diminish after a year's stay in the destination region, but disappointment in the most recent year still has a bearing on the delayed repeat probabilities. While the quantitative impact of the disappointment variable is not large, it is difficult to draw comparisons since similar estimates are not available elsewhere.
Public Attitudes toward Stuttering in Turkey: Probability versus Convenience Sampling

ERIC Educational Resources Information Center

Ozdemir, R. Sertan; St. Louis, Kenneth O.; Topbas, Seyhun

2011-01-01

Purpose: A Turkish translation of the "Public Opinion Survey of Human Attributes-Stuttering" ("POSHA-S") was used to compare probability versus convenience sampling to measure public attitudes toward stuttering. Method: A convenience sample of adults in Eskisehir, Turkey was compared with two replicates of a school-based,…
Edge Effects in Line Intersect Sampling With

Treesearch

David L. R. Affleck; Timothy G. Gregoire; Harry T. Valentine

2005-01-01

Transects consisting of multiple, connected segments with a prescribed configuration are commonly used in ecological applications of line intersect sampling. The transect configuration has implications for the probability with which population elements are selected and for how the selection probabilities can be modified by the boundary of the tract being sampled. As...
Estimating total suspended sediment yield with probability sampling

Treesearch

Robert B. Thomas

1985-01-01

The ""Selection At List Time"" (SALT) scheme controls sampling of concentration for estimating total suspended sediment yield. The probability of taking a sample is proportional to its estimated contribution to total suspended sediment discharge. This procedure gives unbiased estimates of total suspended sediment yield and the variance of the...
Statistical characterization of a large geochemical database and effect of sample size

USGS Publications Warehouse

Zhang, C.; Manheim, F.T.; Hinde, J.; Grossman, J.N.

2005-01-01

The authors investigated statistical distributions for concentrations of chemical elements from the National Geochemical Survey (NGS) database of the U.S. Geological Survey. At the time of this study, the NGS data set encompasses 48,544 stream sediment and soil samples from the conterminous United States analyzed by ICP-AES following a 4-acid near-total digestion. This report includes 27 elements: Al, Ca, Fe, K, Mg, Na, P, Ti, Ba, Ce, Co, Cr, Cu, Ga, La, Li, Mn, Nb, Nd, Ni, Pb, Sc, Sr, Th, V, Y and Zn. The goal and challenge for the statistical overview was to delineate chemical distributions in a complex, heterogeneous data set spanning a large geographic range (the conterminous United States), and many different geological provinces and rock types. After declustering to create a uniform spatial sample distribution with 16,511 samples, histograms and quantile-quantile (Q-Q) plots were employed to delineate subpopulations that have coherent chemical and mineral affinities. Probability groupings are discerned by changes in slope (kinks) on the plots. Major rock-forming elements, e.g., Al, Ca, K and Na, tend to display linear segments on normal Q-Q plots. These segments can commonly be linked to petrologic or mineralogical associations. For example, linear segments on K and Na plots reflect dilution of clay minerals by quartz sand (low in K and Na). Minor and trace element relationships are best displayed on lognormal Q-Q plots. These sensitively reflect discrete relationships in subpopulations within the wide range of the data. For example, small but distinctly log-linear subpopulations for Pb, Cu, Zn and Ag are interpreted to represent ore-grade enrichment of naturally occurring minerals such as sulfides. None of the 27 chemical elements could pass the test for either normal or lognormal distribution on the declustered data set. Part of the reasons relate to the presence of mixtures of subpopulations and outliers. Random samples of the data set with successively smaller numbers of data points showed that few elements passed standard statistical tests for normality or log-normality until sample size decreased to a few hundred data points. Large sample size enhances the power of statistical tests, and leads to rejection of most statistical hypotheses for real data sets. For large sample sizes (e.g., n > 1000), graphical methods such as histogram, stem-and-leaf, and probability plots are recommended for rough judgement of probability distribution if needed. ?? 2005 Elsevier Ltd. All rights reserved.
Using multilevel spatial models to understand salamander site occupancy patterns after wildfire

USGS Publications Warehouse

Chelgren, Nathan; Adams, Michael J.; Bailey, Larissa L.; Bury, R. Bruce

2011-01-01

Studies of the distribution of elusive forest wildlife have suffered from the confounding of true presence with the uncertainty of detection. Occupancy modeling, which incorporates probabilities of species detection conditional on presence, is an emerging approach for reducing observation bias. However, the current likelihood modeling framework is restrictive for handling unexplained sources of variation in the response that may occur when there are dependence structures such as smaller sampling units that are nested within larger sampling units. We used multilevel Bayesian occupancy modeling to handle dependence structures and to partition sources of variation in occupancy of sites by terrestrial salamanders (family Plethodontidae) within and surrounding an earlier wildfire in western Oregon, USA. Comparison of model fit favored a spatial N-mixture model that accounted for variation in salamander abundance over models that were based on binary detection/non-detection data. Though catch per unit effort was higher in burned areas than unburned, there was strong support that this pattern was due to a higher probability of capture for individuals in burned plots. Within the burn, the odds of capturing an individual given it was present were 2.06 times the odds outside the burn, reflecting reduced complexity of ground cover in the burn. There was weak support that true occupancy was lower within the burned area. While the odds of occupancy in the burn were 0.49 times the odds outside the burn among the five species, the magnitude of variation attributed to the burn was small in comparison to variation attributed to other landscape variables and to unexplained, spatially autocorrelated random variation. While ordinary occupancy models may separate the biological pattern of interest from variation in detection probability when all sources of variation are known, the addition of random effects structures for unexplained sources of variation in occupancy and detection probability may often more appropriately represent levels of uncertainty. ?? 2011 by the Ecological Society of America.
Performance evaluation of an importance sampling technique in a Jackson network

NASA Astrophysics Data System (ADS)

brahim Mahdipour, E.; Masoud Rahmani, Amir; Setayeshi, Saeed

2014-03-01

Importance sampling is a technique that is commonly used to speed up Monte Carlo simulation of rare events. However, little is known regarding the design of efficient importance sampling algorithms in the context of queueing networks. The standard approach, which simulates the system using an a priori fixed change of measure suggested by large deviation analysis, has been shown to fail in even the simplest network settings. Estimating probabilities associated with rare events has been a topic of great importance in queueing theory, and in applied probability at large. In this article, we analyse the performance of an importance sampling estimator for a rare event probability in a Jackson network. This article carries out strict deadlines to a two-node Jackson network with feedback whose arrival and service rates are modulated by an exogenous finite state Markov process. We have estimated the probability of network blocking for various sets of parameters, and also the probability of missing the deadline of customers for different loads and deadlines. We have finally shown that the probability of total population overflow may be affected by various deadline values, service rates and arrival rates.
Setting Time Limits on Tests

ERIC Educational Resources Information Center

van der Linden, Wim J.

2011-01-01

It is shown how the time limit on a test can be set to control the probability of a test taker running out of time before completing it. The probability is derived from the item parameters in the lognormal model for response times. Examples of curves representing the probability of running out of time on a test with given parameters as a function…
[Exploration of the concept of genetic drift in genetics teaching of undergraduates].

PubMed

Wang, Chun-ming

2016-01-01

Genetic drift is one of the difficulties in teaching genetics due to its randomness and probability which could easily cause conceptual misunderstanding. The “sampling error" in its definition is often misunderstood because of the research method of “sampling", which disturbs the results and causes the random changes in allele frequency. I analyzed and compared the definitions of genetic drift in domestic and international genetic textbooks, and found that the definitions containing “sampling error" are widely adopted but are interpreted correctly in only a few textbooks. Here, the history of research on genetic drift, i.e., the contributions of Wright, Fisher and Kimura, is introduced. Moreover, I particularly describe two representative articles recently published about genetic drift teaching of undergraduates, which point out that misconceptions are inevitable for undergraduates during the studying process and also provide a preliminary solution. Combined with my own teaching practice, I suggest that the definition of genetic drift containing “sampling error" can be adopted with further interpretation, i.e., “sampling error" is random sampling among gametes when generating the next generation of alleles which is equivalent to a random sampling of all gametes participating in mating in gamete pool and has no relationship with artificial sampling in general genetics studies. This article may provide some help in genetics teaching.
Probability mass first flush evaluation for combined sewer discharges.

PubMed

Park, Inhyeok; Kim, Hongmyeong; Chae, Soo-Kwon; Ha, Sungryong

2010-01-01

The Korea government has put in a lot of effort to construct sanitation facilities for controlling non-point source pollution. The first flush phenomenon is a prime example of such pollution. However, to date, several serious problems have arisen in the operation and treatment effectiveness of these facilities due to unsuitable design flow volumes and pollution loads. It is difficult to assess the optimal flow volume and pollution mass when considering both monetary and temporal limitations. The objective of this article was to characterize the discharge of storm runoff pollution from urban catchments in Korea and to estimate the probability of mass first flush (MFFn) using the storm water management model and probability density functions. As a result of the review of gauged storms for the representative using probability density function with rainfall volumes during the last two years, all the gauged storms were found to be valid representative precipitation. Both the observed MFFn and probability MFFn in BE-1 denoted similarly large magnitudes of first flush with roughly 40% of the total pollution mass contained in the first 20% of the runoff. In the case of BE-2, however, there were significant difference between the observed MFFn and probability MFFn.
A re-evaluation of a case-control model with contaminated controls for resource selection studies

Treesearch

Christopher T. Rota; Joshua J. Millspaugh; Dylan C. Kesler; Chad P. Lehman; Mark A. Rumble; Catherine M. B. Jachowski

2013-01-01

A common sampling design in resource selection studies involves measuring resource attributes at sample units used by an animal and at sample units considered available for use. Few models can estimate the absolute probability of using a sample unit from such data, but such approaches are generally preferred over statistical methods that estimate a relative probability...
77 FR 15376 - State Median Income Estimates for a Four-Person Household: Notice of the Federal Fiscal Year (FFY...

Federal Register 2010, 2011, 2012, 2013, 2014

2012-03-15

... contact the Census Bureau's Social, Economic and Housing Statistics Division at (301) 763- 3243. Under the... the use of probability sampling to create the sample. For additional information about the accuracy of... consists of the error that arises from the use of probability sampling to create the sample. \\2\\ These...
75 FR 26780 - State Median Income Estimate for a Four-Person Family: Notice of the Federal Fiscal Year (FFY...

Federal Register 2010, 2011, 2012, 2013, 2014

2010-05-12

... Household Economic Statistics Division at (301) 763-3243. Under the advice of the Census Bureau, HHS..., which consists of the error that arises from the use of probability sampling to create the sample. For...) Sampling Error, which consists of the error that arises from the use of probability sampling to create the...
Understanding environmental DNA detection probabilities: A case study using a stream-dwelling char Salvelinus fontinalis

USGS Publications Warehouse

Wilcox, Taylor M; Mckelvey, Kevin S.; Young, Michael K.; Sepulveda, Adam; Shepard, Bradley B.; Jane, Stephen F; Whiteley, Andrew R.; Lowe, Winsor H.; Schwartz, Michael K.

2016-01-01

Environmental DNA sampling (eDNA) has emerged as a powerful tool for detecting aquatic animals. Previous research suggests that eDNA methods are substantially more sensitive than traditional sampling. However, the factors influencing eDNA detection and the resulting sampling costs are still not well understood. Here we use multiple experiments to derive independent estimates of eDNA production rates and downstream persistence from brook trout (Salvelinus fontinalis) in streams. We use these estimates to parameterize models comparing the false negative detection rates of eDNA sampling and traditional backpack electrofishing. We find that using the protocols in this study eDNA had reasonable detection probabilities at extremely low animal densities (e.g., probability of detection 0.18 at densities of one fish per stream kilometer) and very high detection probabilities at population-level densities (e.g., probability of detection > 0.99 at densities of ≥ 3 fish per 100 m). This is substantially more sensitive than traditional electrofishing for determining the presence of brook trout and may translate into important cost savings when animals are rare. Our findings are consistent with a growing body of literature showing that eDNA sampling is a powerful tool for the detection of aquatic species, particularly those that are rare and difficult to sample using traditional methods.
The National Children's Study: Recruitment Outcomes Using the Provider-Based Recruitment Approach.

PubMed

Hale, Daniel E; Wyatt, Sharon B; Buka, Stephen; Cherry, Debra; Cislo, Kendall K; Dudley, Donald J; McElfish, Pearl Anna; Norman, Gwendolyn S; Reynolds, Simone A; Siega-Riz, Anna Maria; Wadlinger, Sandra; Walker, Cheryl K; Robbins, James M

2016-06-01

In 2009, the National Children's Study (NCS) Vanguard Study tested the feasibility of household-based recruitment and participant enrollment using a birth-rate probability sample. In 2010, the NCS Program Office launched 3 additional recruitment approaches. We tested whether provider-based recruitment could improve recruitment outcomes compared with household-based recruitment. The NCS aimed to recruit 18- to 49-year-old women who were pregnant or at risk for becoming pregnant who lived in designated geographic segments within primary sampling units, generally counties. Using provider-based recruitment, 10 study centers engaged providers to enroll eligible participants at their practice. Recruitment models used different levels of provider engagement (full, intermediate, information-only). The percentage of eligible women per county ranged from 1.5% to 57.3%. Across the centers, 3371 potential participants were approached for screening, 3459 (92%) were screened and 1479 were eligible (43%). Of those 1181 (80.0%) gave consent and 1008 (94%) were retained until delivery. Recruited participants were generally representative of the county population. Provider-based recruitment was successful in recruiting NCS participants. Challenges included time-intensity of engaging the clinical practices, differential willingness of providers to participate, and necessary reliance on providers for participant identification. The vast majority of practices cooperated to some degree. Recruitment from obstetric practices is an effective means of obtaining a representative sample. Copyright © 2016 by the American Academy of Pediatrics.
The National Children’s Study: Recruitment Outcomes Using the Provider-Based Recruitment Approach

PubMed Central

Wyatt, Sharon B.; Buka, Stephen; Cherry, Debra; Cislo, Kendall K.; Dudley, Donald J.; McElfish, Pearl Anna; Norman, Gwendolyn S.; Reynolds, Simone A.; Siega-Riz, Anna Maria; Wadlinger, Sandra; Walker, Cheryl K.; Robbins, James M.

2016-01-01

OBJECTIVE: In 2009, the National Children’s Study (NCS) Vanguard Study tested the feasibility of household-based recruitment and participant enrollment using a birth-rate probability sample. In 2010, the NCS Program Office launched 3 additional recruitment approaches. We tested whether provider-based recruitment could improve recruitment outcomes compared with household-based recruitment. METHODS: The NCS aimed to recruit 18- to 49-year-old women who were pregnant or at risk for becoming pregnant who lived in designated geographic segments within primary sampling units, generally counties. Using provider-based recruitment, 10 study centers engaged providers to enroll eligible participants at their practice. Recruitment models used different levels of provider engagement (full, intermediate, information-only). RESULTS: The percentage of eligible women per county ranged from 1.5% to 57.3%. Across the centers, 3371 potential participants were approached for screening, 3459 (92%) were screened and 1479 were eligible (43%). Of those 1181 (80.0%) gave consent and 1008 (94%) were retained until delivery. Recruited participants were generally representative of the county population. CONCLUSIONS: Provider-based recruitment was successful in recruiting NCS participants. Challenges included time-intensity of engaging the clinical practices, differential willingness of providers to participate, and necessary reliance on providers for participant identification. The vast majority of practices cooperated to some degree. Recruitment from obstetric practices is an effective means of obtaining a representative sample. PMID:27251870
First HIV prevalence estimates of a representative sample of adult sub-Saharan African migrants in a European city. Results of a community-based, cross-sectional study in Antwerp, Belgium.

PubMed

Loos, Jasna; Nöstlinger, Christiana; Vuylsteke, Bea; Deblonde, Jessika; Ndungu, Morgan; Kint, Ilse; Manirankunda, Lazare; Reyniers, Thijs; Adobea, Dorothy; Laga, Marie; Colebunders, Robert

2017-01-01

While sub-Saharan African migrants are the second largest group affected by HIV in Europe, sound HIV prevalence estimates based on representative samples of these heterogeneous communities are lacking. Such data are needed to inform prevention and public health policy. This community-based, cross-sectional study combined oral fluid HIV testing with an electronic behavioral survey. Adopting a two-stage time location sampling HIV prevalence estimates for a representative sample of adult sub-Saharan African migrants in Antwerp, Belgium were obtained. Sample proportions and estimated adjusted population proportions were calculated for all variables. Univariable and multivariable logistic regression analysis explored factors independently associated with HIV infection. Between December 2013 and October 2014, 744 sub-Saharan African migrants were included (37% women). A substantial proportion was socially, legally and economically vulnerable: 21% were probably of undocumented status, 63% had financial problems in the last year and 9% lacked stable housing. Sexual networks were mostly African and crossed national borders, i.e. sexual encounters during travels within Europa and Africa. Concurrency is common, 34% of those in a stable relationship had a partner on the side in the last year. HIV prevalence was 5.9%(95%CI:3.4%-10.1%) among women and 4.2% (95%CI:1.6%-10.6%) among men. Although high lifetime HIV testing was reported at community level (73%), 65.2% (CI95%:32.4%-88.0%) of sub-Saharan African migrants were possibly undiagnosed. Being 45 years or older, unprotected sex when travelling within Europe in the last year, high intentions to use condoms, being unaware of their last sexual partners' HIV status, recent HIV testing and not having encountered partner violence in the last year were independently associated with HIV infection in multivariable logical regression. In univariable analysis, HIV infection was additionally associated to unemployment. This is the first HIV prevalence study among adult sub-Saharan African migrants resettling in a European city based on a representative sample. HIV prevalence was high and could potentially increase further due to the high number of people with an undiagnosed HIV infection, social vulnerability, high levels of concurrency and mainly African sexual networks. Given this population's mobility, an aligned European combination prevention approach addressing these determinants is urgently needed.
Generation of intervention strategy for a genetic regulatory network represented by a family of Markov Chains.

PubMed

Berlow, Noah; Pal, Ranadip

2011-01-01

Genetic Regulatory Networks (GRNs) are frequently modeled as Markov Chains providing the transition probabilities of moving from one state of the network to another. The inverse problem of inference of the Markov Chain from noisy and limited experimental data is an ill posed problem and often generates multiple model possibilities instead of a unique one. In this article, we address the issue of intervention in a genetic regulatory network represented by a family of Markov Chains. The purpose of intervention is to alter the steady state probability distribution of the GRN as the steady states are considered to be representative of the phenotypes. We consider robust stationary control policies with best expected behavior. The extreme computational complexity involved in search of robust stationary control policies is mitigated by using a sequential approach to control policy generation and utilizing computationally efficient techniques for updating the stationary probability distribution of a Markov chain following a rank one perturbation.

Characteristics of the first child predict the parents' probability of having another child.

PubMed

Jokela, Markus

2010-07-01

In a sample of 7,695 families in the prospective, nationally representative British Millennium Cohort Study, this study examined whether characteristics of the 1st-born child predicted parents' timing and probability of having another child within 5 years after the 1st child's birth. Infant temperament was assessed with the Carey Infant Temperament Scale (Carey, 1972; Carey & McDevitt, 1978) at age 9 months, childhood socioemotional and behavioral characteristics with the Strengths and Difficulties Questionnaire (Goodman, 2001), and childhood cognitive ability with the Bracken School Readiness Assessment (Bracken, 2002) test at age 3 years. Survival analysis modeling indicated that the 1st child's low reactivity to novelty in infancy, high prosociality, low conduct problems, and high cognitive ability in childhood were associated with increased probability of parents having another child. Except for reactivity to novelty, these associations became stronger with time. High emotional symptoms were also positively associated with childbearing, but this was likely to reflect reverse causality-that is, the effect of sibling birth on the 1st child's adjustment. The results suggest that child effects, particularly those related to the child's cognitive ability, adaptability to novelty, and prosocial behavior, may be relevant to parents' future childbearing. (PsycINFO Database Record (c) 2010 APA, all rights reserved).
Effects of sampling strategy, detection probability, and independence of counts on the use of point counts

USGS Publications Warehouse

Pendleton, G.W.; Ralph, C. John; Sauer, John R.; Droege, Sam

1995-01-01

Many factors affect the use of point counts for monitoring bird populations, including sampling strategies, variation in detection rates, and independence of sample points. The most commonly used sampling plans are stratified sampling, cluster sampling, and systematic sampling. Each of these might be most useful for different objectives or field situations. Variation in detection probabilities and lack of independence among sample points can bias estimates and measures of precision. All of these factors should be con-sidered when using point count methods.
CAN'T MISS--conquer any number task by making important statistics simple. Part 2. Probability, populations, samples, and normal distributions.

PubMed

Hansen, John P

2003-01-01

Healthcare quality improvement professionals need to understand and use inferential statistics to interpret sample data from their organizations. In quality improvement and healthcare research studies all the data from a population often are not available, so investigators take samples and make inferences about the population by using inferential statistics. This three-part series will give readers an understanding of the concepts of inferential statistics as well as the specific tools for calculating confidence intervals for samples of data. This article, Part 2, describes probability, populations, and samples. The uses of descriptive and inferential statistics are outlined. The article also discusses the properties and probability of normal distributions, including the standard normal distribution.
Secondary School Students' Reasoning about Conditional Probability, Samples, and Sampling Procedures

ERIC Educational Resources Information Center

Prodromou, Theodosia

2016-01-01

In the Australian mathematics curriculum, Year 12 students (aged 16-17) are asked to solve conditional probability problems that involve the representation of the problem situation with two-way tables or three-dimensional diagrams and consider sampling procedures that result in different correct answers. In a small exploratory study, we…
MEASUREMENT OF CHILDREN'S EXPOSURE TO PESTICIDES: ANALYSIS OF URINARY METABOLITE LEVELS IN A PROBABILITY-BASED SAMPLE

EPA Science Inventory

The Minnesota Children's Pesticide Exposure Study is a probability-based sample of 102 children 3-13 years old who were monitored for commonly used pesticides. During the summer of 1997, first-morning-void urine samples (1-3 per child) were obtained for 88% of study children a...
45 CFR 1356.71 - Federal review of the eligibility of children in foster care and the eligibility of foster care...

Code of Federal Regulations, 2010 CFR

2010-10-01

... by ACF statistical staff from the Adoption and Foster Care Analysis and Reporting System (AFCARS... primary review utilizing probability sampling methodologies. Usually, the chosen methodology will be simple random sampling, but other probability samples may be utilized, when necessary and appropriate. (3...
For what applications can probability and non-probability sampling be used?

Treesearch

H. T. Schreuder; T. G. Gregoire; J. P. Weyer

2001-01-01

Almost any type of sample has some utility when estimating population quantities. The focus in this paper is to indicate what type or combination of types of sampling can be used in various situations ranging from a sample designed to establish cause-effect or legal challenge to one involving a simple subjective judgment. Several of these methods have little or no...
Optimization of the two-sample rank Neyman-Pearson detector

NASA Astrophysics Data System (ADS)

Akimov, P. S.; Barashkov, V. M.

1984-10-01

The development of optimal algorithms concerned with rank considerations in the case of finite sample sizes involves considerable mathematical difficulties. The present investigation provides results related to the design and the analysis of an optimal rank detector based on a utilization of the Neyman-Pearson criteria. The detection of a signal in the presence of background noise is considered, taking into account n observations (readings) x1, x2, ... xn in the experimental communications channel. The computation of the value of the rank of an observation is calculated on the basis of relations between x and the variable y, representing interference. Attention is given to conditions in the absence of a signal, the probability of the detection of an arriving signal, details regarding the utilization of the Neyman-Pearson criteria, the scheme of an optimal rank, multichannel, incoherent detector, and an analysis of the detector.
Osteoporosis and milk intake among Korean women in California: relationship with acculturation to U.S. lifestyle.

PubMed

Irvin, Veronica L; Nichols, Jeanne F; Hofstetter, C Richard; Ojeda, Victoria D; Song, Yoon Ju; Kang, Sunny; Hovell, Melbourne F

2013-12-01

The Korean population in the US increased by a third between 2000 and 2010. Korean women in the US report low calcium intake and relatively high rate of fractures. However, little is known about the prevalence of osteoporosis among Korean American women. This paper examined the relationship between prevalence of osteoporosis and milk consumption, and their relationship with acculturation among a representative sample of immigrant California women of Korean descent. Bilingual telephone surveys were conducted from a probability sample (N = 590) in 2007. Lower acculturation significantly related to lower milk consumption for women during the age periods of 12-18 and 19-34 years. Acculturation was related to higher prevalence of osteoporosis among post-menopausal, but not pre-menopausal Korean women in California. Future research should include larger cohorts, objective measures of osteoporosis, other sources of calcium specific to Korean cuisine, and assessment of bone-loading physical activity.
Family caregiving to those with dementia in rural Alabama: racial similarities and differences.

PubMed

Kosberg, Jordan I; Kaufman, Allan V; Burgio, Louis D; Leeper, James D; Sun, Fei

2007-02-01

This study explored differences and similarities in the experiences of African American and White family caregivers of dementia patients living in rural Alabama. This cross-sectional survey used a caregiving stress model to investigate the interrelationships between caregiving burden, mediators, and outcomes. Random-digit-dialing telephone interviews were used to obtain data on a probability sample of 74 non-Hispanic White and 67 African American caregivers. White caregivers were more likely to be married and older, used acceptance and humor as coping styles, and had fewer financial problems. African American caregivers gave more hours of care, used religion and denial as coping styles, and were less burdened. The authors have developed a methodology for obtaining a representative sample of African American and White rural caregivers. Further investigations are needed of the interactions between urban/rural location and ethnic/racial backgrounds of dementia caregivers for heuristic and applied reasons.
Computing thermal Wigner densities with the phase integration method

DOE Office of Scientific and Technical Information (OSTI.GOV)

Beutier, J.; Borgis, D.; Vuilleumier, R.

2014-08-28

We discuss how the Phase Integration Method (PIM), recently developed to compute symmetrized time correlation functions [M. Monteferrante, S. Bonella, and G. Ciccotti, Mol. Phys. 109, 3015 (2011)], can be adapted to sampling/generating the thermal Wigner density, a key ingredient, for example, in many approximate schemes for simulating quantum time dependent properties. PIM combines a path integral representation of the density with a cumulant expansion to represent the Wigner function in a form calculable via existing Monte Carlo algorithms for sampling noisy probability densities. The method is able to capture highly non-classical effects such as correlation among the momenta andmore » coordinates parts of the density, or correlations among the momenta themselves. By using alternatives to cumulants, it can also indicate the presence of negative parts of the Wigner density. Both properties are demonstrated by comparing PIM results to those of reference quantum calculations on a set of model problems.« less
After-School and Informal STEM Projects: the Effect of Participant Self-Selection

NASA Astrophysics Data System (ADS)

Vallett, David B.; Lamb, Richard; Annetta, Leonard

2017-12-01

This research represents an unforeseen outcome of the authors' National Science Foundation Innovation Technology Experiences for Students and Teachers (ITEST) program grant in science education. The grant itself focused on the use of serious educational games (SEGs) in the science classroom, both during and after school, to teach science content and affect student perceptions of science and technology. This study consists of a Bayesian artificial neural network analysis, using the preintervention measures of affect, interest, personality, and cognitive ability, in members of both the treatment and comparison groups to generate the probabilities that students would opt into the treatment group or choose not to participate. It appears, from this sample and the sampling methods of other related studies within the field, that despite sometimes profound results from technology interventions in science, interventions are affecting only those who already have a strong interest in STEM due to the manner in which participants are recruited.
Computing thermal Wigner densities with the phase integration method.

PubMed

Beutier, J; Borgis, D; Vuilleumier, R; Bonella, S

2014-08-28

We discuss how the Phase Integration Method (PIM), recently developed to compute symmetrized time correlation functions [M. Monteferrante, S. Bonella, and G. Ciccotti, Mol. Phys. 109, 3015 (2011)], can be adapted to sampling/generating the thermal Wigner density, a key ingredient, for example, in many approximate schemes for simulating quantum time dependent properties. PIM combines a path integral representation of the density with a cumulant expansion to represent the Wigner function in a form calculable via existing Monte Carlo algorithms for sampling noisy probability densities. The method is able to capture highly non-classical effects such as correlation among the momenta and coordinates parts of the density, or correlations among the momenta themselves. By using alternatives to cumulants, it can also indicate the presence of negative parts of the Wigner density. Both properties are demonstrated by comparing PIM results to those of reference quantum calculations on a set of model problems.
Estimation of the Rate of Unrecognized Cross-Contamination with Mycobacterium tuberculosis in London Microbiology Laboratories

PubMed Central

Ruddy, M.; McHugh, T. D.; Dale, J. W.; Banerjee, D.; Maguire, H.; Wilson, P.; Drobniewski, F.; Butcher, P.; Gillespie, S. H.

2002-01-01

Isolates from patients with confirmed tuberculosis from London were collected over 2.5 years between 1995 and 1997. Restriction fragment length polymorphism (RFLP) analysis was performed by the international standard technique as part of a multicenter epidemiological study. A total of 2,779 samples representing 2,500 individual patients from 56 laboratories were examined. Analysis of these samples revealed a laboratory cross-contamination rate of between 0.54%, when only presumed cases of cross-contamination were considered, and 0.93%, when presumed and possible cases were counted. Previous studies suggest an extremely wide range of laboratory cross-contamination rates of between 0.1 and 65%. These data indicate that laboratory cross-contamination has not been a common problem in routine practice in the London area, but in several incidents patients did receive full courses of therapy that were probably unnecessary. PMID:12409381
Childhood trauma and psychiatric disorders as correlates of school dropout in a national sample of young adults.

PubMed

Porche, Michelle V; Fortuna, Lisa R; Lin, Julia; Alegria, Margarita

2011-01-01

The effect of childhood trauma, psychiatric diagnoses, and mental health services on school dropout among U.S.-born and immigrant youth is examined using data from the Collaborative Psychiatric Epidemiology Surveys, a nationally representative probability sample of African Americans, Afro-Caribbeans, Asians, Latinos, and non-Latino Whites, including 2,532 young adults, aged 21-29. The dropout prevalence rate was 16% overall, with variation by childhood trauma, childhood psychiatric diagnosis, race/ethnicity, and nativity. Childhood substance and conduct disorders mediated the relation between trauma and school dropout. Likelihood of dropout was decreased for Asians, and increased for African Americans and Latinos, compared to non-Latino Whites as a function of psychiatric disorders and trauma. Timing of U.S. immigration during adolescence increased risk of dropout. © 2011 The Authors. Child Development © 2011 Society for Research in Child Development, Inc.
"Unequal opportunity": neighbourhood disadvantage and the chance to buy illegal drugs.

PubMed

Storr, C L; Chen, C-Y; Anthony, J C

2004-03-01

This study investigates whether subgroups of people living in disadvantaged neighbourhoods may be more likely to come into contact with drug dealers as compared with persons living in more advantaged areas, with due attention to male-female and race-ethnicity differences. Standardised survey data collected using stratified, multistage area probability sampling. United States of America, 1998. Nationally representative sample of household residents age 12 or older (n = 25 500). Evidence supports an inference that women are less likely to be approached by someone selling illegal drugs. The study found no more than modest and generally null racial and ethnicity differences, even for residents living within socially disadvantaged neighbourhoods, where chances to buy illegal drugs are found to be more common. Limitations of survey data always merit attention, but this study evidence lends support to the inference that physical and social characteristics of a neighbourhood can set the stage for opportunities to become involved with drugs.
[Validation of the Eating Attitudes Test as a screening instrument for eating disorders in general population].

PubMed

Peláez-Fernández, María Angeles; Ruiz-Lázaro, Pedro Manuel; Labrador, Francisco Javier; Raich, Rosa María

2014-02-20

To validate the best cut-off point of the Eating Attitudes Test (EAT-40), Spanish version, for the screening of eating disorders (ED) in the general population. This was a transversal cross-sectional study. The EAT-40 Spanish version was administered to a representative sample of 1.543 students, age range 12 to 21 years, in the Region of Madrid. Six hundred and two participants (probable cases and a random sample of controls) were interviewed. The best diagnostic prediction was obtained with a cut-off point of 21, with sensitivity: 88.2%; specificity: 62.1%; positive predictive value: 17.7%; negative predictive value: 62.1%. Use of a cut-off point of 21 is recommended in epidemiological studies of eating disorders in the Spanish general population. Copyright © 2012 Elsevier España, S.L. All rights reserved.
A Stochastic Diffusion Process for the Dirichlet Distribution

DOE PAGES

Bakosi, J.; Ristorcelli, J. R.

2013-03-01

The method of potential solutions of Fokker-Planck equations is used to develop a transport equation for the joint probability ofNcoupled stochastic variables with the Dirichlet distribution as its asymptotic solution. To ensure a bounded sample space, a coupled nonlinear diffusion process is required: the Wiener processes in the equivalent system of stochastic differential equations are multiplicative with coefficients dependent on all the stochastic variables. Individual samples of a discrete ensemble, obtained from the stochastic process, satisfy a unit-sum constraint at all times. The process may be used to represent realizations of a fluctuating ensemble ofNvariables subject to a conservation principle.more » Similar to the multivariate Wright-Fisher process, whose invariant is also Dirichlet, the univariate case yields a process whose invariant is the beta distribution. As a test of the results, Monte Carlo simulations are used to evolve numerical ensembles toward the invariant Dirichlet distribution.« less
Design and simulation of stratified probability digital receiver with application to the multipath communication

NASA Technical Reports Server (NTRS)

Deal, J. H.

1975-01-01

One approach to the problem of simplifying complex nonlinear filtering algorithms is through using stratified probability approximations where the continuous probability density functions of certain random variables are represented by discrete mass approximations. This technique is developed in this paper and used to simplify the filtering algorithms developed for the optimum receiver for signals corrupted by both additive and multiplicative noise.
Human Remains from the Pleistocene-Holocene Transition of Southwest China Suggest a Complex Evolutionary History for East Asians

PubMed Central

Curnoe, Darren; Xueping, Ji; Herries, Andy I. R.; Kanning, Bai; Taçon, Paul S. C.; Zhende, Bao; Fink, David; Yunsheng, Zhu; Hellstrom, John; Yun, Luo; Cassis, Gerasimos; Bing, Su; Wroe, Stephen; Shi, Hong; Parr, William C. H.; Shengmin, Huang; Rogers, Natalie

2012-01-01

Background Later Pleistocene human evolution in East Asia remains poorly understood owing to a scarcity of well described, reliably classified and accurately dated fossils. Southwest China has been identified from genetic research as a hotspot of human diversity, containing ancient mtDNA and Y-DNA lineages, and has yielded a number of human remains thought to derive from Pleistocene deposits. We have prepared, reconstructed, described and dated a new partial skull from a consolidated sediment block collected in 1979 from the site of Longlin Cave (Guangxi Province). We also undertook new excavations at Maludong (Yunnan Province) to clarify the stratigraphy and dating of a large sample of mostly undescribed human remains from the site. Methodology/Principal Findings We undertook a detailed comparison of cranial, including a virtual endocast for the Maludong calotte, mandibular and dental remains from these two localities. Both samples probably derive from the same population, exhibiting an unusual mixture of modern human traits, characters probably plesiomorphic for later Homo, and some unusual features. We dated charcoal with AMS radiocarbon dating and speleothem with the Uranium-series technique and the results show both samples to be from the Pleistocene-Holocene transition: ∼14.3-11.5 ka. Conclusions/Significance Our analysis suggests two plausible explanations for the morphology sampled at Longlin Cave and Maludong. First, it may represent a late-surviving archaic population, perhaps paralleling the situation seen in North Africa as indicated by remains from Dar-es-Soltane and Temara, and maybe also in southern China at Zhirendong. Alternatively, East Asia may have been colonised during multiple waves during the Pleistocene, with the Longlin-Maludong morphology possibly reflecting deep population substructure in Africa prior to modern humans dispersing into Eurasia. PMID:22431968

Human remains from the Pleistocene-Holocene transition of southwest China suggest a complex evolutionary history for East Asians.

PubMed

Curnoe, Darren; Xueping, Ji; Herries, Andy I R; Kanning, Bai; Taçon, Paul S C; Zhende, Bao; Fink, David; Yunsheng, Zhu; Hellstrom, John; Yun, Luo; Cassis, Gerasimos; Bing, Su; Wroe, Stephen; Shi, Hong; Parr, William C H; Shengmin, Huang; Rogers, Natalie

2012-01-01

Later Pleistocene human evolution in East Asia remains poorly understood owing to a scarcity of well described, reliably classified and accurately dated fossils. Southwest China has been identified from genetic research as a hotspot of human diversity, containing ancient mtDNA and Y-DNA lineages, and has yielded a number of human remains thought to derive from Pleistocene deposits. We have prepared, reconstructed, described and dated a new partial skull from a consolidated sediment block collected in 1979 from the site of Longlin Cave (Guangxi Province). We also undertook new excavations at Maludong (Yunnan Province) to clarify the stratigraphy and dating of a large sample of mostly undescribed human remains from the site. We undertook a detailed comparison of cranial, including a virtual endocast for the Maludong calotte, mandibular and dental remains from these two localities. Both samples probably derive from the same population, exhibiting an unusual mixture of modern human traits, characters probably plesiomorphic for later Homo, and some unusual features. We dated charcoal with AMS radiocarbon dating and speleothem with the Uranium-series technique and the results show both samples to be from the Pleistocene-Holocene transition: ∼14.3-11.5 ka. Our analysis suggests two plausible explanations for the morphology sampled at Longlin Cave and Maludong. First, it may represent a late-surviving archaic population, perhaps paralleling the situation seen in North Africa as indicated by remains from Dar-es-Soltane and Temara, and maybe also in southern China at Zhirendong. Alternatively, East Asia may have been colonised during multiple waves during the Pleistocene, with the Longlin-Maludong morphology possibly reflecting deep population substructure in Africa prior to modern humans dispersing into Eurasia.
Using known populations of pronghorn to evaluate sampling plans and estimators

USGS Publications Warehouse

Kraft, K.M.; Johnson, D.H.; Samuelson, J.M.; Allen, S.H.

1995-01-01

Although sampling plans and estimators of abundance have good theoretical properties, their performance in real situations is rarely assessed because true population sizes are unknown. We evaluated widely used sampling plans and estimators of population size on 3 known clustered distributions of pronghorn (Antilocapra americana). Our criteria were accuracy of the estimate, coverage of 95% confidence intervals, and cost. Sampling plans were combinations of sampling intensities (16, 33, and 50%), sample selection (simple random sampling without replacement, systematic sampling, and probability proportional to size sampling with replacement), and stratification. We paired sampling plans with suitable estimators (simple, ratio, and probability proportional to size). We used area of the sampling unit as the auxiliary variable for the ratio and probability proportional to size estimators. All estimators were nearly unbiased, but precision was generally low (overall mean coefficient of variation [CV] = 29). Coverage of 95% confidence intervals was only 89% because of the highly skewed distribution of the pronghorn counts and small sample sizes, especially with stratification. Stratification combined with accurate estimates of optimal stratum sample sizes increased precision, reducing the mean CV from 33 without stratification to 25 with stratification; costs increased 23%. Precise results (mean CV = 13) but poor confidence interval coverage (83%) were obtained with simple and ratio estimators when the allocation scheme included all sampling units in the stratum containing most pronghorn. Although areas of the sampling units varied, ratio estimators and probability proportional to size sampling did not increase precision, possibly because of the clumped distribution of pronghorn. Managers should be cautious in using sampling plans and estimators to estimate abundance of aggregated populations.
Analysis of counting errors in the phase/Doppler particle analyzer

NASA Technical Reports Server (NTRS)

Oldenburg, John R.

1987-01-01

NASA is investigating the application of the Phase Doppler measurement technique to provide improved drop sizing and liquid water content measurements in icing research. The magnitude of counting errors were analyzed because these errors contribute to inaccurate liquid water content measurements. The Phase Doppler Particle Analyzer counting errors due to data transfer losses and coincidence losses were analyzed for data input rates from 10 samples/sec to 70,000 samples/sec. Coincidence losses were calculated by determining the Poisson probability of having more than one event occurring during the droplet signal time. The magnitude of the coincidence loss can be determined, and for less than a 15 percent loss, corrections can be made. The data transfer losses were estimated for representative data transfer rates. With direct memory access enabled, data transfer losses are less than 5 percent for input rates below 2000 samples/sec. With direct memory access disabled losses exceeded 20 percent at a rate of 50 samples/sec preventing accurate number density or mass flux measurements. The data transfer losses of a new signal processor were analyzed and found to be less than 1 percent for rates under 65,000 samples/sec.
Combined target factor analysis and Bayesian soft-classification of interference-contaminated samples: forensic fire debris analysis.

PubMed

Williams, Mary R; Sigman, Michael E; Lewis, Jennifer; Pitan, Kelly McHugh

2012-10-10

A bayesian soft classification method combined with target factor analysis (TFA) is described and tested for the analysis of fire debris data. The method relies on analysis of the average mass spectrum across the chromatographic profile (i.e., the total ion spectrum, TIS) from multiple samples taken from a single fire scene. A library of TIS from reference ignitable liquids with assigned ASTM classification is used as the target factors in TFA. The class-conditional distributions of correlations between the target and predicted factors for each ASTM class are represented by kernel functions and analyzed by bayesian decision theory. The soft classification approach assists in assessing the probability that ignitable liquid residue from a specific ASTM E1618 class, is present in a set of samples from a single fire scene, even in the presence of unspecified background contributions from pyrolysis products. The method is demonstrated with sample data sets and then tested on laboratory-scale burn data and large-scale field test burns. The overall performance achieved in laboratory and field test of the method is approximately 80% correct classification of fire debris samples. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Average probability that a "cold hit" in a DNA database search results in an erroneous attribution.

PubMed

Song, Yun S; Patil, Anand; Murphy, Erin E; Slatkin, Montgomery

2009-01-01

We consider a hypothetical series of cases in which the DNA profile of a crime-scene sample is found to match a known profile in a DNA database (i.e., a "cold hit"), resulting in the identification of a suspect based only on genetic evidence. We show that the average probability that there is another person in the population whose profile matches the crime-scene sample but who is not in the database is approximately 2(N - d)p(A), where N is the number of individuals in the population, d is the number of profiles in the database, and p(A) is the average match probability (AMP) for the population. The AMP is estimated by computing the average of the probabilities that two individuals in the population have the same profile. We show further that if a priori each individual in the population is equally likely to have left the crime-scene sample, then the average probability that the database search attributes the crime-scene sample to a wrong person is (N - d)p(A).
Detection probabilities of electrofishing, hoop nets, and benthic trawls for fishes in two western North American rivers

USGS Publications Warehouse

Smith, Christopher D.; Quist, Michael C.; Hardy, Ryan S.

2015-01-01

Research comparing different sampling techniques helps improve the efficiency and efficacy of sampling efforts. We compared the effectiveness of three sampling techniques (small-mesh hoop nets, benthic trawls, boat-mounted electrofishing) for 30 species in the Green (WY, USA) and Kootenai (ID, USA) rivers by estimating conditional detection probabilities (probability of detecting a species given its presence at a site). Electrofishing had the highest detection probabilities (generally greater than 0.60) for most species (88%), but hoop nets also had high detectability for several taxa (e.g., adult burbot Lota lota, juvenile northern pikeminnow Ptychocheilus oregonensis). Benthic trawls had low detection probabilities (<0.05) for most taxa (84%). Gear-specific effects were present for most species indicating large differences in gear effectiveness among techniques. In addition to gear effects, habitat characteristics also influenced detectability of fishes. Most species-specific habitat relationships were idiosyncratic and reflected the ecology of the species. Overall findings of our study indicate that boat-mounted electrofishing and hoop nets are the most effective techniques for sampling fish assemblages in large, coldwater rivers.
How representative is pesticide monitoring in Swiss streams?

NASA Astrophysics Data System (ADS)

Munz, Nicole; Wittmer, Irene; Strahm, Ivo; Leu, Christian; Stamm, Christian

2013-04-01

The surveillance of surface water quality in Switzerland is the task of the 26 cantons. This includes the assessment of the level of pesticide pollution. Each of the cantons may follow different procedures, which makes a comparison difficult and cumbersome. Nevertheless, in this study presents the main results of the first nation-wide compilation and interpretation of cantonal and federal monitoring data as well as results from specific research projects on agricultural and urban pesticides are presented. Overall, more than 345'000 concentration data of 281 biocidal compounds have been analyzed. This set of substances includes 203 compounds that have been registered either only as agricultural plant protection (N = 149) product or only as urban biocide (N = 18), but also some (N = 36) which were registered for both uses. This data set contains 70 out of the 100 most sold agricultural plant protection products in 2010. A comparable assessment for the representativeness of the biocide data is hardly possible due to a lack of systematic use data. The data stem from 565 measuring sites. However, these sites are not representative for all size classes of the Swiss stream network. While about 75% of the total length of the stream network is made up by small streams (Strahler order 1 and 2), only 28% of the measuring sites are located on such streams. In combination with the sampling strategies that have been used - about 50% grab samples and 50% composite samples - it can be concluded that the 2% of measured values > 100 ng L-1 most probably severely underestimates the true level of pesticide pollution in the Swiss stream network. In the future, more emphasis has to be put on small streams, where higher concentrations are expected and thus also actual ecological effects.
Can we estimate molluscan abundance and biomass on the continental shelf?

NASA Astrophysics Data System (ADS)

Powell, Eric N.; Mann, Roger; Ashton-Alcox, Kathryn A.; Kuykendall, Kelsey M.; Chase Long, M.

2017-11-01

Few empirical studies have focused on the effect of sample density on the estimate of abundance of the dominant carbonate-producing fauna of the continental shelf. Here, we present such a study and consider the implications of suboptimal sampling design on estimates of abundance and size-frequency distribution. We focus on a principal carbonate producer of the U.S. Atlantic continental shelf, the Atlantic surfclam, Spisula solidissima. To evaluate the degree to which the results are typical, we analyze a dataset for the principal carbonate producer of Mid-Atlantic estuaries, the Eastern oyster Crassostrea virginica, obtained from Delaware Bay. These two species occupy different habitats and display different lifestyles, yet demonstrate similar challenges to survey design and similar trends with sampling density. The median of a series of simulated survey mean abundances, the central tendency obtained over a large number of surveys of the same area, always underestimated true abundance at low sample densities. More dramatic were the trends in the probability of a biased outcome. As sample density declined, the probability of a survey availability event, defined as a survey yielding indices >125% or <75% of the true population abundance, increased and that increase was disproportionately biased towards underestimates. For these cases where a single sample accessed about 0.001-0.004% of the domain, 8-15 random samples were required to reduce the probability of a survey availability event below 40%. The problem of differential bias, in which the probabilities of a biased-high and a biased-low survey index were distinctly unequal, was resolved with fewer samples than the problem of overall bias. These trends suggest that the influence of sampling density on survey design comes with a series of incremental challenges. At woefully inadequate sampling density, the probability of a biased-low survey index will substantially exceed the probability of a biased-high index. The survey time series on the average will return an estimate of the stock that underestimates true stock abundance. If sampling intensity is increased, the frequency of biased indices balances between high and low values. Incrementing sample number from this point steadily reduces the likelihood of a biased survey; however, the number of samples necessary to drive the probability of survey availability events to a preferred level of infrequency may be daunting. Moreover, certain size classes will be disproportionately susceptible to such events and the impact on size frequency will be species specific, depending on the relative dispersion of the size classes.
Prevalence and associated factors of Internet gaming disorder among community dwelling adults in Macao, China.

PubMed

Wu, Anise M S; Chen, Juliet Honglei; Tong, Kwok-Kit; Yu, Shu; Lau, Joseph T F

2018-03-01

Background and aims Internet gaming disorder (IGD) has been mainly studied among adolescents, and no research to date has examined its prevalence in general Chinese adult populations. This study estimated the prevalence of probable IGD in community-dwelling adults in Macao, China. Associations between IGD and psychological distress (i.e., depression and anxiety) as well as IGD and character strength (i.e., psychological resilience and purpose in life) were also tested. Methods A random, representative sample of 1,000 Chinese residents (44% males; mean age = 40.0) was surveyed using a telephone poll design from October to November 2016. Results The estimated prevalence of probable IGD was 2.0% of the overall sample and 4.3% among the recent gamers (n = 473), with no statistically significant sex and age effects observed (p > .05). The two most prevalent IGD symptoms were mood modification and continued engagement, despite negative consequences. Probable IGD respondents were more vulnerable to psychological distress (25.0% and 45.0% for moderate or above levels of depression and anxiety, respectively) than their non-IGD counterparts. They also reported a lower level of psychological resilience than non-IGD respondents. No significant buffering effect of the two character strength variables on the distress-IGD relationship was found. Discussion and conclusions These results provide empirical evidence that IGD is a mental health threat not only to adolescents but also to adults. IGD was significantly associated with psychological distress, which should be addressed in conjunction with IGD symptoms in interventions. Inclusion of gamers of both sexes and different age groups in future prevention programs is also recommended.
Palaeomagnetism of the loess/palaeosol sequence in Viatovo (NE Bulgaria) in the Danube basin

NASA Astrophysics Data System (ADS)

Jordanova, Diana; Hus, Jozef; Evlogiev, Jordan; Geeraerts, Raoul

2008-03-01

The results of a palaeomagnetic investigation of a 27 m thick loess/palaeosol sequence in Viatovo (NE Bulgaria) are presented in this paper. The sequence consists of topsoil S 0, seven loess horizons (L 1-L 7) and six interbedded palaeosols (S 1-S 6) overlying a red clay (terra rossa) complex. Magnetic viscosity experiments, IRM acquisition, AMS analysis and NRM stepwise alternating and thermal demagnetisation experiments of pilot samples were implemented for precise determination of the characteristic remanence and construction of a reliable magnetostratigraphical scheme. Analysis of IRM acquisition curves using the expectation - maximization algorithm of Heslop et al. [Heslop, D., Dekkers, M., Kruiver, P., van Oorschot, H., 2002. Analysis of isothermal remanent magnetization acquisition curves using the expectation - maximization algorithm. Geophys. J. Int., 148, 58-64] suggests that the best fitting is obtained by three coercivity components. Component 1 corresponds to SD maghemite/magnetite, while component 2 is probably related to the presence of oxidised detrital magnetites. The third component shows varying coercivities depending on the degree of pedogenic alteration of the samples and probably reflects the presence of detrital magnetite grains oxidised at different degree. The relevance of the Viatovo section as a key representative sequence for the loess cover in the Danube basin is confirmed by the presence of geomagnetic polarity changes in the lower part of the sequence. The youngest one recorded in the seventh loess unit L 7 can be identified as corresponding to the Matuyama/Brunhes palaeomagnetic polarity transition. Two normal magnetozones were found in the red clay complex, probably corresponding to the Jaramillo and Olduvai subchronozones of the Matuyama chron.
Oxytocin receptor gene polymorphisms, attachment, and PTSD: Results from the National Health and Resilience in Veterans Study.

PubMed

Sippel, Lauren M; Han, Shizhong; Watkins, Laura E; Harpaz-Rotem, Ilan; Southwick, Steven M; Krystal, John H; Olff, Miranda; Sherva, Richard; Farrer, Lindsay A; Kranzler, Henry R; Gelernter, Joel; Pietrzak, Robert H

2017-11-01

The human oxytocin system is implicated in social behavior and stress recovery. Polymorphisms in the oxytocin receptor gene (OXTR) may interact with attachment style to predict stress-related psychopathology like posttraumatic stress disorder (PTSD). The objective of this study was to examine independent and interactive effects of the OXTR single nucleotide polymorphism (SNP) rs53576, which has been associated with stress reactivity, support-seeking, and PTSD in prior studies, and attachment style on risk for PTSD in a nationally representative sample of 2163 European-American (EA) U.S. military veterans who participated in two independent waves of the National Health and Resilience in Veterans Study (NHRVS). Results revealed that insecure attachment style [adjusted odds ratio (OR) = 4.29; p < 0.001] and the interaction of rs53576 and attachment style (OR = 2.58, p = 0.02) were associated with probable lifetime PTSD. Among individuals with the minor A allele, the prevalence of probable PTSD was significantly higher among those with an insecure attachment style (23.9%) than those with a secure attachment style (2.0%), equivalent to an adjusted OR of 10.7. We attempted to replicate these findings by utilizing dense marker data from a genome-wide association study of 2215 high-risk civilians; one OXTR variant, though not rs53576, was associated with PTSD. Exploratory analyses in the veteran sample revealed that the interaction between this variant and attachment style predicting probable PTSD approached statistical significance. Results indicate that polymorphisms in the OXTR gene and attachment style may contribute to vulnerability to PTSD in U.S. military veterans. Published by Elsevier Ltd.
The determinants of HMOs' contracting with hospitals for bypass surgery.

PubMed

Gaskin, Darrell J; Escarce, José J; Schulman, Kevin; Hadley, Jack

2002-08-01

Selective contracting with health care providers is one of the mechanisms HMOs (Health Maintenance Organizations) use to lower health care costs for their enrollees. However, are HMOs compromising quality to lower costs? To address this and other questions we identify factors that influence HMOs' selective contracting for coronary artery bypass surgery (CABG). Using a logistic regression analysis, we estimated the effects of hospitals' quality, costliness, and geographic convenience on HMOs' decision to contract with a hospital for CABG services. We also estimated the impact of HMO characteristics and market characteristics on HMOs' contracting decision. A 1997 survey of a nationally representative sample of 50 HMOs that could have potentially contracted with 447 hospitals. About 44 percent of the HMO-hospital pairs had a contract. We found that the probability of an HMO contracting with a hospital increased as hospital quality increased and decreased as distance increased. Hospital costliness had a negative but borderline significant (0.10 < p < 0.05) effect on the probability of a contract across all types of HMOs. However, this effect was much larger for IPA (Independent Practice Association)-model HMOs than for either group/staff or network HMOs. An increase in HMO competition increased the probability of a contract while an increase in hospital competition decreased the probability of a contract. HMO penetration did not affect the probability of contracting. HMO characteristics also had significant effects on contracting decisions. The results suggest that HMOs value quality, geographic convenience, and costliness, and that the importance of quality and costliness vary with HMO. Greater HMO competition encourages broader hospital networks whereas greater hospital competition leads to more restrictive networks.
The Determinants of HMOs’ Contracting with Hospitals for Bypass Surgery

PubMed Central

Gaskin, Darrell J; Escarce, José J; Schulman, Kevin; Hadley, Jack

2002-01-01

Objective Selective contracting with health care providers is one of the mechanisms HMOs (Health Maintenance Organizations) use to lower health care costs for their enrollees. However, are HMOs compromising quality to lower costs? To address this and other questions we identify factors that influence HMOs’ selective contracting for coronary artery bypass surgery (CABG). Study Design Using a logistic regression analysis, we estimated the effects of hospitals’ quality, costliness, and geographic convenience on HMOs’ decision to contract with a hospital for CABG services. We also estimated the impact of HMO characteristics and market characteristics on HMOs’ contracting decision. Data Sources A 1997 survey of a nationally representative sample of 50 HMOs that could have potentially contracted with 447 hospitals. Principal Findings About 44 percent of the HMO-hospital pairs had a contract. We found that the probability of an HMO contracting with a hospital increased as hospital quality increased and decreased as distance increased. Hospital costliness had a negative but borderline significant (0.10
DOE Office of Scientific and Technical Information (OSTI.GOV)

Gagné, Jonathan; Lafrenière, David; Doyon, René

We present Bayesian Analysis for Nearby Young AssociatioNs II (BANYAN II), a modified Bayesian analysis for assessing the membership of later-than-M5 objects to any of several Nearby Young Associations (NYAs). In addition to using kinematic information (from sky position and proper motion), this analysis exploits 2MASS-WISE color-magnitude diagrams in which old and young objects follow distinct sequences. As an improvement over our earlier work, the spatial and kinematic distributions for each association are now modeled as ellipsoids whose axes need not be aligned with the Galactic coordinate axes, and we use prior probabilities matching the expected populations of the NYAsmore » considered versus field stars. We present an extensive contamination analysis to characterize the performance of our new method. We find that Bayesian probabilities are generally representative of contamination rates, except when a parallax measurement is considered. In this case contamination rates become significantly smaller and hence Bayesian probabilities for NYA memberships are pessimistic. We apply this new algorithm to a sample of 158 objects from the literature that are either known to display spectroscopic signs of youth or have unusually red near-infrared colors for their spectral type. Based on our analysis, we identify 25 objects as new highly probable candidates to NYAs, including a new M7.5 bona fide member to Tucana-Horologium, making it the latest-type member. In addition, we reveal that a known L2γ dwarf is co-moving with a bright M5 dwarf, and we show for the first time that two of the currently known ultra red L dwarfs are strong candidates to the AB Doradus moving group. Several objects identified here as highly probable members to NYAs could be free-floating planetary-mass objects if their membership is confirmed.« less
Geomorphological and sedimentary evidence of probable glaciation in the Jizerské hory Mountains, Central Europe

NASA Astrophysics Data System (ADS)

Engel, Zbyněk; Křížek, Marek; Kasprzak, Marek; Traczyk, Andrzej; Hložek, Martin; Krbcová, Klára

2017-03-01

The Jizerské hory Mountains in the Czech Republic have traditionally been considered to be a highland that lay beyond the limits of Quaternary glaciations. Recent work on cirque-like valley heads in the central part of the range has shown that niche glaciers could form during the Quaternary. Here we report geomorphological and sedimentary evidence for a small glacier in the Pytlácká jáma Hollow that represents one of the most-enclosed valley heads within the range. Shape and size characteristics of this landform indicate that the hollow is a glacial cirque at a degraded stage of development. Boulder accumulations at the downslope side of the hollow probably represent a relic of terminal moraines, and the grain size distribution of clasts together with micromorphology of quartz grains from the hollow indicate the glacial environment of a small glacier. This glacier represents the lowermost located such system in central Europe and provides evidence for the presence of niche or small cirque glaciers probably during pre-Weichselian glacial periods. The glaciation limit (1000 m asl) and paleo-ELA (900 m asl) proposed for the Jizerské hory Mountains implies that central European ranges lower than 1100 m asl were probably glaciated during the Quaternary.
Strategies for Obtaining Probability Samples of Homeless Youth

ERIC Educational Resources Information Center

Golinelli, Daniela; Tucker, Joan S.; Ryan, Gery W.; Wenzel, Suzanne L.

2015-01-01

Studies of homeless individuals typically sample subjects from few types of sites or regions within a metropolitan area. This article focuses on the biases that can result from such a practice. We obtained a probability sample of 419 homeless youth from 41 sites (shelters, drop-in centers, and streets) in four regions of Los Angeles County (LAC).…
Application of binomial and multinomial probability statistics to the sampling design process of a global grain tracing and recall system

USDA-ARS?s Scientific Manuscript database

Small, coded, pill-sized tracers embedded in grain are proposed as a method for grain traceability. A sampling process for a grain traceability system was designed and investigated by applying probability statistics using a science-based sampling approach to collect an adequate number of tracers fo...
Sexual Behaviors of U.S. Men by Self-Identified Sexual Orientation: Results From the 2012 National Survey of Sexual Health and Behavior.

PubMed

Dodge, Brian; Herbenick, Debby; Fu, Tsung-Chieh Jane; Schick, Vanessa; Reece, Michael; Sanders, Stephanie; Fortenberry, J Dennis

2016-04-01

Although a large body of previous research has examined sexual behavior and its relation to risk in men of diverse sexual identities, most studies have relied on convenience sampling. As such, the vast majority of research on the sexual behaviors of gay and bisexual men, in particular, might not be generalizable to the general population of these men in the United States. This is of particular concern because many studies are based on samples of men recruited from relatively "high-risk" venues and environments. To provide nationally representative baseline rates for sexual behavior in heterosexual, gay, and bisexual men in the United States and compare findings on sexual behaviors, relationships, and other variables across subgroups. Data were obtained from the 2012 National Survey of Sexual Health and Behavior, which involved the administration of an online questionnaire to a nationally representative probability sample of women and men at least 18 years old in the United States, with oversampling of self-identified gay and bisexual men and women. Results from the male participants are included in this article. Measurements include demographic characteristics, particularly sexual identity, and their relations to diverse sexual behaviors, including masturbation, mutual masturbation, oral sex, vaginal sex, and anal sex. Behaviors with male and female partners were examined. Men of all self-identified sexual identities reported engaging in a range of sexual behaviors (solo and partnered). As in previous studies, sexual identity was not always congruent for gender of lifetime and recent sexual partners. Patterns of sexual behaviors and relationships vary among heterosexual, gay, and bisexual men. Several demographic characteristics, including age, were related to men's sexual behaviors. The results from this probability study highlight the diversity in men's sexual behaviors across sexual identities, and these data allow generalizability to the broader population of gay and bisexual men, in particular, in the United States, which is a major advancement in research focused on individuals in a sexual minority. Copyright © 2016 International Society for Sexual Medicine. Published by Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Pavlou, A. T.; Betzler, B. R.; Burke, T. P.

Uncertainties in the composition and fabrication of fuel compacts for the Fort St. Vrain (FSV) high temperature gas reactor have been studied by performing eigenvalue sensitivity studies that represent the key uncertainties for the FSV neutronic analysis. The uncertainties for the TRISO fuel kernels were addressed by developing a suite of models for an 'average' FSV fuel compact that models the fuel as (1) a mixture of two different TRISO fuel particles representing fissile and fertile kernels, (2) a mixture of four different TRISO fuel particles representing small and large fissile kernels and small and large fertile kernels and (3)more » a stochastic mixture of the four types of fuel particles where every kernel has its diameter sampled from a continuous probability density function. All of the discrete diameter and continuous diameter fuel models were constrained to have the same fuel loadings and packing fractions. For the non-stochastic discrete diameter cases, the MCNP compact model arranged the TRISO fuel particles on a hexagonal honeycomb lattice. This lattice-based fuel compact was compared to a stochastic compact where the locations (and kernel diameters for the continuous diameter cases) of the fuel particles were randomly sampled. Partial core configurations were modeled by stacking compacts into fuel columns containing graphite. The differences in eigenvalues between the lattice-based and stochastic models were small but the runtime of the lattice-based fuel model was roughly 20 times shorter than with the stochastic-based fuel model. (authors)« less
Geochemistry and stratigraphic correlation of basalt lavas beneath the Idaho Chemical Processing Plant, Idaho National Engineering Laboratory

USGS Publications Warehouse

Reed, M.F.; Bartholomay, R.C.; Hughes, S.S.

1997-01-01

Thirty-nine samples of basaltic core were collected from wells 121 and 123, located approximately 1.8 km apart north and south of the Idaho Chemical Processing Plant at the Idaho National Engineering Laboratory. Samples were collected from depths ranging from 15 to 221 m below land surface for the purpose of establishing stratigraphic correlations between these two wells. Elemental analyses indicate that the basalts consist of three principal chemical types. Two of these types are each represented by a single basalt flow in each well. The third chemical type is represented by many basalt flows and includes a broad range of chemical compositions that is distinguished from the other two types. Basalt flows within the third type were identified by hierarchical K-cluster analysis of 14 representative elements: Fe, Ca, K, Na, Sc, Co, La, Ce, Sm, Eu, Yb, Hf, Ta, and Th. Cluster analyses indicate correlations of basalt flows between wells 121 and 123 at depths of approximately 38-40 m, 125-128 m, 131-137 m, 149-158 m, and 183-198 m. Probable correlations also are indicated for at least seven other depth intervals. Basalt flows in several depth intervals do not correlate on the basis of chemical compositions, thus reflecting possible flow margins in the sequence between the wells. Multi-element chemical data provide a useful method for determining stratigraphic correlations of basalt in the upper 1-2 km of the eastern Snake River Plain.

Mineralization in the Uyaijah-Thaaban area, west-central part of the Uyaijah ring structure, Kingdom of Saudi Arabia

USGS Publications Warehouse

Dodge, F.C.; Helaby, A.M.

1975-01-01

Anomalous amounts of tungsten, molybdenum, and bismuth were found previously in surficial debris collected from the Uyaijah-Thaaban area in the west-central part of the Precambrian Al Uyaijah ring structure. The area is mostly underlain by quartz monzonite. Countless quartz veins ranging from a knife edge to more than 3 m in thickness cut the quartz monzonite; many of these veins contain molybdenite. Detailed mapping and intensive sampling of the molybdenite-bearing quartz veins indicate that their grade and quantity are probably inadequate to permit present-day mining; however, they represent a potential future resource. The tungsten of the area appears to be negligible.
Comparison of three-parameter probability distributions for representing annual extreme and partial duration precipitation series

NASA Astrophysics Data System (ADS)

Wilks, Daniel S.

1993-10-01

Performance of 8 three-parameter probability distributions for representing annual extreme and partial duration precipitation data at stations in the northeastern and southeastern United States is investigated. Particular attention is paid to fidelity on the right tail, through use of a bootstrap procedure simulating extrapolation on the right tail beyond the data. It is found that the beta-κ distribution best describes the extreme right tail of annual extreme series, and the beta-P distribution is best for the partial duration data. The conventionally employed two-parameter Gumbel distribution is found to substantially underestimate probabilities associated with the larger precipitation amounts for both annual extreme and partial duration data. Fitting the distributions using left-censored data did not result in improved fits to the right tail.
Sampling little fish in big rivers: Larval fish detection probabilities in two Lake Erie tributaries and implications for sampling effort and abundance indices

USGS Publications Warehouse

Pritt, Jeremy J.; DuFour, Mark R.; Mayer, Christine M.; Roseman, Edward F.; DeBruyne, Robin L.

2014-01-01

Larval fish are frequently sampled in coastal tributaries to determine factors affecting recruitment, evaluate spawning success, and estimate production from spawning habitats. Imperfect detection of larvae is common, because larval fish are small and unevenly distributed in space and time, and coastal tributaries are often large and heterogeneous. We estimated detection probabilities of larval fish from several taxa in the Maumee and Detroit rivers, the two largest tributaries of Lake Erie. We then demonstrated how accounting for imperfect detection influenced (1) the probability of observing taxa as present relative to sampling effort and (2) abundance indices for larval fish of two Detroit River species. We found that detection probabilities ranged from 0.09 to 0.91 but were always less than 1.0, indicating that imperfect detection is common among taxa and between systems. In general, taxa with high fecundities, small larval length at hatching, and no nesting behaviors had the highest detection probabilities. Also, detection probabilities were higher in the Maumee River than in the Detroit River. Accounting for imperfect detection produced up to fourfold increases in abundance indices for Lake Whitefish Coregonus clupeaformis and Gizzard Shad Dorosoma cepedianum. The effect of accounting for imperfect detection in abundance indices was greatest during periods of low abundance for both species. Detection information can be used to determine the appropriate level of sampling effort for larval fishes and may improve management and conservation decisions based on larval fish data.
How Unusual were Hurricane Harvey's Rains?

NASA Astrophysics Data System (ADS)

Emanuel, K.

2017-12-01

We apply an advanced technique for hurricane risk assessment to evaluate the probability of hurricane rainfall of Harvey's magnitude. The technique embeds a detailed computational hurricane model in the large-scale conditions represented by climate reanalyses and by climate models. We simulate 3700 hurricane events affecting the state of Texas, from each of three climate reanalyses spanning the period 1980-2016, and 2000 events from each of six climate models for each of two periods: the period 1981-2000 from historical simulations, and the period 2081-2100 from future simulations under Representative Concentration Pathway (RCP) 8.5. On the basis of these simulations, we estimate that hurricane rain of Harvey's magnitude in the state of Texas would have had an annual probability of 0.01 in the late twentieth century, and will have an annual probability of 0.18 by the end of this century, with remarkably small scatter among the six climate models downscaled. If the event frequency is changing linearly over time, this would yield an annual probability of 0.06 in 2017.
Asymptotic Equivalence of Probability Measures and Stochastic Processes

NASA Astrophysics Data System (ADS)

Touchette, Hugo

2018-03-01

Let P_n and Q_n be two probability measures representing two different probabilistic models of some system (e.g., an n-particle equilibrium system, a set of random graphs with n vertices, or a stochastic process evolving over a time n) and let M_n be a random variable representing a "macrostate" or "global observable" of that system. We provide sufficient conditions, based on the Radon-Nikodym derivative of P_n and Q_n, for the set of typical values of M_n obtained relative to P_n to be the same as the set of typical values obtained relative to Q_n in the limit n→ ∞. This extends to general probability measures and stochastic processes the well-known thermodynamic-limit equivalence of the microcanonical and canonical ensembles, related mathematically to the asymptotic equivalence of conditional and exponentially-tilted measures. In this more general sense, two probability measures that are asymptotically equivalent predict the same typical or macroscopic properties of the system they are meant to model.
United States Forest Disturbance Trends Observed Using Landsat Time Series

NASA Technical Reports Server (NTRS)

Masek, Jeffrey G.; Goward, Samuel N.; Kennedy, Robert E.; Cohen, Warren B.; Moisen, Gretchen G.; Schleeweis, Karen; Huang, Chengquan

2013-01-01

Disturbance events strongly affect the composition, structure, and function of forest ecosystems; however, existing U.S. land management inventories were not designed to monitor disturbance. To begin addressing this gap, the North American Forest Dynamics (NAFD) project has examined a geographic sample of 50 Landsat satellite image time series to assess trends in forest disturbance across the conterminous United States for 1985-2005. The geographic sample design used a probability-based scheme to encompass major forest types and maximize geographic dispersion. For each sample location disturbance was identified in the Landsat series using the Vegetation Change Tracker (VCT) algorithm. The NAFD analysis indicates that, on average, 2.77 Mha/yr of forests were disturbed annually, representing 1.09%/yr of US forestland. These satellite-based national disturbance rates estimates tend to be lower than those derived from land management inventories, reflecting both methodological and definitional differences. In particular the VCT approach used with a biennial time step has limited sensitivity to low-intensity disturbances. Unlike prior satellite studies, our biennial forest disturbance rates vary by nearly a factor of two between high and low years. High western US disturbance rates were associated with active fire years and insect activity, while variability in the east is more strongly related to harvest rates in managed forests. We note that generating a geographic sample based on representing forest type and variability may be problematic since the spatial pattern of disturbance does not necessarily correlate with forest type. We also find that the prevalence of diffuse, non-stand clearing disturbance in US forests makes the application of a biennial geographic sample problematic. Future satellite-based studies of disturbance at regional and national scales should focus on wall-to-wall analyses with annual time step for improved accuracy.
Statistical approaches to the analysis of point count data: A little extra information can go a long way

USGS Publications Warehouse

Farnsworth, G.L.; Nichols, J.D.; Sauer, J.R.; Fancy, S.G.; Pollock, K.H.; Shriner, S.A.; Simons, T.R.; Ralph, C. John; Rich, Terrell D.

2005-01-01

Point counts are a standard sampling procedure for many bird species, but lingering concerns still exist about the quality of information produced from the method. It is well known that variation in observer ability and environmental conditions can influence the detection probability of birds in point counts, but many biologists have been reluctant to abandon point counts in favor of more intensive approaches to counting. However, over the past few years a variety of statistical and methodological developments have begun to provide practical ways of overcoming some of the problems with point counts. We describe some of these approaches, and show how they can be integrated into standard point count protocols to greatly enhance the quality of the information. Several tools now exist for estimation of detection probability of birds during counts, including distance sampling, double observer methods, time-depletion (removal) methods, and hybrid methods that combine these approaches. Many counts are conducted in habitats that make auditory detection of birds much more likely than visual detection. As a framework for understanding detection probability during such counts, we propose separating two components of the probability a bird is detected during a count into (1) the probability a bird vocalizes during the count and (2) the probability this vocalization is detected by an observer. In addition, we propose that some measure of the area sampled during a count is necessary for valid inferences about bird populations. This can be done by employing fixed-radius counts or more sophisticated distance-sampling models. We recommend any studies employing point counts be designed to estimate detection probability and to include a measure of the area sampled.
Genetic analysis of haplotype data for 23 Y-chromosome short tandem repeat loci in the Turkish population recently settled in Sarajevo, Bosnia and Herzegovina

PubMed Central

Dogan, Serkan; Primorac, Dragan; Marjanović, Damir

2014-01-01

Aim To explore the distribution and polymorphisms of 23 short tandem repeat (STR) loci on the Y chromosome in the Turkish population recently settled in Sarajevo, Bosnia and Herzegovina and to investigate its genetic relationships with the homeland Turkish population and neighboring populations. Methods This study included 100 healthy unrelated male individuals from the Turkish population living in Sarajevo. Buccal swab samples were collected as a DNA source. Genomic DNA was extracted using the salting out method and amplification was performed using PowerPlex Y 23 amplification kit. The studied population was compared to other populations using pairwise genetic distances, which were represented with a multi-dimensional scaling plot. Results Haplotype and allele frequencies of the sample population were calculated and the results showed that all 100 samples had unique haplotypes. The most polymorphic locus was DYS458, and the least polymorphic DYS391. The observed haplotype diversity was 1.0000 ± 0.0014, with a discrimination capacity of 1.00 and the match probability of 0.01. Rst values showed that our sample population was closely related in both dimensions to the Lebanese and Iraqi populations, while it was more distant from Bosnian, Croatian, and Macedonian populations. Conclusion Turkish population residing in Sarajevo could be observed as a representative Turkish population, since our results were consistent with those previously published for the homeland Turkish population. Also, this study once again proved that geographically close populations were genetically more related to each other. PMID:25358886
High throughput nonparametric probability density estimation.

PubMed

Farmer, Jenny; Jacobs, Donald

2018-01-01

In high throughput applications, such as those found in bioinformatics and finance, it is important to determine accurate probability distribution functions despite only minimal information about data characteristics, and without using human subjectivity. Such an automated process for univariate data is implemented to achieve this goal by merging the maximum entropy method with single order statistics and maximum likelihood. The only required properties of the random variables are that they are continuous and that they are, or can be approximated as, independent and identically distributed. A quasi-log-likelihood function based on single order statistics for sampled uniform random data is used to empirically construct a sample size invariant universal scoring function. Then a probability density estimate is determined by iteratively improving trial cumulative distribution functions, where better estimates are quantified by the scoring function that identifies atypical fluctuations. This criterion resists under and over fitting data as an alternative to employing the Bayesian or Akaike information criterion. Multiple estimates for the probability density reflect uncertainties due to statistical fluctuations in random samples. Scaled quantile residual plots are also introduced as an effective diagnostic to visualize the quality of the estimated probability densities. Benchmark tests show that estimates for the probability density function (PDF) converge to the true PDF as sample size increases on particularly difficult test probability densities that include cases with discontinuities, multi-resolution scales, heavy tails, and singularities. These results indicate the method has general applicability for high throughput statistical inference.
High throughput nonparametric probability density estimation

PubMed Central

Farmer, Jenny

2018-01-01

In high throughput applications, such as those found in bioinformatics and finance, it is important to determine accurate probability distribution functions despite only minimal information about data characteristics, and without using human subjectivity. Such an automated process for univariate data is implemented to achieve this goal by merging the maximum entropy method with single order statistics and maximum likelihood. The only required properties of the random variables are that they are continuous and that they are, or can be approximated as, independent and identically distributed. A quasi-log-likelihood function based on single order statistics for sampled uniform random data is used to empirically construct a sample size invariant universal scoring function. Then a probability density estimate is determined by iteratively improving trial cumulative distribution functions, where better estimates are quantified by the scoring function that identifies atypical fluctuations. This criterion resists under and over fitting data as an alternative to employing the Bayesian or Akaike information criterion. Multiple estimates for the probability density reflect uncertainties due to statistical fluctuations in random samples. Scaled quantile residual plots are also introduced as an effective diagnostic to visualize the quality of the estimated probability densities. Benchmark tests show that estimates for the probability density function (PDF) converge to the true PDF as sample size increases on particularly difficult test probability densities that include cases with discontinuities, multi-resolution scales, heavy tails, and singularities. These results indicate the method has general applicability for high throughput statistical inference. PMID:29750803
Urban soil pollution and the playfields of small children

NASA Astrophysics Data System (ADS)

Jartun, M.; Ottesen, R. T.; Steinnes, E.

2003-05-01

The chemical composition of urban surface soil in Tromsø, northern Norway has been mapped to describe the environmental load of toxic elements in different parts of the city. Surface soil samples were collected from 275 locations throughout the city center and nearby suburban areas. Natural background concentrations were determined in samples of the local bedrock. Surface soil in younger, suburban parts of the city shows low concentrations of heavy metals, reflecting the local geochemistry. The inner and older parts of the city are generally polluted with lead (Pb), zinc (Zn) and tin (Sn). The most important sources of this urban soil pollution are probably city fires, industrial and domestic waste, traffic, and shipyards. In this paper two different approaches have been used. First, as a result of the general mapping, 852 soil and sand samples from kindergartens and playgrounds were analyzed. In this study concentrations of arsenic (As) up to 1800ppm were found, most likely due to the extensive use of CCA (copper, chromium, arsenic) impregnated wood in sandboxes and other playground equipment. This may represent a significant health risk especially to children having a high oral intake of contaminated sand and soil. Secondly a pattern of tin (Sn) concentrations was found in Tromsøcity with especially high values near shipyards. Further investigation indicated that this pattern most probably reflected the use of the highty toxic tributyltin (TBT). Thus détermination of total Sn in surface soils could be a cost-effective way to localize sources of TBT contamination in the environment.
Development of a methodology to evaluate material accountability in pyroprocess

NASA Astrophysics Data System (ADS)

Woo, Seungmin

This study investigates the effect of the non-uniform nuclide composition in spent fuel on material accountancy in the pyroprocess. High-fidelity depletion simulations are performed using the Monte Carlo code SERPENT in order to determine nuclide composition as a function of axial and radial position within fuel rods and assemblies, and burnup. For improved accuracy, the simulations use short burnups step (25 days or less), Xe-equilibrium treatment (to avoid oscillations over burnup steps), axial moderator temperature distribution, and 30 axial meshes. Analytical solutions of the simplified depletion equations are built to understand the axial non-uniformity of nuclide composition in spent fuel. The cosine shape of axial neutron flux distribution dominates the axial non-uniformity of the nuclide composition. Combined cross sections and time also generate axial non-uniformity, as the exponential term in the analytical solution consists of the neutron flux, cross section and time. The axial concentration distribution for a nuclide having the small cross section gets steeper than that for another nuclide having the great cross section because the axial flux is weighted by the cross section in the exponential term in the analytical solution. Similarly, the non-uniformity becomes flatter as increasing burnup, because the time term in the exponential increases. Based on the developed numerical recipes and decoupling of the results between the axial distributions and the predetermined representative radial distributions by matching the axial height, the axial and radial composition distributions for representative spent nuclear fuel assemblies, the Type-0, -1, and -2 assemblies after 1, 2, and 3 depletion cycles, is obtained. These data are appropriately modified to depict processing for materials in the head-end process of pyroprocess that is chopping, voloxidation and granulation. The expectation and standard deviation of the Pu-to-244Cm-ratio by the single granule sampling calculated by the central limit theorem and the Geary-Hinkley transformation. Then, the uncertainty propagation through the key-pyroprocess is conducted to analyze the Material Unaccounted For (MUF), which is a random variable defined as a receipt minus a shipment of a process, in the system. The random variable, LOPu, is defined for evaluating the non-detection probability at each Key Measurement Point (KMP) as the original Pu mass minus the Pu mass after a missing scenario. A number of assemblies for the LOPu to be 8 kg is considered in this calculation. The probability of detection for the 8 kg LOPu is evaluated with respect the size of granule and powder using the event tree analysis and the hypothesis testing method. We can observe there are possible cases showing the probability of detection for the 8 kg LOPu less than 95%. In order to enhance the detection rate, a new Material Balance Area (MBA) model is defined for the key-pyroprocess. The probabilities of detection for all spent fuel types based on the new MBA model are greater than 99%. Furthermore, it is observed that the probability of detection significantly increases by increasing granule sample sizes to evaluate the Pu-to-244Cm-ratio before the key-pyroprocess. Based on these observations, even though the Pu material accountability in pyroprocess is affected by the non-uniformity of nuclide composition when the Pu-to-244Cm-ratio method is being applied, that is surmounted by decreasing the uncertainty of measured ratio by increasing sample sizes and modifying the MBAs and KMPs. (Abstract shortened by ProQuest.).
Polymerase chain reaction-based clonality testing in tissue samples with reactive lymphoproliferations: usefulness and pitfalls. A report of the BIOMED-2 Concerted Action BMH4-CT98-3936.

PubMed

Langerak, A W; Molina, T J; Lavender, F L; Pearson, D; Flohr, T; Sambade, C; Schuuring, E; Al Saati, T; van Dongen, J J M; van Krieken, J H J M

2007-02-01

Lymphoproliferations are generally diagnosed via histomorphology and immunohistochemistry. Although mostly conclusive, occasionally the differential diagnosis between reactive lesions and malignant lymphomas is difficult. In such cases molecular clonality studies of immunoglobulin (Ig)/T-cell receptor (TCR) rearrangements can be useful. Here we address the issue of clonality assessment in 106 histologically defined reactive lesions, using the standardized BIOMED-2 Ig/TCR multiplex polymerase chain reaction (PCR) heteroduplex and GeneScan assays. Samples were reviewed nationally, except 10% random cases and cases with clonal results selected for additional international panel review. In total 75% (79/106) only showed polyclonal Ig/TCR targets (type I), whereas another 15% (16/106) represent probably polyclonal cases, with weak Ig/TCR (oligo)clonality in an otherwise polyclonal background (type II). Interestingly, in 10% (11/106) clear monoclonal Ig/TCR products were observed (types III/IV), which prompted further pathological review. Clonal cases included two missed lymphomas in national review and nine cases that could be explained as diagnostically difficult cases or probable lymphomas upon additional review. Our data show that the BIOMED-2 Ig/TCR multiplex PCR assays are very helpful in confirming the polyclonal character in the vast majority of reactive lesions. However, clonality detection in a minority should lead to detailed pathological review, including close interaction between pathologist and molecular biologist.
Victimization and PTSD-like states in an Icelandic youth probability sample.

PubMed

Bödvarsdóttir, Iris; Elklit, Ask

2007-10-01

Although adolescence in many cases is a period of rebellion and experimentation with new behaviors and roles, the exposure of adolescents to life-threatening and violent events has rarely been investigated in national probability studies using a broad range of events. In an Icelandic national representative sample of 206 9th-grade students (mean = 14.5 years), the prevalence of 20 potentially traumatic events and negative life events was reported, along with the psychological impact of these events. Seventy-four percent of the girls and 79 percent of the boys were exposed to at least one event. The most common events were the death of a family member, threat of violence, and traffic accidents. The estimated lifetime prevalence of posttraumatic stress disorder-like states (PTSD; DSM-IV, APA, 1994 1) was 16 percent, whereas another 12 percent reached a sub-clinical level of PTSD-like states (missing the full diagnosis with one symptom). Following exposure, girls suffered from PTSD-like states almost twice as often as boys. Gender, mothers' education, and single-parenthood were associated with specific events. The odds ratios and 95% CI for PTSD-like states given a specific event are reported. Being exposed to multiple potentially traumatic events was associated with an increase in PTSD-like states. The findings indicate substantial mental health problems in adolescents that are associated with various types of potentially traumatic exposure.
Optimal methods for fitting probability distributions to propagule retention time in studies of zoochorous dispersal.

PubMed

Viana, Duarte S; Santamaría, Luis; Figuerola, Jordi

2016-02-01

Propagule retention time is a key factor in determining propagule dispersal distance and the shape of "seed shadows". Propagules dispersed by animal vectors are either ingested and retained in the gut until defecation or attached externally to the body until detachment. Retention time is a continuous variable, but it is commonly measured at discrete time points, according to pre-established sampling time-intervals. Although parametric continuous distributions have been widely fitted to these interval-censored data, the performance of different fitting methods has not been evaluated. To investigate the performance of five different fitting methods, we fitted parametric probability distributions to typical discretized retention-time data with known distribution using as data-points either the lower, mid or upper bounds of sampling intervals, as well as the cumulative distribution of observed values (using either maximum likelihood or non-linear least squares for parameter estimation); then compared the estimated and original distributions to assess the accuracy of each method. We also assessed the robustness of these methods to variations in the sampling procedure (sample size and length of sampling time-intervals). Fittings to the cumulative distribution performed better for all types of parametric distributions (lognormal, gamma and Weibull distributions) and were more robust to variations in sample size and sampling time-intervals. These estimated distributions had negligible deviations of up to 0.045 in cumulative probability of retention times (according to the Kolmogorov-Smirnov statistic) in relation to original distributions from which propagule retention time was simulated, supporting the overall accuracy of this fitting method. In contrast, fitting the sampling-interval bounds resulted in greater deviations that ranged from 0.058 to 0.273 in cumulative probability of retention times, which may introduce considerable biases in parameter estimates. We recommend the use of cumulative probability to fit parametric probability distributions to propagule retention time, specifically using maximum likelihood for parameter estimation. Furthermore, the experimental design for an optimal characterization of unimodal propagule retention time should contemplate at least 500 recovered propagules and sampling time-intervals not larger than the time peak of propagule retrieval, except in the tail of the distribution where broader sampling time-intervals may also produce accurate fits.
Making Sense of 'Big Data' in Provenance Studies

NASA Astrophysics Data System (ADS)

Vermeesch, P.

2014-12-01

Huge online databases can be 'mined' to reveal previously hidden trends and relationships in society. One could argue that sedimentary geology has entered a similar era of 'Big Data', as modern provenance studies routinely apply multiple proxies to dozens of samples. Just like the Internet, sedimentary geology now requires specialised statistical tools to interpret such large datasets. These can be organised on three levels of progressively higher order:A single sample: The most effective way to reveal the provenance information contained in a representative sample of detrital zircon U-Pb ages are probability density estimators such as histograms and kernel density estimates. The widely popular 'probability density plots' implemented in IsoPlot and AgeDisplay compound analytical uncertainty with geological scatter and are therefore invalid.Several samples: Multi-panel diagrams comprising many detrital age distributions or compositional pie charts quickly become unwieldy and uninterpretable. For example, if there are N samples in a study, then the number of pairwise comparisons between samples increases quadratically as N(N-1)/2. This is simply too much information for the human eye to process. To solve this problem, it is necessary to (a) express the 'distance' between two samples as a simple scalar and (b) combine all N(N-1)/2 such values in a single two-dimensional 'map', grouping similar and pulling apart dissimilar samples. This can be easily achieved using simple statistics-based dissimilarity measures and a standard statistical method called Multidimensional Scaling (MDS).Several methods: Suppose that we use four provenance proxies: bulk petrography, chemistry, heavy minerals and detrital geochronology. This will result in four MDS maps, each of which likely show slightly different trends and patterns. To deal with such cases, it may be useful to use a related technique called 'three way multidimensional scaling'. This results in two graphical outputs: an MDS map, and a map with 'weights' showing to what extent the different provenance proxies influence the horizontal and vertical axis of the MDS map. Thus, detrital data can not only inform the user about the provenance of sediments, but also about the causal relationships between the mineralogy, geochronology and chemistry.
Estimation of probability of failure for damage-tolerant aerospace structures

NASA Astrophysics Data System (ADS)

Halbert, Keith

The majority of aircraft structures are designed to be damage-tolerant such that safe operation can continue in the presence of minor damage. It is necessary to schedule inspections so that minor damage can be found and repaired. It is generally not possible to perform structural inspections prior to every flight. The scheduling is traditionally accomplished through a deterministic set of methods referred to as Damage Tolerance Analysis (DTA). DTA has proven to produce safe aircraft but does not provide estimates of the probability of failure of future flights or the probability of repair of future inspections. Without these estimates maintenance costs cannot be accurately predicted. Also, estimation of failure probabilities is now a regulatory requirement for some aircraft. The set of methods concerned with the probabilistic formulation of this problem are collectively referred to as Probabilistic Damage Tolerance Analysis (PDTA). The goal of PDTA is to control the failure probability while holding maintenance costs to a reasonable level. This work focuses specifically on PDTA for fatigue cracking of metallic aircraft structures. The growth of a crack (or cracks) must be modeled using all available data and engineering knowledge. The length of a crack can be assessed only indirectly through evidence such as non-destructive inspection results, failures or lack of failures, and the observed severity of usage of the structure. The current set of industry PDTA tools are lacking in several ways: they may in some cases yield poor estimates of failure probabilities, they cannot realistically represent the variety of possible failure and maintenance scenarios, and they do not allow for model updates which incorporate observed evidence. A PDTA modeling methodology must be flexible enough to estimate accurately the failure and repair probabilities under a variety of maintenance scenarios, and be capable of incorporating observed evidence as it becomes available. This dissertation describes and develops new PDTA methodologies that directly address the deficiencies of the currently used tools. The new methods are implemented as a free, publicly licensed and open source R software package that can be downloaded from the Comprehensive R Archive Network. The tools consist of two main components. First, an explicit (and expensive) Monte Carlo approach is presented which simulates the life of an aircraft structural component flight-by-flight. This straightforward MC routine can be used to provide defensible estimates of the failure probabilities for future flights and repair probabilities for future inspections under a variety of failure and maintenance scenarios. This routine is intended to provide baseline estimates against which to compare the results of other, more efficient approaches. Second, an original approach is described which models the fatigue process and future scheduled inspections as a hidden Markov model. This model is solved using a particle-based approximation and the sequential importance sampling algorithm, which provides an efficient solution to the PDTA problem. Sequential importance sampling is an extension of importance sampling to a Markov process, allowing for efficient Bayesian updating of model parameters. This model updating capability, the benefit of which is demonstrated, is lacking in other PDTA approaches. The results of this approach are shown to agree with the results of the explicit Monte Carlo routine for a number of PDTA problems. Extensions to the typical PDTA problem, which cannot be solved using currently available tools, are presented and solved in this work. These extensions include incorporating observed evidence (such as non-destructive inspection results), more realistic treatment of possible future repairs, and the modeling of failure involving more than one crack (the so-called continuing damage problem). The described hidden Markov model / sequential importance sampling approach to PDTA has the potential to improve aerospace structural safety and reduce maintenance costs by providing a more accurate assessment of the risk of failure and the likelihood of repairs throughout the life of an aircraft.
78 FR 28597 - State Median Income Estimates for a Four-Person Household: Notice of the Federal Fiscal Year (FFY...

Federal Register 2010, 2011, 2012, 2013, 2014

2013-05-15

....gov/acs/www/ or contact the Census Bureau's Social, Economic, and Housing Statistics Division at (301...) Sampling Error, which consists of the error that arises from the use of probability sampling to create the... direction; and (2) Sampling Error, which consists of the error that arises from the use of probability...
Classroom Research: Assessment of Student Understanding of Sampling Distributions of Means and the Central Limit Theorem in Post-Calculus Probability and Statistics Classes

ERIC Educational Resources Information Center

Lunsford, M. Leigh; Rowell, Ginger Holmes; Goodson-Espy, Tracy

2006-01-01

We applied a classroom research model to investigate student understanding of sampling distributions of sample means and the Central Limit Theorem in post-calculus introductory probability and statistics courses. Using a quantitative assessment tool developed by previous researchers and a qualitative assessment tool developed by the authors, we…
Vanguard - a proposed European astrobiology experiment on Mars

NASA Astrophysics Data System (ADS)

Ellery, A. A.; Cockell, C. S.; Edwards, H. G. M.; Dickensheets, D. L.; Welch, C. S.

2002-07-01

We propose a new type of robotic mission for the exploration of Mars. This mission is called Vanguard and represents the fruits of a collaboration that is both international and multi-disciplinary. Vanguard is designed for sub-surface penetration and investigation using remote instruments and unlike previous robotic architectures it offers the opportunity for multiple subsurface site analysis using three moles. The moles increase the probability that a subsurface signature of life can be found and by accomplishing subsurface analysis across a transect, the statistical rigour of Martian scientific exploration would be improved. There is no provision for returning samples to the surface for analysis by a gas-chromatograph/mass-spectrometer (GCMS) this minimizes the complexity invoked by sophisticated robotic overheads. The primary scientific instruments to be deployed are the Raman spectrometer, infrared spectrometer and laser-induced breakdown spectroscope the Raman spectrometer in particular is discussed. We concentrate primarily on the scientific rationale for the Vanguard mission proposal. The Vanguard mission proposal represents a logical opportunity for extending European robotic missions to Mars.

PDF-based heterogeneous multiscale filtration model.

PubMed

Gong, Jian; Rutland, Christopher J

2015-04-21

Motivated by modeling of gasoline particulate filters (GPFs), a probability density function (PDF) based heterogeneous multiscale filtration (HMF) model is developed to calculate filtration efficiency of clean particulate filters. A new methodology based on statistical theory and classic filtration theory is developed in the HMF model. Based on the analysis of experimental porosimetry data, a pore size probability density function is introduced to represent heterogeneity and multiscale characteristics of the porous wall. The filtration efficiency of a filter can be calculated as the sum of the contributions of individual collectors. The resulting HMF model overcomes the limitations of classic mean filtration models which rely on tuning of the mean collector size. Sensitivity analysis shows that the HMF model recovers the classical mean model when the pore size variance is very small. The HMF model is validated by fundamental filtration experimental data from different scales of filter samples. The model shows a good agreement with experimental data at various operating conditions. The effects of the microstructure of filters on filtration efficiency as well as the most penetrating particle size are correctly predicted by the model.
Integrating biology, field logistics, and simulations to optimize parameter estimation for imperiled species

USGS Publications Warehouse

Lanier, Wendy E.; Bailey, Larissa L.; Muths, Erin L.

2016-01-01

Conservation of imperiled species often requires knowledge of vital rates and population dynamics. However, these can be difficult to estimate for rare species and small populations. This problem is further exacerbated when individuals are not available for detection during some surveys due to limited access, delaying surveys and creating mismatches between the breeding behavior and survey timing. Here we use simulations to explore the impacts of this issue using four hypothetical boreal toad (Anaxyrus boreas boreas) populations, representing combinations of logistical access (accessible, inaccessible) and breeding behavior (synchronous, asynchronous). We examine the bias and precision of survival and breeding probability estimates generated by survey designs that differ in effort and timing for these populations. Our findings indicate that the logistical access of a site and mismatch between the breeding behavior and survey design can greatly limit the ability to yield accurate and precise estimates of survival and breeding probabilities. Simulations similar to what we have performed can help researchers determine an optimal survey design(s) for their system before initiating sampling efforts.
Radiation Transport in Random Media With Large Fluctuations

NASA Astrophysics Data System (ADS)

Olson, Aaron; Prinja, Anil; Franke, Brian

2017-09-01

Neutral particle transport in media exhibiting large and complex material property spatial variation is modeled by representing cross sections as lognormal random functions of space and generated through a nonlinear memory-less transformation of a Gaussian process with covariance uniquely determined by the covariance of the cross section. A Karhunen-Loève decomposition of the Gaussian process is implemented to effciently generate realizations of the random cross sections and Woodcock Monte Carlo used to transport particles on each realization and generate benchmark solutions for the mean and variance of the particle flux as well as probability densities of the particle reflectance and transmittance. A computationally effcient stochastic collocation method is implemented to directly compute the statistical moments such as the mean and variance, while a polynomial chaos expansion in conjunction with stochastic collocation provides a convenient surrogate model that also produces probability densities of output quantities of interest. Extensive numerical testing demonstrates that use of stochastic reduced-order modeling provides an accurate and cost-effective alternative to random sampling for particle transport in random media.
Emission-line galaxies in the third list of the Case Low-Dispersion Northern Sky Survey

NASA Technical Reports Server (NTRS)

Weistrop, Donna; Downes, Ronald A.

1991-01-01

Observations of 47 galaxies in the third Case list are reported. Thirty-five of the galaxies in the sample were selected for the presence of emission lines on the objective prism plates. At the higher spectral dispersion of the data, significant line emission was found in 46 of the 47 galaxies. Twenty-six galaxies are found to be undergoing significant bursts of star formation. Ten additional galaxies may be starburst galaxies with low-excitation spectra. Two galaxies are probably type Seyfert 2. The most distant object, CG 200, at a redshift of 0.144, has a strong broad H-alpha emission line, and is probably a Seyfert 1. Seventeen of the galaxies have been detected by IRAS. Eight of the IRAS galaxies have H-II-region-type spectra and eight have low-ionization starburst spectra. The galaxies represent a mixture of types, ranging from intrinsically faint dwarf galaxies with Mb equalling -16 mag, to powerful galaxies with MB equalling -23 mag. Galaxies CG 234 and CG 235 are interacting, as are galaxies CG 269 and CG 270.
Molecular identification of Ascaris lumbricoides and Ascaris suum recovered from humans and pigs in Thailand, Lao PDR, and Myanmar.

PubMed

Sadaow, Lakkhana; Sanpool, Oranuch; Phosuk, Issarapong; Rodpai, Rutchanee; Thanchomnang, Tongjit; Wijit, Adulsak; Anamnart, Witthaya; Laymanivong, Sakhone; Aung, Win Pa Pa; Janwan, Penchom; Maleewong, Wanchai; Intapan, Pewpan M

2018-06-02

Ascaris lumbricoides is the largest roundworm known from the human intestine while Ascaris suum is an internal parasite of pigs. Ascariasis, caused by Ascaris lumbricoides, has a worldwide distribution. Here, we have provided the first molecular identification of Ascaris eggs and adults recovered from humans and pigs in Thailand, Lao PDR, and Myanmar. We amplified and sequenced nuclear ribosomal DNA (ITS1 and ITS2 regions) and mitochondrial DNA (cox1 gene). Sequence chromatograms of PCR-amplified ITS1 region revealed a probable hybrid genotype from two human ascariasis cases from Chiang Mai Province, northern Thailand. All complete ITS2 sequences were identical and did not differ between the species. Phylogenetic trees and haplotype analysis of cox1 sequences showed three clusters with 99 haplotypes. Forty-seven samples from the present study represented 14 haplotypes, including 7 new haplotypes. To our knowledge, this is the first molecular confirmation of Ascaris species in Thailand, Lao PDR, and Myanmar. Zoonotic cross-transmission of Ascaris roundworm between pigs and humans probably occurs in these countries.
Sources and distribution of aromatic hydrocarbons in a tropical marine protected area estuary under influence of sugarcane cultivation.

PubMed

Arruda-Santos, Roxanny Helen de; Schettini, Carlos Augusto França; Yogui, Gilvan Takeshi; Maciel, Daniele Claudino; Zanardi-Lamardo, Eliete

2018-05-15

Goiana estuary is a well preserved marine protected area (MPA) located on the northeastern coast of Brazil. Despite its current state, human activities in the watershed represent a potential threat to long term local preservation. Dissolved/dispersed aromatic hydrocarbons and polycyclic aromatic hydrocarbons (PAHs) were investigated in water and sediments across the estuarine salt gradient. Concentration of aromatic hydrocarbons was low in all samples. According to results, aromatic hydrocarbons are associated to suspended particulate matter (SPM) carried to the estuary by river waters. An estuarine turbidity maximum (ETM) was identified in the upper estuary, indicating that both sediments and contaminants are trapped prior to an occasional export to the adjacent sea. PAHs distribution in sediments were associated with organic matter and mud content. Diagnostic ratios indicated pyrolytic processes as the main local source of PAHs that are probably associated with sugarcane burning and combustion engines. Low PAH concentrations probably do not cause adverse biological effects to the local biota although their presence indicate anthropogenic contamination and pressure on the Goiana estuary MPA. Copyright © 2017 Elsevier B.V. All rights reserved.
Tree-ring based reconstructions of interannual to decadal scale precipitation variability for northeastern Utah since 1226 A.D.

USGS Publications Warehouse

Gray, S.T.; Jackson, S.T.; Betancourt, J.L.

2004-01-01

Samples from 107 pin??on pines (Pinus edulis) at four sites were used to develop a proxy record of annual (June to June) precipitation spanning the 1226 to 2001 AD interval for the Uinta Basin Watershed of northeastern Utah. The reconstruction reveals significant precipitation variability at interannual to decadal scales. Single-year dry events before the instrumental period tended to be more severe than those after 1900. In general, decadal scale dry events were longer and more severe prior to 1900. In particular, dry events in the late 13th, 16th, and 18th Centuries surpass the magnitude and duration of droughts seen in the Uinta Basin after 1900. The last four decades of the 20th Century also represent one of the wettest periods in the reconstruction. The proxy record indicates that the instrumental record (approximately 1900 to the Present) underestimates the potential frequency and severity of severe, sustained droughts in this area, while over representing the prominence of wet episodes. In the longer record, the empirical probability of any decadal scale drought exceeding the duration of the 1954 through 1964 drought is 94 percent, while the probability for any wet event exceeding the duration of the 1965 through 1999 wet spell is only 1 percent. Hence, estimates of future water availability in the Uinta Basin and forecasts for exports to the Colorado River, based on the 1961 to 1990 and 1971 to 2000 "normal" periods, may be overly optimistic.
The influence of the uplink noise on the performance of satellite data transmission systems

NASA Astrophysics Data System (ADS)

Dewal, Vrinda P.

The problem of transmission of binary phase shift keying (BPSK) modulated digital data through a bandlimited nonlinear satellite channel in the presence of uplink, downlink Gaussian noise and intersymbol interface is examined. The satellite transponder is represented by a zero memory bandpass nonlinearity, with AM/AM conversion. The proposed optimum linear receiver structure consists of tapped-delay lines followed by a decision device. The linear receiver is designed to minimize the mean square error that is a function of the intersymbol interface, the uplink and the downlink noise. The minimum mean square error equalizer (MMSE) is derived using the Wiener-Kolmogorov theory. In this receiver, the decision about the transmitted signal is made by taking into account the received sequence of present sample, and the interfering past and future samples, which represent the intersymbol interference (ISI). Illustrative examples of the receiver structures are considered for the nonlinear channels with a symmetrical and asymmetrical frequency responses of the transmitter filter. The transponder nonlinearity is simulated by a polynomial using only the first and the third orders terms. A computer simulation determines the tap gain coefficients of the MMSE equalizer that adapt to the various uplink and downlink noise levels. The performance of the MMSE equalizer is evaluated in terms of an estimate of the average probability of error.
Erectile Dysfunction in Patients with Sleep Apnea--A Nationwide Population-Based Study.

PubMed

Chen, Chia-Min; Tsai, Ming-Ju; Wei, Po-Ju; Su, Yu-Chung; Yang, Chih-Jen; Wu, Meng-Ni; Hsu, Chung-Yao; Hwang, Shang-Jyh; Chong, Inn-Wen; Huang, Ming-Shyan

2015-01-01

Increased incidence of erectile dysfunction (ED) has been reported among patients with sleep apnea (SA). However, this association has not been confirmed in a large-scale study. We therefore performed a population-based cohort study using Taiwan National Health Insurance (NHI) database to investigate the association of SA and ED. From the database of one million representative subjects randomly sampled from individuals enrolled in the NHI system in 2010, we identified adult patients having SA and excluded those having a diagnosis of ED prior to SA. From these suspected SA patients, those having SA diagnosis after polysomnography were defined as probable SA patients. The dates of their first SA diagnosis were defined as their index dates. Each SA patient was matched to 30 randomly-selected, age-matched control subjects without any SA diagnosis. The control subjects were assigned index dates as their corresponding SA patients, and were ensured having no ED diagnosis prior to their index dates. Totally, 4,835 male patients with suspected SA (including 1,946 probable SA patients) were matched to 145,050 control subjects (including 58,380 subjects matched to probable SA patients). The incidence rate of ED was significantly higher in probable SA patients as compared with the corresponding control subjects (5.7 vs. 2.3 per 1000 patient-year; adjusted incidence rate ratio = 2.0 [95% CI: 1.8-2.2], p<0.0001). The cumulative incidence was also significantly higher in the probable SA patients (p<0.0001). In multivariable Cox regression analysis, probable SA remained a significant risk factor for the development of ED after adjusting for age, residency, income level and comorbidities (hazard ratio = 2.0 [95%CI: 1.5-2.7], p<0.0001). In line with previous studies, this population-based large-scale study confirmed an increased ED incidence in SA patients in Chinese population. Physicians need to pay attention to the possible underlying SA while treating ED patients.
Statistics provide guidance for indigenous organic carbon detection on Mars missions.

PubMed

Sephton, Mark A; Carter, Jonathan N

2014-08-01

Data from the Viking and Mars Science Laboratory missions indicate the presence of organic compounds that are not definitively martian in origin. Both contamination and confounding mineralogies have been suggested as alternatives to indigenous organic carbon. Intuitive thought suggests that we are repeatedly obtaining data that confirms the same level of uncertainty. Bayesian statistics may suggest otherwise. If an organic detection method has a true positive to false positive ratio greater than one, then repeated organic matter detection progressively increases the probability of indigeneity. Bayesian statistics also reveal that methods with higher ratios of true positives to false positives give higher overall probabilities and that detection of organic matter in a sample with a higher prior probability of indigenous organic carbon produces greater confidence. Bayesian statistics, therefore, provide guidance for the planning and operation of organic carbon detection activities on Mars. Suggestions for future organic carbon detection missions and instruments are as follows: (i) On Earth, instruments should be tested with analog samples of known organic content to determine their true positive to false positive ratios. (ii) On the mission, for an instrument with a true positive to false positive ratio above one, it should be recognized that each positive detection of organic carbon will result in a progressive increase in the probability of indigenous organic carbon being present; repeated measurements, therefore, can overcome some of the deficiencies of a less-than-definitive test. (iii) For a fixed number of analyses, the highest true positive to false positive ratio method or instrument will provide the greatest probability that indigenous organic carbon is present. (iv) On Mars, analyses should concentrate on samples with highest prior probability of indigenous organic carbon; intuitive desires to contrast samples of high prior probability and low prior probability of indigenous organic carbon should be resisted.
Multiple Imputation in Two-Stage Cluster Samples Using The Weighted Finite Population Bayesian Bootstrap.

PubMed

Zhou, Hanzhi; Elliott, Michael R; Raghunathan, Trivellore E

2016-06-01

Multistage sampling is often employed in survey samples for cost and convenience. However, accounting for clustering features when generating datasets for multiple imputation is a nontrivial task, particularly when, as is often the case, cluster sampling is accompanied by unequal probabilities of selection, necessitating case weights. Thus, multiple imputation often ignores complex sample designs and assumes simple random sampling when generating imputations, even though failing to account for complex sample design features is known to yield biased estimates and confidence intervals that have incorrect nominal coverage. In this article, we extend a recently developed, weighted, finite-population Bayesian bootstrap procedure to generate synthetic populations conditional on complex sample design data that can be treated as simple random samples at the imputation stage, obviating the need to directly model design features for imputation. We develop two forms of this method: one where the probabilities of selection are known at the first and second stages of the design, and the other, more common in public use files, where only the final weight based on the product of the two probabilities is known. We show that this method has advantages in terms of bias, mean square error, and coverage properties over methods where sample designs are ignored, with little loss in efficiency, even when compared with correct fully parametric models. An application is made using the National Automotive Sampling System Crashworthiness Data System, a multistage, unequal probability sample of U.S. passenger vehicle crashes, which suffers from a substantial amount of missing data in "Delta-V," a key crash severity measure.
Multiple Imputation in Two-Stage Cluster Samples Using The Weighted Finite Population Bayesian Bootstrap

PubMed Central

Zhou, Hanzhi; Elliott, Michael R.; Raghunathan, Trivellore E.

2017-01-01

Multistage sampling is often employed in survey samples for cost and convenience. However, accounting for clustering features when generating datasets for multiple imputation is a nontrivial task, particularly when, as is often the case, cluster sampling is accompanied by unequal probabilities of selection, necessitating case weights. Thus, multiple imputation often ignores complex sample designs and assumes simple random sampling when generating imputations, even though failing to account for complex sample design features is known to yield biased estimates and confidence intervals that have incorrect nominal coverage. In this article, we extend a recently developed, weighted, finite-population Bayesian bootstrap procedure to generate synthetic populations conditional on complex sample design data that can be treated as simple random samples at the imputation stage, obviating the need to directly model design features for imputation. We develop two forms of this method: one where the probabilities of selection are known at the first and second stages of the design, and the other, more common in public use files, where only the final weight based on the product of the two probabilities is known. We show that this method has advantages in terms of bias, mean square error, and coverage properties over methods where sample designs are ignored, with little loss in efficiency, even when compared with correct fully parametric models. An application is made using the National Automotive Sampling System Crashworthiness Data System, a multistage, unequal probability sample of U.S. passenger vehicle crashes, which suffers from a substantial amount of missing data in “Delta-V,” a key crash severity measure. PMID:29226161
Gene capture from across the grass family in the allohexaploid Elymus repens (L.) Gould (Poaceae, Triticeae) as evidenced by ITS, GBSSI, and molecular cytogenetics.

PubMed

Mahelka, Václav; Kopecký, David

2010-06-01

Four accessions of hexaploid Elymus repens from its native Central European distribution area were analyzed using sequencing of multicopy (internal transcribed spacer, ITS) and single-copy (granule-bound starch synthase I, GBSSI) DNA in concert with genomic and fluorescent in situ hybridization (GISH and FISH) to disentangle its allopolyploid origin. Despite extensive ITS homogenization, nrDNA in E. repens allowed us to identify at least four distinct lineages. Apart from Pseudoroegneria and Hordeum, representing the major genome constituents, the presence of further unexpected alien genetic material, originating from species outside the Triticeae and close to Panicum (Paniceae) and Bromus (Bromeae), was revealed. GBSSI sequences provided information complementary to the ITS. Apart from Pseudoroegneria and Hordeum, two additional gene variants from within the Triticeae were discovered: One was Taeniatherum-like, but the other did not have a close relationship with any of the diploids sampled. GISH results were largely congruent with the sequence-based markers. GISH clearly confirmed Pseudoroegneria and Hordeum as major genome constituents and further showed the presence of a small chromosome segment corresponding to Panicum. It resided in the Hordeum subgenome and probably represents an old acquisition of a Hordeum progenitor. Spotty hybridization signals across all chromosomes after GISH with Taeniatherum and Bromus probes suggested that gene acquisition from these species is more likely due to common ancestry of the grasses or early introgression than to recent hybridization or allopolyploid origin of E. repens. Physical mapping of rDNA loci using FISH revealed that all rDNA loci except one minor were located on Pseudoroegneria-derived chromosomes, which suggests the loss of all Hordeum-derived loci but one. Because homogenization mechanisms seem to operate effectively among Pseudoroegneria-like copies in this species, incomplete ITS homogenization in our samples is probably due to an interstitial position of an individual minor rDNA locus located within the Hordeum-derived subgenome.
Culturable bacteria present in the fluid of the hooded-pitcher plant Sarracenia minor based on 16S rDNA gene sequence data.

PubMed

Siragusa, Alex J; Swenson, Janice E; Casamatta, Dale A

2007-08-01

The culturable microbial community within the pitcher fluid of 93 Sarracenia minor carnivorous plants was examined over a 2-year study. Many aspects of the plant/bacterial/insect interaction within the pitcher fluid are minimally understood because the bacterial taxa present in these pitchers have not been identified. Thirteen isolates were characterized by 16S rDNA sequencing and subsequent phylogenetic analysis. The Proteobacteria were the most abundant taxa and included representatives from Serratia, Achromobacter, and Pantoea. The Actinobacteria Micrococcus was also abundant while Bacillus, Lactococcus, Chryseobacterium, and Rhodococcus were infrequently encountered. Several isolates conformed to species identifiers (>98% rDNA gene sequence similarity) including Serratia marcescens (isolates found in 27.5% of pitchers), Achromobacter xylosoxidans (37.6%), Micrococcus luteus (40.9%), Bacillus cereus (isolates found in 10.2%), Bacillus thuringiensis (5.4%), Lactococcus lactis (17.2%), and Rhodococcus equi (2.2%). Species-area curves suggest that sampling efforts were sufficient to recover a representative culturable bacterial community. The bacteria present represent a diverse community probably as a result of introduction by insect vectors, but the ecological significance remains under explored.
Automatic Item Generation of Probability Word Problems

ERIC Educational Resources Information Center

Holling, Heinz; Bertling, Jonas P.; Zeuch, Nina

2009-01-01

Mathematical word problems represent a common item format for assessing student competencies. Automatic item generation (AIG) is an effective way of constructing many items with predictable difficulties, based on a set of predefined task parameters. The current study presents a framework for the automatic generation of probability word problems…
Sr, Nd and Pb isotopes in Proterozoic intrusives astride the Grenville Front in Labrador: Implications for crustal contamination and basement mapping

USGS Publications Warehouse

Ashwal, L.D.; Wooden, J.L.; Emslie, R.F.

1986-01-01

We report Sr, Nd and Pb isotopic compositions of mid-Proterozoic anorthosites and related rocks (1.45-1.65 Ga) and of younger olivine diabase dikes (1.4 Ga) from two complexes on either side of the Grenville Front in Labrador. Anorthositic or diabasic samples from the Mealy Mountains (Grenville Province) and Harp Lake (Nain-Churchill Provinces) complexes have very similar major, minor and trace element compositions, but distinctly different isotopic signatures. All Mealy Mountains samples have ISr = 0.7025-0.7033, ??{lunate}Nd = +0.6 to +5.6 and Pb isotopic compositions consistent with derivation from a mantle source depleted with respect to Nd/Sm and Rb/Sr. Pb isotopic compositions for the Mealy Mountains samples are slightly more radiogenic than model mantle compositions. All Harp Lake samples have ISr = 0.7032-0.7066, ??{lunate}Nd = -0.3 to -4.4 and variable, but generally unradiogenic 207Pb 204Pb and 206Pb 204Pb compared to model mantle, suggesting mixing between a mantle-derived component and a U-depleted crustal contaminant. Crustal contaminants are probably a variety of Archean high-grade quartzofeldspathic gneisses with low U/Pb ratios and include a component that must be isotopically similar to the early Archean (>3.6 Ga) Uivak gneisses of Labrador or the Amitsoq gneisses of west Greenland. This would imply that the ancient gneiss complex of coastal Labrador and Greenland is larger than indicated by present surface exposure and may extend in the subsurface as far west as the Labrador Trough. If Harp Lake and Mealy Mountains samples were subjected to the same degree of contamination, as suggested by their chemical similarities, then the Mealy contaminants must be much younger, probably early or middle Proterozoic in age. The Labrador segment of the Grenville Front, therefore, appears to coincide with the southern margin of the Archean North Atlantic craton and may represent a pre mid-Proterozoic suture. ?? 1986.
Integrating sampling techniques and inverse virtual screening: toward the discovery of artificial peptide-based receptors for ligands.

PubMed

Pérez, Germán M; Salomón, Luis A; Montero-Cabrera, Luis A; de la Vega, José M García; Mascini, Marcello

2016-05-01

A novel heuristic using an iterative select-and-purge strategy is proposed. It combines statistical techniques for sampling and classification by rigid molecular docking through an inverse virtual screening scheme. This approach aims to the de novo discovery of short peptides that may act as docking receptors for small target molecules when there are no data available about known association complexes between them. The algorithm performs an unbiased stochastic exploration of the sample space, acting as a binary classifier when analyzing the entire peptides population. It uses a novel and effective criterion for weighting the likelihood of a given peptide to form an association complex with a particular ligand molecule based on amino acid sequences. The exploratory analysis relies on chemical information of peptides composition, sequence patterns, and association free energies (docking scores) in order to converge to those peptides forming the association complexes with higher affinities. Statistical estimations support these results providing an association probability by improving predictions accuracy even in cases where only a fraction of all possible combinations are sampled. False positives/false negatives ratio was also improved with this method. A simple rigid-body docking approach together with the proper information about amino acid sequences was used. The methodology was applied in a retrospective docking study to all 8000 possible tripeptide combinations using the 20 natural amino acids, screened against a training set of 77 different ligands with diverse functional groups. Afterward, all tripeptides were screened against a test set of 82 ligands, also containing different functional groups. Results show that our integrated methodology is capable of finding a representative group of the top-scoring tripeptides. The associated probability of identifying the best receptor or a group of the top-ranked receptors is more than double and about 10 times higher, respectively, when compared to classical random sampling methods.
Methods for sampling geographically mobile female traders in an East African market setting

PubMed Central

Achiro, Lillian; Kwena, Zachary A.; McFarland, Willi; Neilands, Torsten B.; Cohen, Craig R.; Bukusi, Elizabeth A.; Camlin, Carol S.

2018-01-01

Background The role of migration in the spread of HIV in sub-Saharan Africa is well-documented. Yet migration and HIV research have often focused on HIV risks to male migrants and their partners, or migrants overall, often failing to measure the risks to women via their direct involvement in migration. Inconsistent measures of mobility, gender biases in those measures, and limited data sources for sex-specific population-based estimates of mobility have contributed to a paucity of research on the HIV prevention and care needs of migrant and highly mobile women. This study addresses an urgent need for novel methods for developing probability-based, systematic samples of highly mobile women, focusing on a population of female traders operating out of one of the largest open air markets in East Africa. Our method involves three stages: 1.) identification and mapping of all market stall locations using Global Positioning System (GPS) coordinates; 2.) using female market vendor stall GPS coordinates to build the sampling frame using replicates; and 3.) using maps and GPS data for recruitment of study participants. Results The location of 6,390 vendor stalls were mapped using GPS. Of these, 4,064 stalls occupied by women (63.6%) were used to draw four replicates of 128 stalls each, and a fifth replicate of 15 pre-selected random alternates for a total of 527 stalls assigned to one of five replicates. Staff visited 323 stalls from the first three replicates and from these successfully recruited 306 female vendors into the study for a participation rate of 94.7%. Mobilization strategies and involving traders association representatives in participant recruitment were critical to the study’s success. Conclusion The study’s high participation rate suggests that this geospatial sampling method holds promise for development of probability-based samples in other settings that serve as transport hubs for highly mobile populations. PMID:29324780
Variation between Hospitals with Regard to Diagnostic Practice, Coding Accuracy, and Case-Mix. A Retrospective Validation Study of Administrative Data versus Medical Records for Estimating 30-Day Mortality after Hip Fracture.

PubMed

Helgeland, Jon; Kristoffersen, Doris Tove; Skyrud, Katrine Damgaard; Lindman, Anja Schou

2016-01-01

The purpose of this study was to assess the validity of patient administrative data (PAS) for calculating 30-day mortality after hip fracture as a quality indicator, by a retrospective study of medical records. We used PAS data from all Norwegian hospitals (2005-2009), merged with vital status from the National Registry, to calculate 30-day case-mix adjusted mortality for each hospital (n = 51). We used stratified sampling to establish a representative sample of both hospitals and cases. The hospitals were stratified according to high, low and medium mortality of which 4, 3, and 5 hospitals were sampled, respectively. Within hospitals, cases were sampled stratified according to year of admission, age, length of stay, and vital 30-day status (alive/dead). The final study sample included 1043 cases from 11 hospitals. Clinical information was abstracted from the medical records. Diagnostic and clinical information from the medical records and PAS were used to define definite and probable hip fracture. We used logistic regression analysis in order to estimate systematic between-hospital variation in unmeasured confounding. Finally, to study the consequences of unmeasured confounding for identifying mortality outlier hospitals, a sensitivity analysis was performed. The estimated overall positive predictive value was 95.9% for definite and 99.7% for definite or probable hip fracture, with no statistically significant differences between hospitals. The standard deviation of the additional, systematic hospital bias in mortality estimates was 0.044 on the logistic scale. The effect of unmeasured confounding on outlier detection was small to moderate, noticeable only for large hospital volumes. This study showed that PAS data are adequate for identifying cases of hip fracture, and the effect of unmeasured case mix variation was small. In conclusion, PAS data are adequate for calculating 30-day mortality after hip-fracture as a quality indicator in Norway.
Nematode Damage Functions: The Problems of Experimental and Sampling Error

PubMed Central

Ferris, H.

1984-01-01

The development and use of pest damage functions involves measurement and experimental errors associated with cultural, environmental, and distributional factors. Damage predictions are more valuable if considered with associated probability. Collapsing population densities into a geometric series of population classes allows a pseudo-replication removal of experimental and sampling error in damage function development. Recognition of the nature of sampling error for aggregated populations allows assessment of probability associated with the population estimate. The product of the probabilities incorporated in the damage function and in the population estimate provides a basis for risk analysis of the yield loss prediction and the ensuing management decision. PMID:19295865

Kullback-Leibler information function and the sequential selection of experiments to discriminate among several linear models

NASA Technical Reports Server (NTRS)

Sidik, S. M.

1972-01-01

The error variance of the process prior multivariate normal distributions of the parameters of the models are assumed to be specified, prior probabilities of the models being correct. A rule for termination of sampling is proposed. Upon termination, the model with the largest posterior probability is chosen as correct. If sampling is not terminated, posterior probabilities of the models and posterior distributions of the parameters are computed. An experiment was chosen to maximize the expected Kullback-Leibler information function. Monte Carlo simulation experiments were performed to investigate large and small sample behavior of the sequential adaptive procedure.
An empirical probability model of detecting species at low densities.

PubMed

Delaney, David G; Leung, Brian

2010-06-01

False negatives, not detecting things that are actually present, are an important but understudied problem. False negatives are the result of our inability to perfectly detect species, especially those at low density such as endangered species or newly arriving introduced species. They reduce our ability to interpret presence-absence survey data and make sound management decisions (e.g., rapid response). To reduce the probability of false negatives, we need to compare the efficacy and sensitivity of different sampling approaches and quantify an unbiased estimate of the probability of detection. We conducted field experiments in the intertidal zone of New England and New York to test the sensitivity of two sampling approaches (quadrat vs. total area search, TAS), given different target characteristics (mobile vs. sessile). Using logistic regression we built detection curves for each sampling approach that related the sampling intensity and the density of targets to the probability of detection. The TAS approach reduced the probability of false negatives and detected targets faster than the quadrat approach. Mobility of targets increased the time to detection but did not affect detection success. Finally, we interpreted two years of presence-absence data on the distribution of the Asian shore crab (Hemigrapsus sanguineus) in New England and New York, using our probability model for false negatives. The type of experimental approach in this paper can help to reduce false negatives and increase our ability to detect species at low densities by refining sampling approaches, which can guide conservation strategies and management decisions in various areas of ecology such as conservation biology and invasion ecology.
HABITAT ASSESSMENT USING A RANDOM PROBABILITY BASED SAMPLING DESIGN: ESCAMBIA RIVER DELTA, FLORIDA

EPA Science Inventory

Smith, Lisa M., Darrin D. Dantin and Steve Jordan. In press. Habitat Assessment Using a Random Probability Based Sampling Design: Escambia River Delta, Florida (Abstract). To be presented at the SWS/GERS Fall Joint Society Meeting: Communication and Collaboration: Coastal Systems...
7 CFR 43.102 - Definitions.

Code of Federal Regulations, 2010 CFR

2010-01-01

... SAMPLING PLANS Definitions § 43.102 Definitions. Statistical and inspection or sampling terms and their... probability of acceptance (Pa) for the Limited Quality (LQ) lots. The consumer protection is 90 percent in... the standards of this subpart that have a ten percent probability of acceptance are referred to as a...
The Italian national trends in smoking initiation and cessation according to gender and education.

PubMed

Sardu, C; Mereu, A; Minerba, L; Contu, P

2009-09-01

OBJECTIVES. This study aims to assess the trend in initiation and cessation of smoking across successive birth cohorts, according to gender and education, in order to provide useful suggestion for tobacco control policy. STUDY DESIGN. The study is based on data from the "Health conditions and resort to sanitary services" survey carried out in Italy from October 2004 to September 2005 by the National Institute of Statistics. Through a multisampling procedure a sample representative of the entire national territory was selected. In order to calculate trends in smoking initiation and cessation, data were stratified for birth cohorts, gender and education level, and analyzed through the life table method. The cumulative probability of smoking initiation, across subsequent generations, shows a downward trend followed by a plateau. This result highlights that there is not a shred of evidence to support the hypothesis of an anticipation in smoking initiation. The cumulative probability of quitting, across subsequent generations, follows an upward trend, highlighting the growing tendency of smokers to become an "early quitter", who give up within 30 years of age. Results suggest that the Italian antismoking approach, for the most part targeted at preventing the initiation of smoking emphasising the negative consequences, has an effect on the early smoking cessation. Health policies should reinforce the existing trend of "early quitting" through specific actions. In addition our results show that men with low education exhibit the higher probability of smoking initiation and the lower probability of early quitting, and therefore should be targeted with special attention.
Internal Medicine residents use heuristics to estimate disease probability.

PubMed

Phang, Sen Han; Ravani, Pietro; Schaefer, Jeffrey; Wright, Bruce; McLaughlin, Kevin

2015-01-01

Training in Bayesian reasoning may have limited impact on accuracy of probability estimates. In this study, our goal was to explore whether residents previously exposed to Bayesian reasoning use heuristics rather than Bayesian reasoning to estimate disease probabilities. We predicted that if residents use heuristics then post-test probability estimates would be increased by non-discriminating clinical features or a high anchor for a target condition. We randomized 55 Internal Medicine residents to different versions of four clinical vignettes and asked them to estimate probabilities of target conditions. We manipulated the clinical data for each vignette to be consistent with either 1) using a representative heuristic, by adding non-discriminating prototypical clinical features of the target condition, or 2) using anchoring with adjustment heuristic, by providing a high or low anchor for the target condition. When presented with additional non-discriminating data the odds of diagnosing the target condition were increased (odds ratio (OR) 2.83, 95% confidence interval [1.30, 6.15], p = 0.009). Similarly, the odds of diagnosing the target condition were increased when a high anchor preceded the vignette (OR 2.04, [1.09, 3.81], p = 0.025). Our findings suggest that despite previous exposure to the use of Bayesian reasoning, residents use heuristics, such as the representative heuristic and anchoring with adjustment, to estimate probabilities. Potential reasons for attribute substitution include the relative cognitive ease of heuristics vs. Bayesian reasoning or perhaps residents in their clinical practice use gist traces rather than precise probability estimates when diagnosing.
Calculation of a fluctuating entropic force by phase space sampling.

PubMed

Waters, James T; Kim, Harold D

2015-07-01

A polymer chain pinned in space exerts a fluctuating force on the pin point in thermal equilibrium. The average of such fluctuating force is well understood from statistical mechanics as an entropic force, but little is known about the underlying force distribution. Here, we introduce two phase space sampling methods that can produce the equilibrium distribution of instantaneous forces exerted by a terminally pinned polymer. In these methods, both the positions and momenta of mass points representing a freely jointed chain are perturbed in accordance with the spatial constraints and the Boltzmann distribution of total energy. The constraint force for each conformation and momentum is calculated using Lagrangian dynamics. Using terminally pinned chains in space and on a surface, we show that the force distribution is highly asymmetric with both tensile and compressive forces. Most importantly, the mean of the distribution, which is equal to the entropic force, is not the most probable force even for long chains. Our work provides insights into the mechanistic origin of entropic forces, and an efficient computational tool for unbiased sampling of the phase space of a constrained system.
Identification of a potential toxic hot spot associated with AVS spatial and seasonal variation.

PubMed

Campana, O; Rodríguez, A; Blasco, J

2009-04-01

In risk assessment of aquatic sediments, much attention is paid to the difference between acid-volatile sulfide (AVS) and simultaneously extracted metals (SEMs) as indicators of metal availability. Ten representative sampling sites were selected along the estuary of the Guadalete River. Surficial sediments were sampled in winter and summer to better understand SEM and AVS spatial and seasonal distributions and to establish priority risk areas. Total SEM concentration (SigmaSEM) ranged from 0.3 to 4.7 micromol g(-1). It was not significantly different between seasons, however, it showed a significant difference between sampling stations. AVS concentrations were much more variable, showing significant spatial and temporal variations. The values ranged from 0.8 to 22.4 micromol g(-1). The SEM/AVS ratio was found to be <1 at all except one station located near the mouth of the estuary. The results provided information on a potential pollution source near the mouth of the estuary, probably associated with vessel-related activities carried out in a local harbor area located near the station.
Propagation Effects and Circuit Performance of Modern Military Radio Systems with Particular Emphasis on those Employing Bandspreading Held in Arcueil (Paris) France 17-21 October 1988

DTIC Science & Technology

1989-12-01

probabilit6 de fausse alarme ( Pfa ) lorsque le signal nest pas pr~sent. Le coupiePnd, Pfa perrnet de comparer entre elles les diff6rentes hypoth~ses de...detection of initial sync is required to qualify possible bit slips. The frame acceptance probability PFA , frame rejection probability PFR, frame...a bit slip and those not, represented by PFBS and PFA respectively in Figure 3a. With a frame represented by two halves, the possible outcome may be
Simulating landscape change in the Olympic Peninsula using spatial ecological and socioeconomic data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Flamm, R.O.; Gottfried, R.; Lee, R.G.

1994-06-01

Ecological and socioeconomic data were integrated to study landscape change for the Dungeness River basin in the Olympic Peninsula, Washington State. A multinomial logit procedure was used to evaluate twenty-two maps representing various data themes to derive transition probabilities of land cover change. Probabilities of forest disturbance were greater on private land than public. Between 1975 and 1988, forest cover increased, grassy/brushy covers decreased, and the number of forest patches increased about 30%. Simulations were run to estimate future land cover. These results were represented as frequency distributions for proportion cover and patch characteristics.
Spatial distribution of marine airborne bacterial communities

PubMed Central

Seifried, Jasmin S; Wichels, Antje; Gerdts, Gunnar

2015-01-01

The spatial distribution of bacterial populations in marine bioaerosol samples was investigated during a cruise from the North Sea to the Baltic Sea via Skagerrak and Kattegat. The analysis of the sampled bacterial communities with a pyrosequencing approach revealed that the most abundant phyla were represented by the Proteobacteria (49.3%), Bacteroidetes (22.9%), Actinobacteria (16.3%), and Firmicutes (8.3%). Cyanobacteria were assigned to 1.5% of all bacterial reads. A core of 37 bacterial OTUs made up more than 75% of all bacterial sequences. The most abundant OTU was Sphingomonas sp. which comprised 17% of all bacterial sequences. The most abundant bacterial genera were attributed to distinctly different areas of origin, suggesting highly heterogeneous sources for bioaerosols of marine and coastal environments. Furthermore, the bacterial community was clearly affected by two environmental parameters – temperature as a function of wind direction and the sampling location itself. However, a comparison of the wind directions during the sampling and calculated backward trajectories underlined the need for more detailed information on environmental parameters for bioaerosol investigations. The current findings support the assumption of a bacterial core community in the atmosphere. They may be emitted from strong aerosolizing sources, probably being mixed and dispersed over long distances. PMID:25800495
[Bacteriological quality of air in a ward for sterile pharmaceutical preparations].

PubMed

Caorsi P, Beatriz; Sakurada Z, Andrea; Ulloa F, M Teresa; Pezzani V, Marcela; Latorre O, Paz

2011-02-01

An extremely clean area is required for preparation of sterile pharmaceutical compounds, in compliance with international standards, to minimize the probability of microbial contamination. To evaluate the bacteriological quality of the air in the Sterile Pharmaceutical Preparation Unit of the University of Chile's Clinical Hospital and to set up alerts and action levels of bacterial growth. We studied eight representative sites of our Unit on a daily basis from January to February 2005 and twice a week from June 2005 to February 2006. We collected 839 samples of air by impact in the Petri dish. 474 (56.5%) samples were positive; 17 (3.5%) of them had an inappropriate bacterial growth (2% of total samples). The samples from sites 1 and 2 (big and small biosafety cabinets) were negative. The countertop and transfer area occasionally exceeded the bacterial growth limits. The most frequently isolated bacteria were coagulase-negative staphylococci, Micrococcus spp and Corynebacterium spp, from skin microbiota, and Bacillus spp, an environmental bacteria. From a microbiological perspective, the air quality in our sterile preparation unit complied with international standards. Setting institutional alerts and action levels and appropriately identifying bacteria in sensitive areas permits quantification of the microbial load and application of preventive measures.
RANdom SAmple Consensus (RANSAC) algorithm for material-informatics: application to photovoltaic solar cells.

PubMed

Kaspi, Omer; Yosipof, Abraham; Senderowitz, Hanoch

2017-06-06

An important aspect of chemoinformatics and material-informatics is the usage of machine learning algorithms to build Quantitative Structure Activity Relationship (QSAR) models. The RANdom SAmple Consensus (RANSAC) algorithm is a predictive modeling tool widely used in the image processing field for cleaning datasets from noise. RANSAC could be used as a "one stop shop" algorithm for developing and validating QSAR models, performing outlier removal, descriptors selection, model development and predictions for test set samples using applicability domain. For "future" predictions (i.e., for samples not included in the original test set) RANSAC provides a statistical estimate for the probability of obtaining reliable predictions, i.e., predictions within a pre-defined number of standard deviations from the true values. In this work we describe the first application of RNASAC in material informatics, focusing on the analysis of solar cells. We demonstrate that for three datasets representing different metal oxide (MO) based solar cell libraries RANSAC-derived models select descriptors previously shown to correlate with key photovoltaic properties and lead to good predictive statistics for these properties. These models were subsequently used to predict the properties of virtual solar cells libraries highlighting interesting dependencies of PV properties on MO compositions.
Oxygen isotope and deuterium composition of snow cover on the profile of Western Siberia from Tomsk to the Gulf of Ob

NASA Astrophysics Data System (ADS)

Vasil'chuk, Yu. K.; Shevchenko, V. P.; Lisitzin, A. P.; Budantseva, N. A.; Vorobiov, S. N.; Kirpotin, S. N.; Krizkov, I. V.; Manasypov, R. M.; Pokrovsky, O. S.; Chizhova, Ju. N.

2016-12-01

The purpose of this work is to study the variability of the isotope composition (δ18O, δD, d exc) of the snow cover on a long transect of Western Siberia from the southern taiga to the tundra. The study of the snow cover is of paleogeographic, paleogeocryological, and paleohydrological value. The snow cover of western Siberia was sampled on a broadly NS transzonal profile from the environs of Tomsk (southern taiga zone) to the eastern coast of the Gulf of Ob (tundra zone) from February 19 to March 4, 2014. Snow samples were collected at 31 sites. Most of the samples represented by fresh snow, i.e., snow that had fallen a day before the moment of sampling were collected in two areas. In the area of Yamburg, the snow specimens collected from the surface are most probably settled snow of different ages. The values of δ18O in the snow from Tomsk to Yamburg varied from-21.89 to-32.82‰, and the values of δD, from-163.3 to-261.2‰. The value of deuterium excess was in the range of 4.06-19.53‰.
Estimating site occupancy rates for aquatic plants using spatial sub-sampling designs when detection probabilities are less than one

USGS Publications Warehouse

Nielson, Ryan M.; Gray, Brian R.; McDonald, Lyman L.; Heglund, Patricia J.

2011-01-01

Estimation of site occupancy rates when detection probabilities are <1 is well established in wildlife science. Data from multiple visits to a sample of sites are used to estimate detection probabilities and the proportion of sites occupied by focal species. In this article we describe how site occupancy methods can be applied to estimate occupancy rates of plants and other sessile organisms. We illustrate this approach and the pitfalls of ignoring incomplete detection using spatial data for 2 aquatic vascular plants collected under the Upper Mississippi River's Long Term Resource Monitoring Program (LTRMP). Site occupancy models considered include: a naïve model that ignores incomplete detection, a simple site occupancy model assuming a constant occupancy rate and a constant probability of detection across sites, several models that allow site occupancy rates and probabilities of detection to vary with habitat characteristics, and mixture models that allow for unexplained variation in detection probabilities. We used information theoretic methods to rank competing models and bootstrapping to evaluate the goodness-of-fit of the final models. Results of our analysis confirm that ignoring incomplete detection can result in biased estimates of occupancy rates. Estimates of site occupancy rates for 2 aquatic plant species were 19–36% higher compared to naive estimates that ignored probabilities of detection <1. Simulations indicate that final models have little bias when 50 or more sites are sampled, and little gains in precision could be expected for sample sizes >300. We recommend applying site occupancy methods for monitoring presence of aquatic species.
Dangerous "spin": the probability myth of evidence-based prescribing - a Merleau-Pontyian approach.

PubMed

Morstyn, Ron

2011-08-01

The aim of this study was to examine logical positivist statistical probability statements used to support and justify "evidence-based" prescribing rules in psychiatry when viewed from the major philosophical theories of probability, and to propose "phenomenological probability" based on Maurice Merleau-Ponty's philosophy of "phenomenological positivism" as a better clinical and ethical basis for psychiatric prescribing. The logical positivist statistical probability statements which are currently used to support "evidence-based" prescribing rules in psychiatry have little clinical or ethical justification when subjected to critical analysis from any of the major theories of probability and represent dangerous "spin" because they necessarily exclude the individual , intersubjective and ambiguous meaning of mental illness. A concept of "phenomenological probability" founded on Merleau-Ponty's philosophy of "phenomenological positivism" overcomes the clinically destructive "objectivist" and "subjectivist" consequences of logical positivist statistical probability and allows psychopharmacological treatments to be appropriately integrated into psychiatric treatment.
2D modelling of the light distribution of early-type galaxies in a volume-limited sample - II. Results for real galaxies

NASA Astrophysics Data System (ADS)

D'Onofrio, M.

2001-10-01

In this paper we analyse the results of the two-dimensional (2D) fit of the light distribution of 73 early-type galaxies belonging to the Virgo and Fornax clusters, a sample volume- and magnitude-limited down to MB=-17.3, and highly homogeneous. In our previous paper (Paper I) we have presented the adopted 2D models of the surface-brightness distribution - namely the r1/n and (r1/n+exp) models - we have discussed the main sources of error affecting the structural parameters, and we have tested the ability of the chosen minimization algorithm (MINUIT) in determining the fitting parameters using a sample of artificial galaxies. We show that, with the exception of 11 low-luminosity E galaxies, the best fit of the real galaxy sample is always achieved with the two-component (r1/n+exp) model. The improvement in the χ2 due to the addition of the exponential component is found to be statistically significant. The best fit is obtained with the exponent n of the generalized r1/n Sersic law different from the classical de Vaucouleurs value of 4. Nearly 42 per cent of the sample have n<2, suggesting the presence of exponential `bulges' also in early-type galaxies. 20 luminous E galaxies are fitted by the two-component model, with a small central exponential structure (`disc') and an outer big spheroid with n>4. We believe that this is probably due to their resolved core. The resulting scalelengths Rh and Re of each component peak approximately at ~1 and ~2kpc, respectively, although with different variances in their distributions. The ratio Re/Rh peaks at ~0.5, a value typical for normal lenticular galaxies. The first component, represented by the r1/n law, is probably made of two distinct families, `ordinary' and `bright', on the basis of their distribution in the μe-log(Re) plane, a result already suggested by Capaccioli, Caon and D'Onofrio. The bulges of spirals and S0 galaxies belong to the `ordinary' family, while the large spheroids of luminous E galaxies form the `bright' family. The second component, represented by the exponential law, also shows a wide distribution in the μ0c-log(Rh) plane. Small discs (or cores) have short scalelengths and high central surface brightness, while normal lenticulars and spiral galaxies generally have scalelengths higher than 0.5kpc and central surface brightness brighter than 20magarcsec-2 (in the B band). The scalelengths Re and Rh of the `bulge' and `disc' components are probably correlated, indicating that a self-regulating mechanism of galaxy formation may be at work. Alternatively, two regions of the Re-Rh plane are avoided by galaxies due to dynamical instability effects. The bulge-to-disc (B/D) ratio seems to vary uniformly along the Hubble sequence, going from late-type spirals to E galaxies. At the end of the sequence the ratio between the large spheroidal component and the small inner core can reach B/D~100.
Multinomial mixture model with heterogeneous classification probabilities

USGS Publications Warehouse

Holland, M.D.; Gray, B.R.

2011-01-01

Royle and Link (Ecology 86(9):2505-2512, 2005) proposed an analytical method that allowed estimation of multinomial distribution parameters and classification probabilities from categorical data measured with error. While useful, we demonstrate algebraically and by simulations that this method yields biased multinomial parameter estimates when the probabilities of correct category classifications vary among sampling units. We address this shortcoming by treating these probabilities as logit-normal random variables within a Bayesian framework. We use Markov chain Monte Carlo to compute Bayes estimates from a simulated sample from the posterior distribution. Based on simulations, this elaborated Royle-Link model yields nearly unbiased estimates of multinomial and correct classification probability estimates when classification probabilities are allowed to vary according to the normal distribution on the logit scale or according to the Beta distribution. The method is illustrated using categorical submersed aquatic vegetation data. ?? 2010 Springer Science+Business Media, LLC.
Robust location and spread measures for nonparametric probability density function estimation.

PubMed

López-Rubio, Ezequiel

2009-10-01

Robustness against outliers is a desirable property of any unsupervised learning scheme. In particular, probability density estimators benefit from incorporating this feature. A possible strategy to achieve this goal is to substitute the sample mean and the sample covariance matrix by more robust location and spread estimators. Here we use the L1-median to develop a nonparametric probability density function (PDF) estimator. We prove its most relevant properties, and we show its performance in density estimation and classification applications.
Probabilistic Open Set Recognition

NASA Astrophysics Data System (ADS)

Jain, Lalit Prithviraj

Real-world tasks in computer vision, pattern recognition and machine learning often touch upon the open set recognition problem: multi-class recognition with incomplete knowledge of the world and many unknown inputs. An obvious way to approach such problems is to develop a recognition system that thresholds probabilities to reject unknown classes. Traditional rejection techniques are not about the unknown; they are about the uncertain boundary and rejection around that boundary. Thus traditional techniques only represent the "known unknowns". However, a proper open set recognition algorithm is needed to reduce the risk from the "unknown unknowns". This dissertation examines this concept and finds existing probabilistic multi-class recognition approaches are ineffective for true open set recognition. We hypothesize the cause is due to weak adhoc assumptions combined with closed-world assumptions made by existing calibration techniques. Intuitively, if we could accurately model just the positive data for any known class without overfitting, we could reject the large set of unknown classes even under this assumption of incomplete class knowledge. For this, we formulate the problem as one of modeling positive training data by invoking statistical extreme value theory (EVT) near the decision boundary of positive data with respect to negative data. We provide a new algorithm called the PI-SVM for estimating the unnormalized posterior probability of class inclusion. This dissertation also introduces a new open set recognition model called Compact Abating Probability (CAP), where the probability of class membership decreases in value (abates) as points move from known data toward open space. We show that CAP models improve open set recognition for multiple algorithms. Leveraging the CAP formulation, we go on to describe the novel Weibull-calibrated SVM (W-SVM) algorithm, which combines the useful properties of statistical EVT for score calibration with one-class and binary support vector machines. Building from the success of statistical EVT based recognition methods such as PI-SVM and W-SVM on the open set problem, we present a new general supervised learning algorithm for multi-class classification and multi-class open set recognition called the Extreme Value Local Basis (EVLB). The design of this algorithm is motivated by the observation that extrema from known negative class distributions are the closest negative points to any positive sample during training, and thus should be used to define the parameters of a probabilistic decision model. In the EVLB, the kernel distribution for each positive training sample is estimated via an EVT distribution fit over the distances to the separating hyperplane between positive training sample and closest negative samples, with a subset of the overall positive training data retained to form a probabilistic decision boundary. Using this subset as a frame of reference, the probability of a sample at test time decreases as it moves away from the positive class. Possessing this property, the EVLB is well-suited to open set recognition problems where samples from unknown or novel classes are encountered at test. Our experimental evaluation shows that the EVLB provides a substantial improvement in scalability compared to standard radial basis function kernel machines, as well as P I-SVM and W-SVM, with improved accuracy in many cases. We evaluate our algorithm on open set variations of the standard visual learning benchmarks, as well as with an open subset of classes from Caltech 256 and ImageNet. Our experiments show that PI-SVM, WSVM and EVLB provide significant advances over the previous state-of-the-art solutions for the same tasks.

Forestry inventory based on multistage sampling with probability proportional to size

NASA Technical Reports Server (NTRS)

Lee, D. C. L.; Hernandez, P., Jr.; Shimabukuro, Y. E.

1983-01-01

A multistage sampling technique, with probability proportional to size, is developed for a forest volume inventory using remote sensing data. The LANDSAT data, Panchromatic aerial photographs, and field data are collected. Based on age and homogeneity, pine and eucalyptus classes are identified. Selection of tertiary sampling units is made through aerial photographs to minimize field work. The sampling errors for eucalyptus and pine ranged from 8.34 to 21.89 percent and from 7.18 to 8.60 percent, respectively.
Optimized lower leg injury probability curves from postmortem human subject tests under axial impacts.

PubMed

Yoganandan, Narayan; Arun, Mike W J; Pintar, Frank A; Szabo, Aniko

2014-01-01

Derive optimum injury probability curves to describe human tolerance of the lower leg using parametric survival analysis. The study reexamined lower leg postmortem human subjects (PMHS) data from a large group of specimens. Briefly, axial loading experiments were conducted by impacting the plantar surface of the foot. Both injury and noninjury tests were included in the testing process. They were identified by pre- and posttest radiographic images and detailed dissection following the impact test. Fractures included injuries to the calcaneus and distal tibia-fibula complex (including pylon), representing severities at the Abbreviated Injury Score (AIS) level 2+. For the statistical analysis, peak force was chosen as the main explanatory variable and the age was chosen as the covariable. Censoring statuses depended on experimental outcomes. Parameters from the parametric survival analysis were estimated using the maximum likelihood approach and the dfbetas statistic was used to identify overly influential samples. The best fit from the Weibull, log-normal, and log-logistic distributions was based on the Akaike information criterion. Plus and minus 95% confidence intervals were obtained for the optimum injury probability distribution. The relative sizes of the interval were determined at predetermined risk levels. Quality indices were described at each of the selected probability levels. The mean age, stature, and weight were 58.2±15.1 years, 1.74±0.08 m, and 74.9±13.8 kg, respectively. Excluding all overly influential tests resulted in the tightest confidence intervals. The Weibull distribution was the most optimum function compared to the other 2 distributions. A majority of quality indices were in the good category for this optimum distribution when results were extracted for 25-, 45- and 65-year-olds at 5, 25, and 50% risk levels age groups for lower leg fracture. For 25, 45, and 65 years, peak forces were 8.1, 6.5, and 5.1 kN at 5% risk; 9.6, 7.7, and 6.1 kN at 25% risk; and 10.4, 8.3, and 6.6 kN at 50% risk, respectively. This study derived axial loading-induced injury risk curves based on survival analysis using peak force and specimen age; adopting different censoring schemes; considering overly influential samples in the analysis; and assessing the quality of the distribution at discrete probability levels. Because procedures used in the present survival analysis are accepted by international automotive communities, current optimum human injury probability distributions can be used at all risk levels with more confidence in future crashworthiness applications for automotive and other disciplines.
Cost-effective binomial sequential sampling of western bean cutworm, Striacosta albicosta (Lepidoptera: Noctuidae), egg masses in corn.

PubMed

Paula-Moraes, S; Burkness, E C; Hunt, T E; Wright, R J; Hein, G L; Hutchison, W D

2011-12-01

Striacosta albicosta (Smith) (Lepidoptera: Noctuidae), is a native pest of dry beans (Phaseolus vulgaris L.) and corn (Zea mays L.). As a result of larval feeding damage on corn ears, S. albicosta has a narrow treatment window; thus, early detection of the pest in the field is essential, and egg mass sampling has become a popular monitoring tool. Three action thresholds for field and sweet corn currently are used by crop consultants, including 4% of plants infested with egg masses on sweet corn in the silking-tasseling stage, 8% of plants infested with egg masses on field corn with approximately 95% tasseled, and 20% of plants infested with egg masses on field corn during mid-milk-stage corn. The current monitoring recommendation is to sample 20 plants at each of five locations per field (100 plants total). In an effort to develop a more cost-effective sampling plan for S. albicosta egg masses, several alternative binomial sampling plans were developed using Wald's sequential probability ratio test, and validated using Resampling for Validation of Sampling Plans (RVSP) software. The benefit-cost ratio also was calculated and used to determine the final selection of sampling plans. Based on final sampling plans selected for each action threshold, the average sample number required to reach a treat or no-treat decision ranged from 38 to 41 plants per field. This represents a significant savings in sampling cost over the current recommendation of 100 plants.
The Butterflies of Barro Colorado Island, Panama: Local Extinction since the 1930s

PubMed Central

Basset, Yves; Barrios, Héctor; Segar, Simon; Srygley, Robert B.; Aiello, Annette; Warren, Andrew D.; Delgado, Francisco; Coronado, James; Lezcano, Jorge; Arizala, Stephany; Rivera, Marleny; Perez, Filonila; Bobadilla, Ricardo; Lopez, Yacksecari; Ramirez, José Alejandro

2015-01-01

Few data are available about the regional or local extinction of tropical butterfly species. When confirmed, local extinction was often due to the loss of host-plant species. We used published lists and recent monitoring programs to evaluate changes in butterfly composition on Barro Colorado Island (BCI, Panama) between an old (1923–1943) and a recent (1993–2013) period. Although 601 butterfly species have been recorded from BCI during the 1923–2013 period, we estimate that 390 species are currently breeding on the island, including 34 cryptic species, currently only known by their DNA Barcode Index Number. Twenty-three butterfly species that were considered abundant during the old period could not be collected during the recent period, despite a much higher sampling effort in recent times. We consider these species locally extinct from BCI and they conservatively represent 6% of the estimated local pool of resident species. Extinct species represent distant phylogenetic branches and several families. The butterfly traits most likely to influence the probability of extinction were host growth form, wing size and host specificity, independently of the phylogenetic relationships among butterfly species. On BCI, most likely candidates for extinction were small hesperiids feeding on herbs (35% of extinct species). However, contrary to our working hypothesis, extinction of these species on BCI cannot be attributed to loss of host plants. In most cases these host plants remain extant, but they probably subsist at lower or more fragmented densities. Coupled with low dispersal power, this reduced availability of host plants has probably caused the local extinction of some butterfly species. Many more bird than butterfly species have been lost from BCI recently, confirming that small preserves may be far more effective at conserving invertebrates than vertebrates and, therefore, should not necessarily be neglected from a conservation viewpoint. PMID:26305111
Negative values of quasidistributions and quantum wave and number statistics

NASA Astrophysics Data System (ADS)

Peřina, J.; Křepelka, J.

2018-04-01

We consider nonclassical wave and number quantum statistics, and perform a decomposition of quasidistributions for nonlinear optical down-conversion processes using Bessel functions. We show that negative values of the quasidistribution do not directly represent probabilities; however, they directly influence measurable number statistics. Negative terms in the decomposition related to the nonclassical behavior with negative amplitudes of probability can be interpreted as positive amplitudes of probability in the negative orthogonal Bessel basis, whereas positive amplitudes of probability in the positive basis describe classical cases. However, probabilities are positive in all cases, including negative values of quasidistributions. Negative and positive contributions of decompositions to quasidistributions are estimated. The approach can be adapted to quantum coherence functions.
A collision probability analysis of the double-heterogeneity problem

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hebert, A.

1993-10-01

A practical collision probability model is presented for the description of geometries with many levels of heterogeneity. Regular regions of the macrogeometry are assumed to contain a stochastic mixture of spherical grains or cylindrical tubes. Simple expressions for the collision probabilities in the global geometry are obtained as a function of the collision probabilities in the macro- and microgeometries. This model was successfully implemented in the collision probability kernel of the APOLLO-1, APOLLO-2, and DRAGON lattice codes for the description of a broad range of reactor physics problems. Resonance self-shielding and depletion calculations in the microgeometries are possible because eachmore » microregion is explicitly represented.« less
Contaminants in fish tissue from US lakes and reservoirs: A ...

EPA Pesticide Factsheets

An unequal probability design was used to develop national estimates for 268 persistent, bioaccumulative, and toxic chemicals in fish tissue from lakes and reservoirs of the conterminous United States (excluding the Laurentian Great Lakes and Great Salt Lake). Predator (fillet) and bottom-dweller (whole-body) composites were collected from 500 lakes selected randomly from the target population of 147,343 lakes in the lower 48 states. Each of these composite types comprised nationally representative samples whose results were extrapolated to the sampled population of an estimated 76,559 lakes for predators and 46,190 lakes for bottom dwellers. Mercury and PCBs were detected in all fish samples. Dioxins and furans were detected in 81% and 99% of predator and bottom-dweller samples, respectively. Cumulative frequency distributions showed that mercury concentrations exceeded the EPA 300 ppb mercury fish tissue criterion at nearly half of the lakes in the sampled population. Total PCB concentrations exceeded a 12 ppb human health risk-based consumption limit at nearly 17% of lakes, and dioxins and furans exceeded a 0.15 ppt (toxic equivalent or TEQ) risk-based threshold at nearly 8% of lakes in the sampled population. In contrast, 43 target chemicals were not detected in any samples. No detections were reported for nine organophosphate pesticides, one PCB congener, 16 polycyclic aromatic hydrocarbons, or 17 other semivolatile organic chemicals. An unequal prob
Random sampling of elementary flux modes in large-scale metabolic networks.

PubMed

Machado, Daniel; Soons, Zita; Patil, Kiran Raosaheb; Ferreira, Eugénio C; Rocha, Isabel

2012-09-15

The description of a metabolic network in terms of elementary (flux) modes (EMs) provides an important framework for metabolic pathway analysis. However, their application to large networks has been hampered by the combinatorial explosion in the number of modes. In this work, we develop a method for generating random samples of EMs without computing the whole set. Our algorithm is an adaptation of the canonical basis approach, where we add an additional filtering step which, at each iteration, selects a random subset of the new combinations of modes. In order to obtain an unbiased sample, all candidates are assigned the same probability of getting selected. This approach avoids the exponential growth of the number of modes during computation, thus generating a random sample of the complete set of EMs within reasonable time. We generated samples of different sizes for a metabolic network of Escherichia coli, and observed that they preserve several properties of the full EM set. It is also shown that EM sampling can be used for rational strain design. A well distributed sample, that is representative of the complete set of EMs, should be suitable to most EM-based methods for analysis and optimization of metabolic networks. Source code for a cross-platform implementation in Python is freely available at http://code.google.com/p/emsampler. dmachado@deb.uminho.pt Supplementary data are available at Bioinformatics online.
Contaminants in fish tissue from US lakes and reservoirs: a national probabilistic study.

PubMed

Stahl, Leanne L; Snyder, Blaine D; Olsen, Anthony R; Pitt, Jennifer L

2009-03-01

An unequal probability design was used to develop national estimates for 268 persistent, bioaccumulative, and toxic chemicals in fish tissue from lakes and reservoirs of the conterminous United States (excluding the Laurentian Great Lakes and Great Salt Lake). Predator (fillet) and bottom-dweller (whole body) composites were collected from 500 lakes selected randomly from the target population of 147,343 lakes in the lower 48 states. Each of these composite types comprised nationally representative samples whose results were extrapolated to the sampled population of an estimated 76,559 lakes for predators and 46,190 lakes for bottom dwellers. Mercury and PCBs were detected in all fish samples. Dioxins and furans were detected in 81% and 99% of predator and bottom-dweller samples, respectively. Cumulative frequency distributions showed that mercury concentrations exceeded the EPA 300 ppb mercury fish tissue criterion at nearly half of the lakes in the sampled population. Total PCB concentrations exceeded a 12 ppb human health risk-based consumption limit at nearly 17% of lakes, and dioxins and furans exceeded a 0.15 ppt (toxic equivalent or TEQ) risk-based threshold at nearly 8% of lakes in the sampled population. In contrast, 43 target chemicals were not detected in any samples. No detections were reported for nine organophosphate pesticides, one PCB congener, 16 polycyclic aromatic hydrocarbons, or 17 other semivolatile organic chemicals.
Using open robust design models to estimate temporary emigration from capture-recapture data.

PubMed

Kendall, W L; Bjorkland, R

2001-12-01

Capture-recapture studies are crucial in many circumstances for estimating demographic parameters for wildlife and fish populations. Pollock's robust design, involving multiple sampling occasions per period of interest, provides several advantages over classical approaches. This includes the ability to estimate the probability of being present and available for detection, which in some situations is equivalent to breeding probability. We present a model for estimating availability for detection that relaxes two assumptions required in previous approaches. The first is that the sampled population is closed to additions and deletions across samples within a period of interest. The second is that each member of the population has the same probability of being available for detection in a given period. We apply our model to estimate survival and breeding probability in a study of hawksbill sea turtles (Eretmochelys imbricata), where previous approaches are not appropriate.
Using open robust design models to estimate temporary emigration from capture-recapture data

USGS Publications Warehouse

Kendall, W.L.; Bjorkland, R.

2001-01-01

Capture-recapture studies are crucial in many circumstances for estimating demographic parameters for wildlife and fish populations. Pollock's robust design, involving multiple sampling occasions per period of interest, provides several advantages over classical approaches. This includes the ability to estimate the probability of being present and available for detection, which in some situations is equivalent to breeding probability. We present a model for estimating availability for detection that relaxes two assumptions required in previous approaches. The first is that the sampled population is closed to additions and deletions across samples within a period of interest. The second is that each member of the population has the same probability of being available for detection in a given period. We apply our model to estimate survival and breeding probability in a study of hawksbill sea turtles (Eretmochelys imbricata), where previous approaches are not appropriate.
Application of the FINDER system to the search for epithermal vein gold-silver deposits : Kushikino, Japan, a case study

USGS Publications Warehouse

Singer, Donald A.; Kouda, Ryoichi

1991-01-01

The FINDER system employs geometric probability, Bayesian statistics, and the normal probability density function to integrate spatial and frequency information to produce a map of probabilities of target centers. Target centers can be mineral deposits, alteration associated with mineral deposits, or any other target that can be represented by a regular shape on a two dimensional map. The size, shape, mean, and standard deviation for each variable are characterized in a control area and the results applied by means of FINDER to the study area. The Kushikino deposit consists of groups of quartz-calcite-adularia veins that produced 55 tonnes of gold and 456 tonnes of silver since 1660. Part of a 6 by 10 km area near Kushikino served as a control area. Within the control area, data plotting, contouring, and cluster analysis were used to identify the barren and mineralized populations. Sodium was found to be depleted in an elliptically shaped area 3.1 by 1.6 km, potassium was both depleted and enriched locally in an elliptically shaped area 3.0 by 1.3 km, and sulfur was enriched in an elliptically shaped area 5.8 by 1.6 km. The potassium, sodium, and sulfur content from 233 surface rock samples were each used in FINDER to produce probability maps for the 12 by 30 km study area which includes Kushikino. High probability areas for each of the individual variables are over and offset up to 4 km eastward from the main Kushikino veins. In general, high probability areas identified by FINDER are displaced from the main veins and cover not only the host andesite and the dacite-andesite that is about the same age as the Kushikino mineralization, but also younger sedimentary rocks, andesite, and tuff units east and northeast of Kushikino. The maps also display the same patterns observed near Kushikino, but with somewhat lower probabilities, about 1.5 km east of the old gold prospect, Hajima, and in a broad zone 2.5 km east-west and 1 km north-south, centered 2 km west of the old gold prospect, Yaeyama.
Influence of the HLA characteristics of Italian patients on donor search outcome in unrelated hematopoietic stem cell transplantation.

PubMed

Testi, M; Andreani, M; Locatelli, F; Arcese, W; Troiano, M; Battarra, M; Gaziev, J; Lucarelli, G

2014-08-01

The information regarding the probability of finding a matched unrelated donor (MUD) within a relatively short time is crucial for the success of hematopoietic stem cell transplantation (HSCT), particularly in patients with malignancies. In this study, we retrospectively analyzed 315 Italian patients who started a search for a MUD, in order to assess the distribution of human leukocyte antigen (HLA) alleles and haplotypes in this population of patients and to evaluate the probability of finding a donor. Comparing two groups of patients based on whether or not a 10/10 HLA-matched donor was available, we found that patients who had a fully-matched MUD possessed at least one frequent haplotype more often than the others (45.6% vs 14.3%; P = 0.000003). In addition, analysis of data pertaining to the HLA class I alleles distribution showed that, in the first group of patients, less common alleles were under-represented (20.2% vs 40.0%; P = 0.006). Therefore, the presence of less frequent alleles represents a negative factor for the search for a potential compatible donor being successful, whereas the presence of one frequent haplotype represents a positive predictive factor. Antigenic differences between patient and donor observed at C and DQB1 loci, were mostly represented by particular B/C or DRB1/DQB1 allelic associations. Thus, having a particular B or DRB1 allele, linked to multiple C or DQB1 alleles, respectively, might be considered to be associated with a lower probability of a successful search. Taken together, these data may help determine in advance the probability of finding a suitable unrelated donor for an Italian patient. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Uncommon knowledge of a common phenomenon: intuitions and statistical thinking about gender birth ratio

NASA Astrophysics Data System (ADS)

Peled, Ofra N.; Peled, Irit; Peled, Jonathan U.

2013-01-01

The phenomenon of birth of a baby is a common and familiar one, and yet college students participating in a general biology class did not possess the expected common knowledge of the equal probability of gender births. We found that these students held strikingly skewed conceptions regarding gender birth ratio, estimating the number of female births to be more than twice the number of male births. Possible sources of these beliefs were analysed, showing flaws in statistical thinking such as viewing small unplanned samples as representing the whole population and making inferences from an inappropriate population. Some educational implications are discussed and a short teaching example (using data assembly) demonstrates an instructional direction that might facilitate conceptual change.
Intelligence and homosexuality.

PubMed

Kanazawa, Satoshi

2012-09-01

The origin of preferences and values is an unresolved theoretical problem in behavioural sciences. The Savanna-IQ Interaction Hypothesis, derived from the Savanna Principle and a theory of the evolution of general intelligence, suggests that more intelligent individuals are more likely to acquire and espouse evolutionarily novel preferences and values than less intelligent individuals, but general intelligence has no effect on the acquisition and espousal of evolutionarily familiar preferences and values. Ethnographies of traditional societies suggest that exclusively homosexual behaviour was probably rare in the ancestral environment, so the Hypothesis would predict that more intelligent individuals are more likely to identify themselves as homosexual and engage in homosexual behaviour. Analyses of three large, nationally representative samples (two of which are prospectively longitudinal) from two different nations confirm the prediction.
New Trends in Gender and Mathematics Performance: A Meta-Analysis

PubMed Central

Lindberg, Sara M.; Hyde, Janet Shibley; Petersen, Jennifer L.; Linn, Marcia C.

2010-01-01

In this paper, we use meta-analysis to analyze gender differences in recent studies of mathematics performance. First, we meta-analyzed data from 242 studies published between 1990 and 2007, representing the testing of 1,286,350 people. Overall, d = .05, indicating no gender difference, and VR = 1.08, indicating nearly equal male and female variances. Second, we analyzed data from large data sets based on probability sampling of U.S. adolescents over the past 20 years: the NLSY, NELS88, LSAY, and NAEP. Effect sizes for the gender difference ranged between −0.15 and +0.22. Variance ratios ranged from 0.88 to 1.34. Taken together these findings support the view that males and females perform similarly in mathematics. PMID:21038941
Palliative care and the intensive care nurses: feelings that endure.

PubMed

Silveira, Natyele Rippel; Nascimento, Eliane Regina Pereira do; Rosa, Luciana Martins da; Jung, Walnice; Martins, Sabrina Regina; Fontes, Moisés Dos Santos

2016-01-01

to know the feelings of nurses regarding palliative care in adult intensive care units. qualitative study, which adopted the theoretical framework of Social Representations, carried out with 30 nurses of the state of Santa Catarina included by Snowball sampling. Data were collected through semi-structured interviews conducted from April to August 2015, organized and analyzed through the Collective Subject Discourse. the results showed how central ideas are related to feelings of comfort, frustration, insecurity and anguish, in addition to the feeling that the professional training and performance are focused on the cure. the social representations of nurses regarding the feelings related to palliative care are represented mainly by negative feelings, probably as consequence of the context in which care is provided.
Using GIS to generate spatially balanced random survey designs for natural resource applications.

PubMed

Theobald, David M; Stevens, Don L; White, Denis; Urquhart, N Scott; Olsen, Anthony R; Norman, John B

2007-07-01

Sampling of a population is frequently required to understand trends and patterns in natural resource management because financial and time constraints preclude a complete census. A rigorous probability-based survey design specifies where to sample so that inferences from the sample apply to the entire population. Probability survey designs should be used in natural resource and environmental management situations because they provide the mathematical foundation for statistical inference. Development of long-term monitoring designs demand survey designs that achieve statistical rigor and are efficient but remain flexible to inevitable logistical or practical constraints during field data collection. Here we describe an approach to probability-based survey design, called the Reversed Randomized Quadrant-Recursive Raster, based on the concept of spatially balanced sampling and implemented in a geographic information system. This provides environmental managers a practical tool to generate flexible and efficient survey designs for natural resource applications. Factors commonly used to modify sampling intensity, such as categories, gradients, or accessibility, can be readily incorporated into the spatially balanced sample design.
[Probability, Cambridge Conference on School Mathematics Feasibility Study No. 7.

ERIC Educational Resources Information Center

Davis, R.

These materials were written with the aim of reflecting the thinking of the Cambridge Conference on School Mathematics (CCSM) regarding the goals and objectives for school mathematics. They represent a practical response to a proposal by CCSM that some elements of probability be introduced in the elementary grades. These materials provide children…
Probability & Perception: The Representativeness Heuristic in Action

ERIC Educational Resources Information Center

Lu, Yun; Vasko, Francis J.; Drummond, Trevor J.; Vasko, Lisa E.

2014-01-01

If the prospective students of probability lack a background in mathematical proofs, hands-on classroom activities may work well to help them to learn to analyze problems correctly. For example, students may physically roll a die twice to count and compare the frequency of the sequences. Tools such as graphing calculators or Microsoft Excel®…

The Effects and Side-Effects of Statistics Education: Psychology Students' (Mis-)Conceptions of Probability

ERIC Educational Resources Information Center

Morsanyi, Kinga; Primi, Caterina; Chiesi, Francesca; Handley, Simon

2009-01-01

In three studies we looked at two typical misconceptions of probability: the representativeness heuristic, and the equiprobability bias. The literature on statistics education predicts that some typical errors and biases (e.g., the equiprobability bias) increase with education, whereas others decrease. This is in contrast with reasoning theorists'…
The Neural Correlates of Health Risk Perception in Individuals with Low and High Numeracy

ERIC Educational Resources Information Center

Vogel, Stephan E.; Keller, Carmen; Koschutnig, Karl; Reishofer, Gernot; Ebner, Franz; Dohle, Simone; Siegrist, Michael; Grabner, Roland H.

2016-01-01

The ability to use numerical information in different contexts is a major goal of mathematics education. In health risk communication, outcomes of a medical condition are frequently expressed in probabilities. Difficulties to accurately represent probability information can result in unfavourable medical decisions. To support individuals with…
Statistical analysis of CSP plants by simulating extensive meteorological series

NASA Astrophysics Data System (ADS)

Pavón, Manuel; Fernández, Carlos M.; Silva, Manuel; Moreno, Sara; Guisado, María V.; Bernardos, Ana

2017-06-01

The feasibility analysis of any power plant project needs the estimation of the amount of energy it will be able to deliver to the grid during its lifetime. To achieve this, its feasibility study requires a precise knowledge of the solar resource over a long term period. In Concentrating Solar Power projects (CSP), financing institutions typically requires several statistical probability of exceedance scenarios of the expected electric energy output. Currently, the industry assumes a correlation between probabilities of exceedance of annual Direct Normal Irradiance (DNI) and energy yield. In this work, this assumption is tested by the simulation of the energy yield of CSP plants using as input a 34-year series of measured meteorological parameters and solar irradiance. The results of this work show that, even if some correspondence between the probabilities of exceedance of annual DNI values and energy yields is found, the intra-annual distribution of DNI may significantly affect this correlation. This result highlights the need of standardized procedures for the elaboration of representative DNI time series representative of a given probability of exceedance of annual DNI.
Adaptively biased sequential importance sampling for rare events in reaction networks with comparison to exact solutions from finite buffer dCME method

PubMed Central

Cao, Youfang; Liang, Jie

2013-01-01

Critical events that occur rarely in biological processes are of great importance, but are challenging to study using Monte Carlo simulation. By introducing biases to reaction selection and reaction rates, weighted stochastic simulation algorithms based on importance sampling allow rare events to be sampled more effectively. However, existing methods do not address the important issue of barrier crossing, which often arises from multistable networks and systems with complex probability landscape. In addition, the proliferation of parameters and the associated computing cost pose significant problems. Here we introduce a general theoretical framework for obtaining optimized biases in sampling individual reactions for estimating probabilities of rare events. We further describe a practical algorithm called adaptively biased sequential importance sampling (ABSIS) method for efficient probability estimation. By adopting a look-ahead strategy and by enumerating short paths from the current state, we estimate the reaction-specific and state-specific forward and backward moving probabilities of the system, which are then used to bias reaction selections. The ABSIS algorithm can automatically detect barrier-crossing regions, and can adjust bias adaptively at different steps of the sampling process, with bias determined by the outcome of exhaustively generated short paths. In addition, there are only two bias parameters to be determined, regardless of the number of the reactions and the complexity of the network. We have applied the ABSIS method to four biochemical networks: the birth-death process, the reversible isomerization, the bistable Schlögl model, and the enzymatic futile cycle model. For comparison, we have also applied the finite buffer discrete chemical master equation (dCME) method recently developed to obtain exact numerical solutions of the underlying discrete chemical master equations of these problems. This allows us to assess sampling results objectively by comparing simulation results with true answers. Overall, ABSIS can accurately and efficiently estimate rare event probabilities for all examples, often with smaller variance than other importance sampling algorithms. The ABSIS method is general and can be applied to study rare events of other stochastic networks with complex probability landscape. PMID:23862966
Adaptively biased sequential importance sampling for rare events in reaction networks with comparison to exact solutions from finite buffer dCME method

NASA Astrophysics Data System (ADS)

Cao, Youfang; Liang, Jie

2013-07-01

Critical events that occur rarely in biological processes are of great importance, but are challenging to study using Monte Carlo simulation. By introducing biases to reaction selection and reaction rates, weighted stochastic simulation algorithms based on importance sampling allow rare events to be sampled more effectively. However, existing methods do not address the important issue of barrier crossing, which often arises from multistable networks and systems with complex probability landscape. In addition, the proliferation of parameters and the associated computing cost pose significant problems. Here we introduce a general theoretical framework for obtaining optimized biases in sampling individual reactions for estimating probabilities of rare events. We further describe a practical algorithm called adaptively biased sequential importance sampling (ABSIS) method for efficient probability estimation. By adopting a look-ahead strategy and by enumerating short paths from the current state, we estimate the reaction-specific and state-specific forward and backward moving probabilities of the system, which are then used to bias reaction selections. The ABSIS algorithm can automatically detect barrier-crossing regions, and can adjust bias adaptively at different steps of the sampling process, with bias determined by the outcome of exhaustively generated short paths. In addition, there are only two bias parameters to be determined, regardless of the number of the reactions and the complexity of the network. We have applied the ABSIS method to four biochemical networks: the birth-death process, the reversible isomerization, the bistable Schlögl model, and the enzymatic futile cycle model. For comparison, we have also applied the finite buffer discrete chemical master equation (dCME) method recently developed to obtain exact numerical solutions of the underlying discrete chemical master equations of these problems. This allows us to assess sampling results objectively by comparing simulation results with true answers. Overall, ABSIS can accurately and efficiently estimate rare event probabilities for all examples, often with smaller variance than other importance sampling algorithms. The ABSIS method is general and can be applied to study rare events of other stochastic networks with complex probability landscape.
Adaptively biased sequential importance sampling for rare events in reaction networks with comparison to exact solutions from finite buffer dCME method.

PubMed

Cao, Youfang; Liang, Jie

2013-07-14

Critical events that occur rarely in biological processes are of great importance, but are challenging to study using Monte Carlo simulation. By introducing biases to reaction selection and reaction rates, weighted stochastic simulation algorithms based on importance sampling allow rare events to be sampled more effectively. However, existing methods do not address the important issue of barrier crossing, which often arises from multistable networks and systems with complex probability landscape. In addition, the proliferation of parameters and the associated computing cost pose significant problems. Here we introduce a general theoretical framework for obtaining optimized biases in sampling individual reactions for estimating probabilities of rare events. We further describe a practical algorithm called adaptively biased sequential importance sampling (ABSIS) method for efficient probability estimation. By adopting a look-ahead strategy and by enumerating short paths from the current state, we estimate the reaction-specific and state-specific forward and backward moving probabilities of the system, which are then used to bias reaction selections. The ABSIS algorithm can automatically detect barrier-crossing regions, and can adjust bias adaptively at different steps of the sampling process, with bias determined by the outcome of exhaustively generated short paths. In addition, there are only two bias parameters to be determined, regardless of the number of the reactions and the complexity of the network. We have applied the ABSIS method to four biochemical networks: the birth-death process, the reversible isomerization, the bistable Schlögl model, and the enzymatic futile cycle model. For comparison, we have also applied the finite buffer discrete chemical master equation (dCME) method recently developed to obtain exact numerical solutions of the underlying discrete chemical master equations of these problems. This allows us to assess sampling results objectively by comparing simulation results with true answers. Overall, ABSIS can accurately and efficiently estimate rare event probabilities for all examples, often with smaller variance than other importance sampling algorithms. The ABSIS method is general and can be applied to study rare events of other stochastic networks with complex probability landscape.
Estimating juvenile Chinook salmon (Oncorhynchus tshawytscha) abundance from beach seine data collected in the Sacramento–San Joaquin Delta and San Francisco Bay, California

USGS Publications Warehouse

Perry, Russell W.; Kirsch, Joseph E.; Hendrix, A. Noble

2016-06-17

Resource managers rely on abundance or density metrics derived from beach seine surveys to make vital decisions that affect fish population dynamics and assemblage structure. However, abundance and density metrics may be biased by imperfect capture and lack of geographic closure during sampling. Currently, there is considerable uncertainty about the capture efficiency of juvenile Chinook salmon (Oncorhynchus tshawytscha) by beach seines. Heterogeneity in capture can occur through unrealistic assumptions of closure and from variation in the probability of capture caused by environmental conditions. We evaluated the assumptions of closure and the influence of environmental conditions on capture efficiency and abundance estimates of Chinook salmon from beach seining within the Sacramento–San Joaquin Delta and the San Francisco Bay. Beach seine capture efficiency was measured using a stratified random sampling design combined with open and closed replicate depletion sampling. A total of 56 samples were collected during the spring of 2014. To assess variability in capture probability and the absolute abundance of juvenile Chinook salmon, beach seine capture efficiency data were fitted to the paired depletion design using modified N-mixture models. These models allowed us to explicitly test the closure assumption and estimate environmental effects on the probability of capture. We determined that our updated method allowing for lack of closure between depletion samples drastically outperformed traditional data analysis that assumes closure among replicate samples. The best-fit model (lowest-valued Akaike Information Criterion model) included the probability of fish being available for capture (relaxed closure assumption), capture probability modeled as a function of water velocity and percent coverage of fine sediment, and abundance modeled as a function of sample area, temperature, and water velocity. Given that beach seining is a ubiquitous sampling technique for many species, our improved sampling design and analysis could provide significant improvements in density and abundance estimation.
Multinomial Logistic Regression & Bootstrapping for Bayesian Estimation of Vertical Facies Prediction in Heterogeneous Sandstone Reservoirs

NASA Astrophysics Data System (ADS)

Al-Mudhafar, W. J.

2013-12-01

Precisely prediction of rock facies leads to adequate reservoir characterization by improving the porosity-permeability relationships to estimate the properties in non-cored intervals. It also helps to accurately identify the spatial facies distribution to perform an accurate reservoir model for optimal future reservoir performance. In this paper, the facies estimation has been done through Multinomial logistic regression (MLR) with respect to the well logs and core data in a well in upper sandstone formation of South Rumaila oil field. The entire independent variables are gamma rays, formation density, water saturation, shale volume, log porosity, core porosity, and core permeability. Firstly, Robust Sequential Imputation Algorithm has been considered to impute the missing data. This algorithm starts from a complete subset of the dataset and estimates sequentially the missing values in an incomplete observation by minimizing the determinant of the covariance of the augmented data matrix. Then, the observation is added to the complete data matrix and the algorithm continues with the next observation with missing values. The MLR has been chosen to estimate the maximum likelihood and minimize the standard error for the nonlinear relationships between facies & core and log data. The MLR is used to predict the probabilities of the different possible facies given each independent variable by constructing a linear predictor function having a set of weights that are linearly combined with the independent variables by using a dot product. Beta distribution of facies has been considered as prior knowledge and the resulted predicted probability (posterior) has been estimated from MLR based on Baye's theorem that represents the relationship between predicted probability (posterior) with the conditional probability and the prior knowledge. To assess the statistical accuracy of the model, the bootstrap should be carried out to estimate extra-sample prediction error by randomly drawing datasets with replacement from the training data. Each sample has the same size of the original training set and it can be conducted N times to produce N bootstrap datasets to re-fit the model accordingly to decrease the squared difference between the estimated and observed categorical variables (facies) leading to decrease the degree of uncertainty.
Sampling techniques for burbot in a western non-wadeable river

USGS Publications Warehouse

Klein, Z. B.; Quist, Michael C.; Rhea, D.T.; Senecal, A. C.

2015-01-01

Burbot, Lota lota (L.), populations are declining throughout much of their native distribution. Although numerous aspects of burbot ecology are well understood, less is known about effective sampling techniques for burbot in lotic systems. Occupancy models were used to estimate the probability of detection () for three gears (6.4- and 19-mm bar mesh hoop nets, night electric fishing), within the context of various habitat characteristics. During the summer, night electric fishing had the highest estimated detection probability for both juvenile (, 95% C.I.; 0.35, 0.26–0.46) and adult (0.30, 0.20–0.41) burbot. However, small-mesh hoop nets (6.4-mm bar mesh) had similar detection probabilities to night electric fishing for both juvenile (0.26, 0.17–0.36) and adult (0.27, 0.18–0.39) burbot during the summer. In autumn, a similar overlap between detection probabilities was observed for juvenile and adult burbot. Small-mesh hoop nets had the highest estimated probability of detection for both juvenile and adult burbot (0.46, 0.33–0.59), whereas night electric fishing had a detection probability of 0.39 (0.28–0.52) for juvenile and adult burbot. By using detection probabilities to compare gears, the most effective sampling technique can be identified, leading to increased species detections and more effective management of burbot.
The value of Bayes' theorem for interpreting abnormal test scores in cognitively healthy and clinical samples.

PubMed

Gavett, Brandon E

2015-03-01

The base rates of abnormal test scores in cognitively normal samples have been a focus of recent research. The goal of the current study is to illustrate how Bayes' theorem uses these base rates--along with the same base rates in cognitively impaired samples and prevalence rates of cognitive impairment--to yield probability values that are more useful for making judgments about the absence or presence of cognitive impairment. Correlation matrices, means, and standard deviations were obtained from the Wechsler Memory Scale--4th Edition (WMS-IV) Technical and Interpretive Manual and used in Monte Carlo simulations to estimate the base rates of abnormal test scores in the standardization and special groups (mixed clinical) samples. Bayes' theorem was applied to these estimates to identify probabilities of normal cognition based on the number of abnormal test scores observed. Abnormal scores were common in the standardization sample (65.4% scoring below a scaled score of 7 on at least one subtest) and more common in the mixed clinical sample (85.6% scoring below a scaled score of 7 on at least one subtest). Probabilities varied according to the number of abnormal test scores, base rates of normal cognition, and cutoff scores. The results suggest that interpretation of base rates obtained from cognitively healthy samples must also account for data from cognitively impaired samples. Bayes' theorem can help neuropsychologists answer questions about the probability that an individual examinee is cognitively healthy based on the number of abnormal test scores observed.
Nonparametric, Coupled ,Bayesian ,Dictionary ,and Classifier Learning for Hyperspectral Classification.

PubMed

Akhtar, Naveed; Mian, Ajmal

2017-10-03

We present a principled approach to learn a discriminative dictionary along a linear classifier for hyperspectral classification. Our approach places Gaussian Process priors over the dictionary to account for the relative smoothness of the natural spectra, whereas the classifier parameters are sampled from multivariate Gaussians. We employ two Beta-Bernoulli processes to jointly infer the dictionary and the classifier. These processes are coupled under the same sets of Bernoulli distributions. In our approach, these distributions signify the frequency of the dictionary atom usage in representing class-specific training spectra, which also makes the dictionary discriminative. Due to the coupling between the dictionary and the classifier, the popularity of the atoms for representing different classes gets encoded into the classifier. This helps in predicting the class labels of test spectra that are first represented over the dictionary by solving a simultaneous sparse optimization problem. The labels of the spectra are predicted by feeding the resulting representations to the classifier. Our approach exploits the nonparametric Bayesian framework to automatically infer the dictionary size--the key parameter in discriminative dictionary learning. Moreover, it also has the desirable property of adaptively learning the association between the dictionary atoms and the class labels by itself. We use Gibbs sampling to infer the posterior probability distributions over the dictionary and the classifier under the proposed model, for which, we derive analytical expressions. To establish the effectiveness of our approach, we test it on benchmark hyperspectral images. The classification performance is compared with the state-of-the-art dictionary learning-based classification methods.
Second-order contrast based on the expectation of effort and reinforcement.

PubMed

Clement, Tricia S; Zentall, Thomas R

2002-01-01

Pigeons prefer signals for reinforcement that require greater effort (or time) to obtain over those that require less effort to obtain (T. S. Clement, J. Feltus, D. H. Kaiser, & T. R. Zentall, 2000). Preference was attributed to contrast (or to the relatively greater improvement in conditions) produced by the appearance of the signal when it was preceded by greater effort. In Experiment 1, the authors of the present study demonstrated that the expectation of greater effort was sufficient to produce such a preference (a second-order contrast effect). In Experiments 2 and 3, low versus high probability of reinforcement was substituted for high versus low effort, respectively, with similar results. In Experiment 3, the authors found that the stimulus preference could be attributed to positive contrast (when the discriminative stimuli represented an improvement in the probability of reinforcement) and perhaps also negative contrast (when the discriminative stimuli represented reduction in the probability of reinforcement).
Quantum decision-maker theory and simulation

NASA Astrophysics Data System (ADS)

Zak, Michail; Meyers, Ronald E.; Deacon, Keith S.

2000-07-01

A quantum device simulating the human decision making process is introduced. It consists of quantum recurrent nets generating stochastic processes which represent the motor dynamics, and of classical neural nets describing the evolution of probabilities of these processes which represent the mental dynamics. The autonomy of the decision making process is achieved by a feedback from the mental to motor dynamics which changes the stochastic matrix based upon the probability distribution. This feedback replaces unavailable external information by an internal knowledge- base stored in the mental model in the form of probability distributions. As a result, the coupled motor-mental dynamics is described by a nonlinear version of Markov chains which can decrease entropy without an external source of information. Applications to common sense based decisions as well as to evolutionary games are discussed. An example exhibiting self-organization is computed using quantum computer simulation. Force on force and mutual aircraft engagements using the quantum decision maker dynamics are considered.
Fungal Communities Including Plant Pathogens in Near Surface Air Are Similar across Northwestern Europe.

PubMed

Nicolaisen, Mogens; West, Jonathan S; Sapkota, Rumakanta; Canning, Gail G M; Schoen, Cor; Justesen, Annemarie F

2017-01-01

Information on the diversity of fungal spores in air is limited, and also the content of airborne spores of fungal plant pathogens is understudied. In the present study, a total of 152 air samples were taken from rooftops at urban settings in Slagelse, DK, Wageningen NL, and Rothamsted, UK together with 41 samples from above oilseed rape fields in Rothamsted. Samples were taken during 10-day periods in spring and autumn, each sample representing 1 day of sampling. The fungal content of samples was analyzed by metabarcoding of the fungal internal transcribed sequence 1 (ITS1) and by qPCR for specific fungi. The metabarcoding results demonstrated that season had significant effects on airborne fungal communities. In contrast, location did not have strong effects on the communities, even though locations were separated by up to 900 km. Also, a number of plant pathogens had strikingly similar patterns of abundance at the three locations. Rooftop samples were more diverse than samples taken above fields, probably reflecting greater mixing of air from a range of microenvironments for the rooftop sites. Pathogens that were known to be present in the crop were also found in air samples taken above the field. This paper is one of the first detailed studies of fungal composition in air with the focus on plant pathogens and shows that it is possible to detect a range of pathogens in rooftop air samplers using metabarcoding.
Influence of item distribution pattern and abundance on efficiency of benthic core sampling

USGS Publications Warehouse

Behney, Adam C.; O'Shaughnessy, Ryan; Eichholz, Michael W.; Stafford, Joshua D.

2014-01-01

ore sampling is a commonly used method to estimate benthic item density, but little information exists about factors influencing the accuracy and time-efficiency of this method. We simulated core sampling in a Geographic Information System framework by generating points (benthic items) and polygons (core samplers) to assess how sample size (number of core samples), core sampler size (cm2), distribution of benthic items, and item density affected the bias and precision of estimates of density, the detection probability of items, and the time-costs. When items were distributed randomly versus clumped, bias decreased and precision increased with increasing sample size and increased slightly with increasing core sampler size. Bias and precision were only affected by benthic item density at very low values (500–1,000 items/m2). Detection probability (the probability of capturing ≥ 1 item in a core sample if it is available for sampling) was substantially greater when items were distributed randomly as opposed to clumped. Taking more small diameter core samples was always more time-efficient than taking fewer large diameter samples. We are unable to present a single, optimal sample size, but provide information for researchers and managers to derive optimal sample sizes dependent on their research goals and environmental conditions.
"I Don't Really Understand Probability at All": Final Year Pre-Service Teachers' Understanding of Probability

ERIC Educational Resources Information Center

Maher, Nicole; Muir, Tracey

2014-01-01

This paper reports on one aspect of a wider study that investigated a selection of final year pre-service primary teachers' responses to four probability tasks. The tasks focused on foundational ideas of probability including sample space, independence, variation and expectation. Responses suggested that strongly held intuitions appeared to…
Sample Size Determination for Rasch Model Tests

ERIC Educational Resources Information Center

Draxler, Clemens

2010-01-01

This paper is concerned with supplementing statistical tests for the Rasch model so that additionally to the probability of the error of the first kind (Type I probability) the probability of the error of the second kind (Type II probability) can be controlled at a predetermined level by basing the test on the appropriate number of observations.…
MEASUREMENT OF MULTI-POLLUTANT AND MULTI-PATHWAY EXPOSURES IN A PROBABILITY-BASED SAMPLE OF CHILDREN: PRACTICAL STRATEGIES FOR EFFECTIVE FIELD STUDIES

EPA Science Inventory

The purpose of this manuscript is to describe the practical strategies developed for the implementation of the Minnesota Children's Pesticide Exposure Study (MNCPES), which is one of the first probability-based samples of multi-pathway and multi-pesticide exposures in children....
Multi-scale occupancy estimation and modelling using multiple detection methods

USGS Publications Warehouse

Nichols, James D.; Bailey, Larissa L.; O'Connell, Allan F.; Talancy, Neil W.; Grant, Evan H. Campbell; Gilbert, Andrew T.; Annand, Elizabeth M.; Husband, Thomas P.; Hines, James E.

2008-01-01

Occupancy estimation and modelling based on detection–nondetection data provide an effective way of exploring change in a species’ distribution across time and space in cases where the species is not always detected with certainty. Today, many monitoring programmes target multiple species, or life stages within a species, requiring the use of multiple detection methods. When multiple methods or devices are used at the same sample sites, animals can be detected by more than one method.We develop occupancy models for multiple detection methods that permit simultaneous use of data from all methods for inference about method-specific detection probabilities. Moreover, the approach permits estimation of occupancy at two spatial scales: the larger scale corresponds to species’ use of a sample unit, whereas the smaller scale corresponds to presence of the species at the local sample station or site.We apply the models to data collected on two different vertebrate species: striped skunks Mephitis mephitis and red salamanders Pseudotriton ruber. For striped skunks, large-scale occupancy estimates were consistent between two sampling seasons. Small-scale occupancy probabilities were slightly lower in the late winter/spring when skunks tend to conserve energy, and movements are limited to males in search of females for breeding. There was strong evidence of method-specific detection probabilities for skunks. As anticipated, large- and small-scale occupancy areas completely overlapped for red salamanders. The analyses provided weak evidence of method-specific detection probabilities for this species.Synthesis and applications. Increasingly, many studies are utilizing multiple detection methods at sampling locations. The modelling approach presented here makes efficient use of detections from multiple methods to estimate occupancy probabilities at two spatial scales and to compare detection probabilities associated with different detection methods. The models can be viewed as another variation of Pollock's robust design and may be applicable to a wide variety of scenarios where species occur in an area but are not always near the sampled locations. The estimation approach is likely to be especially useful in multispecies conservation programmes by providing efficient estimates using multiple detection devices and by providing device-specific detection probability estimates for use in survey design.
A rational decision rule with extreme events.

PubMed

Basili, Marcello

2006-12-01

Risks induced by extreme events are characterized by small or ambiguous probabilities, catastrophic losses, or windfall gains. Through a new functional, that mimics the restricted Bayes-Hurwicz criterion within the Choquet expected utility approach, it is possible to represent the decisionmaker behavior facing both risky (large and reliable probability) and extreme (small or ambiguous probability) events. A new formalization of the precautionary principle (PP) is shown and a new functional, which encompasses both extreme outcomes and expectation of all the possible results for every act, is claimed.

Conditional Probabilities and Collapse in Quantum Measurements

NASA Astrophysics Data System (ADS)

Laura, Roberto; Vanni, Leonardo

2008-09-01

We show that including both the system and the apparatus in the quantum description of the measurement process, and using the concept of conditional probabilities, it is possible to deduce the statistical operator of the system after a measurement with a given result, which gives the probability distribution for all possible consecutive measurements on the system. This statistical operator, representing the state of the system after the first measurement, is in general not the same that would be obtained using the postulate of collapse.
The use of Stable Isotopes to Assess Climatic Controls on Groundwater Recharge in the Southern Sacramento Mountains, New Mexico

NASA Astrophysics Data System (ADS)

Newton, B. T.; Timmons, S. S.; Rawling, G. C.; Kludt, T.; Eastoe, C. J.

2008-12-01

We used the stable isotopes of hydrogen and oxygen to relate the temporal variability of groundwater recharge to climatic conditions in the southern Sacramento Mountains as a part of a larger regional hydrogeologic study. The southern Sacramento Mountains are the primary recharge source not only to local aquifers, but also to the Lower Pecos River Basin, the Roswell Artesian aquifer and aquifers in the Salt Basin. Aquifers in the study area mainly consist of fractured limestone. In years prior to 2006, groundwater levels within the study area showed a steady decline. We observed a significant increase in regional groundwater levels and spring discharge during and shortly after the unusually wet 2006 monsoon season. We developed a local meteoric water line (LMWL) in δ18O vs. δD space based on precipitation samples collected from several different elevations over a period of two years. The stable isotopic compositions of streams during base flow conditions define an evaporation line with a slope of 5.5 that intersects the LMWL in the region that represents winter precipitation. Spring and well samples collected in 2003 and spring samples collected in 2008 exhibit isotopic compositions that plot near the evaporation line, indicating that groundwater recharge is largely snow melt that has subsequently undergone evaporation in local streams. After the unusually wet 2006 monsoon season, the isotopic compositions of springs sampled in fall of 2006 and wells sampled in spring of 2007 deviated from the evaporation line, plotting closer to the LMWL. This observed isotopic trend is thought to represent a large input of 2006 monsoon precipitation to the groundwater system via relatively short fracture-dominated flow paths. Stable isotope results indicate that while snow melt is probably the main source of groundwater recharge in the southern Sacramento Mountains, as exhibited by the 2003 and 2008 samples, above average summer precipitation events, such as in 2006, can also contribute to significant groundwater recharge.
Testing the Paradigm that Ultra-Luminous X-Ray Sources as a Class Represent Accreting Intermediate

NASA Technical Reports Server (NTRS)

Berghea, C. T.; Weaver, K. A.; Colbert, E. J. M.; Roberts, T. P.

2008-01-01

To test the idea that ultraluminous X-ray sources (ULXs) in external galaxies represent a class of accreting Intermediate-Mass Black Holes (IMBHs), we have undertaken a program to identify ULXs and a lower luminosity X-ray comparison sample with the highest quality data in the Chandra archive. We establish a general property of ULXs that the most X-ray luminous objects possess the fattest X-ray spectra (in the Chandra band pass). No prior sample studies have established the general hardening of ULX spectra with luminosity. This hardening occurs at the highest luminosities (absorbed luminosity > or equals 5x10(exp 39) ergs/s) and is in line with recent models arguing that ULXs are actually stellar-mass black holes. From spectral modeling, we show that the evidence originally taken to mean that ULXs are IMBHs - i.e., the "simple IMBH model" - is nowhere near as compelling when a large sample of ULXs is looked at properly. During the last couple of years, XMM-Newton spectroscopy of ULXs has to some large extent begun to negate the simple IMBH model based on fewer objects. We confirm and expand these results, which validates the XMM-Newton work in a broader sense with independent X-ray data. We find (1) that cool disk components are present with roughly equal probability and total flux fraction for any given ULX, regardless of luminosity, and (2) that cool disk components extend below the standard ULX luminosity cutoff of 10(exp 39) ergs/s, down to our sample limit of 10(exp 38:3) ergs/s. The fact that cool disk components are not correlated with luminosity damages the argument that cool disks indicate IMBHs in ULXs, for which a strong statistical support was never made.
Environmental exposures to lead, mercury, and cadmium among South Korean teenagers (KNHANES 2010-2013): Body burden and risk factors.

PubMed

Kim, Nam-Soo; Ahn, Jaeouk; Lee, Byung-Kook; Park, Jungsun; Kim, Yangho

2017-07-01

Limited information is available on the association of age and sex with blood concentrations of heavy metals in teenagers. In addition, factors such as a shared family environment may have an association. We analyzed data from the Korean National Health and Nutrition Examination Survey (KNHANES, 2010-2013) to determine whether blood levels of heavy metals differ by risk factors such as age, sex, and shared family environment in a representative sample of teenagers. This study used data obtained in the KNHANES 2010-2013, which had a rolling sampling design that involved a complex, stratified, multistage, probability-cluster survey of a representative sample of the non-institutionalized civilian population in South Korea. Our cross-sectional analysis was restricted to teenagers and their parents who completed the health examination survey, and for whom blood measurements of cadmium, lead, and mercury were available. The final analytical sample consisted of 1585 teenagers, and 376 fathers and 399 mothers who provided measurements of blood heavy metal concentrations. Male teenagers had greater blood levels of lead and mercury, but sex had no association with blood cadmium level. There were age-related increases in blood cadmium, but blood lead decreased with age, and age had little association with blood mercury. The concentrations of cadmium and mercury declined from 2010 to 2013. The blood concentrations of lead, cadmium, and mercury in teenagers were positively associated with the levels in their parents after adjustment for covariates. Our results show that blood heavy metal concentrations differ by risk factors such as age, sex, and shared family environment in teenagers. Copyright © 2017 Elsevier Inc. All rights reserved.
Heavy metal pollution and ecological risk assessment of the paddy soils near a zinc-lead mining area in Hunan.

PubMed

Lu, Sijin; Wang, Yeyao; Teng, Yanguo; Yu, Xuan

2015-10-01

Soil pollution by Cd, Hg, As, Pb, Cr, Cu, and Zn was characterized in the area of the mining and smelting of metal ores at Guiyang, northeast of Hunan Province. A total of 150 topsoil (0-20 cm) samples were collected in May 2012 with a nominal density of one sample per 4 km(2). High concentrations of heavy metals especially, Cd, Zn, and Pb were found in many of the samples taken from surrounding paddy soil, indicating a certain extent of spreading of heavy metal pollution. Sequential extraction technique and risk assessment code (RAC) were used to study the mobility of chemical forms of heavy metals in the soils and their ecological risk. The results reveal that Cd represents a high ecological risk due to its highest percentage of the exchangeable and carbonate fractions. The metals of Zn and Cu pose a medium risk, and the rest of the metals represent a low environmental risk. The range of the potential ecological risk of soil calculated by risk index (RI) was 123.5~2791.2 and revealed a considerable-high ecological risk in study area especially in the neighboring and surrounding the mining activities area. Additionally, cluster analyses suggested that metals such as Pb, As, Hg, Zn, and Cd could be from the same sources probably related to the acidic drainage and wind transport of dust. Cluster analysis also clearly distinguishes the samples with similar characteristics according to their spatial distribution. The results could be used during the ecological risk screening stage, in conjunction with total concentrations and metal fractionation values to better estimate ecological risk.
Testing the Paradigm that Ultraluminous X-Ray Sources as a Class Represent Accreting Intermediate-Mass Black Holes

NASA Astrophysics Data System (ADS)

Berghea, C. T.; Weaver, K. A.; Colbert, E. J. M.; Roberts, T. P.

2008-11-01

To test the idea that ultraluminous X-ray sources (ULXs) in external galaxies represent a class of accreting intermediate-mass black holes (IMBHs), we have undertaken a program to identify ULXs and a lower luminosity X-ray comparison sample with the highest quality data in the Chandra archive. We establish as a general property of ULXs that the most X-ray-luminous objects possess the flattest X-ray spectra (in the Chandra bandpass). No prior sample studies have established the general hardening of ULX spectra with luminosity. This hardening occurs at the highest luminosities (absorbed luminosity >=5 × 1039 erg s-1) and is in line with recent models arguing that ULXs are actually stellar mass black holes. From spectral modeling, we show that the evidence originally taken to mean that ULXs are IMBHs—i.e., the "simple IMBH model"—is nowhere near as compelling when a large sample of ULXs is looked at properly. During the last couple of years, XMM-Newton spectroscopy of ULXs has to a large extent begun to negate the simple IMBH model based on fewer objects. We confirm and expand these results, which validates the XMM-Newton work in a broader sense with independent X-ray data. We find that (1) cool-disk components are present with roughly equal probability and total flux fraction for any given ULX, regardless of luminosity, and (2) cool-disk components extend below the standard ULX luminosity cutoff of 1039 erg s-1, down to our sample limit of 1038.3 erg s-1. The fact that cool-disk components are not correlated with luminosity damages the argument that cool disks indicate IMBHs in ULXs, for which strong statistical support was never found.
The structure of tracheobronchial mucins from cystic fibrosis and control patients.

PubMed

Gupta, R; Jentoft, N

1992-02-15

Tracheobronchial mucin samples from control and cystic fibrosis patients were purified by gel filtration chromatography on Sephacryl S-1000 and by density gradient centrifugation. Normal secretions contained high molecular weight (approximately 10(7] mucins, whereas the cystic fibrosis secretions contained relatively small amounts of high molecular weight mucin together with larger quantities of lower molecular weight mucin fragments. These probably represent products of protease digestion. Reducing the disulfide bonds in either the control or cystic fibrosis high molecular weight mucin fractions released subunits of approximately 2000 kDa. Treating these subunits with trypsin released glycopeptides of 300 kDa. Trypsin treatment of unreduced mucin also released fragments of 2000 kDa that could be converted into 300-kDa glycopeptides upon disulfide bond reduction. Thus, protease-susceptible linkages within these mucins must be cross-linked by disulfide bonds so that the full effects of proteolytic degradation of mucins remain cryptic until disulfide bonds are reduced. Since various combinations of protease treatment and disulfide bond reduction release either 2000- or 300-kDa fragments, these fragments must represent important elements of mucin structure. The high molecular weight fractions of cystic fibrosis mucins appear to be indistinguishable from control mucins. Their amino acid compositions are the same, and various combinations of disulfide bond reduction and protease treatment release products of identical size and amino acid composition. Sulfate and carbohydrate compositions did vary considerably from sample to sample, but the limited number of samples tested did not demonstrate a cystic fibrosis-specific pattern. Thus, tracheobronchial mucins from cystic fibrosis and control patients are very similar, and both share the same generalized structure previously determined for salivary, cervical, and intestinal mucins.
Zonal management of arsenic contaminated ground water in Northwestern Bangladesh.

PubMed

Hill, Jason; Hossain, Faisal; Bagtzoglou, Amvrossios C

2009-09-01

This paper used ordinary kriging to spatially map arsenic contamination in shallow aquifers of Northwestern Bangladesh (total area approximately 35,000 km(2)). The Northwestern region was selected because it represents a relatively safer source of large-scale and affordable water supply for the rest of Bangladesh currently faced with extensive arsenic contamination in drinking water (such as the Southern regions). Hence, the work appropriately explored sustainability issues by building upon a previously published study (Hossain et al., 2007; Water Resources Management, vol. 21: 1245-1261) where a more general nation-wide assessment afforded by kriging was identified. The arsenic database for reference comprised the nation-wide survey (of 3534 drinking wells) completed in 1999 by the British Geological Survey (BGS) in collaboration with the Department of Public Health Engineering (DPHE) of Bangladesh. Randomly sampled networks of zones from this reference database were used to develop an empirical variogram and develop maps of zonal arsenic concentration for the Northwestern region. The remaining non-sampled zones from the reference database were used to assess the accuracy of the kriged maps. Two additional criteria were explored: (1) the ability of geostatistical interpolators such as kriging to extrapolate information on spatial structure of arsenic contamination beyond small-scale exploratory domains; (2) the impact of a priori knowledge of anisotropic variability on the effectiveness of geostatistically based management. On the average, the kriging method was found to have a 90% probability of successful prediction of safe zones according to the WHO safe limit of 10ppb while for the Bangladesh safe limit of 50ppb, the safe zone prediction probability was 97%. Compared to the previous study by Hossain et al. (2007) over the rest of the contaminated country side, the probability of successful detection of safe zones in the Northwest is observed to be about 25% higher. An a priori knowledge of anisotropy was found to have inconclusive impact on the effectiveness of kriging. It was, however, hypothesized that a preferential sampling strategy that honored anisotropy could be necessary to reach a more definitive conclusion in regards to this issue.
Drying step optimization to obtain large-size transparent magnesium-aluminate spinel samples

NASA Astrophysics Data System (ADS)

Petit, Johan; Lallemant, Lucile

2017-05-01

In the transparent ceramics processing, the green body elaboration step is probably the most critical one. Among the known techniques, wet shaping processes are particularly interesting because they enable the particles to find an optimum position on their own. Nevertheless, the presence of water molecules leads to drying issues. During the water removal, its concentration gradient induces cracks limiting the sample size: laboratory samples are generally less damaged because of their small size but upscaling the samples for industrial applications lead to an increasing cracking probability. Thanks to the drying step optimization, large size spinel samples were obtained.
Fast-NPS-A Markov Chain Monte Carlo-based analysis tool to obtain structural information from single-molecule FRET measurements

NASA Astrophysics Data System (ADS)

Eilert, Tobias; Beckers, Maximilian; Drechsler, Florian; Michaelis, Jens

2017-10-01

The analysis tool and software package Fast-NPS can be used to analyse smFRET data to obtain quantitative structural information about macromolecules in their natural environment. In the algorithm a Bayesian model gives rise to a multivariate probability distribution describing the uncertainty of the structure determination. Since Fast-NPS aims to be an easy-to-use general-purpose analysis tool for a large variety of smFRET networks, we established an MCMC based sampling engine that approximates the target distribution and requires no parameter specification by the user at all. For an efficient local exploration we automatically adapt the multivariate proposal kernel according to the shape of the target distribution. In order to handle multimodality, the sampler is equipped with a parallel tempering scheme that is fully adaptive with respect to temperature spacing and number of chains. Since the molecular surrounding of a dye molecule affects its spatial mobility and thus the smFRET efficiency, we introduce dye models which can be selected for every dye molecule individually. These models allow the user to represent the smFRET network in great detail leading to an increased localisation precision. Finally, a tool to validate the chosen model combination is provided. Programme Files doi:http://dx.doi.org/10.17632/7ztzj63r68.1 Licencing provisions: Apache-2.0 Programming language: GUI in MATLAB (The MathWorks) and the core sampling engine in C++ Nature of problem: Sampling of highly diverse multivariate probability distributions in order to solve for macromolecular structures from smFRET data. Solution method: MCMC algorithm with fully adaptive proposal kernel and parallel tempering scheme.
Internal Medicine residents use heuristics to estimate disease probability

PubMed Central

Phang, Sen Han; Ravani, Pietro; Schaefer, Jeffrey; Wright, Bruce; McLaughlin, Kevin

2015-01-01

Background Training in Bayesian reasoning may have limited impact on accuracy of probability estimates. In this study, our goal was to explore whether residents previously exposed to Bayesian reasoning use heuristics rather than Bayesian reasoning to estimate disease probabilities. We predicted that if residents use heuristics then post-test probability estimates would be increased by non-discriminating clinical features or a high anchor for a target condition. Method We randomized 55 Internal Medicine residents to different versions of four clinical vignettes and asked them to estimate probabilities of target conditions. We manipulated the clinical data for each vignette to be consistent with either 1) using a representative heuristic, by adding non-discriminating prototypical clinical features of the target condition, or 2) using anchoring with adjustment heuristic, by providing a high or low anchor for the target condition. Results When presented with additional non-discriminating data the odds of diagnosing the target condition were increased (odds ratio (OR) 2.83, 95% confidence interval [1.30, 6.15], p = 0.009). Similarly, the odds of diagnosing the target condition were increased when a high anchor preceded the vignette (OR 2.04, [1.09, 3.81], p = 0.025). Conclusions Our findings suggest that despite previous exposure to the use of Bayesian reasoning, residents use heuristics, such as the representative heuristic and anchoring with adjustment, to estimate probabilities. Potential reasons for attribute substitution include the relative cognitive ease of heuristics vs. Bayesian reasoning or perhaps residents in their clinical practice use gist traces rather than precise probability estimates when diagnosing. PMID:27004080
Intrinsic Multi-Scale Dynamic Behaviors of Complex Financial Systems.

PubMed

Ouyang, Fang-Yan; Zheng, Bo; Jiang, Xiong-Fei

2015-01-01

The empirical mode decomposition is applied to analyze the intrinsic multi-scale dynamic behaviors of complex financial systems. In this approach, the time series of the price returns of each stock is decomposed into a small number of intrinsic mode functions, which represent the price motion from high frequency to low frequency. These intrinsic mode functions are then grouped into three modes, i.e., the fast mode, medium mode and slow mode. The probability distribution of returns and auto-correlation of volatilities for the fast and medium modes exhibit similar behaviors as those of the full time series, i.e., these characteristics are rather robust in multi time scale. However, the cross-correlation between individual stocks and the return-volatility correlation are time scale dependent. The structure of business sectors is mainly governed by the fast mode when returns are sampled at a couple of days, while by the medium mode when returns are sampled at dozens of days. More importantly, the leverage and anti-leverage effects are dominated by the medium mode.
Lower white blood cell counts in elite athletes training for highly aerobic sports.

PubMed

Horn, P L; Pyne, D B; Hopkins, W G; Barnes, C J

2010-11-01

White cell counts at rest might be lower in athletes participating in selected endurance-type sports. Here, we analysed blood tests of elite athletes collected over a 10-year period. Reference ranges were established for 14 female and 14 male sports involving 3,679 samples from 937 females and 4,654 samples from 1,310 males. Total white blood cell counts and counts of neutrophils, lymphocytes and monocytes were quantified. Each sport was scaled (1-5) for its perceived metabolic stress (aerobic-anaerobic) and mechanical stress (concentric-eccentric) by 13 sports physiologists. Substantially lower total white cell and neutrophil counts were observed in aerobic sports of cycling and triathlon (~16% of test results below the normal reference range) compared with team or skill-based sports such as water polo, cricket and volleyball. Mechanical stress of sports had less effect on the distribution of cell counts. The lower white cell counts in athletes in aerobic sports probably represent an adaptive response, not underlying pathology.
If you provide the test, they will take it: factors associated with HIV/STI Testing in a representative sample of homeless youth in Los Angeles.

PubMed

Ober, Allison J; Martino, Steven C; Ewing, Brett; Tucker, Joan S

2012-08-01

Homeless youth are at high risk for human immunodeficiency virus (HIV) and other sexually transmitted infections (STI), yet those at greatest risk may never have been tested for HIV or STI. In a probability sample of sexually active homeless youth in Los Angeles (n = 305), this study identifies factors associated with HIV/STI testing status. Most youth (85%) had ever been tested and 47% had been tested in the past 3 months. Recent testing was significantly more likely among youth who self-identified as gay, were Hispanic, injected drugs, and used drop-in centers, and marginally more likely among youth with more depressive symptoms. Drop-in center use mediated the association of injection drug use with HIV/STI testing. HIV/STI testing was unrelated to sexual risk behavior. Drop-in centers can play an important role in facilitating testing, including among injection drug users, but more outreach is needed to encourage testing in other at-risk subgroups.
If You Provide the Test, They will Take It: Factors Associated with HIV/STI Testing in a Representative Sample of Homeless Youth in Los Angeles*

PubMed Central

Ober, Allison J.; Martino, Steven C.; Ewing, Brett; Tucker, Joan S.

2012-01-01

Homeless youth are at high risk for human immunodeficiency virus (HIV) and other sexually transmitted infections (STI), yet those at greatest risk may never have been tested for HIV or STI. In a probability sample of sexually active homeless youth in Los Angeles (n =305), this study identifies factors associated with HIV/STI testing status. Most youth (85%) had ever been tested and 47% had tested in the past 3 months. Recent testing was significantly more likely among youth who self-identified as gay, were Hispanic, injected drugs, and used drop-in centers, and marginally more likely among youth with more depressive symptoms. Drop-in center use mediated the association of injection drug use with HIV/STI testing. HIV/STI testing was unrelated to sexual risk behavior. Drop-in centers can play an important role in facilitating testing, including among injection drug users, but more outreach is needed to encourage testing in other at-risk subgroups. PMID:22827904
Break-up of New Orleans Households after Hurricane Katrina

PubMed Central

Rendall, Michael S.

2011-01-01

Theory and evidence on disaster-induced population displacement have focused on individual and population-subgroup characteristics. Less is known about impacts on households. I estimate excess incidence of household break-up due to Hurricane Katrina by comparing a probability sample of pre-Katrina New Orleans resident adult household heads and non–household heads (N = 242), traced just over a year later, with a matched sample from a nationally representative survey over an equivalent period. One in three among all adult non–household heads, and one in two among adult children of household heads, had separated from the household head 1 year post-Katrina. These rates were, respectively, 2.2 and 2.7 times higher than national rates. A 50% higher prevalence of adult children living with parents in pre-Katrina New Orleans than nationally increased the hurricane’s impact on household break-up. Attention to living arrangements as a dimension of social vulnerability in disaster recovery is suggested. PMID:21709733
Modeling Common-Sense Decisions in Artificial Intelligence

NASA Technical Reports Server (NTRS)

Zak, Michail

2010-01-01

A methodology has been conceived for efficient synthesis of dynamical models that simulate common-sense decision- making processes. This methodology is intended to contribute to the design of artificial-intelligence systems that could imitate human common-sense decision making or assist humans in making correct decisions in unanticipated circumstances. This methodology is a product of continuing research on mathematical models of the behaviors of single- and multi-agent systems known in biology, economics, and sociology, ranging from a single-cell organism at one extreme to the whole of human society at the other extreme. Earlier results of this research were reported in several prior NASA Tech Briefs articles, the three most recent and relevant being Characteristics of Dynamics of Intelligent Systems (NPO -21037), NASA Tech Briefs, Vol. 26, No. 12 (December 2002), page 48; Self-Supervised Dynamical Systems (NPO-30634), NASA Tech Briefs, Vol. 27, No. 3 (March 2003), page 72; and Complexity for Survival of Living Systems (NPO- 43302), NASA Tech Briefs, Vol. 33, No. 7 (July 2009), page 62. The methodology involves the concepts reported previously, albeit viewed from a different perspective. One of the main underlying ideas is to extend the application of physical first principles to the behaviors of living systems. Models of motor dynamics are used to simulate the observable behaviors of systems or objects of interest, and models of mental dynamics are used to represent the evolution of the corresponding knowledge bases. For a given system, the knowledge base is modeled in the form of probability distributions and the mental dynamics is represented by models of the evolution of the probability densities or, equivalently, models of flows of information. Autonomy is imparted to the decisionmaking process by feedback from mental to motor dynamics. This feedback replaces unavailable external information by information stored in the internal knowledge base. Representation of the dynamical models in a parameterized form reduces the task of common-sense-based decision making to a solution of the following hetero-associated-memory problem: store a set of m predetermined stochastic processes given by their probability distributions in such a way that when presented with an unexpected change in the form of an input out of the set of M inputs, the coupled motormental dynamics converges to the corresponding one of the m pre-assigned stochastic process, and a sample of this process represents the decision.
Computer simulation of the probability that endangered whales will interact with oil spills, Final report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Reed, M.; Jayko, K.; Bowles, A.

1986-10-01

A numerical model system was developed to assess quantitatively the probability that endangered bowhead and gray whales will encounter spilled oil in Alaskan waters. Bowhead and gray whale migration diving-surfacing models, and an oil-spill-trajectory model comprise the system. The migration models were developed from conceptual considerations, then calibrated with and tested against observations. The distribution of animals is represented in space and time by discrete points, each of which may represent one or more whales. The movement of a whale point is governed by a random-walk algorithm which stochastically follows a migratory pathway.
Sample design effects in landscape genetics

USGS Publications Warehouse

Oyler-McCance, Sara J.; Fedy, Bradley C.; Landguth, Erin L.

2012-01-01

An important research gap in landscape genetics is the impact of different field sampling designs on the ability to detect the effects of landscape pattern on gene flow. We evaluated how five different sampling regimes (random, linear, systematic, cluster, and single study site) affected the probability of correctly identifying the generating landscape process of population structure. Sampling regimes were chosen to represent a suite of designs common in field studies. We used genetic data generated from a spatially-explicit, individual-based program and simulated gene flow in a continuous population across a landscape with gradual spatial changes in resistance to movement. Additionally, we evaluated the sampling regimes using realistic and obtainable number of loci (10 and 20), number of alleles per locus (5 and 10), number of individuals sampled (10-300), and generational time after the landscape was introduced (20 and 400). For a simulated continuously distributed species, we found that random, linear, and systematic sampling regimes performed well with high sample sizes (>200), levels of polymorphism (10 alleles per locus), and number of molecular markers (20). The cluster and single study site sampling regimes were not able to correctly identify the generating process under any conditions and thus, are not advisable strategies for scenarios similar to our simulations. Our research emphasizes the importance of sampling data at ecologically appropriate spatial and temporal scales and suggests careful consideration for sampling near landscape components that are likely to most influence the genetic structure of the species. In addition, simulating sampling designs a priori could help guide filed data collection efforts.
[Effect of environmental factors and fishing effort allocation on catch of the Spotted Eagle Ray Aetobatus narinari (Rajiformes: Myliobatidae) in Southern Gulf of Mexico].

PubMed

Cuevas, Elizabeth; Pérez, Juan Carlos; Méndez, Iván

2013-09-01

Aetobatus narinari represents a fisheries target in Southern Gulf of Mexico, and it is currently considered a Near Threatened species by the IUCN red list. The information available of this batoid fish includes some biological and fishery aspects; nevertheless, little is known about the factors influencing on fishing operations and catches. In order to evaluate the effect of environmental factors and the fishing effort allocation by vessels on the target fishery of A. narinari in this area, a daily basis sampling was carried out on four small-scale vessels, from January to July 2009 (the entire fishing season), in two fishing localities (Campeche and Seybaplaya). A total of 896 rays were recorded from 280 fishing trips. A General Linear Model was used to predict the factors effect on the probability that fishing operations occurred, and on the probability for captures of at least one or three or five rays per vessel-trip. The probability that fishing operations occurred off Campeche was predicted by the lunar cycle, with the highest probability in the new moon period (66%) and a probability smaller than 35% for the other periods. The probability that fishing operations occurred off Seybaplaya was predicted by wind velocity, with higher probabilities at low wind velocity than at high wind velocity, and a 50% probability of fishing operations at 12-15 km/h. Catch rates off Seybaplaya were predicted by the vessel's factor (the effect of fishing effort allocation), the North wind season and sea surface temperature. The probability for captures of at least one and three rays per vessel-trip was predicted by the vessel's factor and the North wind season. One vessel had higher catch probability (83% for at least one ray and 43% for at least three rays) than the others (69 and 70% for at least one ray and 26% for at least three rays), and during the North wind season the catch probability was higher (96% for at least one ray and 72% for at least three rays) than out of that season (68% for at least one ray and 21% for at least three rays). The probability for capture at least five rays per vessel-trip was predicted by the sea surface temperature and the North wind season. At 23 degrees C the catch probability was of 49% and the probability gradually diminished to 4% at 28 degrees C, and during the North wind season the catch probability was higher (40%) than out of that season (7%). This study shows that some environmental factors and fishermen perceptions and experience (fishing effort allocation) influence on the catch rate of A. narinari, and that these factors must be considered in future studies on elasmobranch fisheries, mainly when comparisons between catch rates among seasons or regions are analyzed.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.